Study on sentence similarity based on quantum theory
-
摘要: 量子理论所具备的叠加、纠缠、不相容和干涉等特征使其成为优秀的建模框架. 研究了量子理论在自然语言理解方面的应用潜力. 在自然语言句子匹配任务上, 探讨了量子理论作为一种形式化框架捕捉句子、词语的语义和语义建模上的能力: 利用量子态构建句子的语义Hilbert空间, 计算句子信息变换过程中信息的保真度; 与此同时, 将量子理论与Word Embedding技术巧妙结合, 在高维低秩向量空间中表示单词或概念, 求取句子的相似性. 在一个真实业务场景中构造的问句匹配数据集上, 模拟数据表明, 所提出的方法相比于经典方法取得了更好的效果, 为以后进行多个句子的相似度研究提供了新的思路, 也是计算机科学与量子理论学科交叉研究领域的一个突破, 符合当下研究的方向.Abstract: Quantum theory has the characteristics of superposition, entanglement, incompatibility, and interference, which make it an excellent modeling framework. For the purpose of sentence matching, we explore the ability of quantum theory as a framework to capture sentence meaning and model semantic processes. We use quantum states to construct the semantic Hilbert space and calculate the fidelity of information during sentence transformation. The similarity of sentences is subsequently determined by using word embedding technology to represent words or concepts in semantic vector spaces. Simulation data showed that the proposed method achieved better results than traditional methods for sentence matching datasets constructed on real business scenarios. Hence, this paper provides a new idea for similarity research of multiple sentences and introduces a breakthrough in interdisciplinary research between computer science and quantum theory, in line with current research trends.
-
Key words:
- quantum theory /
- natural language /
- fidelity /
- sentence similarity
-
表 1 实验数据示例
Tab. 1 Examples from simulation data
句1 句2 相似度标签 How does the bank modify the repayment? How to change the repayment card? 1 When can I use the particle loan? When can I borrow money from the particles? 1 Does WeChat count? How much is left? 0 Will prepayment interest be reduced? How to clear the certificate? 0 表 2 各类词性权重值取值
Tab. 2 Weight assigned to various parts of speech
动词类 名词类 修饰类 其他 1.02 1.01 0.95 0.92 表 3 测试集相似度计算结果比较
Tab. 3 Comparison of test set similarity calculations
模型 A/(%) P/(%) R/(%) F1/(%) Jaccard_similarity 65.15 63.06 64.39 63.72 Cosine_similarity_TF 63.13 63.37 62.42 62.89 Cosine_similarity_TFIDF 63.11 63.34 62.48 62.91 本文方法 65.93 63.81 73.50 68.31 -
[1] CANCHO R F I, SOLÉ R V, KÖHLE R. Patterns in syntactic dependency networks [J]. Physical Review E, 2004, 69: 051915. DOI: 10.1103/PhysRevE.69.051915. [2] GÓMEZ-RODRÍGUEZ C, FERRER-I-CANCHO R. Scarcity of crossing dependencies: A direct outcome of a specific constraint? [J]. Physical Review E, 2017, 96: 062304. DOI: 10.1103/PhysRevE.96.062304. [3] BLACOE W, KASHEFI E, LAPATA M. A quantum-theoretic approach to distributional semantics//[C] Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings. Association for Computational Linguistics, 2013: 847-857. [4] LI Q C, WANG B Y, MELUCCI M. CNM: An interpretable complex-valued network for matching[EB/OL]. (2019-04-10)[2019-10-12]. https://arxiv.org/pdf/1904.05298.pdf. [5] LI Q C, UPRETY S, WANG B Y, et al. Quantum-inspired complex word embedding[EB/OL]. (2018-05-29)[2019-10-12]. https://arxiv.org/pdf/1805.11351.pdf. [6] YAO X W, WANG H Y, LIAO Z Y, et al. Quantum image processing and its application to edge detection: Theory and experiment [J]. Physical Review X, 2017, 7(3): 031041. DOI: 10.1103/PhysRevX.7.031041. [7] QU Z Q, YANG Z J, CUI N R, et al. Gating of inward rectifier K+ channels by proton-mediated interactions of N- and C-terminal domains [J]. Journal of Biological Chemistry, 2000, 275(41): 31573-31580. DOI: 10.1074/jbc.M003473200. [8] CAI Y Q, LU X W, JIANG N. A survey on quantum image processing [J]. Chinese Journal of Electronics, 2018, 27(4): 718-727. DOI: 10.1049/cje.2018.02.012. [9] REBENTROST P, GUPT B, BROMLEY T R. Quantum computational finance: Monte Carlo pricing of financial derivatives [J]. Physical Review A, 2018, 98: 022321. DOI: 10.1103/PhysRevA.98.022321. [10] MASOLIVER J. Book review: Quantum finance, path integrals and hamiltonians for options and interest rates [J]. Journal of Statistical Physics, 2005, 120(1): 417-418. DOI: 10.1007/s10955-005-5473-z. [11] NIELSEN M A, CHUANG I L. Quantum Computation and Quantum Information [M]. 10th Anniversary ed. New York: Cambridge University Press, 2010. [12] POTHOS E M, BUSEMEYER J R. Quantum principles in psychology: The debate, the evidence, and the future [J]. Behavioral and Brain Sciences, 2013, 36(3): 310-327. DOI: 10.1017/S0140525X12003226. [13] BUSEMEYER J R, BRUZA P D. Quantum Models of Cognition and Decision [M]. New York: Cambridge University Press, 2012: 28-98. [14] AERTS D. Quantum structure in cognition [J]. Journal of Mathematical Psychology, 2009, 53(5): 314-348. DOI: 10.1016/j.jmp.2009.04.005. [15] BRUZA P D, LAWLESS W, VAN RIJSBERGEN K, et al. Quantum Interaction: Proceedings of the Second Quantum Interaction Symposium - Qi-2008 [M]. [S.l]: [s.n], 2008. [16] 周法国, 杨炳儒. 句子相似度计算新方法及在问答系统中的应用 [J]. 计算机工程与应用, 2008, 44(1): 165-168. DOI: 10.3778/j.issn.1002-8331.2008.01.052. [17] RIBADAS F J, VILARES M, VILARES J. Semantic similarity between sentences through approximate tree matching [C]// IbPRIA 2005: Pattern Recognition and Image Analysis. Berlin: Springer, 2005: 638-646. DOI: 10.1007/11492542_78. [18] HUANG G Y, SHENG J Q. Measuring similarity between sentence fragments [C]// Proceedings of the 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics. IEEE, 2012: 327-330. DOI: 10.1109/IHMSC.2012.88. [19] 赵谦, 荆琪, 李爱萍, 等. 一种基于语义与句法结构的短文本相似度计算方法 [J]. 计算机工程与科学, 2018, 40(7): 1287-1294. DOI: 10.3969/j.issn.1007-130X.2018.07.020. [20] 钟晓阳. 基于WordNet的概念语义相似度计算及其应用研究 [D].山东 曲阜: 曲阜师范大学, 2017. [21] 黄洪, 陈德锐. 基于语义依存的汉语句子相似度改进算法 [J]. 浙江工业大学报, 2017, 45(1): 6-9. [22] 田堃, 柯永红, 穗志方. 基于语义角色标注的汉语句子相似度算法 [J]. 中文信息报, 2016, 30(6): 126-132. [23] HEALEY R. Book review: The structure and interpretation of quantum mechanics by R. I. G. Hughes [J]. The Philosophical Review, 1992, 101(3): 720-723. DOI: 10.2307/2186092. [24] ISHAM C J, MAYER M. Lectures on quantum theory: Mathematical and structural foundations [J]. Physics Today, 1996, 49(8): 66. DOI: 10.1063/1.2807731. [25] HIROAKI T. Information and fidelity in projective measurements [J]. Physical Review A, 2012, 85(2): 022124. DOI: 10.1103/PhysRevA.85.022124. [26] RUSKAI M B. Beyond strong subadditivity? Improved bounds on the contraction of generalized relative entropy [J]. Reviews in Mathematical Physics, 1994, 6(5a): 1147-1161. DOI: 10.1142/S0129055X94000407. [27] ŻYCZKOWSKI K, SOMMERS H J. Average fidelity between random quantum states [J]. Physical Review A, 2005, 71(3): 032313. DOI: 10.1103/PhysRevA.71.032313. [28] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality [C/OL]//Advances in Neural Information Processing Systems 26(NIPS 2013).[2019-10-02]. https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf.