中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于卷积神经网络的自动问答

金丽娇 傅云斌 董启文

金丽娇, 傅云斌, 董启文. 基于卷积神经网络的自动问答[J]. 华东师范大学学报(自然科学版), 2017, (5): 66-79. doi: 10.3969/j.issn.1000-5641.2017.05.007
引用本文: 金丽娇, 傅云斌, 董启文. 基于卷积神经网络的自动问答[J]. 华东师范大学学报(自然科学版), 2017, (5): 66-79. doi: 10.3969/j.issn.1000-5641.2017.05.007
JING Li-jiao, FU Yun-bin, DONG Qi-wen. The auto-question answering system based on convolution neural network[J]. Journal of East China Normal University (Natural Sciences), 2017, (5): 66-79. doi: 10.3969/j.issn.1000-5641.2017.05.007
Citation: JING Li-jiao, FU Yun-bin, DONG Qi-wen. The auto-question answering system based on convolution neural network[J]. Journal of East China Normal University (Natural Sciences), 2017, (5): 66-79. doi: 10.3969/j.issn.1000-5641.2017.05.007

基于卷积神经网络的自动问答

doi: 10.3969/j.issn.1000-5641.2017.05.007
基金项目: 

国家重点研发计划 2016YFB1000905

国家自然科学基金广东省联合重点项目 U1401256

国家自然科学基金 61672234

国家自然科学基金 61402177

详细信息
    作者简介:

    金丽娇, 女, 硕士研究生, 研究方向为自然语言处理与自动问答.E-mail:51164500102@stu.ecnu.edu.cn

    通讯作者:

    傅云斌, 男, 博士后, 研究方向为数据科学与机器学习.E-mail:fuyunbin2012@163.com

  • 中图分类号: TP391

The auto-question answering system based on convolution neural network

  • 摘要: 自动问答是自然语言处理领域中的一个研究热点,自动问答系统能够用简短、精确的答案直接回答用户提出的问题,给用户提供更加精确的信息服务.自动问答系统中需解决两个关键问题:一是实现自然语言问句及答案的语义表示,另一个是实现问句及答案间的语义匹配.卷积神经网络是一种经典的深层网络结构,近年来卷积神经网络在自然语言处理领域表现出强大的语言表示能力,被广泛应用于自动问答领域中.本文对基于卷积神经网络的自动问答技术进行了梳理和总结,从语义表示和语义匹配两个主要角度分别对面向知识库和面向文本的问答技术进行了归纳,并指出了当前的研究难点.
  • 图  1  卷积网络结构图

    Fig.  1  The architecture of convolution network

    图  2  并列式匹配模型

    Fig.  2  Parallel matching model

    图  3  交互式匹配模型

    Fig.  3  Interactive matching model

    图  4  CNNSM模型

    Fig.  4  CNNSM model

    图  5  问句“Who first voiced Meg on Family Guy”的查询图

    Fig.  5  Query graph that represents the question "Who first voiced Meg on Family Guy"

    图  6  MCCNN模型

    Fig.  6  MCCNN model

    图  7  问答句匹配的深度学习架构

    Fig.  7  Deep learning architecture for matching QA sentence

    图  8  交互式语义匹配架构

    Fig.  8  Architecture of interactive matching model

  • [1] KATZ B. Annotating the World Wide Web using natural language[C]//Proceedings of RIAO'97 ComputerAssisted Information Searching on Internet. 1997:136-155.
    [2] SPINK A, GUNAR O. E-commerce web queries:eExcite and ask jeeves study[J/OL]. First Monday, 2001, 6(7).[2017-06-02]. http://firstmonday.org/issues/issue67/spink/index.html.
    [3] ZHENG Z. AnswerBus question answering system[C]//Proceedings of the Second International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc, 2002:399-404.
    [4] 郑实福, 刘挺, 秦兵, 等.自动问答综述[J].中文信息学报, 2002, 16(6):46-52. http://www.cnki.com.cn/Article/CJFDTOTAL-SDKY200704020.htm
    [5] MOLLAD, VICEDO J L. Special section on restricted-domain question answering[J]. Computational Linguistics, 2006, 33(1):41-61. http://www.aclweb.org/anthology/J/J07/J07-1004.pdf
    [6] KWIATKOWSKI T, ZETTLEMOYER L, GOLDWATER S, et al. Lexical generalization in CCG grammar induction for semantic parsing[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011:1512-1523.
    [7] LIANG P, JORDAN M I, KLEIN D. Learning dependency-based compositional semantics[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies (I). Association for Computational Linguistics, 2011:590-599.
    [8] ZHANG Y, JIN R, ZHOU Z H. Understanding bag-of-words model:A statistical framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1(1/4):43-52. http://cat.inist.fr/?aModele=afficheN&cpsidt=23365445
    [9] LANDAUER T K, FOLTZ P W, LAHAM D. An introduction to latent semantic analysis[J]. Discourse Processes, 1998, 25:259-284. doi:  10.1080/01638539809545028
    [10] BROWN P F, DESOUZA P V, MERCER R L, et al. Class-based n-gram models of natural language[J]. Computational Linguistics, 1992, 18(4):467-479. http://www.cs.columbia.edu/~djhsu/papers/brown_alg.pdf
    [11] BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003, 3:1137-1155. http://www.iro.umontreal.ca/~lisa/pointeurs/BengioDucharmeVincentJauvin_jmlr.pdf
    [12] MIKOLOV T, YIH W T, ZWEIG G. Linguistic regularities in continuous space word representations[C]//Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistic. 2013, 13:746-751.
    [13] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space.[EB/OL].[2017-06-02]. arXiv:1301.3781. https://arxiv.org/pdf/1301.3781.pdf.
    [14] COLLOBERT R, WESTON J. A unified architecture for natural language processing:Deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine Learning. ACM, 2008:160-167.
    [15] MNIH A, HINTON G. Three new graphical models for statistical language modelling[C]//Proceedings of the 24th International Conference on Machine Learning. ACM, 2007:641-648.
    [16] FREGE G. Funktion, Begriff, Bedeutung[M]. Gottingen:Vandenhoeck & Ruprecht, 2002.
    [17] HERMANN K M. Distributed representations for compositional semantics[D]. Oxford:University of Oxford, 2014.
    [18] 来斯惟. 基于神经网络的词和文档语义向量表示方法研究[D]. 北京: 中国科学院研究生院, 2016.
    [19] FUKUSHIMA K, MIYAKE S. Neocognitron:A self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J]. Biological Cybernetics, 1980, 36(4):193-202. doi:  10.1007/BF00344251
    [20] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE. 1998, 86(11):2278-2324. doi:  10.1109/5.726791
    [21] SEVERYN A, MOSCHITTI A. Learning to rank short text pairs with convolutional deep neural networks[C]//Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2015:373-382.
    [22] DAHL G E, SAINATH T N, HINTON G E. Improving deep neural networks for LVCSR using rectified linear units and dropout[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 2013:8609-8613.
    [23] COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011, 12:2493-2537.
    [24] BERGER A, LAFFERTY J. Information retrieval as statistical translation[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1999:222-229.
    [25] WU W, LU Z, LI H. Learning bilinear model for matching queries and documents[J]. Journal of Machine Learning Research, 2013, 14(1):2519-2548. http://www.microsoft.com/en-us/research/publication/learning-bilinear-model-matching-queries-documents/
    [26] YU L, HERMANN K M, BLUNSOM P, et al. Deep learning for answer sentence selection[EB/OL].[2017-06-02]. arXiv:1412.1632. https://arxiv.org/pdf/1412.1632.pdf.
    [27] SURDEANU M, CIARAMITA M, ZARAGOZA H. Learning to rank answers to non-factoid questions from web collections[J]. Computational Linguistics, 2011, 37(2):351-383. doi:  10.1162/COLI_a_00051
    [28] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems. 2013:3111-3119.
    [29] 庞亮, 兰艳艳, 徐君, 等.深度文本匹配综述[J].计算机学报, 2017, 40(4):985-1003. http://www.cnki.com.cn/Article/CJFDTOTAL-JSJX201704014.htm
    [30] YIH W, HE X D, MEEK C. Semantic parsing for single-relation question answering[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2014:643-648.
    [31] DONG L, WEI F, ZHOU M, et al. Question Answering over Freebase with Multi-Column Convolutional Neural Networks[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2015:260-269.
    [32] CAI Q, YATES A. Large-scale semantic parsing via schema matching and lexicon extension[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2013:423-433.
    [33] BORDES A, WESTON J, USUNIER N. Open question answering with weakly supervised embedding models[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin:Springer, 2014:165-180.
    [34] BORDES A, CHOPRA S, WESTON J. Question answering with subgraph embeddings[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014:615-620.
    [35] HUANG P S, HE X, GAO J, et al. Learning deep structured semantic models for web search using clickthrough data[C]//Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management. ACM, 2013:2333-2338.
    [36] SHEN Y, HE X, GAO J, et al. Learning semantic representations using convolutional neural networks for web search[C]//Proceedings of the 23rd International Conference on World Wide Web. ACM, 2014:373-374.
    [37] FADER A, ZETTLEMOYER L S, ETZIONI O. Paraphrase-driven learning for open question answering[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2013:1608-1618.
    [38] FADERA, SODERLAND S, ETZIONI O. Identifying relations for open information extraction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2011:1535-1545.
    [39] YAO X, VAN DURME B. Information extraction over structured data:Question answering with freebase[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2014:956-966.
    [40] BAO J, DUAN N, ZHOU M, et al. Knowledge-based question answering as machine translation[J]. Cell, 2014, 2(6):967-976. http://aclweb.org/anthology/P/P14/P14-1091.pdf
    [41] LEHMANN J, ISELE R, JAKOB M, et al. DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia[J]. Semantic Web, 2015, 6(2):167-195. doi:  10.3233/SW-140134
    [42] BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase:A collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, 2008:1247-1250.
    [43] SUCHANEK F M, KASNECI G, WEIKUM G. Yago:A core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web. ACM, 2007:697-706.
    [44] YIH S W, CHANG M W, HE X, et al. Semantic parsing via staged query graph generation:Question answering with knowledge base[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2015:1321-1331.
    [45] YANG Y, CHANG M W. S-MART:Novel tree-based structured learning algorithms applied to tweet entity linking[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 2015:504-513.
    [46] BERANT J, CHOU A, FROSTIG R, et al. Semantic parsing on freebase from question-answer pairs[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2013:1533-1544.
    [47] BERANT J, LIANG P. Semantic parsing via paraphrasing[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2014:1415-1425.
    [48] WANG M, SMITH N A, MITAMURA T. What is the Jeopardy model? A quasi-synchronous grammar for QA[C]//Proceedings of EMNLP-CoNLL'07. 2007:22-32.
    [49] HEILMAN M, SMITH N A. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions[C]//Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. ACL, 2010:1011-1019.
    [50] WANG M, MANNING C D. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering[C]//Proceedings of the 23rd International Conference on Computational Linguistics. ACL, 2010:1164-1172.
    [51] YAO X, VAN DURME B, CALLISON-BURCH C, et al. Answer extraction as sequence tagging with tree edit distance[C]//Proceedings of NAACL-HLT. 2013:858-867.
    [52] YIH W, CHANG M W, MEEK C, et al. Question answering using enhanced lexical semantic models[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. ACL, 2013:1744-1753.
    [53] YIH W, ZWEIG G, PLATT J C. Polarity inducing latent semantic analysis[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 2012:1212-1222.
    [54] HU B, LU Z, LI H, et al. Convolutional neural network architectures for matching natural language sentences[C]//Proceedings of Advances in Neural Information Processing Systems. 2015:2042-2050.
    [55] BENGIO Y. Learning Deep Architectures for AI[M]. Foundations and Trends in Machine Learning. Boston, USA:Now Publishers Ins, 2009.
    [56] LU Z, LI H. A deep architecture for matching short texts[C]//Proceedings of Advances in Neural Information Processing Systems. 2013:1367-1375.
    [57] WANG M, LU Z, LI H, et al. Syntax-based deep matching of short texts[C]//Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2015:1354-1361.
  • 加载中
图(8)
计量
  • 文章访问数:  222
  • HTML全文浏览量:  161
  • PDF下载量:  432
  • 被引次数: 0
出版历程
  • 收稿日期:  2017-06-23
  • 刊出日期:  2017-09-25

目录

    /

    返回文章
    返回