中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于社区问答数据迁移学习的FAQ问答模型研究

邵明锐 马登豪 陈跃国 覃雄派 杜小勇

邵明锐, 马登豪, 陈跃国, 覃雄派, 杜小勇. 基于社区问答数据迁移学习的FAQ问答模型研究[J]. 华东师范大学学报(自然科学版), 2019, (5): 74-84. doi: 10.3969/j.issn.1000-5641.2019.05.006
引用本文: 邵明锐, 马登豪, 陈跃国, 覃雄派, 杜小勇. 基于社区问答数据迁移学习的FAQ问答模型研究[J]. 华东师范大学学报(自然科学版), 2019, (5): 74-84. doi: 10.3969/j.issn.1000-5641.2019.05.006
SHAO Ming-rui, MA Deng-hao, CHEN Yue-guo, QIN Xiong-pai, DU Xiao-yong. Transfer learning based QA model of FAQ using CQA data[J]. Journal of East China Normal University (Natural Sciences), 2019, (5): 74-84. doi: 10.3969/j.issn.1000-5641.2019.05.006
Citation: SHAO Ming-rui, MA Deng-hao, CHEN Yue-guo, QIN Xiong-pai, DU Xiao-yong. Transfer learning based QA model of FAQ using CQA data[J]. Journal of East China Normal University (Natural Sciences), 2019, (5): 74-84. doi: 10.3969/j.issn.1000-5641.2019.05.006

基于社区问答数据迁移学习的FAQ问答模型研究

doi: 10.3969/j.issn.1000-5641.2019.05.006
基金项目: 

国家自然科学基金广东大数据科学中心联合基金 U1711261

国家自然科学基金 61432006

详细信息
    作者简介:

    邵明锐, 男, 硕士研究生, 研究方向为自然语言处理与语义搜索.E-mail:dhucstsmr@163.com

    通讯作者:

    陈跃国, 男, 教授, 博士生导师, 研究方向为语义搜索与知识图谱.E-mail:chenyueguo@ruc.edu.cn

  • 中图分类号: TP391

Transfer learning based QA model of FAQ using CQA data

  • 摘要: 基于FAQ(Frequent Asked Questions)问答技术构建智能客服系统,是当前业界普遍采用的技术方案.基于FAQ构建的问答系统,其返回的结果具有稳定、可靠、质量高的优点;但因受限于人工标注的知识库规模,识别能力有限,容易遇到瓶颈.为了解决FAQ数据集规模有限的问题,给出了数据层面和模型层面的解决方法:在数据层面,利用百度知道爬取相关数据并挖掘语义等价问题,保证了数据的相关性和一致性;在模型层面,提出了一种面向迁移学习的深度神经网络transAT,该模型融合了Transformer强大的特征抽取能力和注意力机制,适用于句子对之间的语义相似度计算.实验表明,该方法可以显著提升模型在FAQ问答任务中的效果,在一定程度上解决了FAQ数据集规模有限的问题.
  • 图  1  语义计算神经网络结构

    Fig.  1  Structure of a semantic computing neural network

    图  2  transAT神经网络局部微调方案

    Fig.  2  Local fine-tuning scheme of a transAT neural network

    图  3  注意力机制权重图

    Fig.  3  Visualization of attention weights

    表  1  源域和目标域数据对比

    Tab.  1  Comparison of data from the resource and target domain

    标准问 扩展问
    源域 支付宝里的钱怎么花掉 支付宝怎么改变付款方式
    华为mate7严重发烫怎么回事 手机发烫怎么办
    手机受限后收不到验证码怎么办 无法接收到短信息
    荣耀6的网络位置怎么耗电这么快 小米note2耗电快怎么办
    华为自带手机浏览器如何删除 华为手机浏览器卸载会如何
    目标域 商品有质量问题怎么办 收到货外观破损怎么办
    怎么查询不到物流信息 我买的商品, 物流一点动静都没有
    如何确认收货 收到邮件如何操作确认收货
    评价上说的都是真的吗 为何有差评
    如何办理退货 可以退货吗
    下载: 导出CSV

    表  2  语义等价任务实验数据集划分结果

    Tab.  2  Semantic equivalent experimental data partitioning results

    社区问答数据集 FAQ数据集
    总量 808 708 87 112
    训练集 646 966 69 690
    测试集 161 742 17 422
    下载: 导出CSV

    表  3  各个模型在源域数据集的测试结果

    Tab.  3  Test results for various models in the source domain datasets

    模型 Precison Recall F1 Time_cost/s
    LSTM 0.908 3 0.958 3 0.932 6 1 223
    BCNN 0.870 0 0.970 0 0.920 0 4 200
    PWIM 0.937 1 0.928 7 0.932 9 172 800
    transAT 0.982 5 0.983 7 0.983 1 3 927
    下载: 导出CSV

    表  4  各个模型在目标域数据集的测试结果

    Tab.  4  Test results for various models in the target domain datasets

    模型 Precison Recall F1 Time_cost/s
    LSTM 0.840 8 0.944 7 0.889 7 420
    BCNN 0.890 0 0.940 0 0.920 0 1 188
    PWIM 0.951 7 0.952 2 0.952 0 622 182
    transAT 0.965 3 0.959 2 0.9622 540
    下载: 导出CSV

    表  5  不同迁移学习方式的迁移学习效果

    Tab.  5  Results of various transfer learning methods

    迁移学习方式 Precision Recall F1
    transAT 0.965 3 0.959 2 0.962 2
    transAT(fine_tune0) 0.840 2 0.801 2 0.820 2
    transAT(fine_tune1) 0.841 3 0.800 9 0.820 6
    transAT(fine_tune_all) 0.953 8 0.961 8 0.957 8
    transAT(CSBC) 0.959 8 0.963 4 0.961 5
    transAT(BCCS) 0.970 3 0.968 2 0.969 2
    下载: 导出CSV

    表  6  不同负例构造方式, FAQ问答实验结果

    Tab.  6  Results of FAQ QA experiment with different negative sample constructions

    P@1(random) P@1(BM25)
    LSTM 0.643 4 0.656 1
    BCNN 0.633 1 0.655 2
    PWIM 0.742 1 0.761 1
    transAT 0.772 2 0.805 3
    transAT(pretrain) 0.762 1 0.806 0
    下载: 导出CSV

    表  7  不同正负例占比, FAQ问答实验结果

    Tab.  7  Results of FAQ QA experiment with different ratios of positive and negative

    P@1(1:1) P@1(1:2) P@1(1:3) P@1(1:5)
    BM25 0.601 2 0.601 2 0.601 2 0.601 2
    LSTM 0.643 4 0.653 2 0.667 8 0.632 1
    BCNN 0.633 1 0.634 5 0.645 8 0.620 0
    PWIM 0.742 1 0.751 2 0.742 0 0.730 3
    transAT 0.772 2 0.785 6 0.784 8 0.751 1
    transAT(pretrain) 0.762 1 0.786 6 0.7858 0.742 7
    下载: 导出CSV
  • [1] TURNEY P D, PANTEL P.From frequency to meaning:Vector space models of semantics[J]. Journal of Artificial Intelligence Research, 2010, 37:141-188. doi:  10.1613/jair.2934
    [2] ROBERTSON S, ZARAGOZA H. The probabilistic relevance framework:BM25 and beyond[J]. Foundations and Trends in Information Retrieval, 2009, 3(4):333-389.
    [3] KATO S, TOGASHI R, MAEDA H, et al. LSTM vs BM25 for open-domain QA: A hands-on comparison of effectiveness and efficiency[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development. ACM, 2017: 1309-1312.
    [4] WANG Z G, ITTYCHERIAH A.FAQ-based question answering via word alignment[J]. arXiv: 1507.02628v1[cs.CL].
    [5] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 2013 Conference on Neural Information Processing Systems Association. NIPS, 2013: 3111-311.
    [6] PENNINGTON J, SOCHER R, MANNING C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing Association for Computational Linguistics. EMNLP, 2014: 1532-1543.
    [7] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 2014: 1746-1751.
    [8] LIPTON Z C. A critical review of recurrent neural networks for sequence learning[J]. arXiv: 1506.00019v1[cs.LG].
    [9] SAK H, SENIOR A W, BEAUFAYS F. Long short term memory recurrent neural network architectures for large scale acoustic modeling[C]//Proceedings of the 2014 Conference of the International Speech Communication Association. INTERSPEECH, 2014: 338-342.
    [10] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30(NIPS 2017). 2017: 6000-6010.
    [11] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. 2019: 4171-4186.
    [12] YU J F, QIU M H, JIANG J, et al. Modelling domain relationships for transfer learning on retrieval-based question answering systems in e-commerce[C]//Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 2018: 682-690.
    [13] NIE Y X, BANSAL M.Shortcut-stacked sentence encoders for multi-domain inference[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Association for Computational Linguistics, EMNLP 2017, 2017: 41-45.
    [14] CONNEAU A, KIELA D, SCHWENK H, et al. Supervised learning of universal sentence representations from natural language inference data[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. ACL, 2017: 670-680.
    [15] KRATZWALD B, FEUERRIEGEL S. Putting question-answering systems into practice: Transfer learning for efficient domain customization[J]. ACM Transactions on Management Information Systems, 2019, 9(4): 15: 1-15: 20(Article No.15).
    [16] HE H, LIN J J. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. 2016: 937-948.
    [17] CHEN Q, ZHU XD, LING Z H, et al. Enhanced LSTM for natural language inference[C]//Proceedings of the 2017 Annual Meeting of the Association for Computation Linguistics. ACL, 2017: 1657-1668.
  • 加载中
图(3) / 表(7)
计量
  • 文章访问数:  166
  • HTML全文浏览量:  152
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-07-27
  • 刊出日期:  2019-09-25

目录

    /

    返回文章
    返回