Transfer learning based QA model of FAQ using CQA data

SHAO Ming-rui; MA Deng-hao; CHEN Yue-guo; QIN Xiong-pai; DU Xiao-yong

doi:10.3969/j.issn.1000-5641.2019.05.006

Issue 5

Dec. 2019

Turn off MathJax

Article Contents

Article Navigation > Journal of East China Normal University (Natural Sciences) > 2019 > (5): 74-84

SHAO Ming-rui, MA Deng-hao, CHEN Yue-guo, QIN Xiong-pai, DU Xiao-yong. Transfer learning based QA model of FAQ using CQA data[J]. Journal of East China Normal University (Natural Sciences), 2019, (5): 74-84. doi: 10.3969/j.issn.1000-5641.2019.05.006

Citation:

SHAO Ming-rui, MA Deng-hao, CHEN Yue-guo, QIN Xiong-pai, DU Xiao-yong. Transfer learning based QA model of FAQ using CQA data[J]. Journal of East China Normal University (Natural Sciences), 2019, (5): 74-84. doi: 10.3969/j.issn.1000-5641.2019.05.006

Citation:

PDF( 1735 KB)

Transfer learning based QA model of FAQ using CQA data

doi: 10.3969/j.issn.1000-5641.2019.05.006

Information college, Renmin University of China, Beijing 100872, China

Received Date: 2019-07-27
Publish Date: 2019-09-25

Abstract

Abstract

Building an intelligent customer service system based on FAQ (frequent asked questions) is a technique commonly used in industry. Question answering systems based on FAQ offer numerous advantages including stability, reliability, and quality. However, given the practical limitations of scaling a manually annotated knowledge base, models often have limited recognition ability and can easily encounter bottlenecks. In order to address the problem of limited scale with FAQ datasets, this paper offers a solution at both the data level and the model level. At the data level, we use Baidu Knows to crawl relevant data and mine semantically equivalent questions, ensuring the relevance and consistency of the data. At the model level, we propose a deep neural network with transAT oriented transfer learning, which combines a transformer network and an attention network, and is suitable for semantic similarity calculations between sentence pairs. Experiments show that the proposed solution can significantly improve the impact of the model on FAQ datasets and to a certain extent resolve the issues with the limited scale of FAQ datasets.
- transfer learning,
- deep neural network,
- FAQ (frequent asked questions) question-answering

FullText(HTML)

References(17)

References

[1]	TURNEY P D, PANTEL P.From frequency to meaning:Vector space models of semantics[J]. Journal of Artificial Intelligence Research, 2010, 37:141-188. doi: 10.1613/jair.2934
[2]	ROBERTSON S, ZARAGOZA H. The probabilistic relevance framework:BM25 and beyond[J]. Foundations and Trends in Information Retrieval, 2009, 3(4):333-389.
[3]	KATO S, TOGASHI R, MAEDA H, et al. LSTM vs BM25 for open-domain QA: A hands-on comparison of effectiveness and efficiency[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development. ACM, 2017: 1309-1312.
[4]	WANG Z G, ITTYCHERIAH A.FAQ-based question answering via word alignment[J]. arXiv: 1507.02628v1[cs.CL].
[5]	MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 2013 Conference on Neural Information Processing Systems Association. NIPS, 2013: 3111-311.
[6]	PENNINGTON J, SOCHER R, MANNING C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing Association for Computational Linguistics. EMNLP, 2014: 1532-1543.
[7]	KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 2014: 1746-1751.
[8]	LIPTON Z C. A critical review of recurrent neural networks for sequence learning[J]. arXiv: 1506.00019v1[cs.LG].
[9]	SAK H, SENIOR A W, BEAUFAYS F. Long short term memory recurrent neural network architectures for large scale acoustic modeling[C]//Proceedings of the 2014 Conference of the International Speech Communication Association. INTERSPEECH, 2014: 338-342.
[10]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30(NIPS 2017). 2017: 6000-6010.
[11]	DEVLIN J, CHANG M W, LEE K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. 2019: 4171-4186.
[12]	YU J F, QIU M H, JIANG J, et al. Modelling domain relationships for transfer learning on retrieval-based question answering systems in e-commerce[C]//Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 2018: 682-690.
[13]	NIE Y X, BANSAL M.Shortcut-stacked sentence encoders for multi-domain inference[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Association for Computational Linguistics, EMNLP 2017, 2017: 41-45.
[14]	CONNEAU A, KIELA D, SCHWENK H, et al. Supervised learning of universal sentence representations from natural language inference data[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. ACL, 2017: 670-680.
[15]	KRATZWALD B, FEUERRIEGEL S. Putting question-answering systems into practice: Transfer learning for efficient domain customization[J]. ACM Transactions on Management Information Systems, 2019, 9(4): 15: 1-15: 20(Article No.15).
[16]	HE H, LIN J J. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. 2016: 937-948.
[17]	CHEN Q, ZHU XD, LING Z H, et al. Enhanced LSTM for natural language inference[C]//Proceedings of the 2017 Annual Meeting of the Association for Computation Linguistics. ACL, 2017: 1657-1668.