Relation extraction via distant supervision technology
-
摘要: 关系抽取作为一种经典的自然语言处理任务, 广泛应用于知识图谱的构建与补全、知识库问答和文本摘要等领域, 旨在抽取目标实体对之间的语义关系. 为了能够高效地构建大规模监督语料, 基于远程监督的关系抽取方法被提出, 通过将文本与现有知识库进行对齐来实现自动标注. 然而由于过强的假设使得其面临诸多挑战, 从而吸引了研究者们的关注. 本文首先介绍远程监督关系抽取的概念和形式化描述, 其次从噪声、信息匮乏以及非均衡3个方面对比分析相关方法及其优缺点, 接着对评估数据集以及评测指标进行了解释和对比分析, 最后探讨了远程监督关系抽取面对的新的挑战以及未来发展趋势, 并在最后做出总结.Abstract: Relation extraction is one of the classic natural language processing tasks that has been widely used in knowledge graph construction and completion, knowledge base question answering, and text summarization. It aims to extract the semantic relation from a target entity pair. In order to construct a large-scale supervised corpus efficiently, a distant supervision method was proposed to realize automatic annotation by aligning the text with the existing knowledge base. However, it highlights a series of challenges as a result of over-strong assumptions and, accordingly, has attracted the attention of researchers. Firstly, this paper introduces the theories of distant supervision relation extraction and the corresponding formal descriptions. Secondly, we systematically analyze related methods and their respective pros and cons from three perspectives: noisy data, insufficient information, and data imbalance. Next, we explain and compare some benchmark corpus and evaluation metrics. Lastly, we highlight new subsequent challenges for distant supervision relation extraction and discuss trends and directions of future research before concluding.
-
表 1 远程监督关系抽取研究问题及相关方法
Tab. 1 Research problems and related methods for distant supervision relation extraction
研究挑战 技术类别 代表性方法 描述 噪声 规则统计 核方法与依存关系[18]、概率图[13,30]、矩阵补全[31-32] 利用实体关系规则判断句子与标签是否匹配 多示例学习 PCNN[17]、多标记[28-29]、EM算法[33-35]、注意力
机制[36-41]、正则化[42]、语言模型[43]使用包(bag)作为分类的单位, 并通过多示例学习来降低噪声对分类的影响 对抗与强化学习 对抗样本训练[19,44]、生成对抗网络[45-47]、策略
梯度[11,48-50]、Q学习[49]自动地从语料中过滤噪声, 将高质量的语料用于训练, 提升分类效果 信息匮乏 辅助信息增强 实体关系信息[20-22,51]、知识表示[52-53]、条件约束[54-55] 引入额外的知识进行增强, 弥补由于知识库不充分导致的信息匮乏 联合学习 监督与半监督联合学习[56]、实体关系联合抽取[57-59] 结合其他任务进行端到端学习 非均衡 少样本学习 多任务学习[60]、语法规则[23,61]、关系层次表征[62-63] 捕捉丰富的头尾数据的相关性, 缓解长尾关系预测不准确问题 表 2 远程监督语料噪声的示例
Tab. 2 Some examples of distant supervision noisy data
示例 对齐标签 正确标签 ... [Obama] was born in [US.]... PlaceOfBirth PlaceOfBirth ... [Obama] have said he loved [US.]... PlaceOfBirth NA ... [Obama] was lived in [US.] last year... PlaceOfBirth PlaceOfLived ... [Obama] was the president of [US.] during 2008 and 2016... PlaceOfBirth President ... [Obama] will leave [US.] for China... PlaceOfBirth NA 表 3 评测数据集统计信息
Tab. 3 Statistics of the evaluate dataset
表 4 远程监督关系抽取评测指标
Tab. 4 Evaluation metrics of distant supervision relation extraction
评测指标 描述 功能 准确率(Precision) 指在测试集某个关系类上所有样本被预测正确的占比, 通常分为微平均和宏平均 评价关系预测的准确程度 召回率(Recall) 指在测试集上预测为某个关系类中正确的占比, 通常分为微平均和宏平均 评价关系预测的查全程度 $ {F}_{\beta } $ 值指准确率和召回率的综合评价, 公式为 $ {F}_{\beta }=\left(1+{\beta }^{2}\right)\frac{{\rm{Precision}}\times {\rm{Recall}}}{\left({\beta }^{2}\times {\rm{Precision}}\right)+{\rm{Recall}}} $ 综合评价关系抽取在查准率和查全率方面的效果 P-R曲线 指以Recall为横轴、以Precision为纵轴的曲线 评价分类器的优劣性能 AUC值 ROC曲线与坐标轴包围部分的面积, $0\leqslant AUC\leqslant 1$ 评价分类器的优劣性能 P@N 通常表示按照准确率降序排序时第N(或N%)个值 避免False Negative对关系预测错误评估的影响 Hits@K 表示预测结果的前K个关系中如果存在真实标签则记为1, 否则记为0 评估在关系抽取基于相似度排序问题上的准确效果 MRR 值为所有排序位置对应倒数的和 评估在关系抽取基于相似度排序问题上的准确效果 Recall@K 表示在测试集上Hits@K指标的期望, 公式为 $ \text{Recall}@K=\frac{1}{N}\sum \limits_{i=1}^{N}\text{Hit}{\text{s}}_{i}@K $ 评估在关系抽取基于相似度排序问题上的查全效果 -
[1] 刘峤, 李杨, 段宏, 等. 知识图谱构建技术综述 [J]. 计算机研究与发展, 2016, 53(3): 582-600. [2] KEJRIWAL M, SEQUEDA J, LOPEZ V, et al. Knowledge graphs: Construction, management and querying: Editorial [J]. Social Work, 2019, 10(6): 961-962. [3] YU M, YIN W, HASAN K S, et al. Improved neural relation detection for knowledge base question answering [C]// Meeting of the Association for Computational Linguistics. 2017: 571-581. [4] ALLAHYARI M, POURIYEH S, ASSEFI M, et al. Text summarization techniques: A brief survey [J]. International Journal of Advanced Computer Science and Applications, 2017, 8(10): 397-405. [5] HASEGAWA T, SEKINE S, GRISHMAN R, et al. Discovering relations among named entities from large corpora [C]// Meeting of the Association for Computational Linguistics. 2004: 415-422. [6] ETZIONI O, BANKO M, SODERLAND S, et al. Open information extraction from the web [J]. Communications of the ACM, 2008, 51(12): 68-74. [7] LI F, ZHANG M, FU G, et al. A Bi-LSTM-RNN model for relation classification using low-cost sequence features[J]. ArXiv: Computation and Language, 2016. [8] 姚春华, 刘潇, 高弘毅, 等. 基于句法语义特征的实体关系抽取技术 [J]. 通信技术, 2018, 51(8): 1828-1835. [9] KUMLIEN M C J. Constructing biological knowledge bases by extraction information from text sources [C]// Proc Int Conf Intell Syst Mol Biol. 1999: 77-86. [10] MINTZ M, BILLS S, SNOW R, et al. Distant supervision for relation extraction without labeled data [C]// International Joint Conference on Natural Language Processing. 2009: 1003-1011. [11] ZENG X, HE S, LIU K, et al. Large scaled relation extraction with reinforcement learning [C]// National Conference on Artificial Intelligence. 2018: 5658-5665. [12] 杨东明, 杨大为, 顾航, 等. 面向初等数学的知识点关系提取研究 [J]. 华东师范大学学报(自然科学版), 2019(5): 53-65. [13] RIEDEL S, YAO L, MCCALLUM A, et al. Modeling relations and their mentions without labeled text [C]// European Conference on Machine Learning. 2010: 148-163. [14] BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: A collaboratively created graph database for structuring human knowledge [C]// International Conference on Management of Data. 2008: 1247-1250. [15] JAT S, KHANDELWAL S, TALUKDAR P P, et al. Improving distantly supervised relation extraction using word and entity based attention [J]. ArXiv: Computation and Language, 2018. [16] HAN X, ZHU H, YU P, et al. FewRel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation [C]// Empirical Methods in Natural Language Processing. 2018: 4803-4809. [17] ZENG D, LIU K, CHEN Y, et al. Distant supervision for relation extraction via piecewise convolutional neural networks [C]// Empirical Methods in Natural Language Processing. 2015: 1753-1762. [18] ZELENKO D, AONE C, RICHARDELLA A, et al. Kernel methods for relation extraction [J]. Journal of Machine Learning Research, 2003, 3(6): 1083-1106. [19] SHI G, FENG C, HUANG L, et al. Genre separation network with adversarial training for cross-genre relation extraction [C]// Empirical Methods in Natural Language Processing. 2018: 1018-1023. [20] VASHISHTH S, JOSHI R, PRAYAGA S S, et al. RESIDE: Improving distantly-supervised neural relation extraction using side information [C]// Empirical Methods in Natural Language Processing. 2018: 1257-1266. [21] LI Y, LONG G, SHEN T, et al. Self-attention enhanced selective gate with entity-aware embedding for distantly supervised relation extraction [C]// National Conference on Artificial Intelligence. 2020. [22] KUANG J, CAO Y, ZHENG J, et al. Improving neural relation extraction with implicit mutual relations [C]// International Conference on Data Engineering. 2020. [23] KRAUSE S, LI H, USZKOREIT H, et al. Large-scale learning of relation-extraction rules with distant supervision from the web [C]// International Semantic Web Conference. 2012: 263-278. [24] 白龙, 靳小龙, 席鹏弼, 等. 基于远程监督的关系抽取研究综述 [J]. 中文信息学报, 2019, 33(10): 10-17. [25] 鄂海红, 张文静, 肖思琪, 等. 深度学习实体关系抽取研究综述 [J]. 软件学报, 2019, 30(6): 1793-1818. [26] SUCHANEK F M, KASNECI G, WEIKUM G, et al. Yago: A core of semantic knowledge [C]// The Web Conference. 2007: 697-706. [27] ZHOU P, SHI W, TIAN J, et al. Attention-based bidirectional long short-term memory networks for relation classification [C]// Meeting of the Association for Computational Linguistics. 2016: 207-212. [28] HOFFMANN R, ZHANG C, LING X, et al. Knowledge-based weak supervision for information extraction of overlapping relations [C]// Meeting of the Association for Computational Linguistics. 2011: 541-550. [29] SURDEANU M, TIBSHIRANI J, NALLAPATI R, et al. Multi-instance multi-label learning for relation extraction [C]// Empirical Methods in Natural Language Processing. 2012: 455-465. [30] TAKAMATSU S, SATO I, NAKAGAWA H, et al. Reducing wrong labels in distant supervision for relation extraction [C]// Meeting of the Association for Computational Linguistics. 2012: 721-729. [31] FAN M, ZHAO D, ZHOU Q, et al. Distant supervision for relation extraction with matrix completion [C]// Meeting of the Association for Computational Linguistics. 2014: 839-849. [32] ZHANG Q, WANG H. Noise-clustered distant supervision for relation extraction: A nonparametric bayesian perspective [C]// Empirical Methods in Natural Language Processing. 2017: 1808-1813. [33] MIN B, GRISHMAN R, WAN L, et al. Distant supervision for relation extraction with an incomplete knowledge base [C]// North American Chapter of the Association for Computational Linguistics. 2013: 777-782. [34] XU W, HOFFMANN R, ZHAO L, et al. Filling knowledge base gaps for distant supervision of relation extraction [C]// Meeting of the Association for Computational Linguistics. 2013: 665-670. [35] RITTER A, ZETTLEMOYER L, ETZIONI O, et al. Modeling missing data in distant supervision for information extraction [C]// Transactions of the Association for Computational Linguistics. 2013: 367-378. [36] LIN Y, SHEN S, LIU Z, et al. Neural relation extraction with selective attention over instances [C]// Meeting of the Association for Computational Linguistics. 2016: 2124-2133. [37] JI G, LIU K, HE S, et al. Distant supervision for relation extraction with sentence-level attention and entity descriptions [C]// National Conference on Artificial Intelligence. 2017: 3060-3066. [38] JAT S, KHANDELWAL S, TALUKDAR P P, et al. Improving distantly supervised relation extraction using word and entity based attention [J]. ArXiv: Computation and Language, 2018. [39] WU S, FAN K, ZHANG Q, et al. Improving distantly supervised relation extraction with neural noise converter and conditional optimal selector [J]. National Conference on Artificial Intelligence, 2019, 33(1): 7273-7280. [40] YE Z, LING Z. Distant supervision relation extraction with intra-bag and inter-bag attentions [C]// North American Chapter of the Association for Computational Linguistics. 2019: 2810-2819. [41] YUAN Y, LIU L, TANG S, et al. Cross-relation cross-bag attention for distantly-supervised relation extraction [J]. National Conference on Artificial Intelligence, 2019, 33(1): 419-426. [42] JIA W, DAI D, XIAO X, et al. ARNOR: Attention regularization based noise reduction for distant supervision relation classification [C]// Meeting of the Association for Computational Linguistics. 2019: 1399-1408. [43] ALT C, HUBNER M, HENNIG L, et al. Fine-tuning pre-trained transformer language models to distantly supervised relation extraction [C]// Meeting of the Association for Computational Linguistics. 2019: 1388-1398. [44] WU Y, BAMMAN D, RUSSELL S, et al. Adversarial training for relation extraction [C]// Empirical Methods in Natural Language Processing. 2017: 1778-1783. [45] QIN P, WEIRAN X U, WANG W Y, et al. DSGAN: Generative adversarial training for robust distant supervision relation extraction [C]// Meeting of the Association for Computational Linguistics. 2018: 496-505. [46] LI P, ZHANG X, JIA W, et al. GAN driven semi-distant supervision for relation extraction [C]// North American Chapter of the Association for Computational Linguistics. 2019: 3026-3035. [47] HAN X, LIU Z, SUN M, et al. Denoising distant supervision for relation extraction via instance-level adversarial training [J]. ArXiv: Computation and Language, 2018. [48] FENG J, HUANG M, ZHAO L, et al. Reinforcement learning for relation classification from noisy data [C]// National Conference on Artificial Intelligence. 2018: 5779-5786. [49] HE Z, CHEN W, WANG Y, et al. Improving neural relation extraction with positive and unlabeled learning [C]// National Conference on Artificial Intelligence. 2020. [50] QIN P, XU W, WANG W Y, et al. Robust distant supervision relation extraction via deep reinforcement learning [C]// Meeting of the Association for Computational Linguistics. 2018: 2137-2147. [51] SU Y, LIU H, YAVUZ S, et al. Global relation embedding for relation extraction [C]// North American Chapter of the Association for Computational Linguistics. 2018: 820-830. [52] XU P, BARBOSA D. Investigations on knowledge base embedding for relation prediction and extraction [J]. ArXiv: Computation and Language, 2018. [53] XU P, BARBOSA D. Connecting language and knowledge with heterogeneous representations for neural relation extraction [C]// North American Chapter of the Association for Computational Linguistics. 2019: 3201-3206. [54] LIU Y, LIU K, XU L, et al. Exploring fine-grained entity type constraints for distantly supervised relation extraction [C]// International Conference on Computational Linguistics. 2014: 2107-2116. [55] YE Y, FENG Y, LUO B, et al. Integrating relation constraints with neural relation extractors [C]// National Conference on Artificial Intelligence. 2020. [56] BELTAGY I, LO K, AMMAR W, et al. Combining distant and direct supervision for neural relation extraction [C]// North American Chapter of the Association for Computational Linguistics. 2019: 1858-1867. [57] WEI Z, SU J, WANG Y, et al. A novel hierarchical binary tagging framework for joint extraction of entities and relations [J]. ArXiv: Computation and Language, 2019. [58] REN X, WU Z, HE W, et al. CoType: Joint extraction of typed entities and relations with knowledge bases [C]// The Web Conference. 2017: 1015-1024. [59] TAKANOBU R, ZHANG T, LIU J, et al. A hierarchical framework for relation extraction with reinforcement learning [J]. National Conference on Artificial Intelligence, 2019, 33(1): 7072-7079. [60] YE W, LI B, XIE R, et al. Exploiting entity BIO tag embeddings and multi-task learning for relation extraction with imbalanced data [C]// Meeting of the Association for Computational Linguistics. 2019: 1351-1360. [61] GUI Y, LIU Q, ZHU M, et al. Exploring long tail data in distantly supervised relation extraction [C]// LIN C Y, XUE N, ZHAO D, et al. Natural Language Understanding and Intelligent Applications. ICCPOL 2016, NLPCC 2016. Lecture Notes in Computer Science, 2016. [62] ZHANG N, DENG S, SUN Z, et al. Long-tail relation extraction via knowledge graph embeddings and graph convolution networks [C]// North American Chapter of the Association for Computational Linguistics. 2019: 3016-3025. [63] HAN X, YU P, LIU Z, et al. Hierarchical relation extraction with coarse-to-fine grained attention [C]// Empirical Methods in Natural Language Processing. 2018: 2236-2245. [64] MIKOLOV T, CHEN K, CORRADO G S, et al. Efficient estimation of word representations in vector space [C]// International Conference on Learning Representations. 2013. [65] PENNINGTON J, SOCHER R, MANNING C D, et al. Glove: Global vectors for word representation [C]// Empirical Methods in Natural Language Processing. 2014: 1532-1543. [66] DEVLIN J, CHANG M, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding [C]// North American Chapter of the Association for Computational Linguistics. 2019: 4171-4186. [67] GOODFELLOW I, POUGETABADIE J, MIRZA M, et al. Generative adversarial nets [C]// Neural Information Processing Systems. 2014: 2672-2680. [68] SALVARIS M, DEAN D, TOK W H, et al. Generative adversarial networks [J]. ArXiv: Machine Learning, 2018: 187-208. [69] ANDREW A M. Reinforcement learning: An introduction [J]. Kybernetes, 1998, 27(9): 1093-1096. [70] SUN T, ZHANG C, JI Y, et al. Reinforcement learning for distantly supervised relation extraction [J]. IEEE Access, 2019(7): 98023-98033. [71] TANG J, QU M, WANG M, et al. LINE: Large-scale information network embedding [C]// The Web Conference. 2015: 1067-1077. [72] HOCHREITER S, SCHMIDHUBER J. Long short-term memory [J]. Neural Computation, 1997, 9(8): 1735-1780. [73] BORDES A, USUNIER N, GARCIADURAN A, et al. Translating embeddings for modeling multi-relational data [C]// Neural Information Processing Systems. 2013: 2787-2795. [74] KIPF T, WELLING M. Semi-supervised classification with graph convolutional networks [C]// International Conference on Learning Representations. 2017. [75] HENDRICKX I, KIM S N, KOZAREVA Z, et al. SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals [C]// North American Chapter of the Association for Computational Linguistics. 2009: 94-99. [76] SURDEANU M, GUPTA S, BAUER J, et al. Stanford's distantly-supervised slot-filling system [R]. Stanford, CA: Stanford University, 2011. [77] JI, HENG, GRISHMAN, RALPH, et al. Overview of the TAC 2010 knowledge base population track [C]// Text Analysis Conference. 2009. [78] JI H, GRISHMAN R, DANG H. Overview of the TAC2011 knowledge base population track [C]// Text Analysis Conference. 2011. [79] GAO T, HAN X, ZHU H, et al. FewRel 2.0: Towards more challenging few-shot relation classification [C]// International Joint Conference on Natural Language Processing. 2019: 6249-6254. [80] XU J, WEN J, SUN X, et al. A discourse-level named entity recognition and relation extraction dataset for Chinese literature text [J]. ArXiv: Computation and Language, 2017. [81] HAN X, GAO T, YAO Y, et al. OpenNRE: An open and extensible toolkit for neural relation extraction [C]// International Joint Conference on Natural Language Processing. 2019: 169-174. [82] LIU T, ZHANG X, ZHOU W, et al. Neural relation extraction via inner-sentence noise reduction and transfer learning [C]// Empirical Methods in Natural Language Processing. 2018: 2195-2204. [83] REN Z, WANG X, ZHANG N, et al. Deep reinforcement learning-based image captioning with embedding reward [C]// Computer Vision and Pattern Recognition. 2017: 1151-1159. [84] SHANG Y M, HUANG H, MAO X, et al. Are noisy sentences useless for distant supervised relation extraction [C]// National Conference on Artificial Intelligence. 2020. [85] CAO Z, HIDALGO G, SIMON T, et al. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields [J]. ArXiv: Computer Vision and Pattern Recognition, 2018.