A survey on coreference resolution
-
摘要: 共指消解旨在识别指向同一实体的不同表述,在文本摘要、机器翻译、自动问答和知识图谱等领域有着广泛的应用.然而,作为自然语言处理中的一个经典问题,它是一个NP-Hard的问题.本文首先对共指消解的基本概念进行介绍,对易混淆概念进行解析,并讨论了共指消解的研究意义及难点.本文进一步归纳梳理了共指消解的发展历程,将共指消解从技术层面划分为若干阶段,并介绍了各个阶段的代表性模型,探讨了各类模型的优缺点,其中着重介绍了基于规则、基于机器学习、基于全局最优化、基于知识库和基于深度学习的模型.接着对共指消解的评测会议进行介绍,对共指消解的语料库和常用评测指标进行解释和对比分析.最后,指出了当前共指消解模型尚未解决的问题,探讨了共指消解的发展趋势.Abstract: Coreference resolution is the task of finding all expressions that point to the same entity in a text; this technique is widely used for text summarization, machine translation, question answering systems, and knowledge graphs. As a classic problem in natural language processing, it is considered NP-Hard. This paper first introduces the basic concepts of coreference resolution, analyzes some confusing concepts related thereto, and discusses the research significance and difficulties of the technique. Then, we summarize research advances in coreference resolution, divide them into stages from a technical standpoint, introduce the representative approaches for each stage, and discuss the advantages and disadvantages of various methods. The summarized approaches are five-fold:rule-based, machine learning, global optimization, knowledge base, and deep learning. Next, we introduce benchmark conferences for the problem of coreference resolution; in this context, we explain and compare their corpus and common evaluation metrics. Finally, this paper highlights the open problems for coreference resolution, and discusses trends and directions of future research.
-
表 1 四种共指类型示例
Tab. 1 Examples of four coreference types
共指类型 定义 例子 解释 回指 照应语为人称代词, 出现在先行语后面的共指情况 [小强]在平时乐于助人, 因此[他]在班级中的口碑很好. [他]是人称代词, 出现在名词短语[小强]后面 预指 照应语为人称代词, 出现在先行语前面的共指情况 “[我]这次彻底的失败了. ”[刘总]无奈地摇头说道. [我]是人称代词, 出现在名词短语[刘总]前面 名词短语共指 照应语和先行语都是名词短语, 而非人称代词的情况 2010年公布的数据显示, [中国]在第二季度已经超越日本, 成为了[世界第二大经济体]. [中国]和[世界第二大经济体]都是名词短语 先行语分指 一个照应语同时对应多个先行语的组合的情况 [梅西]和[C罗]都是世界顶级的球员, [他们]惺惺相惜. 先行语[梅西]与[C罗]之和与照应语[他们]共指 表 2 共指消解各研究阶段及特点
Tab. 2 Research stages and characteristics of coreference resolution
研究阶段 开始时期 代表性方法 特点 规则方法 1978年 Hobbs算法及其改进[13-14, 33-34]、中心理论[15, 35-36] 理解和实现比较简单; 复杂的语言学规则导致泛化能力较差. 机器学习方法 1995年 监督方法(决策树[16]、朴素贝叶斯[37]、最大熵[17]、SVM[18]、CRF[38])、无监督方法(聚类[19-20]、图划分[21]、EM[39]、LDA[40])、半监督方法(协同训练[22]、多视角学习[41]) 通过大量数据训练模型, 使得模型的泛化性能显著提升; 模型的效果高度依赖于特征工程; 模型没有考虑全局的依赖和矛盾, 效果存在一定局限性. 全局最优化方法 本世纪初 整数规划[23]、矛盾消解[42]、模式发现[43]、多通道筛法[24, 44-45]、隐结构[46-51]、singleton侦测[12, 52-53] 基于全局最优策略, 使得模型的全局效果得到很大提升. 基于知识库的方法 2011年 众包系统[25]、百科知识[26-29] 引入开放知识作为额外特征, 很大程度避免了“知识匮乏”导致的预测错误. 深度学习方法 2016年 前馈神经网络[54-55]、神经语言模型[56]、强化学习[31]、End-to-end[32]、ELMo[57]、Coarse-to-fine[58] 采用深度学习技术, 大大增加了模型的深层语义学习能力和泛化性能. 表 3 共指消解会议及语料库
Tab. 3 Conferences and corpus of coreference resolution
会议名称 举办时间 共指消解任务年份 语料库 特点 MUC 1987-1997 1995、1998 MUC数据集 主题只与军事、科技相关, 只包含英文 ACE 2000-2008 2003-2008 ACE数据集 包含新闻专线、广播、报纸中语料, 首次加入中文 TAC 2008-至今 2009-2017 TAC数据集 取代了ACE会议, 共指消解任务开始过渡到基于维基百科的实体链接任务 SemEval 1998-至今 2010 OntoNotes2.0数据集 没有将单独表述(Singleton)标注出来, 增加了共指消解的难度 CoNLL 1999-至今 2011、2012 OntoNotes4.0数据集 OntoNotes4.0(CoNLL 2011)只支持英文, OntoNotes5.0(CoNLL 2012)中加入了中文和阿拉伯文, 是目前最经典的数据集 OntoNotes5.0数据集 表 4 共指划分的一个例子
Tab. 4 An example of coreference partition
[鲍勃]$_{1}$今天计划出去游玩, 于是[他]$_{2}$打电话叫[查理]$_{3}$一同前往[海滩]$_{4}$.然而, [查理]$_{5}$没有回应[他]$_{6}$的呼叫, 因为[他]$_{7}$已经在[海滩]$_{8}$了. Key: {1, 2, 6}$_{\mbox{鲍勃}}$, {3, 5, 7}$_{\mbox{查理}}$, {4, 8}$_{\mbox{海滩}}$ Response 1: {1, 2, 6, 7}$_{\mbox{鲍勃}}$, {3, 5}$_{\mbox{查理}}$, {4, 8}$_{\mbox{海滩}}$ Response 2: {1, 2, 3, 5, 6, 7}$_{\mbox{鲍勃/查理}}$, {4, 8}$_{\mbox{海滩}}$ 表 5 共指消解评测指标
Tab. 5 Evaluation metrics of coreference resolution
指标名称 关注点 优/缺点 MUC-score[91] 主要统计Key和Response中共现的共指链接个数 计算方法较为简单; 但是无法计算所有表述均为单独表述的情况, 且对错误严重程度不同的共指链同等看待 ACE-value[87] 除了考察共指链预测, 还考察了实体和表述类型的预测正确与否 将实体类型也考虑到评测指标中; 但是与当前标准共指消解任务有一定区别, 因此只适用于ACE数据集, 现在很少采用该指标 B-CUBED[92] 从划分的角度, 直接对表述进行逐个统计 克服了MUC-score的缺点; 但是当Key中所有表述共指时召回率一定为100%, 当Key中都是单独表述时准确率一定为100%, 这显然是错误的 CEAF[93] 建立了Key到Response之间共指链的一对一映射, 可看做二分图匹配问题 克服了B-CUBED的缺点; 但是其忽视了Response中正确但未被匹配的共指链, 并且忽视了共指集合的大小 BLANC[94-95] 同时考虑了共指表述对和非共指表述对的准确率和召回率, 最后求其平均 克服了CEAF的缺点, 是一种较新的评测指标; 但是由于该指标对单独表述是否识别过于敏感, 没有被广泛采用 LEA[96] 以Key和Response的共指交集的链接数为基础, 再按照共指集合大小加权 克服了BLANC的缺点, 是一种较新的评测指标, 同时考虑了共指链的完整性和共指集合的大小; 但是由于提出较晚, 暂未被广泛使用, 且相较先前方法运算略为复杂 -
[1] 刘峤, 李杨, 段宏, 等.知识图谱构建技术综述[J].计算机研究与发展, 2016, 53(3):582-600. http://d.old.wanfangdata.com.cn/Periodical/jsjyjyfz201603008 [2] 王厚峰.指代消解的基本方法和实现技术[J].中文信息学报, 2002, 16(6):9-17. doi: 10.3969/j.issn.1003-0077.2002.06.002 [3] GETOOR L, MACHANAVAJJHALA A. Entity resolution:Theory, practice & open challenge[J]. Proceedings of the Very Large Data Bases Endowment, 2012, 5(12):2018-2019. http://cn.bing.com/academic/profile?id=4b922fba1a847fa1fea3fbbd00f7211a&encoded=0&v=paper_preview&mkt=zh-cn [4] MELLI G, ESTER M. Supervised identification and linking of concept mentions to a domain-specific ontology[C]//Proceedings of the 19th ACM International Conference on Information & Knowledge Management. 2010: 1717-1720. [5] JURAFSKY D, MARTIN H. Speech and Language Processing:An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition[M]. New Delhi:Pearson Education, 2000. [6] LANG J, QIN B, LIU T, et al. Intra-document coreference resolution:The state of the art[J]. Journal of Chinese Language and Computing, 2008, 17(4):227-253. http://cn.bing.com/academic/profile?id=bbbcce83d54b3bd3aff17c96f63121df&encoded=0&v=paper_preview&mkt=zh-cn [7] 宋洋, 王厚峰.共指消解研究方法综述[J].中文信息学报, 2015, 29(1):1-12. doi: 10.3969/j.issn.1003-0077.2015.01.001 [8] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]//Proceedings of NAACL-HLT. 2016: 260-270. [9] 高艳红, 李爱萍, 段利国.面向实体链接的多特征图模型实体消歧方法[J].计算机应用研究, 2017, 34(10):2909-2914. doi: 10.3969/j.issn.1001-3695.2017.10.007 [10] LI Y, WANG C, HAN F Q, et al. Mining evidences for named entity disambiguation[C]//Proceedings of the 19th International Conference on Knowledge Discovery and Data Mining. 2013: 1070-1078. [11] DEEMTER K V, KIBBLE R. On coreferring:Coreference in MUC and related annotation schemes[J]. Computational Linguistics, 2000, 26(4):629-637. doi: 10.1162/089120100750105966 [12] MITKOV R. Anaphora resolution: The state of the art[D]. Wolverhampton: University of Wolverhampton, 1999. [13] HOBBS J R. Resolving pronoun references[J]. Journal of Lingua, 1978, 44:311-338. doi: 10.1016/0024-3841(78)90006-2 [14] WALKER M A. Evaluating discourse processing algorithms[C]//Proceedings of the 27th Annual Meeting of Association of Computational Linguistics. Vancouver, 1989. [15] GROSZ B, JOSHI A, WEINSTEIN S. Centering:A framework for modelling the local coherence of discourse[J]. Journal of Computational Linguistics, 1995, 21(2):203-225. http://d.old.wanfangdata.com.cn/Periodical/sxzx201606001 [16] MCCARTHY J, LEHNERT W. Using decision trees for coreference resolution[C]//Proceedings of the 14th International Joint Conference on Artificial Intelligence. 1995. [17] PONZETTO S P, STRUBE M. Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution[C]//Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 2006: 192-199. http://cn.bing.com/academic/profile?id=530bea1ed3dfaf0b79ffc6584ef1afee&encoded=0&v=paper_preview&mkt=zh-cn [18] RAHMAN A, NG V. Supervised models for coreference resolution[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009: 968-977. [19] CARDIE C, WAGSTAFF K. Noun phrase coreference as clustering[C]//Proceedings of the Joint Conference on Empirical Methods in NLP and Very Large Corpora. 1999: 277-308. [20] 谢永康, 周雅倩, 黄萱菁.一种基于谱聚类的共指消解方法[J].中文信息学报, 2007, 21(2):77-82. doi: 10.3969/j.issn.1003-0077.2007.02.012 [21] 周俊生, 黄书剑, 陈家骏, 等.一种基于图划分的无监督汉语指代消解算法[J].中文信息学报, 2007, 21(2):77-82. doi: 10.3969/j.issn.1003-0077.2007.02.012 [22] MULLER C, RAPP S, STRUBE M. Applying co-training to reference resolution[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002: 352-359 [23] DENIS P, BALDRIDGE J. Joint determination of anaphoricity and coreference resolution using integer programming[C]//Proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics. 2007: 236-243. [24] RAGHUNATHAN K, LEE H, RANGARAJAN S, et al. A multi-pass sieve for coreference resolution[C]//Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. 2010. [25] VESDAPUNT N, BELLARE K, DALVI N. Crowdsourcing algorithms for entity resolution[C]//Proceedings of the VLDB Endowment. 2014: 1071-1082. [26] RAHMAN A, NG V. Coreference resolution with world knowledge[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 2011: 814-824. [27] RATINOV L, ROTH D. Learning-based Multi-Sieve Co-Reference Resolution with Knowledge[M]. Association for Computational Linguistics, 2012:1234-1244. [28] DURRETT G, KLEIN D. Easy Victories and Uphill Battles in Coreference Resolution[M]. Association for Computational Linguistics, 2013:1971-1982. [29] SORALUZE A, ARREGI O, ARREGI X, et al. Enriching basque coreference resolution system using semantic knowledge sources[C]//Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes. Association for Computational Linguistics, 2017: 8-16. [30] WISEMAN S, RUSH A M, SHIEBER S M. Learning global features for coreference resolution[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. [31] CLARK K, MANNING C D. Deep reinforcement learning for mention-ranking coreference models[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016: 2256-2262. [32] LEE K, HE L H, LEWIS M, et al. End-to-end neural coreference resolution[C]//Conference on Empirical Methods in Natural Language Processing. 2017: 188-197. [33] HAGHIGHI A, KLEIN D. Simple coreference resolution with rich syntactic and semantic features[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009: 1152-1161. [34] CONVERSE S P. Pronominal Anaphora Resolution in Chinese[D]. Pennsylvania: University of Pennsylvania, 2006. [35] SIDNER C. Focusing for interpretation of pronouns[J]. Computational Linguistics. 1981, 7(4):217-231. http://dl.acm.org/citation.cfm?id=972912 [36] BRENNAN S E, FRIEDMAN M W, POLLARD C. A centering approach to pronouns[C]//Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics. 1987: 155-162. [37] GE N Y, HALE J, CHARNIAK E. A statistical approach to anaphora resolution[C]//Proceedings of the ACL 1998 Workshop on Very Large Corpora. 1998. [38] MCCALLUM A, WELLNER B. Conditional models of identity uncertainty with application to noun coreference[C]//International Conference on Neural Information Processing System. 2004: 905-912. [39] NG V. Unsupervised models for coreference resolution[C]//Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 2008: 640-649. http://cn.bing.com/academic/profile?id=302f142ed3d143d86fd7be020eccf9ed&encoded=0&v=paper_preview&mkt=zh-cn [40] BHATTACHARYA I, GETOOR L. A latent Dirichlet model for unsupervised entity resolution[C]//SIAM International Conference on Data Mining. 2006. [41] RAGHAVAN P, FOSLERLUSSIER E, LAI A M. Exploring semi-supervised coreference resolution of medical concepts using semantic and temporal features[C]//Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2012: 731-741. [42] MCCALLUM A, WELLNER B. Conditional models of identity uncertainty with application to noun coreference[C]//Proceedings of Neural Information Processing Systems. 2004: 905-912. [43] YANG X, SU J. Coreference resolution using semantic relatedness information from automatically discovered patterns[C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 2007: 528-535. [44] CHEN C, NG V. Combining the best of two worlds: A hybrid approach to multilingual coreference resolution[C]//Joint Conference on EMNLP & CONLL-Shared Task. Association for Computational Linguistics, 2012: 56-63. [45] LEE H, PEIRSMAN Y, CHANG A, et al. Stanford's multi-pass sieve coreference resolution system at the conll-2011 shared task[C]//Proceedings of the 15th Conference on Computational Natural Language Learning: Shared Task. 2011: 28-34. [46] FERNANDES E R, SANTOS C N, MILIDIU R L. Latent trees for coreference resolution[J]. Computational Linguistics, 2014, 40(4):801-835. doi: 10.1162/COLI_a_00200 [47] FERNANDES E R, MILIDIU R L. Entropy-guided feature generation for structured learning of Portuguese dependency parsing[C]//Computational Processing of the Portuguese Language. 2012: 146-156. [48] YU C N J, JOACHIMS T. Learning structural SVMs with latent variables[C]//Proceedings of the 26th Annual International Conference on Machine Learning. 2009: 1169-1176. [49] DAUME H, MARCU D. Learning as search optimization: Approximate large margin methods for structured prediction[C]//Proceedings of the 22nd International Conference on Machine Learning. 2005: 169-176. [50] BJORKELUND A, KUHN J. Learning structured perceptrons for coreference resolution with latent antecedents and non-local features[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Lingustics. 2014: 47-57. [51] MARTSCHAT S, STRUBE M. Latent structures for coreference resolution[J]. Transactions of the Association for Computational Linguistics, 2015(3):405-418. http://cn.bing.com/academic/profile?id=e6b0f72520638418b4a5eb10bbb02aba&encoded=0&v=paper_preview&mkt=zh-cn [52] RECASENS M, MARNEFFE M C, POTTS C. The life and death of discourse entities: Identifying singleton metions[C]//The 2013 Annual Conference of the North American Chapter of the Association for Computational Linguistics. 2013: 627-633. [53] MARNEFFE M C, RECASENS M, POTTS C, et al. Modeling the lifespan of discourse entities with application to coreference resolution[J]. Journal of Artificial Intelligence Research, 2015, 52:445-475. doi: 10.1613/jair.4565 [54] PARK C, CHOI K H, LEE C K, et al. Korean coreference resolution with guided mention pair model using deep learning[J]. ETRI Journal, 2016, 38(6):1207-1217. doi: 10.4218/etr2.2016.38.issue-6 [55] CLARK K, MANNING C D. Improving coreference resolution by learning entity-level distributed representations[EB/OL].[2019-05-03]. https://arxiv.org/pdf/1606.01323.pdf. [56] MIKOLOV T, KARAFIAT M, BURGET L, et al. Recurrent neural network based language model[C]//Conference of the International Speech Communication Association. 2010: 1045-1048. [57] PETERS M E, NEUMANN M, LYYER M, et al. Deep contextualized word representations[C]//North American Chapter of the Association for Computational Linguistics. 2018: 2227-2237. [58] LEE K, HE L H, ZETTLEMOYER L. Higher-order coreference resolution with coarse-to-fine inference[C]//North American Chapter of the Association for Computational Linguistics. 2018: 687-692. [59] LAPPIN S, SHALOM H J. An algorithm for pronominal anaphora resolution[J]. Computational Linguistics, 1994, 20(4):535-561. http://dl.acm.org/citation.cfm?id=203989 [60] POESIO M, STEVENSON R, EUGENIO B D, et al. Centering:A parametric theory and its instantiations[J]. Computational Linguistics, 2004, 30(3):309-363. doi: 10.1162/0891201041850911 [61] NG V, CARDIE C. Improving machine learning approaches to coreference resolution[C]//Meeting of the Association of Computational Linguistics. 2002: 104-111. [62] PONZETTO S P, STRUBE M. Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution[C]//Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL. 2006: 192-199. [63] DENIS P, BALDRIDGE J. Specialized models and ranking for coreference resolution[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2008: 660-669. [64] YANG X, ZHOU G, SU J, et al. Coreference resolution using competitive learning approach[C]//Proceedings of the Association of Computational Linguistics. 2003: 176-183. [65] YANG X F, SU J, LANG J, et al. An entity-mention model for coreference resolution with inductive logic programming[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2008: 843-851. [66] RAHMAN A, NG V. Narrowing the modeling gap:A cluster-ranking approach to coreference resolution[J]. Journal of Artificial Intelligence Research, 2011, 40:469-521. doi: 10.1613/jair.3120 [67] NEWMAN M E J, GIRVAN M. Finding and evaluating community structure in networks[J]. Phys Rev E, 2004, 69(2):026113. doi: 10.1103/PhysRevE.69.026113 [68] BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]//Proceedings of the 11th Annual Conference on Learning Theory. 1998: 92-100. [69] GANCHEV K, GRACA J, GILLENWATER J. Posterior regularization for structured latent variable models[J]. Journal of Machine Learning Research, 2010, 11(1):2001-2049. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=CC0210210735 [70] MOOSAVI N S, STRUBE M. Search space pruning: A simple solution for better coreference resolvers[C]//Proceedings of NAACL-HLT 2016. Association for Computational Linguistics, 2016: 1005-1011. [71] WISEMAN S, RUSH A M, SHIEBER S M, et al. Learning anaphoricity and antecedent ranking features for coreference resolution[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 2015: 1416-1426. [72] MA C, DOPPA J R, ORR J W, et al. Prune-and-score: Learning for greedy coreference resolution[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014. [73] SUCHANEK F, KASNECI G, WEIKUM G. YAGO: A core of semantic knowledge unifying wordnet and Wikipedia[C]//Proceedings of the World Wide Web Conference. 2007: 697-706. [74] BAKER C F, FILLMORE C J, LOWE J B. The Berkeley FrameNet project[C]//Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics. 1998: 86-90. [75] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL].[2019-05-10]. https://arxiv.org/pdf/1301.3781.pdf. [76] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9:1735-1780. doi: 10.1162/neco.1997.9.8.1735 [77] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL].[2019-06-02]. https://arxiv.org/pdf/1409.0473.pdf. [78] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436. doi: 10.1038/nature14539 [79] CLARK K, MANNING C D. Entity-centric coreference resolution with model stacking[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 2015: 1405-1415. [80] HINTON G, TIELEMAN T. Lecture 6.5-RmsProp:Divide the gradient by a running average of its recent magnitude[J]. COURSERA:Neural Networks for Machine Learning, 2012, 4:26-30. [81] HINTON G, SRIVASTAVA N, KRIZHEVSKY I, et al. Improving neural networks by preventing coadaptation of feature detectors[EB/OL].[2019-06-20]. https://arxiv.org/pdf/1207.0580.pdf. [82] WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning[J]. Machine Learning, 1992, 8(3/4):229-256. doi: 10.1023/A:1022672621406 [83] JI Y F, TAN C H, MARTSCHAT S, et al. Dynamic entity representations in neural language models[EB/OL].[2019-06-10]. https://arxiv.org/pdf/1708.00781.pdf. [84] PENNINGTON J, SOCHER R, MANNING C D. GloVe: Global vectors for word representation[C]//Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543. [85] TURIAN J, RATINOV L, BENGIO Y. Word representations: A simple and general method for semi-supervised learning[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010: 384-394. [86] GRISHMAN R, SUNDHEIM B. Message understanding conference-6: A brief history[C]//Proceedings of the 16th Conference on Computational linguistics. 1996: 466-471. [87] NIST, US. The ACE 2003 Evaluation Plan V[R]. US National Institute for Standards and Technology (NIST), 2003. [88] RECASENS M, MARQUEZ L, SAPENA E, et al. SemEval-2010 Task 1 OntoNotes English:Coreference Resolution in Multiple Languages[M]. Philadelphia:Linguistic Data Consortium, 2011. [89] PRADHAN S S, RAMSHAW L, MARCUS M, et al. CoNLL-2011 shared task: Modeling unrestricted coreference in OntoNotes[C]//Proceedings of the Shared Task of the 15th Conference on Computational Natural Language Learning. 2011: 1-27 [90] PRADHAN S, MOSCHITTI A, XUE N W, et al. CoNLL-2012 shared task: Modeling multilingual unrestricted coreference in OntoNotes[C]//Proceedings of the Shared Task of the 16th Conference on Computational Natural Language Learning. 2012: 1-40. [91] VILAIN M, BURGER J, ABERDEEN J, et al. A model-theoretic coreference scoring scheme[C]//Proceedings of the 6th Conference on Message Understanding. 1995: 45-52. [92] BAGGA A, BALDWIN B. Algorithms for scoring coreference chains[C]//Proceedings of the Linguistic Coreference Workshop at the First International Conference on Language Resources and Evaluation. 1998: 563-566. [93] LUO X. On coreference resolution performance metrics[C]//Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2005: 25-32. [94] RECASENS M, HOVY E. BLANC:Implementing the rand index for coreference evaluation[J]. Natural Language Engineering, 2011, 17(4):485-510. doi: 10.1017/S135132491000029X [95] LUO X, PRADHAN S, RECASENS M, et al. An extension of BLANC to system mentions[C]//Meeting of the Association for Computational Linguistics. 2014: 24. [96] MOOSAVI N S, STRUBE M. Which coreference evaluation metric do you trust? A proposal for a link-based entity aware metric[C]//Meeting of the Association for Computational Linguistics. 2016: 7-12. [97] KUHN H W. The Hungarian method for the assignment problem[J]. Naval Research Logistics Quarterly, 1955, 2(1/2):83-97. http://d.old.wanfangdata.com.cn/OAPaper/oai_doaj-articles_128d62831ff7321ac91e4d14db3de64e [98] MUNKRES J. Algorithms for the assignment and transportation problems[J]. Journal of the Society for Industrial & Applied Mathematics, 1957, 5(1):32-38. http://d.old.wanfangdata.com.cn/OAPaper/oai_doaj-articles_b5d08307cc012cd4b9760d14c5fe66a3 [99] PENG H R, KHASHABI D, ROTH D. Solving hard coreference problems[EB/OL].[2019-05-1]. https://arxiv.org/pdf/1907.05524.pdf. [100] ZHOU Z H. A brief introduction to weakly supervised learning[J]. National Science Review, 2017, 5(1):44-53. http://www.cnki.com.cn/Article/CJFDTotal-NASR201801015.htm [101] LEE D H. Pseudo-Label: The simple and efficient semi-supervised learning method for deep neural networks[C]//International Conference on Machine Learning. 2013. [102] RASMUS A, VALPOLA H, HONKALA M, et al. Semi-supervised learning with ladder networks[J]. Computer Science, 2015:1-9. http://d.old.wanfangdata.com.cn/Periodical/gjzdhyjszz-e201904003 [103] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529:484-489. doi: 10.1038/nature16961 [104] MA S, SUN X, LIN J Y, et al. A hierarchical end-to-end model for jointly improving text summarization and sentiment classification[C]//International Joint Conferencces on Artificial Intelligence. 2018. [105] CHO K, VAN MERRENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoderdecoder for statistical machine translation[C]//Conference on Empirical Methods in Natural Language Processing. 2014: 1724-1734.