ABSTRACT
Entity Linking (EL) is the task of linking name mentions in Web text with their referent entities in a knowledge base. Traditional EL methods usually link name mentions in a document by assuming them to be independent. However, there is often additional interdependence between different EL decisions, i.e., the entities in the same document should be semantically related to each other. In these cases, Collective Entity Linking, in which the name mentions in the same document are linked jointly by exploiting the interdependence between them, can improve the entity linking accuracy.
This paper proposes a graph-based collective EL method, which can model and exploit the global interdependence between different EL decisions. Specifically, we first propose a graph-based representation, called Referent Graph, which can model the global interdependence between different EL decisions. Then we propose a collective inference algorithm, which can jointly infer the referent entities of all name mentions by exploiting the interdependence captured in Referent Graph. The key benefit of our method comes from: 1) The global interdependence model of EL decisions; 2) The purely collective nature of the inference algorithm, in which evidence for related EL decisions can be reinforced into high-probability decisions. Experimental results show that our method can achieve significant performance improvement over the traditional EL methods.
- Adafre, S. F. & de Rijke, M. 2005. Discovering missing links in Wikipedia. In: Proceedings of the 3rd international workshop on Link discovery. Google ScholarDigital Library
- Artiles, J., Sekine, S. & Gonzalo, J. 2008. Web people search. In: Proceedings of LREC, vol. 8.Google ScholarDigital Library
- Bunescu, R. & Pasca, M. 2006. Using encyclopedic knowledge for named entity disambiguation. In: Proceedings of EACL, vol. 6.Google Scholar
- Cucerzan, S. 2007. Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of EMNLP-CoNLL.Google Scholar
- Dredze, M., McNamee, P., Rao, D., Gerber, A. & Finin, T. 2010. Entity Disambiguation for Knowledge Base Population. In: Proceedings of COLING. Google ScholarDigital Library
- Fader, A., Soderland, S., Etzioni, O. & Center, T. 2009. Scaling Wikipedia-based named entity disambiguation to arbitrary web text. In: Proceedings of Wiki-AI at IJCAI.Google Scholar
- Gabrilovich, E. and Markovich, S. 2007. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. In: Proceedings of the IJCAI. Google ScholarDigital Library
- Gbel, F. & Jagers, A. A. 1974. Random walks on graphs. In: Stochastic processes and their applications, vol. 2, no. 4, pp. 311--336.Google Scholar
- Han, X. & Zhao, J. 2009.Named Entity Disambiguation by leveraging Wikipedia semantic knowledge. In: Proceedings of CIKM. Google ScholarDigital Library
- Han, X. & Zhao, J. 2010. Structural semantic relatedness: a knowledge-based method to named entity disambiguation. In: Proceedings of the 49th ACL. Google ScholarDigital Library
- Kulkarni, S., Singh, A., Ramakrishnan, G. & Chakrabarti, S. 2009. Collective annotation of Wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD. Google ScholarDigital Library
- Li, X., Morie, P. & Roth, D. 2004. Identification and tracing of ambiguous names: Discriminative and generative approaches. In: Proceedings of AAAI, pp. 419--424. Google ScholarDigital Library
- McNamee, P. & Dang, H. T. 2009. Overview of the TAC 2009 Knowledge Base Population Track. In: Proceeding of Text Analysis Conference.Google Scholar
- Milne, D. & Witten, I. H. 2008. Learning to link with Wikipedia. In: Proceedings of the 17th ACM CIKM. Google ScholarDigital Library
- Milne, D., et al. 2006. Mining Domain-Specific Thesauri from Wikipedia: A case study. In: Proceedings of WI. Google ScholarDigital Library
- Medelyan, O., Witten, I. H. & Milne, D. 2008. Topic indexing with Wikipedia. In: Proceedings of the AAAI WikiAI workshop.Google Scholar
- Mihalcea, R. & Csomai, A. 2007. Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the sixteenth ACM CIKM. Google ScholarDigital Library
- Pedersen, T., Purandare, A. & Kulkarni, A. 2005. Name discrimination by clustering similar contexts. In: Proceedings of CICLing. Google ScholarDigital Library
- Strube, M. and Ponzetto, S. P. 2006. WikiRelate! Computing Semantic Relatedness Using Wikipedia. In: Proceedings of AAAI. Google ScholarDigital Library
- Taher H. Haveliwala. 2003. Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. IEEE Transactions on Knowledge and Data Engineering. Google ScholarDigital Library
- Tong, H., Faloutsos, C. & Pan, J. Y. 2007. Fast random walk with restart and its applications, Data Mining. In: Proceedings of ICDM. Google ScholarDigital Library
- Zhang, W., Su, J., Tan, Chew Lim & Wang, W. T. 2010. Entity Linking Leveraging Automatically Generated Annotation. In: Proceedings of the 23rd COLING. Google ScholarDigital Library
- Zheng, Z., Li, F., Huang, M. & Zhu, X. 2010. Learning to Link Entities with Knowledge Base. In: The Proceedings of NAACL. Google ScholarDigital Library
- Zhou, Y., Nie, L., Rouhani-Kalleh, O., Vasile, F. & Gaffney, S. 2010. Resolving Surface Forms to Wikipedia Topics. In: Proceedings of the 23rd COLING. Google ScholarDigital Library
- Hu, J., Fang, L., Cao, Y., et al. 2008. Enhancing Text Clustering by Leveraging Wikipedia Semantics. In Proceedings of SIGIR. Google ScholarDigital Library
Index Terms
- Collective entity linking in web text: a graph-based method
Recommendations
Re-ranking for joint named-entity recognition and linking
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementRecognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use a Named-Entity Recognition (NER) system to find the boundaries of ...
Combining Textual and Graph-Based Features for Named Entity Disambiguation Using Undirected Probabilistic Graphical Models
Knowledge Engineering and Knowledge ManagementAbstractNamed Entity Disambiguation (NED) is the task of disambiguating named entities in a natural language text by linking them to their corresponding entities in a knowledge base such as DBpedia, which are already recognized. It is an important step in ...
System for collective entity disambiguation
ERD '14: Proceedings of the first international workshop on Entity recognition & disambiguationWe present an approach and a system for collective disambiguation of entity mentions occurring in natural language text. Given an input text, the system spots mentions and their candidate entities. Candidate entities across all mentions are jointly ...
Comments