ABSTRACT
In many information retrieval and selection tasks it is valuable to score how much a text is about a certain entity and to compute how much the text discusses the entity with respect to a certain viewpoint. In this paper we are interested in giving an aboutness score to a text, when the input query is a person name and we want to measure the aboutness with respect to the biographical data of that person. We present a graph-based algorithm and compare its results with other approaches.
- Angheluta, R., Jeuniaux, P., Mitra, R. and Moens, M.-F. (2004). Clustering algorithms for noun phrase coreference resolution. In Proceedings JADT - 2004. 7èmes Journées internationales d'Analyse statistique des Données Textuelles. Louvain-La-Neuve, Belgium.Google Scholar
- Beghtol, C. (1986). Bibliographic classification theory and text linguistics: Aboutness analysis, intertextuality and the cognitive act of classifying documents. Journal of Documentation, 42(2): 84--113.Google ScholarCross Ref
- Cardie C. and Wagstaff K. (1999). Noun phrase coreference as clustering. In Proceedings of the Joint Conference on Empirical Methods in NLP and Very Large Corpora.Google Scholar
- Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19: 61--74. Google ScholarDigital Library
- Erkan, G. and Radev, D. R. (2004). LexRank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22: 457--479. Google ScholarCross Ref
- Givón, T. (2001). Syntax. An Introduction. Amsterdam: John Benjamins.Google Scholar
- Kleinberg, J. M. (1998). Authoritative sources in a hyperlinked environment. In Proceedings 9th ACM-SIAM Symposium on Discrete Algorithms (pp. 668--677). Google ScholarDigital Library
- Mihalcea, R. and Tarau, P. (2004). TextRank: Bringing order into texts. In Proceedings of EMNLP (pp. 404--411).Google Scholar
- Soergel, D. (1994). Indexing and retrieval performance: The logical evidence. Journal of the American Society for Information Science, 45 (8): 589--599. Google ScholarCross Ref
- Van Dijk, T. A. and Kintsch, W. (1983). Strategies of Discourse Comprehension. New York: Academic Press.Google Scholar
- Zha, H. (2002). Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 113--120). New York: ACM. Google ScholarDigital Library
- Measuring aboutness of an entity in a text
Recommendations
Named entity recognition and resolution in legal text
Semantic Processing of Legal TextsNamed entities in text are persons, places, companies, etc. that are explicitly mentioned in text using proper nouns. The process of finding named entities in a text and classifying them to a semantic type, is called named entity recognition. Resolution ...
Rule based synonyms for entity extraction from noisy text
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text dataIdentification of named entities such as person, organization and product names from text is an important task in information extraction. In many domains, the same entity could be referred to in multiple ways due to variations introduced by different ...
Exploring entity relations for named entity disambiguation
HLT-SS '11: Proceedings of the ACL 2011 Student SessionNamed entity disambiguation is the task of linking an entity mention in a text to the correct real-world referent predefined in a knowledge base, and is a crucial subtask in many areas like information retrieval or topic detection and tracking. Named ...
Comments