ABSTRACT
In this paper we describe the development of a geographic co-occurrence model and how it can be applied to geographic information retrieval. The model consists of mining co-occurrences of placenames from Wikipedia, and then mapping these placenames to locations in the Getty Thesaurus of Geographical Names. We begin by quantifying the accuracy of our model and compute theoretical bounds for the accuracy achievable when applied to placename disambiguation in free text. We conclude with a discussion of the improvement such a model could provide for placename disambiguation and geographic relevance ranking over traditional methods.
- E. Agichtein and S. Cucerzan. Predicting accuracy of extracting information from unstructured text collections. In Proceedings of CIKM, pages 567--568, 2005. Google ScholarDigital Library
- E. Amitay, N. Har'El, R. Silvan, and A. Soffer. Web-a-where: Geotagging web content. In Proceedings of SIGIR, pages 273--280, 2004. Google ScholarDigital Library
- R. Bunescu and M. Paşca. Using encycopedic knowledge for named entity disambiguation. In Proceedings of EACL, pages 9--16, 2006.Google Scholar
- D. Buscaldi, P. Rosso, and P. Garcia. Inferring geographic ontologies from multiple resources for geographic information retrieval. In SIGIR Workshop on GIR, pages 52--55, 2006.Google Scholar
- P. Clough, M. Sanderson, and H. Joho. Extraction of semantic annotations from textual web pages. Technical report, University of Sheffield, 2004.Google Scholar
- T. Cover and J. Thomas. Elements of Information Theory. Wiley, 1st edition, 1991. Google ScholarDigital Library
- S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In Proceedings of EMNLP-CoNLL, 2007.Google Scholar
- M. Egenhofer and D. Mark. Naive geography. In Proceedings of COSIT, 1995.Google ScholarCross Ref
- M. Egenhofer and A. Shariff. Metric details for natural-language spatial relations. Journal of the ACM TOIS, 4:295--321, 1998. Google ScholarDigital Library
- E. Gabrilovich and S. Markovitch. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In Proceedings of IJCAI, pages 1606--1611, 2007. Google ScholarDigital Library
- W. Gale, K. Church, and D. Yarowsky. One sense per discourse. In DARPA Speech and Natural Language Workshop, pages 233--237, 1992. Google ScholarDigital Library
- E. Garbin and I. Mani. Disambiguating toponyms in news. In Proceedings of HLT/EMNLP, pages 363--370, 2005. Google ScholarDigital Library
- P. Harping. User's Guide to the TGN Data Releases. The Getty Vocabulary Program, 2.0 edition, 2000.Google Scholar
- L. Hill. Core elements of digital gazetteers: Placenames, categories, and footprints. In Proceedings of ECDL, pages 280--290, 2000. Google ScholarDigital Library
- J. Leveling, S. Hartrumpf, and D. Veiel. University of Hagen at GeoCLEF 2005: Using semantic networks for interpreting geographical queries. In Working Notes for the CLEF Workshop, 2005.Google Scholar
- H. Li, R. Srihari, C. Niu, and W. Li. InfoXtract location normalization: A hybrid approach to geographic references in information extraction. In HLT-NAACL Workshop on Analysis of Geographic References, pages 39--44, 2003. Google ScholarDigital Library
- B. Martins, N. Cardoso, M. Chaves, L. Andrade, and M. Silva. The University of Lisbon at GeoCLEF 2006. In Working Notes for the CLEF Workshop, 2006. Google ScholarDigital Library
- National Geospatial-Intelligence Agency. http://earth-info.nga.mil/gns/html/. Accessed 15 June 2007.Google Scholar
- National Geospatial-Intelligence Agency. http://www.nga.mil/. Accessed 15 June 2007.Google Scholar
- S. Overell and S. Rüger. Identifying and grounding descriptions of places. In SIGIR Workshop on GIR, pages 14--16, 2006.Google Scholar
- H. Raghavan, J. Allan, and A. McCallum. An exploration of entity models, collective classification and relation description. In KDD Workshop on Link Analysis and Group Detection, pages 1--10, 2004.Google Scholar
- E. Rauch, M. Bukatin, and K. Baker. A confidence-based framework for disambiguating geographic terms. In HLT-NAACL Workshop on Analysis of Geographic References, pages 50--54, 2003. Google ScholarDigital Library
- M. Sanderson and J. Kohler. Analyzing geographic queries. In SIGIR Workshop on GIR, 2004.Google Scholar
- C. Schlieder, T. Vögele, and U. Visser. Qualitative spatial representations for information retrieval by gazetteers. In {\em Proceedings of COSIT}, pages 336--351, 2001. Google ScholarDigital Library
- D. Smith and G. Mann. Bootstrapping toponym classifiers. In HLT-NAACL Workshop on Analysis of Geographic References, pages 45--49, 2003. Google ScholarDigital Library
- T. Steinberg. http://www.placeopedia.com. Accessed 15 June 2007.Google Scholar
- M. Strube and S. P. Ponzetto. WikiRelate! Computing semantic relatedness using Wikipedia. In Proceedings of AAAI--06, pages 1419--1424, 2006. Google ScholarDigital Library
- N. Wacholder, Y. Ravin, and M. Choi. Disambiguation of proper names in text. In Proceedings of ANLP, pages 202--208, 1997. Google ScholarDigital Library
- G. Weaver, B. Strickland, and G. Crane. Quantifying the accuracy of relational statements in Wikipedia: a methodology. In Proceedings of JCDL, pages 358--358, 2006. Google ScholarDigital Library
- Wikipedia. http://www.wikipedia.org. Accessed 15 June 2007.Google Scholar
- D. Yarowsky. One sense per collocation. In ARPA Human Language and Technology Workshop, pages 266--271, 1993. Google ScholarDigital Library
- W. Zong, D. Wu, A. Sun, E. Lim, and D. Goh. On assigning place names to geography related web pages. In Proceedings of JCDL, pages 354--362, 2005. Google ScholarDigital Library
Index Terms
- Geographic co-occurrence as a tool for gir.
Recommendations
Using co-occurrence models for placename disambiguation
This paper describes the generation of a model capturing information on how placenames co-occur together. The advantages of the co-occurrence model over traditional gazetteers are discussed and the problem of placename disambiguation is presented as a ...
Evaluating Entity Linking with Wikipedia
Named Entity Linking (nel) grounds entity mentions to their corresponding node in a Knowledge Base (kb). Recently, a number of systems have been proposed for linking entity mentions in text to Wikipedia pages. Such systems typically search for candidate ...
Geographic Information Retrieval Using Wikipedia Articles
WWW '23: Proceedings of the ACM Web Conference 2023Assigning semantically relevant, real-world locations to documents opens new possibilities to perform geographic information retrieval. We propose a novel approach to automatically determine the latitude-longitude coordinates of appropriate Wikipedia ...
Comments