skip to main content
10.1145/2675354.2675356acmconferencesArticle/Chapter ViewAbstractPublication PagesgirConference Proceedingsconference-collections
research-article

Improving wikipedia-based place name disambiguation in short texts using structured data from DBpedia

Published:04 November 2014Publication History

ABSTRACT

Place name disambiguation is an important task for improving the accuracy of geographic information retrieval. This task becomes more challenging when the input texts are short. Wikipedia provides information about places and has often been employed for named entity recognition. However, the natural language representation of Wikipedia articles limits more effective use of this rich knowledge base. DBpedia is the Semantic Web version of Wikipedia, which provides structured and machine-understandable knowledge mined from Wikipedia articles. This paper presents an approach for combining Wikipedia and DBpedia to disambiguate place names in short texts. We discuss the pros and cons of the two knowledge bases, and argue that a combination of both performs better than each of them alone. We evaluate our proposed method by conducting experiments against baselines of three established methods. The result indicates that our method has a generally higher precision and recall. While our study employs DBpedia, the proposed method is generic and can be extended to other structured Linked Datasets such as Freebase or Wikidata.

References

  1. M. Andrea Rodriguez and M. J. Egenhofer. Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure. International Journal of Geographical Information Science, 18(3):229--256, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  2. M. Bazire and P. Brézillon. Understanding context before using it. In Modeling and using context, pages 29--40. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. C. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL, volume 6, pages 9--16, 2006.Google ScholarGoogle Scholar
  4. D. Buscaldi and P. Rosso. A conceptual density-based approach for the disambiguation of toponyms. International Journal of Geographical Information Science, 22(3):301--313, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Buscaldi, P. Rosso, and E. S. Arnal. Using the wordnet ontology in the geoclef geographical information retrieval task. Springer, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL, volume 7, pages 708--716. Citeseer, 2007.Google ScholarGoogle Scholar
  7. A. Fader, S. Soderland, O. Etzioni, and T. Center. Scaling wikipedia-based named entity disambiguation to arbitrary web text. In Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA, pages 21--26, 2009.Google ScholarGoogle Scholar
  8. J. Hoffart, M. A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 782--792. Association for Computational Linguistics, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Janowicz. Kinds of contexts and their impact on semantic similarity measurement. In Pervasive Computing and Communications, 2008. PerCom 2008. Sixth Annual IEEE International Conference on, pages 441--446. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. B. Jones and R. S. Purves. Geographical information retrieval. International Journal of Geographical Information Science, 22(3):219--228, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, et al. Dbpedia--a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web, 2014.Google ScholarGoogle Scholar
  12. J. L. Leidner. Toponym resolution in text: Annotation, evaluation and applications of spatial grounding of place names. Universal-Publishers, 2008.Google ScholarGoogle Scholar
  13. P. N. Mendes, M. Jakob, A. García-Silva, and C. Bizer. Dbpedia spotlight: shedding light on the web of documents. In Proceedings of the 7th International Conference on Semantic Systems, pages 1--8. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 233--242. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Milne and I. H. Witten. Learning to link with wikipedia. In Proceedings of the 17th ACM conference on Information and knowledge management, pages 509--518. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. T. Nguyen and T. H. Cao. Named entity disambiguation on an ontology enriched by wikipedia. In Research, Innovation and Vision for the Future, 2008. RIVF 2008. IEEE International Conference on, pages 247--254. IEEE, 2008.Google ScholarGoogle Scholar
  17. S. Overell and S. Rüger. Using co-occurrence models for placename disambiguation. International Journal of Geographical Information Science, 22(3):265--287, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. A. Smith and G. Crane. Disambiguating geographic names in a historical digital library. In Research and Advanced Technology for Digital Libraries, pages 127--136. Springer, 2001. Google ScholarGoogle ScholarCross RefCross Ref
  19. R. Volz, J. Kleb, and W. Mueller. Towards ontology-based disambiguation of geographical identifiers. In I3, 2007.Google ScholarGoogle Scholar

Index Terms

  1. Improving wikipedia-based place name disambiguation in short texts using structured data from DBpedia

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GIR '14: Proceedings of the 8th Workshop on Geographic Information Retrieval
        November 2014
        94 pages
        ISBN:9781450331357
        DOI:10.1145/2675354

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 November 2014

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        GIR '14 Paper Acceptance Rate11of15submissions,73%Overall Acceptance Rate46of61submissions,75%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader