skip to main content
10.1145/2911451.2911536acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

When a Knowledge Base Is Not Enough: Question Answering over Knowledge Bases with External Text Data

Published:07 July 2016Publication History

ABSTRACT

One of the major challenges for automated question answering over Knowledge Bases (KBQA) is translating a natural language question to the Knowledge Base (KB) entities and predicates. Previous systems have used a limited amount of training data to learn a lexicon that is later used for question answering. This approach does not make use of other potentially relevant text data, outside the KB, which could supplement the available information. We introduce a new system, Text2KB, that enriches question answering over a knowledge base by using external text data. Specifically, we revisit different phases in the KBQA process and demonstrate that text resources improve question interpretation, candidate generation and ranking. Building on a state-of-the-art traditional KBQA system, Text2KB utilizes web search results, community question answering and general text document collection data, to detect question topic entities, map question phrases to KB predicates, and to enrich the features of the candidates derived from the KB. Text2KB significantly improves performance over the baseline KBQA method, as measured on a popular WebQuestions dataset. The results and insights developed in this work can guide future efforts on combining textual and structured KB data for question answering.

References

  1. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. Dbpedia: A nucleus for a web of open data. Springer, 2007.Google ScholarGoogle Scholar
  2. K. Barker. Combining structured and unstructured knowledge sources for question answering in watson. In DILS, Lecture Notes in Computer Science. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. H. Bast and E. Haussmann. More accurate question answering on freebase. Proceedings of CIKM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Baudiš. Systems and approaches for question answering. 2015.Google ScholarGoogle Scholar
  5. P. Baudiš and J. Šediỳ. Modeling of the question answering task in the yodaqa system. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. Springer, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on freebase from question-answer pairs. In Proceedings of EMNLP, 2013.Google ScholarGoogle Scholar
  7. J. Berant and P. Liang. Semantic parsing via paraphrasing. In Proceedings of ACL, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  8. J. Berant and P. Liang. Imitation learning of agenda-based semantic parsers. Transactions of the Association for Computational Linguistics, 3, 2015.Google ScholarGoogle Scholar
  9. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of ICMD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Bordes, S. Chopra, and J. Weston. Question answering with subgraph embeddings. In Proceedings of EMNLP, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  11. E. Brill, S. Dumais, and M. Banko. An analysis of the askmsr question-answering system. In Proceedings of EMNLP. Association for Computational Linguistics, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Cornolti, P. Ferragina, M. Ciaramita, H. Schütze, and S. Rüd. The smaph system for query entity recognition and disambiguation. In Proceedings of the First International Workshop on Entity Recognition and Disambiguation, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Dalton. Entity-based Enrichment for Information Extraction and Retrieval. PhD thesis, University of Massachusetts Amherst, 2014.Google ScholarGoogle Scholar
  14. H. T. Dang, D. Kelly, and J. J. Lin. Overview of the trec 2007 question answering track. In Proceedings of TREC, 2007.Google ScholarGoogle Scholar
  15. A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information extraction. In Proceedings of EMNLP, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In Proceedings of SIGKDD, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Lin. An exploration of the principles underlying redundancy-based factoid question answering. Transactions of ACM, 25(2), Apr. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Mintz, S. Bills, R. Snow, and D. Jurafsky. Distant supervision for relation extraction without labeled data. In Proceedings of ACL, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Pound, P. Mika, and H. Zaragoza. Ad-hoc object retrieval in the web of data. In Proceedings of WWW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Savenkov, W.-L. Lu, J. Dalton, and E. Agichtein. Relation extraction from community generated question-answer pairs. In Proceedings of NAACL: Student Research Workshop, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  21. V. I. Spitkovsky and A. X. Chang. A cross-lingual dictionary for english wikipedia concepts. In Proceedings of LREC, 2012.Google ScholarGoogle Scholar
  22. H. Sun, H. Ma, W.-t. Yih, C.-T. Tsai, J. Liu, and M.-W. Chang. Open domain question answering via semantic enrichment. Proceedings of WWW, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Unger, C. Forascu, V. Lopez, A.-C. N. Ngomo, E. Cabrio, P. Cimiano, and S. Walter. Question answering over linked data (qald-5). In Proceedings of CLEF, 2015.Google ScholarGoogle Scholar
  24. D. Vrandečić and M. Krötzsch. Wikidata: A free collaborative knowledgebase. Communications of ACM, (10), Sept. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Xu, Y. Feng, S. Reddy, S. Huang, and D. Zhao. Enhancing freebase question answering using textual evidence. arXiv preprint arXiv:1603.00957, 2016.Google ScholarGoogle Scholar
  26. M. Yahya, D. Barbosa, K. Berberich, Q. Wang, and G. Weikum. Relationship queries on extended knowledge graphs. In Proceedings of WSDM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Yahya, K. Berberich, S. Elbassuoni, and G. Weikum. Robust question answering over the web of linked data. In Proceedings of the CIKM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. X. Yao. Lean question answering over freebase from scratch. In Proceedings of NAACL Demo, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  29. X. Yao, J. Berant, and B. Van Durme. Freebase qa: Information extraction or semantic parsing? In Proceedings of ACL, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  30. X. Yao and B. Van Durme. Information extraction over structured data: Question answering with freebase. In Proceedings of ACL, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  31. W.-t. Yih, M.-W. Chang, X. He, and J. Gao. Semantic parsing via staged query graph generation: Question answering with knowledge base. In Proceedings of ACL, 2015.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. When a Knowledge Base Is Not Enough: Question Answering over Knowledge Bases with External Text Data

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
          July 2016
          1296 pages
          ISBN:9781450340694
          DOI:10.1145/2911451

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 July 2016

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader