skip to main content
article

Japanese question-answering system using A* search and its improvement

Published:01 September 2005Publication History
Skip Abstract Section

Abstract

We have proposed a method to introduce A* search control in a sentential matching mechanism for Japanese question-answering systems in order to reduce the turnaround time while maintaining the accuracy of the answers. Using this method, preprocessing need not be performed on a document database and we may use any information retrieval systems by writing a simple wrapper program. However, the disadvantage is that the accuracy is not sufficiently high and the mean reciprocal rank (MRR) is approximately 0.3 in NTCIR3 QAC1, an evaluation workshop for question-answering systems. In order to improve the accuracy, we propose several measures of the degree of sentence matching and a variant of a voting method. Both of them can be integrated with our system of controlled search. Using these techniques, the system achieves a higher MRR of 0.5 in the evaluation workshop NTCIR4 QAC2.

References

  1. Breck, E., Burger, J., Ferro, L., House, D., Light, M., and Mani, I. 1999. A Sys called Qanda. In Proceedings of the eighth Text Retrieval Conferene (TREC 8). 499--506.Google ScholarGoogle Scholar
  2. Clarke, C. L., Cormack, G. V., and Lynam, T. R. 2001. Exploiting redundancy in question answering. In Proceedings of SIGIR '01: the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 358--365. Google ScholarGoogle Scholar
  3. Fujihata, K., Shiga, M., and Mori, T. 2001. Extraction of numerical expressions by constraints and default rules of dependency structure. SIG Notes 2001-NL-145, Information Processing Society of Japan. (Sep.) (in Japan).Google ScholarGoogle Scholar
  4. Fukumoto, J., Kato, T., and Masui, F. 2002. Question answering challenge (QAC-1)---Question answering evaluation at NTCIR workshop 3 ---. In Working Notes of the Third NTCIR Workshop meeting---Part IV: Question Answering Challenge (QAC1). 1--6.Google ScholarGoogle Scholar
  5. Fukumoto, J., Kato, T., and Masui, F. 2004. Question answering challenge for five ranked and list answers---Overview of NTCIR4 QAC2 subtask 1 and 2 ---. In Working Notes of the Fourth NTCIR Workshop Meeting. 283--290.Google ScholarGoogle Scholar
  6. Harabagiu, S. M., Moldovan, D. I., Pasca, M., Mihalcea, R., Surdeanu, M., Bunescu, R. C., Girju, R., Rus, V., and Morarescu, P. 2000. Falcon: Boosting knowledge for answer engines. In Proceedings of the ninth Text Retrieval Conferene (TREC 9). 479--488.Google ScholarGoogle Scholar
  7. Joachims, T. 2002. SVMlight---Support vector machine. http://svmlight.joachims.org/.Google ScholarGoogle Scholar
  8. Kazawa, H., Isozaki, H., and Maeda, E. 2001. NTT question answering system in TREC 2001. In Proceedings of the tenth Text Retrieval Conferene (TREC 10). 415--422.Google ScholarGoogle Scholar
  9. Kazawa, H. and Kato, T. 2000. Question answering using semantic constraint on answers. SIG Notes 2000-NL-140, Information Processing Society of Japan. (Nov.)Google ScholarGoogle Scholar
  10. Kurohashi, S. 1998. Japanese Syntactic Parsing System KNP version 2.0 b6 Instruction Manual. (in Japan).Google ScholarGoogle Scholar
  11. Kurohashi, T. and Nagao, M. 1998. Japanese Morphological Analysis System JUMAN version 3.6 Manual. Kyoto University. (in Japan).Google ScholarGoogle Scholar
  12. Kuwahara, D., Kaji, N., and Kurohashi, S. 2002. Question and answering system based on predicate-argument matching. In Working Notes of the Third NTCIR Workshop meeting---Part IV: Question Answering Challenge (QAC1). 21--24.Google ScholarGoogle Scholar
  13. Lee, S. and Lee, G. G. 2002. SiteQ/J: A question answering system for Japanese. In Working Notes of the Third NTCIR Workshop meeting---Part IV: Question Answering Challenge (QAC1). 31--38.Google ScholarGoogle Scholar
  14. Magnini, B., Negri, M., and Tanev, R. P. H. 2002. Is it the right answer? Exploiting web redundancy for answer validation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002). 425--432. Google ScholarGoogle Scholar
  15. Mori, T., Ohta, T., Fujihata, K., and Kumon, R. 2003. An A* search in sentential matching for question answering. IEICE Transactions on Information and Systems E86-D, 9 (Sept.), 1658--1668. Special Issue on Text Processing for Information Access.Google ScholarGoogle Scholar
  16. Murata, M., Utiyama, M., and Isahara, H. 2000. Question answering system using similarity-guided reasoning. SIG Notes 2000-NL-135, Information Processing Society of Japan. (Jan.)Google ScholarGoogle Scholar
  17. Oh, J.-H., Lee, K.-S., Chang, D.-S., Seo, C.-W., and Choi, K.-S. 2001. TREC-10 experiments at KAIST: Batch filtering and question answering. In Proceedings of the tenth Text Retrieval Conferene (TREC 10). 347--354.Google ScholarGoogle Scholar
  18. Prager, J., Brown, E., Coden, A., and Radev, D. 2000. Question-answering by predictive annotation. In Proceedings of SIGIR 2000: 23th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 184--191. Google ScholarGoogle Scholar
  19. QAC Organizing Committee. 2002. NTCIR workshop 3 QA task---Question answering challenge (QAC). http://www.nlp.cs.ritsumei.ac.jp/qac/index-e.html.Google ScholarGoogle Scholar
  20. Sasaki, Y., Isozaki, H., Hirao, T., Kokuryou, K., and Maeda, E. 2002. NTT's QA systems for NTCIR QAC-1. In Working Notes of the Third NTCIR Workshop meeting---Part IV: Question Answering Challenge (QAC1). 63--70.Google ScholarGoogle Scholar
  21. Sasaki, Y., Isozaki, H., Taira, H., Hirao, T., Kazawa, H., Suzuki, J., Kokuryo, K., and Maeda, E. 2001. SAIQA: A Japanese QA system based on a large-scale corpus. SIG Notes 2001-NL-145, Information Processing Society of Japan. (Sept). (in Japan).Google ScholarGoogle Scholar
  22. TREC Project. 2001. Proceedings of the eighth text retreival conference TREC 10. http://trec.nist.gov/pubs/trec10/t10_proceedings.html.Google ScholarGoogle Scholar
  23. Xu, J., Licuanan, A., and Weischedel, R. 2003. TREC2003 QA at BBN: Answering definitional questions. In Proceedings of the twelfth Text Retrieval Conferene (TREC 2003).Google ScholarGoogle Scholar
  24. Yamada, H., Kudo, T., and Matsumoto, Y. 2002. Japanese named entity extraction using support vector machine. IPSJ Journal 43, 1 (Jan.), 44--53. (in Japan).Google ScholarGoogle Scholar

Index Terms

  1. Japanese question-answering system using A* search and its improvement

    Recommendations

    Reviews

    Jonathan P. E. Hodgson

    Typical question-answering systems accept questions of the types who, when, where, what, and how. The response is generated as the result of a search applied to a document base. Because this document collection may include documents from the Web, not all of the documents can be preprocessed. The system retrieves sentences from the document base using keywords derived from the question. The system must then be able to evaluate these sentences, and offer the best ones as potential answers. This paper describes such a system, for Japanese language questions, with answers derived from documents in Japanese. The score for a retrieved sentence, or sentence chain, when a number of found sentences must be combined to cover all of the keywords in the question, is ranked based on the sum of four subscores. These are, in order (for the order turns out to be important), the matching score in terms of 2-grams, the matching score in terms of keywords, a matching score based on dependency relations in a form of parse tree, and the matching in terms of question type. Note that these values are progressively more expensive to compute. The novelty in this paper is the use of an A* algorithm to find the best answers. This is possible because, except for the last evaluator, one can find approximate over-estimators for each of the evaluations. The required search structure is obtained by considering the evaluation to have a tree-based structure, with the levels corresponding to the successive evaluators taken in the order given above. The paper includes the results of tests of the system from a workshop on Japanese question-answering systems. These results show that the system achieved better results than other pruning-based systems. Two of the evaluations seem to depend on the language being Japanese, namely the use of 2-grams and the dependency metric. It would be interesting to see to what extent the methods can be transferred to other languages. The paper is highly recommended to those interested in question-answering systems. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader