skip to main content
article

Domain-specific FAQ retrieval using independent aspects

Published:01 March 2005Publication History
Skip Abstract Section

Abstract

This investigation presents an approach to domain-specific FAQ (frequently-asked question) retrieval using independent aspects. The data analysis classifies the questions in the collected QA (question-answer) pairs into ten question types in accordance with question stems. The answers in the QA pairs are then paragraphed and clustered using latent semantic analysis and the K-means algorithm. For semantic representation of the aspects, a domain-specific ontology is constructed based on WordNet and HowNet. A probabilistic mixture model is then used to interpret the query and QA pairs based on independent aspects; hence the retrieval process can be viewed as the maximum likelihood estimation problem. The expectation-maximization (EM) algorithm is employed to estimate the optimal mixing weights in the probabilistic mixture model. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in medical FAQ retrieval.

References

  1. Ahmedi, L. and Lausen, G. 2002. Ontology-based querying of linked XML documents. In Proceedings of the Semantic Web Workshop 2002 at the 11th International World Wide Web Conference (WWW 2002, Hawaii). 7--11.Google ScholarGoogle Scholar
  2. Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison-Wesley, Reading, MA, 1999. Google ScholarGoogle Scholar
  3. Berger, A., Caruana, A., Cohn, D., Freitag, D., and Mittal, V. 2000. Bridging the lexical chasm: Statistical approaches to answer-finding. In Proceedings of ACM SIGIR Conference. ACM, New York. 192--199. Google ScholarGoogle Scholar
  4. Brill, E., Dumais, S., and Banko, M. 2002. An analysis of the AskMSR question-answering system. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing. 291--298. Google ScholarGoogle Scholar
  5. Burke, R. D., Hammond, K., Kulyukin, V., Lytinen, S. L., Tomuro, N., and Schoenberg, S. 1997. Question answering from frequently-asked-question files: Experiences with the FAQ Finder system. Tech. Rep. TR-97-05, Dept. of Computer Science, University of Chicago. Chicago, IL. Google ScholarGoogle Scholar
  6. Chu-Carroll, J., Prager, J., Welky, C., Czuka, K., and Ferrucci, D. 2002. A multi-strategy and multi-source approach to question answering. In Proceedings of the TREC 2002 Conference. 281--288.Google ScholarGoogle Scholar
  7. Clarke, C. L. A., Cormack, G. V., and Lynam, T. R. 2001. Exploiting redundancy in question answering. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 358--365. Google ScholarGoogle Scholar
  8. Clarke, C. L. A., Cormack, G. V., Kemkes, G., Laszlo, M., Lynam, T. R., Terra, E. L., and Tilker, P. L. 2002. Statistical selection of exact answers (MultiText experiments for TREC 2002). In Proceedings of the TREC 2002 Conference. 823--831.Google ScholarGoogle Scholar
  9. Crowston, K. and Williams, M. 1999. The effects of linking on genres of Web documents. In Proceedings of the 32nd Hawaii International Conference on System Sciences (Maui, Hawaii). Google ScholarGoogle Scholar
  10. Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum-likelihood from incomplete data via the em algorithm. J. Royal Stat. Soc. Ser. B. (1977), 1--39.Google ScholarGoogle Scholar
  11. Eichmann, D., Ruiz, M., and Srinivasan, P. 1998. Cross-language information retrieval with the UMLS metathesaurus. In Proceeding of the ACM Special Interest Group on Information Retrieval (SIGIR). ACM, New York, 1998, 72--80. Google ScholarGoogle Scholar
  12. Hammond, K., Bruke, R., Martin, C., and Lytinen, S. 1995. Faq-Finder: A case based approach to knowledge navigation. In Working Notes of the AAAI Spring Symposium on Information Gathering from Heterogeneous Distributed Environments. AAAI. Google ScholarGoogle Scholar
  13. Harabagiu, S. and Maiorano, S. 1999. Finding answers in 1Large collections of texts: Paragraph indexing + abductive inference. In Proceedings of the AAAI Fall Symposium on Question Answering Systems. 63--71.Google ScholarGoogle Scholar
  14. Hsieh, J. H., Wu, C. H., and Fung, K. A. 2003. Two-stage story segmentation and detection on broadcast news using genetic algorithm, In Proceedings of the 2003 ISCA Workshop on Multilingual Spoken Document Retrieval (MSDR2003). 55--60.Google ScholarGoogle Scholar
  15. Kwok, C., Etzioni, O., and Weld, D. S. 2001. Scaling question answering to the Web. ACM Trans. Inf. Syst. 19, 3 (2001), 242--262. Google ScholarGoogle Scholar
  16. Lee, S. and Lee, G. G. 2003. Use of dynamic passage selection and lexico-semantic patterns for Japanese natural language question answering. IEICE Trans. Inf. Syst. E86-D, 9 (2003), 1638--1647.Google ScholarGoogle Scholar
  17. Lenz, M., Hbner, A., and Kunze, M. 1998. Question answering with textual CBR. In Proceedings of the International Conference on Flexible Query Answering Systems (Denmark). 236--247. Google ScholarGoogle Scholar
  18. Lin, D. and Pantel, P. 2001. Discovery of inference rules for question-answering. Natural Language Engineering 7, 4 (2001), 343--378. Google ScholarGoogle Scholar
  19. Manning, C. D. and Schutze, H. 1999. Fundamentals of Statistical Natural Language Processing. MIT Press, Cambridge, MA, 1999, 554--566. Google ScholarGoogle Scholar
  20. Na, S. H., Kang, I. S., Lee, S. Y., and Lee, J. H. 2002. Question answering approach using a WordNet-based answer type taxonomy. In Proceedings of the TREC 2002 Conference. 512--519.Google ScholarGoogle Scholar
  21. Moldovan, D., Pasca, M., Harabagiu, S., and Surdeanu, M. 2003. Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. 21, 2 (2003), 133--154. Google ScholarGoogle Scholar
  22. Paranjpe, D., Ramakrishnan, G., and Srinivasan, S. 2003. Passage scoring for question answering via Bayesian inference on lexical relations. In Proceedings of the TREC 2003 Conference.Google ScholarGoogle Scholar
  23. Radev, D. R., Qi, H., Zhen, Z., Sasha, B. G., Zhang, Z., Fan, W., and Prager, J. 2001. Mining the Web for answers to natural language questions. In Proceedings of the Tenth International Conference on Information and Knowledge Management. 143--150. Google ScholarGoogle Scholar
  24. Schapire, R., Singer, Y., and Singhal, A. 1998. Boosting and rocchio applied to text filtering. In Proceedings of SIGIR-98. The 21st ACM International Conference on Research and Development in Information Retrieval. ACM, New York. Google ScholarGoogle Scholar
  25. Sneiders, E. 2002. Automated question answering using question templates that cover the conceptual model of the database, natural language processing and information systems. In Proceedings of the NLDB'2002 Conference (Stockholm). LNCS 2553, Springer, New York. 235--239. Google ScholarGoogle Scholar
  26. Sneiders, E. 1999. Automated FAQ answering: Continued experience with shallow language understanding. In Proceedings for the 1999 AAAI Fall Symposium on Question Answering Systems.Google ScholarGoogle Scholar
  27. Tomuro, N. 2002. Question terminology and representation for question type classification. In Proceedings of the 2nd International Workshop on Computational Terminology (COMPUTERM02). Google ScholarGoogle Scholar
  28. Tong, R., Quackenbush, J., and Snuffin, M. 2003. Knowledge-based access to the bio-medical literature, ontologically-grounded experiments for the TREC 2003 genomics track. In Proceedings of TREC 2003 Conference.Google ScholarGoogle Scholar
  29. Turcato, D., Popowich, F., Toole, J., Fass, D., Nicholson, D., and Tisher, G. 2000. Adapting a synonym database to specific domains. In Proceedings of the ACL'2000 Workshop on Information Retrieval and Natural Language Processing (Hong Kong, Oct. 2000). Google ScholarGoogle Scholar
  30. Wang, H. L., Wu, S. H., Wang, I. C., Sung, C. L., Hsu, W. L., and Shih, W. K. 2000. Semantic search on Internet tabular information extraction for answering queries. In Proceedings of the Ninth International Conference on Information and Knowledge Management. 243--249. Google ScholarGoogle Scholar
  31. Whitehead, S. D. 1995. Auto-FAQ: An experiment in cyberspace leveraging. Computer Networks and ISDN Systems 28, 1/2 (1995), 137--146. Google ScholarGoogle Scholar
  32. Yeh, J. F., Wu, C. H., Chen, M. J., and Yu, L. C. 2004. Automated alignment and extraction of bilingual ontology for cross-language domain-specific applications. In Proceeding of the COLING 2004 Conference. Google ScholarGoogle Scholar

Index Terms

  1. Domain-specific FAQ retrieval using independent aspects

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader