Abstract
This investigation presents an approach to domain-specific FAQ (frequently-asked question) retrieval using independent aspects. The data analysis classifies the questions in the collected QA (question-answer) pairs into ten question types in accordance with question stems. The answers in the QA pairs are then paragraphed and clustered using latent semantic analysis and the K-means algorithm. For semantic representation of the aspects, a domain-specific ontology is constructed based on WordNet and HowNet. A probabilistic mixture model is then used to interpret the query and QA pairs based on independent aspects; hence the retrieval process can be viewed as the maximum likelihood estimation problem. The expectation-maximization (EM) algorithm is employed to estimate the optimal mixing weights in the probabilistic mixture model. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in medical FAQ retrieval.
- Ahmedi, L. and Lausen, G. 2002. Ontology-based querying of linked XML documents. In Proceedings of the Semantic Web Workshop 2002 at the 11th International World Wide Web Conference (WWW 2002, Hawaii). 7--11.Google Scholar
- Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison-Wesley, Reading, MA, 1999. Google Scholar
- Berger, A., Caruana, A., Cohn, D., Freitag, D., and Mittal, V. 2000. Bridging the lexical chasm: Statistical approaches to answer-finding. In Proceedings of ACM SIGIR Conference. ACM, New York. 192--199. Google Scholar
- Brill, E., Dumais, S., and Banko, M. 2002. An analysis of the AskMSR question-answering system. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing. 291--298. Google Scholar
- Burke, R. D., Hammond, K., Kulyukin, V., Lytinen, S. L., Tomuro, N., and Schoenberg, S. 1997. Question answering from frequently-asked-question files: Experiences with the FAQ Finder system. Tech. Rep. TR-97-05, Dept. of Computer Science, University of Chicago. Chicago, IL. Google Scholar
- Chu-Carroll, J., Prager, J., Welky, C., Czuka, K., and Ferrucci, D. 2002. A multi-strategy and multi-source approach to question answering. In Proceedings of the TREC 2002 Conference. 281--288.Google Scholar
- Clarke, C. L. A., Cormack, G. V., and Lynam, T. R. 2001. Exploiting redundancy in question answering. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 358--365. Google Scholar
- Clarke, C. L. A., Cormack, G. V., Kemkes, G., Laszlo, M., Lynam, T. R., Terra, E. L., and Tilker, P. L. 2002. Statistical selection of exact answers (MultiText experiments for TREC 2002). In Proceedings of the TREC 2002 Conference. 823--831.Google Scholar
- Crowston, K. and Williams, M. 1999. The effects of linking on genres of Web documents. In Proceedings of the 32nd Hawaii International Conference on System Sciences (Maui, Hawaii). Google Scholar
- Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum-likelihood from incomplete data via the em algorithm. J. Royal Stat. Soc. Ser. B. (1977), 1--39.Google Scholar
- Eichmann, D., Ruiz, M., and Srinivasan, P. 1998. Cross-language information retrieval with the UMLS metathesaurus. In Proceeding of the ACM Special Interest Group on Information Retrieval (SIGIR). ACM, New York, 1998, 72--80. Google Scholar
- Hammond, K., Bruke, R., Martin, C., and Lytinen, S. 1995. Faq-Finder: A case based approach to knowledge navigation. In Working Notes of the AAAI Spring Symposium on Information Gathering from Heterogeneous Distributed Environments. AAAI. Google Scholar
- Harabagiu, S. and Maiorano, S. 1999. Finding answers in 1Large collections of texts: Paragraph indexing + abductive inference. In Proceedings of the AAAI Fall Symposium on Question Answering Systems. 63--71.Google Scholar
- Hsieh, J. H., Wu, C. H., and Fung, K. A. 2003. Two-stage story segmentation and detection on broadcast news using genetic algorithm, In Proceedings of the 2003 ISCA Workshop on Multilingual Spoken Document Retrieval (MSDR2003). 55--60.Google Scholar
- Kwok, C., Etzioni, O., and Weld, D. S. 2001. Scaling question answering to the Web. ACM Trans. Inf. Syst. 19, 3 (2001), 242--262. Google Scholar
- Lee, S. and Lee, G. G. 2003. Use of dynamic passage selection and lexico-semantic patterns for Japanese natural language question answering. IEICE Trans. Inf. Syst. E86-D, 9 (2003), 1638--1647.Google Scholar
- Lenz, M., Hbner, A., and Kunze, M. 1998. Question answering with textual CBR. In Proceedings of the International Conference on Flexible Query Answering Systems (Denmark). 236--247. Google Scholar
- Lin, D. and Pantel, P. 2001. Discovery of inference rules for question-answering. Natural Language Engineering 7, 4 (2001), 343--378. Google Scholar
- Manning, C. D. and Schutze, H. 1999. Fundamentals of Statistical Natural Language Processing. MIT Press, Cambridge, MA, 1999, 554--566. Google Scholar
- Na, S. H., Kang, I. S., Lee, S. Y., and Lee, J. H. 2002. Question answering approach using a WordNet-based answer type taxonomy. In Proceedings of the TREC 2002 Conference. 512--519.Google Scholar
- Moldovan, D., Pasca, M., Harabagiu, S., and Surdeanu, M. 2003. Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. 21, 2 (2003), 133--154. Google Scholar
- Paranjpe, D., Ramakrishnan, G., and Srinivasan, S. 2003. Passage scoring for question answering via Bayesian inference on lexical relations. In Proceedings of the TREC 2003 Conference.Google Scholar
- Radev, D. R., Qi, H., Zhen, Z., Sasha, B. G., Zhang, Z., Fan, W., and Prager, J. 2001. Mining the Web for answers to natural language questions. In Proceedings of the Tenth International Conference on Information and Knowledge Management. 143--150. Google Scholar
- Schapire, R., Singer, Y., and Singhal, A. 1998. Boosting and rocchio applied to text filtering. In Proceedings of SIGIR-98. The 21st ACM International Conference on Research and Development in Information Retrieval. ACM, New York. Google Scholar
- Sneiders, E. 2002. Automated question answering using question templates that cover the conceptual model of the database, natural language processing and information systems. In Proceedings of the NLDB'2002 Conference (Stockholm). LNCS 2553, Springer, New York. 235--239. Google Scholar
- Sneiders, E. 1999. Automated FAQ answering: Continued experience with shallow language understanding. In Proceedings for the 1999 AAAI Fall Symposium on Question Answering Systems.Google Scholar
- Tomuro, N. 2002. Question terminology and representation for question type classification. In Proceedings of the 2nd International Workshop on Computational Terminology (COMPUTERM02). Google Scholar
- Tong, R., Quackenbush, J., and Snuffin, M. 2003. Knowledge-based access to the bio-medical literature, ontologically-grounded experiments for the TREC 2003 genomics track. In Proceedings of TREC 2003 Conference.Google Scholar
- Turcato, D., Popowich, F., Toole, J., Fass, D., Nicholson, D., and Tisher, G. 2000. Adapting a synonym database to specific domains. In Proceedings of the ACL'2000 Workshop on Information Retrieval and Natural Language Processing (Hong Kong, Oct. 2000). Google Scholar
- Wang, H. L., Wu, S. H., Wang, I. C., Sung, C. L., Hsu, W. L., and Shih, W. K. 2000. Semantic search on Internet tabular information extraction for answering queries. In Proceedings of the Ninth International Conference on Information and Knowledge Management. 243--249. Google Scholar
- Whitehead, S. D. 1995. Auto-FAQ: An experiment in cyberspace leveraging. Computer Networks and ISDN Systems 28, 1/2 (1995), 137--146. Google Scholar
- Yeh, J. F., Wu, C. H., Chen, M. J., and Yu, L. C. 2004. Automated alignment and extraction of bilingual ontology for cross-language domain-specific applications. In Proceeding of the COLING 2004 Conference. Google Scholar
Index Terms
- Domain-specific FAQ retrieval using independent aspects
Recommendations
FAQ Retrieval using Query-Question Similarity and BERT-Based Query-Answer Relevance
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information RetrievalFrequently Asked Question (FAQ) retrieval is an important task where the objective is to retrieve an appropriate Question-Answer (QA) pair from a database based on a user's query. We propose a FAQ retrieval system that considers the similarity between a ...
Weighted Edit Distance based FAQ Retrieval using Noisy Queries
FIRE '12 & '13: Proceedings of the 4th and 5th Annual Meetings of the Forum for Information Retrieval EvaluationIn this paper, we describe our contribution to the FIRE 2013 shared task on "FAQ Retrieval using Noisy Queries". Short messaging service (SMS) and voice-based interfaces such as Siri have become quite popular for quick information retrieval these days. ...
Cluster-Based FAQ Retrieval Using Latent Term Weights
To resolve lexical disagreement problems in FAQ retrieval, we propose a high-performance FAQ retrieval system using query-log clustering. The FAQ retrieval system is divided into two subsystems: a query-log clustering system and a cluster-based ...
Comments