ABSTRACT
Community Question Answering has emerged as a popular and effective paradigm for a wide range of information needs. For example, to find out an obscure piece of trivia, it is now possible and even very effective to post a question on a popular community QA site such as Yahoo! Answers, and to rely on other users to provide answers, often within minutes. The importance of such community QA sites is magnified as they create archives of millions of questions and hundreds of millions of answers, many of which are invaluable for the information needs of other searchers. However, to make this immense body of knowledge accessible, effective answer retrieval is required. In particular, as any user can contribute an answer to a question, the majority of the content reflects personal, often unsubstantiated opinions. A ranking that combines both relevance and quality is required to make such archives usable for factual information retrieval. This task is challenging, as the structure and the contents of community QA archives differ significantly from the web setting. To address this problem we present a general ranking framework for factual information retrieval from social media. Results of a large scale evaluation demonstrate that our method is highly effective at retrieving well-formed, factual answers to questions, as evaluated on a standard factoid QA benchmark. We also show that our learning framework can be tuned with the minimum of manual labeling. Finally, we provide result analysis to gain deeper understanding of which features are significant for social media search and retrieval. Our system can be used as a crucial building block for combining results from a variety of social media content with general web search results, and to better integrate social media content for effective information access.
- E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In Proceedings of SIGIR, 2006. Google ScholarDigital Library
- E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media with an application to community-based question answering. In Proceedings of WSDM, 2008. Google ScholarDigital Library
- R. Baeza-Yates and B. Ribeiro-Neto. In Modern Information Retrieval, 1999. Google ScholarDigital Library
- A. Berger. Statistical machine learning for information retrieval. In Ph.D. Thesis, School of Computer Science, Carnegie Mellon Univ., 2001. Google ScholarDigital Library
- E. Brill, S. Dumais, and M. Banko. An analysis of the askmsr question-answering system. In Proceedings of EMNLP, 2002. Google ScholarDigital Library
- C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proceedings of ICML, 2005. Google ScholarDigital Library
- R. Burke, K. Hammond, V. Kulyukin, S. Lytinen, N. Tomuro, and S. Schoenberg. Question answering from frequently asked question files: Experiences with the faq finder system. In AI Magazine, 1997.Google Scholar
- Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. In Journal of Machine Learning Research, 2003. Google ScholarDigital Library
- J. Friedman. Greedy function approximation: a gradient boosting machine. In Ann. Statist., 2001.Google Scholar
- J. Jeon, W. Croft, and J. Lee. Finding similar questions in large question and answer archives. In Proceedings of CIKM, 2005. Google ScholarDigital Library
- J. Jeon, W. Croft, J. Lee, and S. Park. A framework to predict the quality of answers with non-textual features. In Proceedings of SIGIR, 2006. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of KDD, 2002. Google ScholarDigital Library
- T. Joachims, L. Granka, B. Pang, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of SIGIR, 2005. Google ScholarDigital Library
- P. Jurczyk and E. Agichtein. Discovering authorities in question answer communities using link analysis. In Proc. of ACM Conference on Information and Knowledge Management (CIKM2007), 2007. Google ScholarDigital Library
- D. Kelly and J. Teevan. Implicit feedback for inferring user preference: A bibliography. In SIGIR Forum, 2003. Google ScholarDigital Library
- J. Ko, L. Si, and E. Nyberg. A probabilistic framework for answer selection in question answering. In Proc. of NAACL HLT, 2007.Google Scholar
- M. Lenz, A. Hubner, and M. Kunze. Question answering with textual cbr. In Proc. of Third International Conference on Flexible Query Answering System, 1998. Google ScholarDigital Library
- J. Ponte and W. Croft. A language modeling approach to information retrieval. In Proceedings of SIGIR, 1998. Google ScholarDigital Library
- E. Sneiders. Automated faq answering: Continued experience with shallow language understanding. In Proc. of the 1999 AAAI Fall Symposium on Question Answering System, 1999.Google Scholar
- R. Soricut and E. Brill. Automatic question answering: Beyond the factoid. In HLT-NAACL 2004: Main Proceedings, 2004.Google Scholar
- Q. Su, D. Pavlov, J. Chow, and W. Baker. Internet-scale collection of human-reviewed data. In Proc. of the 16th international conference on World Wide Web (WWW2007), 2007. Google ScholarDigital Library
- E. M. Voorhees. Overview of the TREC 2003 question answering track. In Text REtrieval Conference, 2003.Google Scholar
- H. Zha, Z. Zheng, H. Fu, and G. Sun. Incorporating query difference for learning retrieval functions in world wide web search. In Proceedings of CIKM, 2006. Google ScholarDigital Library
- J. Zhang, M. Ackerman, and L. Adamic. Expertise networks in online communities: Structure and algorithms. In Proc. of International World Wide Web Conference WWW2007, 2007. Google ScholarDigital Library
- Z. Zheng, H. Zha, K. Chen, and G. Sun. A regression framework for learning ranking functions using relative relevance judgments. In Proc. of SIGIR, 2007. Google ScholarDigital Library
Index Terms
- Finding the right facts in the crowd: factoid question answering over social media
Recommendations
Social QA in non-CQA platforms
AbstractCommunity Question Answering (cQA) sites have emerged as platforms designed specifically for the exchange of questions and answers among communities of users. Although users tend to find good quality answers in cQA sites, there is ...
Highlights- Twitter has relevant information for factoid and non-factoid QA tasks.
- ...
Selecting the most helpful answers in online health question answering communities
AbstractThe online question answering (QA) community has been popular in recent years. In this paper, we focus on the online health question answering (HQA) community. The HQA community provides a platform for health consumers to inquire about health ...
An Answer Ranking Method in Medical Social Networks
WebMedia '18: Proceedings of the 24th Brazilian Symposium on Multimedia and the WebPatient-oriented medical social networks aim to help their users, with different levels of knowledge, by providing information and support on specific medical and health issues. As one of such pioneering networks, the MedHelp includes forums which allow ...
Comments