skip to main content
10.1145/1401890.1401994acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Identifying authoritative actors in question-answering forums: the case of Yahoo! answers

Published:24 August 2008Publication History

ABSTRACT

We consider the problem of identifying authoritative users in Yahoo! Answers. A common approach is to use link analysis techniques in order to provide a ranked list of users based on their degree of authority. A major problem for such an approach is determining how many users should be chosen as authoritative from a ranked list. To address this problem, we propose a method for automatic identification of authoritative actors. In our approach, we propose to model the authority scores of users as a mixture of gamma distributions. The number of components in the mixture is estimated by the Bayesian Information Criterion (BIC) while the parameters of each component are estimated using the Expectation-Maximization (EM) algorithm. This method allows us to automatically discriminate between authoritative and non-authoritative users. The suitability of our proposal is demonstrated in an empirical study using datasets from Yahoo! Answers.

References

  1. J. Zhang, M.S. Ackerman and L. Adamic. Expertise Networks in Online Communities: Structure and Algorithms. Proceedings of the 16th ACM International World Wide Web Conference (WWW'07), pages 221--230, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T.C. Turner, M.A. Smith, D. Fisher and H.T. Welser. Picturing Usenet: Mapping Computer-Mediated Collective Action. Journal of Computer-Mediated Communication, 10 (4), article 7, 2005.Google ScholarGoogle Scholar
  3. L. Prescott, Yahoo! Answers captures 96% of Q and A market share, 2006.Google ScholarGoogle Scholar
  4. C.S. Campbell, P.P. Maglio, A. Cozzi and B. Dom. Expertise Identification using Email Communication. Proceedings of the 12th ACM International Conference on Information and Knowledge Management (CIKM'03), pages 528--531, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Dom, I. Eiron, A. Cozzi and Y. Zhang. Graph-Based Ranking Algorithms for E-mail Expertise. Proceedings of 8th ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'03), pages 42--48, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. X. Liu, J. Bollen, M. L. Nelson and H. V. Sompel. Co-authorship Network in the Digital Library Research Community. Information Processing and Management, 41 (6): 1462--1480, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. M. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM, 46 (5): 604--632, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Page, S. Brin, R. Motwani and T. Winograd, The Pagerank Citation Ranking: Bringing Order to the Web, Stanford Digital Library Technologies Project, 1998.Google ScholarGoogle Scholar
  9. E. Agichtein, C. Castillo, D. Donato, A. Gionis and G. Mishne. Finding High-Quality Content in Social Media. Proceedings of the 1st ACM International Conference on Web Search and Data Mining (WSDM'08), pages 183--194, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Shetty and J. Adibi. Discovering Important Nodes through Graph Entropy: The Case of Enron Email Database. Proceedings of the 3rd International Workshop on Link Discovery, pages 74--81, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B. Dom and D. Paranjpe. A Bayesian Technique for Estimating the Credibility of Question Answerers. Proceedings of SIAM Conference on Data Mining (SDM'08), pages 399--409, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  12. D. Yimam and A. Kobsa. Expert Finding Systems for Organisations: Problem and Domain Analysis and the DEMOIR Approach. Journal of Organizational Computing and Electronic Commerce, 13 (1): 1--24, 2003Google ScholarGoogle ScholarCross RefCross Ref
  13. K. Bharat and M. Henzinger. Improved Algorithms for Topic Distillation in Hyperlinked Environments. Proceedings of the 21st Annual International ACM SIGIR Conference (SIGIR'98), pages 104--111, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Brodin, G. O. Roberts, J. S. Rosenthal and P. Tsaparas. Link Analysis Ranking: Algorithms, Theory, and Experiments. ACM Transactions on Internet Technology 5 (1): 231--297, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. Balakrishnan and V.B. Nevzorov. A Primer on Statistical Distributions. John Wiley and Sons, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  16. R. V. Hogg, J.W. McKean and A.T. Craig. Introduction to Mathematical Statistics. Pearson Prentice Hall, sixth ed., 2005.Google ScholarGoogle Scholar
  17. J.F. Lawless. Statistical Models and Methods for Lifetime Data. John Wiley and Sons, 1982.Google ScholarGoogle Scholar
  18. M. Bouguessa, S. Wang and H. Sun. An Objective Approach to Cluster Validation. Pattern Recognition Letters 27 (13): 1419--1430, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J.J. Oliver, R.A. Baxter and C.S Wallace. Unsupervised Learning Using MML. Proceedings of the 23rd International Conference on Machine Learning (ICML'06), pages 364--372, 2006.Google ScholarGoogle Scholar
  20. G. Schwarz. Estimating the Dimension of a Model. Annals of Statistics, 6 (2): 461--464, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  21. A. Dempster, N. Laird and D. Rubin. Maximum Likelihood from Mixture Models. Journal of Royal Statistical Society, (Series B): 1--37, 1977.Google ScholarGoogle Scholar
  22. M.A.T. Figueiredo and A.K. Jain. Unsupervised Learning of Finite Mixture Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (3): 381--396, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J.C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. New York Plenum 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Identifying authoritative actors in question-answering forums: the case of Yahoo! answers

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
          August 2008
          1116 pages
          ISBN:9781605581934
          DOI:10.1145/1401890
          • General Chair:
          • Ying Li,
          • Program Chairs:
          • Bing Liu,
          • Sunita Sarawagi

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 August 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          KDD '08 Paper Acceptance Rate118of593submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader