skip to main content
10.1145/1277741.1277836acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Broad expertise retrieval in sparse data environments

Published:23 July 2007Publication History

ABSTRACT

Expertise retrieval has been largely unexplored on data other than the W3C collection. At the same time, many intranets of universities and other knowledge-intensive organisations offer examples of relatively small but clean multilingual expertise data, covering broad ranges of expertise areas. We first present two main expertise retrieval tasks, along with a set of baseline approaches based on generative language modeling, aimed at finding expertise relations between topics and people. For our experimental evaluation, we introduce (and release) a new test set based on a crawl of a university site. Using this test set, we conduct two series of experiments. The first is aimed at determining the effectiveness of baseline expertise retrieval methods applied to the new test set. The second is aimed at assessing refined models that exploit characteristic features of the new test set, such as the organizational structure of the university, and the hierarchical structure of the topics in the test set. Expertise retrieval models are shown to be robust with respect to environments smaller than the W3C collection, and current techniques appear to be generalizable to other settings.

References

  1. L. Azzopardi. Incorporating Context in the Language Modeling Framework for ad hoc Information Retrieval. PhD thesis, University of Paisley, 2005.Google ScholarGoogle Scholar
  2. K. Balog and M. de Rijke. Finding similar experts. In This volume, 2007.Google ScholarGoogle Scholar
  3. K. Balog and M. de Rijke. Determining expert profiles (with an application to expert finding). In IJCAI '07: Proc. 20th Intern. Joint Conf. on Artificial Intelligence, pages 2657--2662, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Balog, L. Azzopardi, and M. de Rijke. Formal models for expert finding in enterprise corpora. In SIGIR '06: Proc. 29th annual intern. ACM SIGIR conf. on Research and development in information retrieval, pages 43--50, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. I. Becerra-Fernandez. The role of artificial intelligence technologies in the implementation of people-finder knowledge management systems. In AAAI Workshop on Bringing Knowledge to Business Processes, March 2000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. S. Campbell, P. P. Maglio, A. Cozzi, and B. Dom. Expertise identification using email communications. In CIKM '03: Proc. twelfth intern. conf. on Information and knowledge management, pages 528--531, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Cao, J.-Y. Nie, and J. Bai. Integrating word relationships into language models. In SIGIR '05: Proc. 28th annual intern. ACM SIGIR conf. on Research and development in information retrieval, pages 298--305, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Craswell, D. Hawking, A. M. Vercoustre, and P.Wilkins. P@noptic expert: Searching for experts not just for documents. In Ausweb, 2001.Google ScholarGoogle Scholar
  10. N. Craswell, A. de Vries, and I. Soboroff. Overview of the TREC-2005 Enterprise Track. In The Fourteenth Text REtrieval Conf. Proc. (TREC 2005), 2006.Google ScholarGoogle Scholar
  11. T. H. Davenport and L. Prusak. Working Knowledge: How Organizations Manage What They Know. Harvard Business School Press, Boston, MA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Dunning. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1):61--74, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Filatova and J. Prager. Tell me what you do and I'll tell you what you are: Learning occupation-related activities for biographies. In HLT/EMNLP, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR '01: Proc. 24th annual intern. ACM SIGIR conf. on Research and development in information retrieval, pages 120--127, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Lavrenko, M. Choquette, and W. B. Croft. Cross-lingual relevance models. In SIGIR '02: Proc. 25th annual intern. ACM SIGIR conf. on Research and development in information retrieval, pages 175--182, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Macdonald and I. Ounis. Voting for candidates: adapting data fusion techniques for an expert search task. In CIKM '06: Proc. 15th ACM intern. conf. on Information and knowledge management, pages 387--396, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Mockus and J. D. Herbsleb. Expertise browser: a quantitative approach to identifying expertise. In ICSE '02: Proc. 24th Intern. Conf. on Software Engineering, pages 503--512, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Petkova and W. B. Croft. Hierarchical language models for expert finding in enterprise corpora. In Proc. ICTAI 2006, pages 599--608, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. I. Soboroff, A. de Vries, and N. Craswell. Overview of the TREC 2006 Enterprise Track. In TREC 2006 Working Notes, 2006.Google ScholarGoogle Scholar
  21. T. Tao, X. Wang, Q. Mei, and C. Zhai. Language model information retrieval with document expansion. In HLT-NAACL 2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. TREC. Enterprise track, 2005. URL: http://www.ins.cwi.nl/projects/trec-ent/wiki/.Google ScholarGoogle Scholar
  23. G. van Noord. TextCat Language Guesser. URL: http://www.let.rug.nl/~vannoord/TextCat/.Google ScholarGoogle Scholar
  24. W3C. The W3C test collection, 2005. URL: http://research.microsoft.com/users/nickcr/w3c-summary.html.Google ScholarGoogle Scholar

Index Terms

  1. Broad expertise retrieval in sparse data environments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
          July 2007
          946 pages
          ISBN:9781595935977
          DOI:10.1145/1277741

          Copyright © 2007 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 July 2007

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader