skip to main content
10.1145/1553374.1553481acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Independent factor topic models

Published:14 June 2009Publication History

ABSTRACT

Topic models such as Latent Dirichlet Allocation (LDA) and Correlated Topic Model (CTM) have recently emerged as powerful statistical tools for text document modeling. In this paper, we improve upon CTM and propose Independent Factor Topic Models (IFTM) which use linear latent variable models to uncover the hidden sources of correlation between topics. There are 2 main contributions of this work. First, by using a sparse source prior model, we can directly visualize sparse patterns of topic correlations. Secondly, the conditional independence assumption implied in the use of latent source variables allows the objective function to factorize, leading to a fast Newton-Raphson based variational inference algorithm. Experimental results on synthetic and real data show that IFTM runs on average 3--5 times faster than CTM, while giving competitive performance as measured by perplexity and loglikelihood of held-out data.

References

  1. Attias, H. (2000). A variational bayesian framework for graphical models. Advances in Neural Information Processing Systems (NIPS) (pp. 209--215).Google ScholarGoogle Scholar
  2. Blei, D. M., Griffiths, T., Jordan, M. I., & Tenenbaum, J. (2004). Hierarchical topic models and the nested chinese restaurant process. Advances in Neural Information Processing Systems (NIPS) (pp. 17--24).Google ScholarGoogle Scholar
  3. Blei, D. M., & Lafferty, J. D. (2006). Correlated topic models. Advances in Neural Information Processing Systems (NIPS) (pp. 147--154).Google ScholarGoogle Scholar
  4. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Everitt, B. S. (1984). An introduction to latent variable models. London: Chapman and Hall.Google ScholarGoogle Scholar
  6. Girolami, M. (2001). A variational method for learning sparse and overcomplete representations. Neural Computation, 13, 2517--2532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Griffiths, T., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences (pp. 5228--5235).Google ScholarGoogle ScholarCross RefCross Ref
  8. Jaakkola, T. S. (1997). Variational methods for inference and estimation in graphical models. Doctoral dissertation, MIT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Joreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. Psychometrika, 32, 443--482.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Independent factor topic models

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Other conferences
                    ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
                    June 2009
                    1331 pages
                    ISBN:9781605585161
                    DOI:10.1145/1553374

                    Copyright © 2009 Copyright 2009 by the author(s)/owner(s).

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 14 June 2009

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • research-article

                    Acceptance Rates

                    Overall Acceptance Rate140of548submissions,26%

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader