skip to main content
10.1145/1390156.1390267acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Bayesian probabilistic matrix factorization using Markov chain Monte Carlo

Published:05 July 2008Publication History

ABSTRACT

Low-rank matrix approximation methods provide one of the simplest and most effective approaches to collaborative filtering. Such models are usually fitted to data by finding a MAP estimate of the model parameters, a procedure that can be performed efficiently even on very large datasets. However, unless the regularization parameters are tuned carefully, this approach is prone to overfitting because it finds a single point estimate of the parameters. In this paper we present a fully Bayesian treatment of the Probabilistic Matrix Factorization (PMF) model in which model capacity is controlled automatically by integrating over all model parameters and hyperparameters. We show that Bayesian PMF models can be efficiently trained using Markov chain Monte Carlo methods by applying them to the Netflix dataset, which consists of over 100 million movie ratings. The resulting models achieve significantly higher prediction accuracy than PMF models trained using MAP estimation.

References

  1. Hinton, G. E., & van Camp, D. (1993). Keeping the neural networks simple by minimizing the description length of the weights. COLT (pp. 5--13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Hofmann, T. (1999). Probabilistic latent semantic analysis. Proceedings of the 15th Conference on Uncertainty in AI (pp. 289--296). San Fransisco, California: Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37, 183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lim, Y. J., & Teh, Y. W. (2007). Variational Bayesian approach to movie rating prediction. Proceedings of KDD Cup and Workshop.Google ScholarGoogle Scholar
  5. Marlin, B. (2004). Modeling user rating profiles for collaborative filtering. In S. Thrun, L. Saul and B. Schölkopf (Eds.), Advances in neural information processing systems 16. Cambridge, MA: MIT Press.Google ScholarGoogle Scholar
  6. Marlin, B., & Zemel, R. S. (2004). The multiple multiplicative factor model for collaborative filtering. Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Neal, R. M. (1993). Probabilistic inference using Markov chain Monte Carlo methods (Technical Report CRG-TR-93-1). Department of Computer Science, University of Toronto.Google ScholarGoogle Scholar
  8. Nowlan, S. J., & Hinton, G. E. (1992). Simplifying neural networks by soft weight-sharing. Neural Computation, 4, 473--493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Raiko, T., Ilin, A., & Karhunen, J. (2007). Principal component analysis for large scale problems with lots of missing values. ECML (pp. 691--698). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Rennie, J. D. M., & Srebro, N. (2005). Fast maximum margin matrix factorization for collaborative prediction. Machine Learning, Proceedings of the Twenty-Second International Conference (ICML 2005), Bonn, Germany (pp. 713--719). ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Salakhutdinov, R., & Mnih, A. (2008). Probabilistic matrix factorization. Advances in Neural Information Processing Systems 20. Cambridge, MA: MIT Press.Google ScholarGoogle Scholar
  12. Srebro, N., & Jaakkola, T. (2003). Weighted low-rank approximations. Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), Washington, DC, USA (pp. 720--727). AAAI Press.Google ScholarGoogle Scholar

Index Terms

  1. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in
                    • Published in

                      cover image ACM Other conferences
                      ICML '08: Proceedings of the 25th international conference on Machine learning
                      July 2008
                      1310 pages
                      ISBN:9781605582054
                      DOI:10.1145/1390156

                      Copyright © 2008 ACM

                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 5 July 2008

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • research-article

                      Acceptance Rates

                      Overall Acceptance Rate140of548submissions,26%

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader