research-article

Relational learning via collective matrix factorization

Authors:
Ajit P. Singh

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Geoffrey J. Gordon

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2008Pages 650–658https://doi.org/10.1145/1401890.1401969

Published:24 August 2008Publication History

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 650–658

ABSTRACT

Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations encode users' ratings of movies, movies' genres, and actors' roles in movies. A common prediction technique given one pairwise relation, for example a #users x #movies ratings matrix, is low-rank matrix factorization. In domains with multiple relations, represented as multiple matrices, we may improve predictive accuracy by exploiting information from one relation while predicting another. To this end, we propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. Each relation can have a different value type and error distribution; so, we allow nonlinear relationships between the parameters and outputs, using Bregman divergences to measure error. We extend standard alternating projection algorithms to our model, and derive an efficient Newton update for the projection. Furthermore, we propose stochastic optimization methods to deal with large, sparse matrices. Our model generalizes several existing matrix factorization methods, and therefore yields new large-scale optimization algorithms for these problems. Our model can handle any pairwise relational schema and a wide variety of error models. We demonstrate its efficiency, as well as the benefit of sharing parameters among relations.

References

D. Agarwal and S. Merugu. Predictive discrete latent factor models for large scale dyadic data. In KDD, pages 26--35, 2007. Google ScholarDigital Library
D. J. Aldous. Representations for partially exchangeable arrays of random variables. J. Multi. Anal., 11(4):581--598, 1981.Google ScholarCross Ref
D. J. Aldous. Exchangeability and related topics, chapter 1. Springer, 1985.Google Scholar
K. S. Azoury and M. Warmuth. Relative loss bounds for on-line density estimation with the exponential family of distributions. Mach. Learn., 43:211--246, 2001. Google ScholarDigital Library
A. Banerjee, S. Basu, and S. Merugu. Multi-way clustering on relation graphs. In SDM. SIAM, 2007.Google ScholarCross Ref
A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh. Clustering with Bregman divergences. J. Mach. Learn. Res., 6:1705--1749, 2005. Google ScholarCross Ref
L. Bottou. Online algorithms and stochastic approximations. In Online Learning and Neural Networks. Cambridge UP, 1998. Google ScholarDigital Library
L. Bottou and Y. LeCun. Large scale online learning. In NIPS, 2003.Google ScholarDigital Library
S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge UP, 2004. Google ScholarDigital Library
L. Bregman. The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comp. Math and Math. Phys., 7:200--217, 1967.Google ScholarCross Ref
Y. Censor and S. A. Zenios. Parallel Optimization: Theory, Algorithms, and Applications. Oxford UP, 1997. Google ScholarDigital Library
P. P. Chen. The entity-relationship model: Toward a unified view of data. ACM Trans. Data. Sys., 1(1):9--36, 1976. Google ScholarDigital Library
D. Cohn and T. Hofmann. The missing link-a probabilistic model of document content and hypertext connectivity. In NIPS, 2000.Google Scholar
M. Collins, S. Dasgupta, and R. E. Schapire. A generalization of principal component analysis to the exponential family. In NIPS, 2001.Google Scholar
J. Forster and M. K. Warmuth. Relative expected instantaneous loss bounds. In COLT, pages 90--99, 2000. Google ScholarDigital Library
G. H. Golub and C. F. V. Loan. Matrix Computions. John Hopkins UP, 3rd edition, 1996.Google Scholar
G. J. Gordon. Generalized² linear² models. In NIPS, 2002.Google Scholar
D. Harman. Overview of the 2nd text retrieval conference (TREC-2). Inf. Process. Manag., 31(3):271--289, 1995. Google ScholarDigital Library
T. Hofmann. Probabilistic latent semantic indexing. In SIGIR, pages 50--57, 1999. Google ScholarDigital Library
Internet Movie Database Inc. IMDB interfaces. http://www.imdb.com/interfaces, Jan. 2007.Google Scholar
D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In NIPS, 2001.Google ScholarDigital Library
J. D. Leeuw. Block relaxation algorithms in statistics, 1994.Google Scholar
B. Long, Z. M. Zhang, X. Wú;, and P. S. Yu. Spectral clustering for multi-type relational data. In ICML, pages 585--592, 2006. Google ScholarDigital Library
B. Long, Z. M. Zhang, X. Wu, and P. S. Yu. Relational clustering by symmetric convex coding. In ICML, pages 569--576, 2007. Google ScholarDigital Library
B. Long, Z. M. Zhang, and P. S. Yu. A probabilistic framework for relational clustering. In KDD, pages 470--479, 2007. Google ScholarDigital Library
P. McCullagh and J. Nelder. Generalized Linear Models. Chapman and Hall: London., 1989.Google ScholarCross Ref
Netflix. Netflix prize dataset. http://www.netflixprize.com, Jan. 2007.Google Scholar
J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 1999.Google ScholarCross Ref
F. Pereira and G. Gordon. The support vector decomposition machine. In ICML, pages 689--696, 2006. Google ScholarDigital Library
J. D. M. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. In ICML, pages 713--719, 2005. Google ScholarDigital Library
A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. Technical Report CMU-ML-08-109, Machine Learning Department, Carnegie Mellon University, 2008.Google ScholarCross Ref
N. Srebro and T. Jaakola. Weighted low-rank approximations. In ICML, 2003.Google ScholarDigital Library
N. Srebro, J. D. Rennie, and T. S. Jaakkola. Maximum-margin matrix factorization. In NIPS, 2004.Google ScholarDigital Library
P. Stoica and Y. Selen. Cyclic minimizers, majorization techniques, and the expectation-maximization algorithm: a refresher. Sig. Process. Mag., IEEE, 21(1):112--114, 2004.Google ScholarCross Ref
K. Yu, S. Yu, and V. Tresp. Multi-label informed latent semantic indexing. In SIGIR, pages 258--265, 2005. Google ScholarDigital Library
S. Yu, K. Yu, V. Tresp, H.-P. Kriegel, and M. Wu. Supervised probabilistic principal component analysis. In KDD, pages 464--473, 2006. Google ScholarDigital Library
S. Zhu, K. Yu, Y. Chi, and Y. Gong. Combining content and link for classification using matrix factorization. In SIGIR, pages 487--494, 2007. Google ScholarDigital Library

Index Terms

Relational learning via collective matrix factorization

Recommendations

Co-manifold Matrix Factorization
ICCPR '20: Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition

Matrix factorization plays a fundamental role in collaborative filtering. In collaborative filtering setting, the rating matrix R is very sparse. Thus, infinite number of matrices can fit the observed entries in the rating matrix. Without additional ...
Read More
Collective matrix factorization for co-clustering
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

We outline some matrix factorization approaches for co- clustering polyadic data (like publication data) using non-negative factorization (NMF). NMF approximates the data as a product of non-negative low-rank matrices, and can induce desirable ...
Read More
Two Purposes for Matrix Factorization: A Historical Appraisal

Matrix factorization in numerical linear algebra (NLA) typically serves the purpose of restating some given problem in such a way that it can be solved more readily; for example, one major application is in the solution of a linear system of equations. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2008
1116 pages
ISBN:9781605581934
DOI:10.1145/1401890
General Chair:
Ying Li
Microsoft adCenter Labs
,
Program Chairs:
Bing Liu
University of Illinois at Chicago
,
Sunita Sarawagi
Indian Institute of Technology, Bombay
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 August 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
matrix factorization
relational learning
stochastic approximation
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '08 Paper Acceptance Rate118of593submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 644
  Total Citations
  View Citations
- 3,680
  Total Downloads
- Downloads (Last 12 months)266
- Downloads (Last 6 weeks)31
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Relational learning via collective matrix factorization

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Co-manifold Matrix Factorization

Collective matrix factorization for co-clustering

Two Purposes for Matrix Factorization: A Historical Appraisal

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Relational learning via collective matrix factorization

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Co-manifold Matrix Factorization

Collective matrix factorization for co-clustering

Two Purposes for Matrix Factorization: A Historical Appraisal

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media