Article

Co-clustering by block value decomposition

Authors:
Bo Long

SUNY Binghamton, Binghamton, NY

SUNY Binghamton, Binghamton, NY
View Profile

,
Zhongfei (Mark) Zhang

SUNY Binghamton, Binghamton, NY

SUNY Binghamton, Binghamton, NY
View Profile

,
Philip S. Yu

IBM Watson Research Center, Hawthorne, NY

IBM Watson Research Center, Hawthorne, NY
View Profile

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data miningAugust 2005Pages 635–640https://doi.org/10.1145/1081870.1081949

Published:21 August 2005Publication History

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

Pages 635–640

ABSTRACT

Dyadic data matrices, such as co-occurrence matrix, rating matrix, and proximity matrix, arise frequently in various important applications. A fundamental problem in dyadic data analysis is to find the hidden block structure of the data matrix. In this paper, we present a new co-clustering framework, block value decomposition(BVD), for dyadic data, which factorizes the dyadic data matrix into three components, the row-coefficient matrix R, the block value matrix B, and the column-coefficient matrix C. Under this framework, we focus on a special yet very popular case -- non-negative dyadic data, and propose a specific novel co-clustering algorithm that iteratively computes the three decomposition matrices based on the multiplicative updating rules. Extensive experimental evaluations also demonstrate the effectiveness and potential of this framework as well as the specific algorithms for co-clustering, and in particular, for discovering the hidden block structure in the dyadic data.

References

N.M.L.A.P. Dempster and D.B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society 39(8):1--38, 1977.]]Google Scholar
A. Banerjee, I.S. Dhillon, J. Ghosh, S. Merugu, and D.S. Modha. A generalized maximum entropy approach to bregman co-clustering and matrix approximation. In KDD pages 509--514, 2004.]] Google ScholarDigital Library
P.K.Chan, M.D.F. Schlag, and J.Y. Zien. Spectral k-way ratio-cut partitioning and clustering. In DAC '93.]] Google ScholarDigital Library
Y. Cheng and G.M. Church. Biclustering of expression data. In ICMB pages 93--103.]] Google ScholarDigital Library
H. Cho, I. Dhillon, Y. Guan, and S. Sra. Minimum sum squared residue co-clustering of gene expression data. In SDM 2004.]]Google ScholarCross Ref
D.D. Lee and H.S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature 401:788--791, 1999.]]Google ScholarCross Ref
S.C. Deerwester, S.T. Dumais, T.K. Landauer, G.W. Furnas, and R.A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6):391--407, 1990.]]Google ScholarCross Ref
I.S. Dhillon, S. Mallela, and D.S. Modha. Information-theoretic co-clustering. In KDD'03 pages 89--98.]] Google ScholarDigital Library
C.H.Q. Ding, X. He, H. Zha, M. Gu, and H.D. Simon. A min-max cut algorithm for graph partitioning and data clustering. In Proceedings of ICDM 2001 pages 107--114, 2001.]] Google ScholarDigital Library
R. El-Yaniv and O. Souroujon. Iterative double clustering for unsupervised and semi-supervised learning. In ECML pages 121--132, 2001.]] Google ScholarDigital Library
J.A. Hartigan. Direct clustering of a data matrix. Journal of the American Statistical Association 67(337):123--129, March 1972.]]Google ScholarCross Ref
K. Lang. NewsWeeder: learning to filter netnews. In ICML'95 pages 331--339, 1995.]]Google ScholarCross Ref
D.D. Lee and H.S. Seung. Algorithms for non-negative matrix factorization. In NIPS pages 556--562, 2000.]]Google ScholarDigital Library
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8):888--905, 2000.]] Google ScholarDigital Library
N. Slonim and N. Tishby. Document clustering using word clusters via the information bottleneck method. In SIGIR '00.]] Google ScholarDigital Library
N. Tishby, F. Pereira, and W. Bialek. The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing pages 368--377, 1999.]]Google Scholar
W. Xu, X. Liu, and Y. Gong. Document clustering based on non-negative matrix factorization. In SIGIR '03 pages 267--273. ACM Press, 2003.]] Google ScholarDigital Library
H. Zha, C. Ding, M. Gu, X. He, and H. Simon. Spectral relaxation for k-means clustering. Advances in Neural Information Processing Systems 14, 2002.]]Google Scholar

Index Terms

Co-clustering by block value decomposition
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
2. Information systems

Recommendations

Orthogonal parametric non-negative matrix tri-factorization with α-divergence for co-clustering
Abstract
Co-clustering algorithms can seek homogeneous sub-matrices into a dyadic data matrix, such as a document-word matrix. Algorithms for co-clustering can be expressed as a non-negative matrix tri-factorization problem such that X ≈ FSG ⊤, which is ...
Highlights
- Our algorithm works by multiplicative update rules and it is convergence.
- Adding two penalties for controlling the orthogonality of row and column clusters.
- Unifying a class of algorithms for co-clustering based on α-divergence.
Read More
Fast Co-clustering Using Matrix Decomposition
APCIP '09: Proceedings of the 2009 Asia-Pacific Conference on Information Processing - Volume 02

Co-clustering is a powerful data mining technique with varied applications such as text clustering, web-log mining and microarray analysis. Simultaneously clustering rows and columns (co-clustering) of large data matrix is an important problem with ...
Read More
Fast matrix decomposition in F2

In this work an efficient algorithm to perform a block decomposition for large dense rectangular matrices with entries in F"2 is presented. Matrices are stored as column blocks of row major matrices in order to facilitate rows operation and matrix ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
August 2005
844 pages
ISBN:159593135X
DOI:10.1145/1081870
General Chair:
Robert Grossman
University of Illinois at Chicago & Open Data Partners, USA
,
Program Chairs:
Roberto Bayardo
IBM Almaden Research, USA
,
Kristin Bennett
RPI, USA
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
block value decomposition (BVD)
clustering
co-clustering
dyadic data
hidden block structure
matrix decomposition
non-negative block value decomposition (NBVD)
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 115
  Total Citations
  View Citations
- 1,439
  Total Downloads
- Downloads (Last 12 months)43
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Co-clustering by block value decomposition

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Orthogonal parametric non-negative matrix tri-factorization with α-divergence for co-clustering

Fast Co-clustering Using Matrix Decomposition

Fast matrix decomposition in F2