skip to main content
10.1145/1081870.1081949acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Co-clustering by block value decomposition

Published:21 August 2005Publication History

ABSTRACT

Dyadic data matrices, such as co-occurrence matrix, rating matrix, and proximity matrix, arise frequently in various important applications. A fundamental problem in dyadic data analysis is to find the hidden block structure of the data matrix. In this paper, we present a new co-clustering framework, block value decomposition(BVD), for dyadic data, which factorizes the dyadic data matrix into three components, the row-coefficient matrix R, the block value matrix B, and the column-coefficient matrix C. Under this framework, we focus on a special yet very popular case -- non-negative dyadic data, and propose a specific novel co-clustering algorithm that iteratively computes the three decomposition matrices based on the multiplicative updating rules. Extensive experimental evaluations also demonstrate the effectiveness and potential of this framework as well as the specific algorithms for co-clustering, and in particular, for discovering the hidden block structure in the dyadic data.

References

  1. N.M.L.A.P. Dempster and D.B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society 39(8):1--38, 1977.]]Google ScholarGoogle Scholar
  2. A. Banerjee, I.S. Dhillon, J. Ghosh, S. Merugu, and D.S. Modha. A generalized maximum entropy approach to bregman co-clustering and matrix approximation. In KDD pages 509--514, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P.K.Chan, M.D.F. Schlag, and J.Y. Zien. Spectral k-way ratio-cut partitioning and clustering. In DAC '93.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Y. Cheng and G.M. Church. Biclustering of expression data. In ICMB pages 93--103.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. Cho, I. Dhillon, Y. Guan, and S. Sra. Minimum sum squared residue co-clustering of gene expression data. In SDM 2004.]]Google ScholarGoogle ScholarCross RefCross Ref
  6. D.D. Lee and H.S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature 401:788--791, 1999.]]Google ScholarGoogle ScholarCross RefCross Ref
  7. S.C. Deerwester, S.T. Dumais, T.K. Landauer, G.W. Furnas, and R.A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6):391--407, 1990.]]Google ScholarGoogle ScholarCross RefCross Ref
  8. I.S. Dhillon, S. Mallela, and D.S. Modha. Information-theoretic co-clustering. In KDD'03 pages 89--98.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C.H.Q. Ding, X. He, H. Zha, M. Gu, and H.D. Simon. A min-max cut algorithm for graph partitioning and data clustering. In Proceedings of ICDM 2001 pages 107--114, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. El-Yaniv and O. Souroujon. Iterative double clustering for unsupervised and semi-supervised learning. In ECML pages 121--132, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J.A. Hartigan. Direct clustering of a data matrix. Journal of the American Statistical Association 67(337):123--129, March 1972.]]Google ScholarGoogle ScholarCross RefCross Ref
  12. K. Lang. NewsWeeder: learning to filter netnews. In ICML'95 pages 331--339, 1995.]]Google ScholarGoogle ScholarCross RefCross Ref
  13. D.D. Lee and H.S. Seung. Algorithms for non-negative matrix factorization. In NIPS pages 556--562, 2000.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8):888--905, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. Slonim and N. Tishby. Document clustering using word clusters via the information bottleneck method. In SIGIR '00.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Tishby, F. Pereira, and W. Bialek. The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing pages 368--377, 1999.]]Google ScholarGoogle Scholar
  17. W. Xu, X. Liu, and Y. Gong. Document clustering based on non-negative matrix factorization. In SIGIR '03 pages 267--273. ACM Press, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Zha, C. Ding, M. Gu, X. He, and H. Simon. Spectral relaxation for k-means clustering. Advances in Neural Information Processing Systems 14, 2002.]]Google ScholarGoogle Scholar

Index Terms

  1. Co-clustering by block value decomposition

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader