skip to main content
10.1145/1015330.1015414acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Solving cluster ensemble problems by bipartite graph partitioning

Published:04 July 2004Publication History

ABSTRACT

A critical problem in cluster ensemble research is how to combine multiple clusterings to yield a final superior clustering result. Leveraging advanced graph partitioning techniques, we solve this problem by reducing it to a graph partitioning problem. We introduce a new reduction method that constructs a bipartite graph from a given cluster ensemble. The resulting graph models both instances and clusters of the ensemble simultaneously as vertices in the graph. Our approach retains all of the information provided by a given ensemble, allowing the similarity among instances and the similarity among clusters to be considered collectively in forming the final clustering. Further, the resulting graph partitioning problem can be solved efficiently. We empirically evaluate the proposed approach against two commonly used graph formulations and show that it is more robust and achieves comparable or better performance in comparison to its competitors.

References

  1. Bach, F. R., & Jordan, M. I. (2004). Learning spectral clustering. NIPS 16.]]Google ScholarGoogle Scholar
  2. Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html.]]Google ScholarGoogle Scholar
  3. Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. John Wiley & Sons.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. KDD.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dhillon, I. S., Mallela, S., & Modha, D. S. (2003). Information-theoretic co-clustering. KDD.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dimitriadou, E., Weingessel, A., & Hornik, K. (2001). Voting-merging: An ensemble method for clustering. ICANN.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Dudoit, S., & Fridlyand, J. (2003). Bagging to improve the accuracy of a clustering procedure. Bioinformatics, 19.]]Google ScholarGoogle Scholar
  8. Dy, J. G., Brodley, C. E., Kak, A., Shyu, C., & Broderick, L. S. (1999). The customized-queries approach to CBIR using EM. CVPR.]]Google ScholarGoogle Scholar
  9. Fern, X. Z., & Brodley, C. E. (2003). Random projection for high dimensional data clustering: A cluster ensemble approach. ICML.]]Google ScholarGoogle Scholar
  10. Fjallstrom, P. (1998). Algorithms for graph partitioning: A survey. Linkoping Electronic Articles in Computer and Information Science, 3.]]Google ScholarGoogle Scholar
  11. Fred, A. L. N., & Jain, A. K. (2002). Data clustering using evidence accumulation. ICPR.]]Google ScholarGoogle Scholar
  12. Hagen, L., & Kahng, A. (1992). New spectral methods for ratio cut partitioning and clustering. IEEE transaction on CAD, 11, 1074--1085.]]Google ScholarGoogle Scholar
  13. Karypis, G., & Kumar, V. (1998). A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20, 359--392.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Monti, S., Tamayo, P., Mesirov, J., & Golub, T. (2003). Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning, 52, 91--118.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ng, A., Jordan, M., & Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. NIPS 14.]]Google ScholarGoogle Scholar
  16. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 22, 888--905.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Slonim, N., & Tishby, N. (2000). Document clustering using word clusters via the information bottleneck method. Research and Development in Information Retrieval.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Strehl, A., & Ghosh, J. (2002). Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Machine Learning Research, 3, 583--417.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Topchy, A., Jain, A. K., & Punch, W. (2003). Combining multiple weak clusterings. ICDM.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Topchy, A., Jain, A. K., & Punch, W. (2004). A mixture model for clustering ensembles. Proc. of SIAM Conf. on Data Mining.]]Google ScholarGoogle ScholarCross RefCross Ref
  1. Solving cluster ensemble problems by bipartite graph partitioning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICML '04: Proceedings of the twenty-first international conference on Machine learning
        July 2004
        934 pages
        ISBN:1581138385
        DOI:10.1145/1015330
        • Conference Chair:
        • Carla Brodley

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 July 2004

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate140of548submissions,26%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader