ABSTRACT
A critical problem in cluster ensemble research is how to combine multiple clusterings to yield a final superior clustering result. Leveraging advanced graph partitioning techniques, we solve this problem by reducing it to a graph partitioning problem. We introduce a new reduction method that constructs a bipartite graph from a given cluster ensemble. The resulting graph models both instances and clusters of the ensemble simultaneously as vertices in the graph. Our approach retains all of the information provided by a given ensemble, allowing the similarity among instances and the similarity among clusters to be considered collectively in forming the final clustering. Further, the resulting graph partitioning problem can be solved efficiently. We empirically evaluate the proposed approach against two commonly used graph formulations and show that it is more robust and achieves comparable or better performance in comparison to its competitors.
- Bach, F. R., & Jordan, M. I. (2004). Learning spectral clustering. NIPS 16.]]Google Scholar
- Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html.]]Google Scholar
- Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. John Wiley & Sons.]] Google ScholarDigital Library
- Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. KDD.]] Google ScholarDigital Library
- Dhillon, I. S., Mallela, S., & Modha, D. S. (2003). Information-theoretic co-clustering. KDD.]] Google ScholarDigital Library
- Dimitriadou, E., Weingessel, A., & Hornik, K. (2001). Voting-merging: An ensemble method for clustering. ICANN.]] Google ScholarDigital Library
- Dudoit, S., & Fridlyand, J. (2003). Bagging to improve the accuracy of a clustering procedure. Bioinformatics, 19.]]Google Scholar
- Dy, J. G., Brodley, C. E., Kak, A., Shyu, C., & Broderick, L. S. (1999). The customized-queries approach to CBIR using EM. CVPR.]]Google Scholar
- Fern, X. Z., & Brodley, C. E. (2003). Random projection for high dimensional data clustering: A cluster ensemble approach. ICML.]]Google Scholar
- Fjallstrom, P. (1998). Algorithms for graph partitioning: A survey. Linkoping Electronic Articles in Computer and Information Science, 3.]]Google Scholar
- Fred, A. L. N., & Jain, A. K. (2002). Data clustering using evidence accumulation. ICPR.]]Google Scholar
- Hagen, L., & Kahng, A. (1992). New spectral methods for ratio cut partitioning and clustering. IEEE transaction on CAD, 11, 1074--1085.]]Google Scholar
- Karypis, G., & Kumar, V. (1998). A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20, 359--392.]] Google ScholarDigital Library
- Monti, S., Tamayo, P., Mesirov, J., & Golub, T. (2003). Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning, 52, 91--118.]] Google ScholarDigital Library
- Ng, A., Jordan, M., & Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. NIPS 14.]]Google Scholar
- Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 22, 888--905.]] Google ScholarDigital Library
- Slonim, N., & Tishby, N. (2000). Document clustering using word clusters via the information bottleneck method. Research and Development in Information Retrieval.]] Google ScholarDigital Library
- Strehl, A., & Ghosh, J. (2002). Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Machine Learning Research, 3, 583--417.]] Google ScholarDigital Library
- Topchy, A., Jain, A. K., & Punch, W. (2003). Combining multiple weak clusterings. ICDM.]] Google ScholarDigital Library
- Topchy, A., Jain, A. K., & Punch, W. (2004). A mixture model for clustering ensembles. Proc. of SIAM Conf. on Data Mining.]]Google ScholarCross Ref
- Solving cluster ensemble problems by bipartite graph partitioning
Recommendations
Cluster graph modification problems
In a clustering problem one has to partition a set of elements into homogeneous and well-separated subsets. From a graph theoretic point of view, a cluster graph is a vertex-disjoint union of cliques. The clustering problem is the task of making the ...
Rainbow matchings in an edge-colored planar bipartite graph
Highlights- Given an edge-colored graph G, if any two edges of G receive distinct colors, then we call G a rainbow graph. The anti-Ramsey number AR(G;H) is the maximum ...
AbstractIn this paper, we consider the existence of rainbow matchings in maximal bipartite planar graphs. We determine the maximum number of colors appearing in an edge-coloring of maximal bipartite planar graphs with a Hamilton cycle which ...
Comments