Abstract
Clustering in graphs aims to group vertices with similar patterns of connections. Applications include discovering communities and latent structures in graphs. Many algorithms have been proposed to find graph clusterings, but an open problem is the need for suitable comparison measures to quantitatively validate these algorithms, performing consensus clustering and to track evolving (graph) clusters across time. To date, most comparison measures have focused on comparing the vertex groupings, and completely ignore the difference in the structural approximations in the clusterings, which can lead to counter-intuitive comparisons. In this paper, we propose new measures that account for differences in the approximations. We focus on comparison measures for two important graph clustering approaches, community detection and blockmodelling, and propose comparison measures that work for weighted (and unweighted) graphs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. of Machine Learning Research 9, 1981–2014 (2008)
Pinkert, S., Schultz, J., Reichardt, J.: Protein interaction networks–more than mere modules. PLoS Computational Biology 6(1), e1000659 (2010)
Lancichinetti, A., Fortunato, S.: Consensus clustering in complex networks. Nature 2(336) (2012)
Chan, J., Liu, W., Leckie, C., Bailey, J., Kotagiri, R.: SeqiBloc: Mining Multi-time Spanning Blockmodels in Dynamic Graphs. In: Proceedings of KDD, pp. 651–659 (2012)
Bae, E., Bailey, J., Dong, G.: A clustering comparison measure using density profiles and its application to the discovery of alternate clusterings. Data Mining and Knowledge Discovery 21(3), 427–471 (2010)
Coen, M.H., Ansari, H.M., Filllmore, N.: Comparing Clusterings in Space. In: Proceedings of ICML, pp. 231–238 (2010)
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proceedings of PNAS 105, 1118–1123 (2008)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge Univ. Press (1994)
Chan, J., Lam, S., Hayes, C.: Increasing the Scalability of the Fitting of Generalised Block Models for Social Networks. In: Proceedings of IJCAI, pp. 1218–1224 (2011)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On Clustering Validation Techniques. J. of Intelligent Information Systems 17(2/3), 107–145 (2001)
Vinh, N., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. The J. of Machine Learning Research 11, 2837–2854 (2010)
Zhou, D., Li, J., Zha, H.: A new Mallows distance based metric for comparing clusterings. In: Proceedings of the ICDM, pp. 1028–1035 (2005)
Patrikainen, A., Meila, M.: Comparing Subspace Clusterings. IEEE Trans. on Know. Eng. 18(7), 902–916 (2006)
Doreian, P., Batagelj, V., Ferligoj, A.: Generalized blockmodeling. Cambridge Univ. Press (2005)
Chan, J., Liu, W., Kan, A., Leckie, C., Bailey, J., Kotagiri, R.: Discovering latent blockmodels in sparse and noisy graphs using non-negative matrix factorisation. In: Proceedings of CIKM, pp. 811–816 (2013)
Lee, D., Seung, H.: Algorithms for non-negative matrix factorization. In: Proceedings of NIPS, pp. 556–562 (2000)
Cover, T., Thomas, J.: Elements of Information Theory. Wiley-Interscience (2006)
Rubner, Y., Tomasi, C., Guibas, L.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40(2), 99–121 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chan, J. et al. (2014). Structure-Aware Distance Measures for Comparing Clusterings in Graphs. In: Tseng, V.S., Ho, T.B., Zhou, ZH., Chen, A.L.P., Kao, HY. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2014. Lecture Notes in Computer Science(), vol 8443. Springer, Cham. https://doi.org/10.1007/978-3-319-06608-0_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-06608-0_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06607-3
Online ISBN: 978-3-319-06608-0
eBook Packages: Computer ScienceComputer Science (R0)