Skip to main content

Structure-Aware Distance Measures for Comparing Clusterings in Graphs

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8443))

Abstract

Clustering in graphs aims to group vertices with similar patterns of connections. Applications include discovering communities and latent structures in graphs. Many algorithms have been proposed to find graph clusterings, but an open problem is the need for suitable comparison measures to quantitatively validate these algorithms, performing consensus clustering and to track evolving (graph) clusters across time. To date, most comparison measures have focused on comparing the vertex groupings, and completely ignore the difference in the structural approximations in the clusterings, which can lead to counter-intuitive comparisons. In this paper, we propose new measures that account for differences in the approximations. We focus on comparison measures for two important graph clustering approaches, community detection and blockmodelling, and propose comparison measures that work for weighted (and unweighted) graphs.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. of Machine Learning Research 9, 1981–2014 (2008)

    Google Scholar 

  2. Pinkert, S., Schultz, J., Reichardt, J.: Protein interaction networks–more than mere modules. PLoS Computational Biology 6(1), e1000659 (2010)

    Google Scholar 

  3. Lancichinetti, A., Fortunato, S.: Consensus clustering in complex networks. Nature 2(336) (2012)

    Google Scholar 

  4. Chan, J., Liu, W., Leckie, C., Bailey, J., Kotagiri, R.: SeqiBloc: Mining Multi-time Spanning Blockmodels in Dynamic Graphs. In: Proceedings of KDD, pp. 651–659 (2012)

    Google Scholar 

  5. Bae, E., Bailey, J., Dong, G.: A clustering comparison measure using density profiles and its application to the discovery of alternate clusterings. Data Mining and Knowledge Discovery 21(3), 427–471 (2010)

    Article  MathSciNet  Google Scholar 

  6. Coen, M.H., Ansari, H.M., Filllmore, N.: Comparing Clusterings in Space. In: Proceedings of ICML, pp. 231–238 (2010)

    Google Scholar 

  7. Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proceedings of PNAS 105, 1118–1123 (2008)

    Article  Google Scholar 

  8. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge Univ. Press (1994)

    Google Scholar 

  9. Chan, J., Lam, S., Hayes, C.: Increasing the Scalability of the Fitting of Generalised Block Models for Social Networks. In: Proceedings of IJCAI, pp. 1218–1224 (2011)

    Google Scholar 

  10. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On Clustering Validation Techniques. J. of Intelligent Information Systems 17(2/3), 107–145 (2001)

    Article  MATH  Google Scholar 

  11. Vinh, N., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. The J. of Machine Learning Research 11, 2837–2854 (2010)

    MATH  MathSciNet  Google Scholar 

  12. Zhou, D., Li, J., Zha, H.: A new Mallows distance based metric for comparing clusterings. In: Proceedings of the ICDM, pp. 1028–1035 (2005)

    Google Scholar 

  13. Patrikainen, A., Meila, M.: Comparing Subspace Clusterings. IEEE Trans. on Know. Eng. 18(7), 902–916 (2006)

    Article  Google Scholar 

  14. Doreian, P., Batagelj, V., Ferligoj, A.: Generalized blockmodeling. Cambridge Univ. Press (2005)

    Google Scholar 

  15. Chan, J., Liu, W., Kan, A., Leckie, C., Bailey, J., Kotagiri, R.: Discovering latent blockmodels in sparse and noisy graphs using non-negative matrix factorisation. In: Proceedings of CIKM, pp. 811–816 (2013)

    Google Scholar 

  16. Lee, D., Seung, H.: Algorithms for non-negative matrix factorization. In: Proceedings of NIPS, pp. 556–562 (2000)

    Google Scholar 

  17. Cover, T., Thomas, J.: Elements of Information Theory. Wiley-Interscience (2006)

    Google Scholar 

  18. Rubner, Y., Tomasi, C., Guibas, L.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40(2), 99–121 (2000)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Chan, J. et al. (2014). Structure-Aware Distance Measures for Comparing Clusterings in Graphs. In: Tseng, V.S., Ho, T.B., Zhou, ZH., Chen, A.L.P., Kao, HY. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2014. Lecture Notes in Computer Science(), vol 8443. Springer, Cham. https://doi.org/10.1007/978-3-319-06608-0_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06608-0_30

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06607-3

  • Online ISBN: 978-3-319-06608-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics