skip to main content
10.1145/1644893.1644930acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

Mining communities in networks: a solution for consistency and its evaluation

Published:04 November 2009Publication History

ABSTRACT

Online social networks pose significant challenges to computer scientists, physicists, and sociologists alike, for their massive size, fast evolution, and uncharted potential for social computing. One particular problem that has interested us is community identification. Many algorithms based on various metrics have been proposed for communities in networks [18, 24], but a few algorithms scale to very large networks. Three recent community identification algorithms, namely CNM [16], Wakita [59], and Louvain [10], stand out for their scalability to a few millions of nodes. All of them use modularity as the metric of optimization. However, all three algorithms produce inconsistent communities every time the ordering of nodes to the algorithms changes.

We propose two quantitative metrics to represent the level of consistency across multiple runs of an algorithm: pairwise membership probability and consistency. Based on these two metrics, we propose a solution that improves the consistency without compromising the modularity. We demonstrate that our solution to use pairwise membership probabilities as link weights generates consistent communities within six or fewer cycles for most networks. However, our iterative, pairwise membership reinforcing approach does not deliver convergence for Flickr, Orkut, and Cyworld networks as well for the rest of the networks. Our approach is empirically driven and is yet to be shown to produce consistent output analytically. We leave further investigation into the topological structure and its impact on the consistency as future work.

In order to evaluate the quality of clustering, we have looked at 3 of the 48 communities identified in the AS graph. Surprisingly, all have either hierarchical, geographical, or topological interpretations to their groupings. Our preliminary evaluation of the quality of communities is promising. We plan to conduct more thorough evaluation of the communities and study network structures and their evolutions using our approach.

References

  1. AS Ranking, Caida. http://as-rank.caida.org/.Google ScholarGoogle Scholar
  2. CIDR report. http://www.cidr-report.org/as2.0/.Google ScholarGoogle Scholar
  3. RIPE Data Search. http://www.db.ripe.net/whois.Google ScholarGoogle Scholar
  4. Y.-Y. Ahn et al. Analysis of topological characteristics of huge online social networking services. In WWW '07, pages 835--844, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Albert et al. Internet: Diameter of the world-wide web. Nature, 401(6749):130--131, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  6. D. Alderson et al. Understanding Internet topology: principles, models, and validation. IEEE/ACM Trans. Netw., 13(6):1205--1218, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. A. Alves. Unveiling community structures in weighted networks. Phys. Rev. E., 76(3):036101, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  8. A. Arenas et al. Synchronization reveals topological scales in complex networks. Phys. Rev. Lett., 96(11):114102, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  9. L. Backstrom et al. Group formation in large social networks: membership, growth, and evolution. In KDD '06, pages 44--54, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. V. D. Blondel et al. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008 (12pp), 2008.Google ScholarGoogle Scholar
  11. U. Brandes and T. Erlebach. Network Analysis: Methodological Foundations. Springer, March 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Chang et al. Towards capturing representative AS-level Internet topologies. Computer Networks, 44(6):737 -- 755, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. M. Chavis and A. Wandersman. Sense of community in the urban environment: a catalyst for participation and community development. American Journal of Community Psychology, 18:55--81, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  14. Q. Chen et al. The origin of power-laws in Internet topologies revisited. In IEEE INFOCOM, volume 2, pages 608--617. IEEE, 2002.Google ScholarGoogle Scholar
  15. A. Clauset. Finding local community structure in networks. Phys. Rev. E, 72(2):026132, Aug 2005.Google ScholarGoogle ScholarCross RefCross Ref
  16. A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Phys. Rev. E., 70:066111, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  17. Cyworld. http://www.cyworld.com.Google ScholarGoogle Scholar
  18. L. Danon et al. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005(09):P09008+, September 2005.Google ScholarGoogle ScholarCross RefCross Ref
  19. I. Derenyi et al. Clique percolation in random networks. Phys. Rev. Lett., 94(16), 2005.Google ScholarGoogle ScholarCross RefCross Ref
  20. Y.-H. Eom et al. Evolution of weighted scale-free networks in empirical data. Phys. Rev. E., 77(5):056105, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  21. M. Faloutsos et al. On power-law relationships of the Internet topology. In SIGCOMM '99, pages 251--262, New York, NY, USA, 1999. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. W. Flake et al. Self-organization and identification of web communities. IEEE Computer, 35:66--71, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Fortunato and M. Barthélemy. Resolution limit in community detection. Proc. Natl. Acad. Sci. U.S.A., 104(1):36--41, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  24. S. Fortunato and C. Castellano. Community structure in graphs, http://arxiv.org/abs/0712.2716, Dec 2007.Google ScholarGoogle Scholar
  25. M. Girvan and M. E. Newman. Community structure in social and biological networks. Proc. Natl. Acad. Sci. U.S.A., 99(12):7821--7826, June 2002.Google ScholarGoogle ScholarCross RefCross Ref
  26. R. Guimerà et al. Modularity from fluctuations in random graphs and complex networks. Phys. Rev. E, 70(2):025101, Aug 2004.Google ScholarGoogle ScholarCross RefCross Ref
  27. R. Guimera et al. The worldwide air transportation network: Anomalous centrality, community structure, and cities' global roles. Proc. Natl. Acad. Sci. U.S.A., 102(22):7794--7799, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  28. R. Guimera et al. Classes of complex networks defined by role-to-role connectivity profiles. Nature Physicsm, 3(1):63--69, January 2007.Google ScholarGoogle ScholarCross RefCross Ref
  29. R. Guimera and L. A. N. Luis. Functional cartography of complex metabolic networks. Nature, 433(7028):895--900, February 2005.Google ScholarGoogle ScholarCross RefCross Ref
  30. M. Gustafsson et al. Comparison and validation of community structures in complex networks. Physica A: Statistical Mechanics and its Applications, 367:559--576, July 2006.Google ScholarGoogle ScholarCross RefCross Ref
  31. H. Jeong et al. Lethality and centrality in protein networks. Nature, 411(6833):41--42, May 2001.Google ScholarGoogle ScholarCross RefCross Ref
  32. D. Kempe et al. Maximizing the spread of influence through a social network. In KDD '03, pages 137--146, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. Krishnamurthy and J. Wang. Topology modeling via cluster graphs. In IMW '01: Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement, pages 19--23, New York, NY, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Y. I. Leon-Suematsu and K. Yuta. A framework for fast community extraction of large-scale networks. In WWW '08, pages 1215--1216, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Leskovec et al. Statistical properties of community structure in large social and information networks. In WWW '08, pages 695--704, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L. Li et al. A first-principles approach to understanding the Internet's router-level topology. SIGCOMM Comput. Commun. Rev., 34(4):3--14, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. In CIKM '03, pages 556--559, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Z. Liu and B. Hu. Epidemic spreading in community networks. Europhys. Lett., 72(2), 2005.Google ScholarGoogle ScholarCross RefCross Ref
  39. B. Long et al. Community learning by graph approximation. In ICDM '07, pages 232--241, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Lozano et al. Mesoscopic structure conditions the emergence of cooperation on social networks. PLoS ONE, 3(4):e1892, 04 2008.Google ScholarGoogle ScholarCross RefCross Ref
  41. D. W. Mcmillan and D. M. Chavis. Sense of community: A definition and theory. Journal of Community Psychology, 14(1):6--23, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  42. A. Mislove. http://socialnetworks.mpi-sws.org.Google ScholarGoogle Scholar
  43. A. Mislove et al. Growth of the flickr social network. In Proceedings of the 1st ACM SIGCOMM Workshop on Social Networks (WOSN'08), August 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. B. A. Nardi et al. Why we blog. Commun. ACM, 47(12):41--46, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. M. E. J. Newman. Fast algorithm for detecting community structure in networks. Phys. Rev. E., 69:066133, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  46. M. E. J. Newman. Modularity and community structure in networks. Proc. Natl. Acad. Sci. U.S.A., 103:8577, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  47. M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E, 69(2):026113, Feb 2004.Google ScholarGoogle ScholarCross RefCross Ref
  48. R. Oliveira et al. Observing the evolution of Internet AS topology. In SIGCOMM '07, pages 313--324, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. R. Oliveira et al. Quantifying the completeness of the observed Internet AS-level structure. Technical Report TR 080026, UCLA CS Department, September 2008.Google ScholarGoogle Scholar
  50. G. Palla et al. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814--818, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  51. G. Palla et al. Quantifying social group evolution. Nature, 446:664--667, April 2007.Google ScholarGoogle ScholarCross RefCross Ref
  52. N. Pathak et al. Social topic models for community extraction. In The 2nd SNA-KDD Workshop'08, August 2008.Google ScholarGoogle Scholar
  53. F. Radicchi et al. Defining and identifying communities in networks. Proc. Natl. Acad. Sci. U.S.A., 101(9):2658--2663, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  54. M. Rosvall and C. T. Bergstrom. An information-theoretic framework for resolving community structure in complex networks. Proc. Natl. Acad. Sci. U.S.A., 104(18):7327--7331, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  55. M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. U.S.A., 105(4):1118--1123, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  56. B. Saha and L. Getoor. Group proximity measure for recommending groups in online social networks. In The 2nd SNA-KDD Workshop'08. ACM, August 2008.Google ScholarGoogle Scholar
  57. S. Tauro et al. A simple conceptual model for the Internet topology. In GLOBECOM '01, volume 3, pages 1667--1671 vol.3, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  58. B. Viswanath et al. On the evolution of user interaction in facebook. In Proceedings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN'09), August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. K. Wakita and T. Tsurumi. Finding community structure in mega-scale social networks. CoRR, abs/cs/0702048, 2007.Google ScholarGoogle Scholar
  60. S. Wasserman et al. Social Network Analysis : Methods and Applications. Cambridge University Press, November 1994.Google ScholarGoogle ScholarCross RefCross Ref
  61. Wikipedia. http://www.wikipedia.org.Google ScholarGoogle Scholar
  62. W. W. Zachary. An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33:452--473, 1977.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Mining communities in networks: a solution for consistency and its evaluation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      IMC '09: Proceedings of the 9th ACM SIGCOMM conference on Internet measurement
      November 2009
      468 pages
      ISBN:9781605587714
      DOI:10.1145/1644893

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 November 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate277of1,083submissions,26%

      Upcoming Conference

      IMC '24
      ACM Internet Measurement Conference
      November 4 - 6, 2024
      Madrid , AA , Spain

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader