ABSTRACT
Online social networks pose significant challenges to computer scientists, physicists, and sociologists alike, for their massive size, fast evolution, and uncharted potential for social computing. One particular problem that has interested us is community identification. Many algorithms based on various metrics have been proposed for communities in networks [18, 24], but a few algorithms scale to very large networks. Three recent community identification algorithms, namely CNM [16], Wakita [59], and Louvain [10], stand out for their scalability to a few millions of nodes. All of them use modularity as the metric of optimization. However, all three algorithms produce inconsistent communities every time the ordering of nodes to the algorithms changes.
We propose two quantitative metrics to represent the level of consistency across multiple runs of an algorithm: pairwise membership probability and consistency. Based on these two metrics, we propose a solution that improves the consistency without compromising the modularity. We demonstrate that our solution to use pairwise membership probabilities as link weights generates consistent communities within six or fewer cycles for most networks. However, our iterative, pairwise membership reinforcing approach does not deliver convergence for Flickr, Orkut, and Cyworld networks as well for the rest of the networks. Our approach is empirically driven and is yet to be shown to produce consistent output analytically. We leave further investigation into the topological structure and its impact on the consistency as future work.
In order to evaluate the quality of clustering, we have looked at 3 of the 48 communities identified in the AS graph. Surprisingly, all have either hierarchical, geographical, or topological interpretations to their groupings. Our preliminary evaluation of the quality of communities is promising. We plan to conduct more thorough evaluation of the communities and study network structures and their evolutions using our approach.
- AS Ranking, Caida. http://as-rank.caida.org/.Google Scholar
- CIDR report. http://www.cidr-report.org/as2.0/.Google Scholar
- RIPE Data Search. http://www.db.ripe.net/whois.Google Scholar
- Y.-Y. Ahn et al. Analysis of topological characteristics of huge online social networking services. In WWW '07, pages 835--844, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- R. Albert et al. Internet: Diameter of the world-wide web. Nature, 401(6749):130--131, 1999.Google ScholarCross Ref
- D. Alderson et al. Understanding Internet topology: principles, models, and validation. IEEE/ACM Trans. Netw., 13(6):1205--1218, 2005. Google ScholarDigital Library
- N. A. Alves. Unveiling community structures in weighted networks. Phys. Rev. E., 76(3):036101, 2007.Google ScholarCross Ref
- A. Arenas et al. Synchronization reveals topological scales in complex networks. Phys. Rev. Lett., 96(11):114102, 2006.Google ScholarCross Ref
- L. Backstrom et al. Group formation in large social networks: membership, growth, and evolution. In KDD '06, pages 44--54, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- V. D. Blondel et al. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008 (12pp), 2008.Google Scholar
- U. Brandes and T. Erlebach. Network Analysis: Methodological Foundations. Springer, March 2005. Google ScholarDigital Library
- H. Chang et al. Towards capturing representative AS-level Internet topologies. Computer Networks, 44(6):737 -- 755, 2004. Google ScholarDigital Library
- D. M. Chavis and A. Wandersman. Sense of community in the urban environment: a catalyst for participation and community development. American Journal of Community Psychology, 18:55--81, 2002.Google ScholarCross Ref
- Q. Chen et al. The origin of power-laws in Internet topologies revisited. In IEEE INFOCOM, volume 2, pages 608--617. IEEE, 2002.Google Scholar
- A. Clauset. Finding local community structure in networks. Phys. Rev. E, 72(2):026132, Aug 2005.Google ScholarCross Ref
- A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Phys. Rev. E., 70:066111, 2004.Google ScholarCross Ref
- Cyworld. http://www.cyworld.com.Google Scholar
- L. Danon et al. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005(09):P09008+, September 2005.Google ScholarCross Ref
- I. Derenyi et al. Clique percolation in random networks. Phys. Rev. Lett., 94(16), 2005.Google ScholarCross Ref
- Y.-H. Eom et al. Evolution of weighted scale-free networks in empirical data. Phys. Rev. E., 77(5):056105, 2008.Google ScholarCross Ref
- M. Faloutsos et al. On power-law relationships of the Internet topology. In SIGCOMM '99, pages 251--262, New York, NY, USA, 1999. ACM. Google ScholarDigital Library
- G. W. Flake et al. Self-organization and identification of web communities. IEEE Computer, 35:66--71, 2002. Google ScholarDigital Library
- S. Fortunato and M. Barthélemy. Resolution limit in community detection. Proc. Natl. Acad. Sci. U.S.A., 104(1):36--41, 2007.Google ScholarCross Ref
- S. Fortunato and C. Castellano. Community structure in graphs, http://arxiv.org/abs/0712.2716, Dec 2007.Google Scholar
- M. Girvan and M. E. Newman. Community structure in social and biological networks. Proc. Natl. Acad. Sci. U.S.A., 99(12):7821--7826, June 2002.Google ScholarCross Ref
- R. Guimerà et al. Modularity from fluctuations in random graphs and complex networks. Phys. Rev. E, 70(2):025101, Aug 2004.Google ScholarCross Ref
- R. Guimera et al. The worldwide air transportation network: Anomalous centrality, community structure, and cities' global roles. Proc. Natl. Acad. Sci. U.S.A., 102(22):7794--7799, 2005.Google ScholarCross Ref
- R. Guimera et al. Classes of complex networks defined by role-to-role connectivity profiles. Nature Physicsm, 3(1):63--69, January 2007.Google ScholarCross Ref
- R. Guimera and L. A. N. Luis. Functional cartography of complex metabolic networks. Nature, 433(7028):895--900, February 2005.Google ScholarCross Ref
- M. Gustafsson et al. Comparison and validation of community structures in complex networks. Physica A: Statistical Mechanics and its Applications, 367:559--576, July 2006.Google ScholarCross Ref
- H. Jeong et al. Lethality and centrality in protein networks. Nature, 411(6833):41--42, May 2001.Google ScholarCross Ref
- D. Kempe et al. Maximizing the spread of influence through a social network. In KDD '03, pages 137--146, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- B. Krishnamurthy and J. Wang. Topology modeling via cluster graphs. In IMW '01: Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement, pages 19--23, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- Y. I. Leon-Suematsu and K. Yuta. A framework for fast community extraction of large-scale networks. In WWW '08, pages 1215--1216, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- J. Leskovec et al. Statistical properties of community structure in large social and information networks. In WWW '08, pages 695--704, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- L. Li et al. A first-principles approach to understanding the Internet's router-level topology. SIGCOMM Comput. Commun. Rev., 34(4):3--14, 2004. Google ScholarDigital Library
- D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. In CIKM '03, pages 556--559, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- Z. Liu and B. Hu. Epidemic spreading in community networks. Europhys. Lett., 72(2), 2005.Google ScholarCross Ref
- B. Long et al. Community learning by graph approximation. In ICDM '07, pages 232--241, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarDigital Library
- S. Lozano et al. Mesoscopic structure conditions the emergence of cooperation on social networks. PLoS ONE, 3(4):e1892, 04 2008.Google ScholarCross Ref
- D. W. Mcmillan and D. M. Chavis. Sense of community: A definition and theory. Journal of Community Psychology, 14(1):6--23, 1986.Google ScholarCross Ref
- A. Mislove. http://socialnetworks.mpi-sws.org.Google Scholar
- A. Mislove et al. Growth of the flickr social network. In Proceedings of the 1st ACM SIGCOMM Workshop on Social Networks (WOSN'08), August 2008. Google ScholarDigital Library
- B. A. Nardi et al. Why we blog. Commun. ACM, 47(12):41--46, 2004. Google ScholarDigital Library
- M. E. J. Newman. Fast algorithm for detecting community structure in networks. Phys. Rev. E., 69:066133, 2004.Google ScholarCross Ref
- M. E. J. Newman. Modularity and community structure in networks. Proc. Natl. Acad. Sci. U.S.A., 103:8577, 2006.Google ScholarCross Ref
- M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E, 69(2):026113, Feb 2004.Google ScholarCross Ref
- R. Oliveira et al. Observing the evolution of Internet AS topology. In SIGCOMM '07, pages 313--324, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- R. Oliveira et al. Quantifying the completeness of the observed Internet AS-level structure. Technical Report TR 080026, UCLA CS Department, September 2008.Google Scholar
- G. Palla et al. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814--818, 2005.Google ScholarCross Ref
- G. Palla et al. Quantifying social group evolution. Nature, 446:664--667, April 2007.Google ScholarCross Ref
- N. Pathak et al. Social topic models for community extraction. In The 2nd SNA-KDD Workshop'08, August 2008.Google Scholar
- F. Radicchi et al. Defining and identifying communities in networks. Proc. Natl. Acad. Sci. U.S.A., 101(9):2658--2663, 2004.Google ScholarCross Ref
- M. Rosvall and C. T. Bergstrom. An information-theoretic framework for resolving community structure in complex networks. Proc. Natl. Acad. Sci. U.S.A., 104(18):7327--7331, 2007.Google ScholarCross Ref
- M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. U.S.A., 105(4):1118--1123, 2008.Google ScholarCross Ref
- B. Saha and L. Getoor. Group proximity measure for recommending groups in online social networks. In The 2nd SNA-KDD Workshop'08. ACM, August 2008.Google Scholar
- S. Tauro et al. A simple conceptual model for the Internet topology. In GLOBECOM '01, volume 3, pages 1667--1671 vol.3, 2001.Google ScholarCross Ref
- B. Viswanath et al. On the evolution of user interaction in facebook. In Proceedings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN'09), August 2009. Google ScholarDigital Library
- K. Wakita and T. Tsurumi. Finding community structure in mega-scale social networks. CoRR, abs/cs/0702048, 2007.Google Scholar
- S. Wasserman et al. Social Network Analysis : Methods and Applications. Cambridge University Press, November 1994.Google ScholarCross Ref
- Wikipedia. http://www.wikipedia.org.Google Scholar
- W. W. Zachary. An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33:452--473, 1977.Google ScholarCross Ref
Index Terms
- Mining communities in networks: a solution for consistency and its evaluation
Recommendations
Community detection using a measure of global influence
SNAKDD'08: Proceedings of the Second international conference on Advances in social network mining and analysisThe growing popularity of online social networks gave researchers access to large amount of network data and renewed interest in methods for automatic community detection. Existing algorithms, including the popular modularity-optimization methods, look ...
Structure and Overlaps of Ground-Truth Communities in Networks
Special Issue on Linking Social Granularity and FunctionsOne of the main organizing principles in real-world networks is that of network communities, where sets of nodes organize into densely linked clusters. Even though detection of such communities is of great interest, understanding the structure ...
doGooder: fostering volunteer communities to serve the homeless
CHI EA '08: CHI '08 Extended Abstracts on Human Factors in Computing SystemsWe developed an online system, doGooder, to foster volunteer social networks. Through an extensive user-centered design process, including interviews and a literature review, we learned that people experiencing homelessness face a wide range of issues. ...
Comments