Abstract
Exponential random graph models (ERGMs) are a well-established family of statistical models for analyzing social networks. Computational complexity has so far limited the appeal of ERGMs for the analysis of large social networks. Efficient computational methods are highly desirable in order to extend the empirical scope of ERGMs. In this paper we report results of a research project on the development of snowball sampling methods for ERGMs. We propose an auxiliary parameter Markov chain Monte Carlo (MCMC) algorithm for sampling from the relevant probability distributions. The method is designed to decrease the number of allowed network states without worsening the mixing of the Markov chains, and suggests a new approach for the developments of MCMC samplers for ERGMs. We demonstrate the method on both simulated and actual (empirical) network data and show that it reduces CPU time for parameter estimation by an order of magnitude compared to current MCMC methods.
Similar content being viewed by others
References
Borgatti, S.P., Mehra, A., Brass, D.J., Labianca, G.: Network analysis in the social sciences. Science 323, 892–895 (2009)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Jackson, M.O.: Social and Economic Networks. Princeton University Press, Princeton (2008)
Friedkin Noah, E.: A Structural Theory of Social Influence. Cambridge University Press, Cambridge (2006)
Ward, M.D., Stovel, K., Sacks, A.: Network analysis and political science. Annu. Rev. Polit. Sci. 14, 245–264 (2011)
Hollway, J., Koskinen, J.: Multilevel embeddedness: the case of the global fisheries governance complex. Soc. Netw. 44, 281–294 (2016)
Christakis, N.A., Fowler, J.H.: The spread of obesity in a large social network over 32 years. N. Engl. J. Med. 357(4), 370–379 (2007)
Kretzschmar, M., Morris, M.: Measures of concurrency in networks and the spread of infectious disease. Math. Biosci. 133(2), 165–195 (1996)
Rolls, D.A., Wang, P., Jenkinson, R., Pattison, P.E., Robins, G.L., Sacks-Davis, R., Daraganova, G., Hellard, M., McBryde, E.: Modelling a disease-relevant contact network of people who inject drugs. Soc. Netw. 35(4), 699–710 (2013)
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)
Newman, M.E., Park, J.: Why social networks are different from other types of networks. Phys. Rev. E 68(3), 036122 (2003)
Newman, M.E., Watts, D.J., Strogatz, S.H.: Random graph models of social networks. Proc. Natl. Acad. Sci. 99(suppl 1), 2566–2572 (2002)
Schweitzer, F., Fagiolo, G., Sornette, D., Vega-Redondo, F., Vespignani, A., White, D.R.: Economic networks: the new challenges. Science 325(5939), 422 (2009)
Snijders, T.A.: Statistical models for social networks. Annu. Rev. Sociol. 37, 131–153 (2011)
Frank, O., Strauss, D.: Markov graphs. J. Am. Stat. Assoc. 81(395), 832–842 (1986)
Lusher, D., Koskinen, J., Robins, G.: Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications. Cambridge University Press, Cambridge (2012)
Snijders, T.A., Pattison, P.E., Robins, G.L., Handcock, M.S.: New specifications for exponential random graph models. Sociol. Methodol. 36(1), 99–153 (2006)
Snijders, T.A.: Markov chain Monte Carlo estimation of exponential random graph models. J. Soc. Struct. 3(2), 1–40 (2002)
Hummel, R.M., Hunter, D.R., Handcock, M.S.: Improving simulation-based algorithms for fitting ERGMs. J. Comput. Gr. Stat. 21(4), 920–939 (2012)
Caimo, A., Friel, N.: Bayesian inference for exponential random graph models. Soc. Netw. 33(1), 41–55 (2011)
Jin, I.H., Yuan, Y., Liang, F.: Bayesian analysis for exponential random graph models using the adaptive exchange sampler. Stat. Interface 6(4), 559 (2013)
Wang, J., Atchadé, Y.F.: Approximate Bayesian computation for exponential random graph models for large social networks. Commun. Stat. Simul. Comput. 43(2), 359–377 (2014)
Handcock, M.S., Hunter, D.R., Butts, C.T., Goodreau, S.M., Morris, M.: statnet: Software tools for the representation, visualization, analysis and simulation of network data. J. Stat. Softw. 24(1), 1548 (2008)
Wang, P., Robins, G., Pattison, P.: PNet: Program for the Estimation and Simulation of p* Exponential Random Graph Models, User Manual. Department of Psychology, University of Melbourne, Melbourne (2006)
Mira, A.: Ordering and improving the performance of Monte Carlo Markov chains. Stat. Sci. 16, 340–350 (2001)
Sengupta, B., Friston, K.J., Penny, W.D.: Gradient-free MCMC methods for dynamic causal modelling. NeuroImage 112, 375–381 (2015)
Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1), 97–109 (1970)
Swendsen, R.H., Wang, J.-S.: Replica Monte Carlo simulation of spin-glasses. Phys. Rev. Lett. 57(21), 2607 (1986)
Barkema, G., Newman, M.: New Monte Carlo algorithms for classical spin systems. arXiv preprint cond-mat/9703179 (1997)
Swendsen, R.H., Wang, J.-S.: Nonuniversal critical dynamics in Monte Carlo simulations. Phys. Rev. Lett. 58(2), 86 (1987)
Fischer, R., Leitão, J.C., Peixoto, T.P., Altmann, E.G.: Sampling motif-constrained ensembles of networks. Phys. Rev. Lett. 115(18), 188701 (2015)
Pakman, A., Paninski, L.: Auxiliary-variable exact Hamiltonian Monte Carlo samplers for binary distributions. In: Advances in Neural Information Processing Systems, pp. 1–9 (2013)
Hunter, D.R., Handcock, M.S.: Inference in curved exponential family models for networks. J. Comput. Gr. Stat. 15(3), 565–583 (2006)
Hunter, D.R.: Curved exponential family models for social networks. Soc. Netw. 29(2), 216–230 (2007)
Tierney, L.: Markov chains for exploring posterior distributions. Ann. Stat. 22, 1701–1728 (1994)
Hunter, D.R., Handcock, M.S., Butts, C.T., Goodreau, S.M., Morris, M.: ergm: a package to fit, simulate and diagnose exponential-family models for networks. J. Stat. Softw. 24(3), nihpa54860 (2008)
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)
McAllister, R.R., McCrea, R., Lubell, M.N.: Policy networks, stakeholder interactions and climate adaptation in the region of South East Queensland. Aust. Reg. Environ. Change 14(2), 527–539 (2014)
Niekamp, A.-M., Mercken, L.A., Hoebe, C.J., Dukers-Muijrers, N.H.: A sexual affiliation network of swingers, heterosexuals practicing risk behaviours that potentiate the spread of sexually transmitted infections: a two-mode approach. Soc. Netw. 35(2), 223–236 (2013)
Morris, M., Handcock, M.S., Hunter, D.R.: Specification of exponential-family random graph models: terms and computational aspects. J. Stat. Softw. 24(4), 1548 (2008)
Cowles, M.K., Carlin, B.P.: Markov chain Monte Carlo convergence diagnostics: a comparative review. J. Am. Stat. Assoc. 91(434), 883–904 (1996)
Plummer, M., Best, N., Cowles, K., Vines, K.: CODA: convergence diagnosis and output analysis for MCMC. R News 6(1), 7–11 (2006)
Stivala, A.D., Koskinen, J.H., Rolls, D.A., Wang, P., Robins, G.L.: Snowball sampling for estimating exponential random graph models for large networks. Soc. Netw. 47, 167–188 (2016). doi:10.1016/j.socnet.2015.11.003
Pattison, P.E., Robins, G.L., Snijders, T.A., Wang, P.: Conditional estimation of exponential random graph models from snowball sampling designs. J. Math. Psychol. 57(6), 284–296 (2013)
Efron, B.: Better bootstrap confidence intervals. J. Am. Stat. Assoc. 82(397), 171–185 (1987)
Newman, M.E.: The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. 98(2), 404–409 (2001)
Roberts, G.O., Gelman, A., Gilks, W.R.: Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Prob. 7(1), 110–120 (1997)
Iwashyna, T.J., Christie, J.D., Moody, J., Kahn, J.M., Asch, D.A.: The structure of critical care transfer networks. Med. Care 47(7), 787 (2009)
Lomi, A., Pallotti, F.: Relational collaboration among spatial multipoint competitors. Soc. Netw. 34(1), 101–111 (2012)
Haario, H., Laine, M., Mira, A., Saksman, E.: (DRAM: efficient adaptive MCMC. Stat. Comput. 16(4), 339–354 (2006)
Acknowledgments
This work was funded by PASC project “Snowball sampling and conditional estimation for exponential random graph models for large networks in high performance computing” and was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID c09. This research was also supported by a Victorian Life Sciences Computation Initiative (VLSCI) grant number VR0261 on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian Government, Australia.
Author information
Authors and Affiliations
Corresponding author
Additional information
Generous support from the Swiss National Platform of Advanced Scientific Computing (PASC) is gratefully acknowledged.
Appendix: IFD Sampler Algorithm
Appendix: IFD Sampler Algorithm
Let RN be a random uniform number between 0 and 1. m, M and K are constants.
The algorithm of the suggested IFD sampler is described below.
-
1.
Initialization \(t=0\);
-
2.
Initialization \(N_{A} =0\); \(N_{D} =0\); Increment t;
-
3.
While (\(N_{D}+N_{A}< m)\)
-
3.1.
Increment \(N_{A}\). Chose uniformly random null dyad (\(x_{ij}=0\)). Using (19) calculate probability P to toggle its value. If \(P\ge \) RN toggle \({{{\varvec{x}}_{{\varvec{ij}}}}}\) value and go to step 3.2. If \(P < RN\) go to step 3.1.
-
3.2.
Increment \(N_{D}\). Chose uniformly random non-null dyad (\(x_{ij}=1\)). Using (19) calculate probability P to toggle its value. If \(P\ge RN\) toggle \({{{\varvec{x}}_{{\varvec{ij}}}}}\) value and go to step 3.1. If \(P < RN\) go to step 3.2.
-
3.1.
-
4.
Update auxiliary parameter: \(V_{\hbox {t}} =V_{\hbox {t-1}} -K\cdot \hbox {sgn}(N_{D} -N_{A} )(N_{D} -N_{A} )^2\)
-
5.
A check of conditions (15) is performed. If \(N_{D} \approx N_{A}\) than (15) is satisfied. If \(\left| {N_{D} -N_{A} } \right| /(N_{D} +N_{A} )>0.8\) than larger K value may be required.
-
6.
If (\(t < M\)) go to step 2.
Here M is the minimum number of steps required in order to reach the stationary distribution [18]. The value of m is suggested to be 100. The value of K is suggested to be small, \(K=\) 10\(^{\mathrm {-5}}\). If this value is too small it is determined on step 5 of the above algorithms and the K value is increased. Though any initial value of auxiliary parameter \(V_{0}\) may be used, we used such a value that satisfies (15). It can be easily estimated on a pre-computing step before MCMC simulation. It was done by making a small number of steps (\(M=10\)) of the above algorithm but without modification of x (“toggle \(x_{ij}\) value” instruction is not executed on the pre-computing step).
Rights and permissions
About this article
Cite this article
Byshkin, M., Stivala, A., Mira, A. et al. Auxiliary Parameter MCMC for Exponential Random Graph Models. J Stat Phys 165, 740–754 (2016). https://doi.org/10.1007/s10955-016-1650-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-016-1650-5