An introduction to exponential random graph (p*) models for social networks
Section snippets
Why model social networks?
There are many well-known techniques that measure properties of a network, of the nodes, or of subsets of nodes (e.g., density, centrality and cohesive subsets). These techniques serve valuable purposes in describing and understanding network features that might bear on particular research questions. Why, then, might we want to go beyond these techniques and search for a well-fitting model of an observed social network, and in particular a statistical model? Reasons for doing so include the
The logic behind p* models for social networks—an outline1
We describe as the observed network the network data the researcher has collected and is interested in modeling. The observed network is regarded as one realization from a set of possible networks with similar important characteristics (at the very least, the same number of actors), that is, as the outcome of some (unknown) stochastic process. In other words, the observed network is seen as one particular pattern of ties out of a large set of possible patterns. In general, we do not know what
The general form of the exponential random graph model: dependence assumptions and parameter constraints
Exponential random graph models have the following form:where (i) the summation is over all configurations A; (ii) ηA is the parameter corresponding to the configuration A (and is non-zero only if all pairs of variables in A are assumed to be conditionally dependent);2 (iii) is the network statistic corresponding to configuration A; gA(y) = 1 if the configuration is observed in the network y, and is 0 otherwise;3
Bernoulli graphs: the simplest dependence assumption
Bernoulli random graph distributions are generated when we assume that edges are independent, for instance if they occur randomly according to a fixed probability α (see Erdös and Renyi, 1959, Frank and Nowicki, 1993). The dependence assumption is simple in this case: all possible distinct ties are independent of one another. We noted above that the only configurations relevant to the model are those in which all possible ties in the configuration are conditionally dependent on each other. When
Estimation
Anderson et al. (1999) in their p* primer used pseudo-likelihood estimation introduced by Strauss and Ikeda (1990) in order to estimate the parameters of Markov models. We now know that, depending on the data, there may be serious problems with pseudo-likelihood estimates for these models. But for Markov random graph models, standard maximum likelihood estimation is not tractable for any but very small networks, because of the difficulties in calculating the normalizing constant in Eq. (1).
A short example: a Markov random graph model for Medici business network
Other papers in this special edition provide examples of fitting exponential random graph models to data, so here we present a very short example. We fit a Markov random graph model for the well-known non-directed network of business connections among 16 Florentine families, available in UCINET 5 (Borgatti et al., 1999). (For a fuller description of the context of the data, see Padgett and Ansell, 1993.) The model includes edge, two-star, three-star and triangle parameters as in Eq. (4). This
Conclusion
This article provides an introductory exposition of the formulation and application of exponential random graph models for social networks. We have concentrated on presenting the underlying logic and derivation of these models. Given the limitations of space, we have only given summary attention to more recent developments which will be discussed in other papers in this special edition.
Recent work on the Markov random graph models of Frank and Strauss (1986) shows that they may be inadequate
Acknowledgements
We thank an anonymous reviewer for helpful comments in improving earlier versions of the paper. This research was assisted by grants from the Australian Research Council.
References (39)
- et al.
A p* primer: logit models for social networks
Social Networks
(1999) - et al.
Multiplexity, generalized exchange and cooperation in organizations
Social Networks
(1999) - et al.
Position in formal structure, personal characteristics and choices of advisors in a law firm: a logistic regression model for dyadic network data
Social Networks
(1997) - et al.
Network models for social selection processes
Social Networks
(2001) - et al.
Models for social networks with missing data
Social Networks
(2004) Spatial interaction and the statistical analysis of lattice systems
Journal of the Royal Statistical Society, Series B
(1974)Statistical analysis of non-lattice data
The Statistician
(1975)- et al.
UCINET 5 for Windows: Software for Social Network Analysis
(1999) - et al.
Testing multi-theoretical multilevel hypotheses about organizational networks: an analytic framework and empirical example
Academy of Management Journal
(2006) - et al.
On random graphs. I
Publicationes Mathematicae (Debrecen)
(1959)