Elsevier

Social Networks

Volume 35, Issue 2, May 2013, Pages 211-222
Social Networks

Exponential random graph model specifications for bipartite networks—A dependence hierarchy

https://doi.org/10.1016/j.socnet.2011.12.004Get rights and content

Abstract

In this paper, we review the development of dependence structures for exponential random graph models for bipartite networks, and propose a hierarchy of dependence structures within which different dependence assumptions may be located. Based on this hierarchy, we propose a new set of model specifications by including bipartite graph configurations involving more than four nodes. We discuss the theoretical significance of the various effects that the extended models afford, and illustrate application of this hierarchy of models to several bipartite networks related to the political mobilization in Brazil in the early 1990s (Mische, 2007).

Introduction

Exponential random graph models (ERGMs), also known as p* models, introduced by Frank and Strauss (1986) and Wasserman and Pattison (1996) allow global network structure to be understood by considering/focusing on local processes. Such models assign probabilities to networks based on local network configurations which representing the local network processes. The local network effects included in a particular model specification are based on dependence assumptions between network tie variables. From the Bernoulli random graph models (Erdös and Rényi, 1960) to the recent social circuit models (Snijders et al., 2006, Robins et al., 2007a, Robins et al., 2007b), the ERGM specifications for one-mode networks are enriched by various forms of dependence assumptions and the inferred endogenous network effects. The tie dependence assumptions not only generate the model specifications but also provide guidance on the interpretation of estimated effects.

Based on an analogous set of dependence assumptions, ERGM specifications for bipartite networks were developed by Skvoretz and Faust (1999), Pattison and Robins, 2002, Pattison and Robins, 2004, Agneessens and Roose (2008) and Wang et al. (2009). Bipartite ERGMs with exogenous effects were also proposed by Agneessens et al. (2004) and Agneessens and Roose (2008). MCMC maximum likelihood estimation methods (e.g. Snijders, 2002) provide us with more reliable model estimates than the older pseudolikelihood approaches (Frank and Strauss, 1986, Strauss and Ikeda, 1990; see also van Duijn et al., 2009, for a comparison), so that these models can be estimated in a principled way. Wang et al. (2009) introduced an ERGM specification for bipartite networks which adapted the social circuit model for one-mode networks (Snijders et al., 2006), and applied a geometric weighting on related graph statistics to alleviate the issue of model degeneracy (Handcock, 2003).

In this paper, we review and discuss the development of dependence structures for exponential random graph (p*) models for bipartite networks, and propose a hierarchy of dependence structures within which different dependence assumptions may be located. The motivation behind developing such hierarchy is that first of all, the current ERGM specifications have difficulty in model convergence for some observed networks; and, secondly, not all converged models provide good fit to the observed network. These two points suggest the need for more elaborate model specifications which may include more complicated graph statistics. This raises the question of how potential ERGM specifications may be developed in a systematic manner. The proposed hierarchy is based on graph theoretical properties, and aims to provide a theoretical framework to extend the current set of dependence hypotheses for ERGMs, such that new configurations can be introduced to the model specifications in a principled way. It is not intended to solve the model degeneracy problem with ERGMs, or to provide general rules for parameter selection in specific models. However, such issues may be alleviated by using the specifications we introduce based on the proposed hierarchy.

We generate the hierarchy by a generalized dependence assumption governed by two factors: the form of proximity between a potentially dependent pair of ties; and the graph theoretical distance with which that proximity is described. This hierarchy can be used to guide the exploration and development of models for observed bipartite network structures. As dependence becomes more complex within the hierarchy, more relaxed conditions govern the pairs of network ties that are considered conditionally dependent, and more complex network effects are introduced into ERGMs for bipartite networks. As a result, we propose a systematically elaborated set of model specifications which extend the models discussed by Wang et al. (2009) and include bipartite graph statistics involving more than four nodes. In particular, we focus on the new edge-cycle network configuration which represents an interaction between network closure and activity. Similar geometric weighting techniques are used for definitions of the higher order configurations to assist model convergence in MCMC estimation. We discuss the theoretical significance of the various effects that the extended models afford using simulations, and illustrate application of this hierarchy of models to bipartite networks related to the political mobilization in Brazil in the early 1990s (Mische, 2007). The simulation and modelling examples are conducted using the BPNet programme as an extension to PNet (Wang et al., 2006).

Section snippets

Bipartite networks and exponential random graph models

A bipartite network has two distinct sets of nodes, and ties are only defined between nodes from different sets. We label the two sets of nodes as A and B. With n nodes in set A, and m nodes in set B, we can represent a (n, m) bipartite network by an n by m matrix x. If there is a network tie between nodes i and j, then the cell entry of xij = 1, otherwise xij = 0. As ties are only defined between the two sets of nodes, bipartite networks have the feature that cycles of odd lengths do not occur.

A hierarchy of dependence assumptions

From the Erdös–Rényi, or Bernoulli random graph models (Erdös and Rényi, 1960) to the current social circuit models (Pattison and Robins, 2002, Pattison and Robins, 2004, Snijders et al., 2006, Agneessens and Roose, 2008, Wang et al., 2009) various dependence assumptions provide the theoretical background for the extension of ERGM specifications. In this section, we focus on the development of network tie dependence assumptions building on the set of realisation-dependent model forms described

Configurations added by the three-path model D1

In the model D1, two tie-variables Xij and Xkl are conditionally independent unless their one-neighbourhood overlaps, or N1({i, j})  N1({k, l})  ∅. Therefore, at least one tie should be present between nodes i and k, or j and l. All configurations in the D1 model have the property that every pair of edges lays on a path of length 3 or less. Hence configurations include the three-path (L3) configuration itself, as well as the simplest hub-connectors with geodesic distance between hubs equal to 1,

Modelling examples

To illustrate the importance of the proposed edge-cycle model specification, we use a dataset on student activists in the period of political mobilization in Brazil from mid 1991 to the end of 1992. The mobilization led to the impeachment of former Brazilian President Collor de Mello (see Mische, 2007, Mische et al., 1999). The dataset is a tripartite network describing the interlocking relationships between three sets of nodes: youth leaders, the political organizations of which they are

Youth by event network

The youth by event network records the 14 youth activists’ attendance at the 49 events. The data is shown in Fig. 9 where the youth activists are represented by circles, and the events are represented by squares. From the visualization we can see there was a group of events that attracted most of the youth as displayed in the centre of the visualization, and the rest of the events shown on the fringe of the figure only attracted certain youth. We will check whether this observation is

Organization by event network

The organization by event network records the 49 events within which the 23 organizations are formally represented by at least one member activist. Fig. 12 depicts the network where organizations are represented by triangles and events by squares. The degree distribution for organizations is quite dispersed, as shown in Fig. 13. There were six organizations being represented in more than 23 events with 34 events as the maximum, and the rest of organizations had 16 or fewer related events. The

Conclusion and discussion

From the Bernoulli random graph models to the social circuit models (Pattison and Robins, 2004, Snijders et al., 2006), the development of exponential random graph models has relied on progressively less restrictive dependence assumptions between network tie-variables. The specific realisation-dependent models for unipartite networks proposed by Snijders et al. (2006) led to the current bipartite model specification by Wang et al. (2009). These advances in ERGM specifications have made model

Acknowledgement

We thank Ann Mische for providing the data set used in this paper.

References (30)

  • O. Frank et al.

    Markov graphs

    Journal of the American Statistical Association

    (1986)
  • Handcock, M.S., 2003. Assessing degeneracy in statistical models of social networks, working paper no. 39. Center for...
  • M.S. Handcock et al.

    statnet: software tools for the representation, visualization, analysis and simulation of network data

    Journal of Statistical Software

    (2008)
  • D.R. Hunter et al.

    Goodness of fit of social network models

    Journal of the American Statistical Association

    (2008)
  • A. Mische

    Partisan Publics: Communication and Contention Across Brazilian Youth Activist Networks

    (2007)
  • Cited by (96)

    • Technology stocks: A study on the characteristics that help transfer public research to industry

      2021, Research Policy
      Citation Excerpt :

      From 5,000,000 simulated networks, we randomly sampled 3000 graphs to conduct goodness of fit tests with. The graph statistics used for the tests were drawn from Wang et al. (2013), the implementation software was MPNet (Wang et al., 2009), and the method followed Hunter et al. (2008) where sample networks and observed networks are compared using t-ratios. A small t-ratio indicates and adequate fit for that statistic.

    View all citing articles on Scopus
    View full text