Elsevier

Social Networks

Volume 52, January 2018, Pages 180-191
Social Networks

Change we can believe in: Comparing longitudinal network models on consistency, interpretability and predictive power

https://doi.org/10.1016/j.socnet.2017.08.001Get rights and content

Highlights

  • We compare auto-regressive and process-based network models on examples TERGM & SAOM.

  • The TERGM has no consistent interpretation on the tie-level and on network change.

  • TERGM parameters strongly depend on time between measurements.

  • The SAOM suffers from neither of these limitations.

  • Both models perform poorly in out-of-sample prediction of individual ties.

Abstract

While several models for analysing longitudinal network data have been proposed, their main differences, especially regarding the treatment of time, have not been discussed extensively in the literature. However, differences in treatment of time strongly impact the conclusions that can be drawn from data. In this article we compare auto-regressive network models using the example of TERGMs – a temporal extensions of ERGMs – and process-based models using SAOMs as an example. We conclude that the TERGM has, in contrast to the ERGM, no consistent interpretation on tie-level probabilities, as well as no consistent interpretation on processes of network change. Further, parameters in the TERGM are strongly dependent on the interval length between two time-points. Neither limitation is true for process-based network models such as the SAOM. Finally, both compared models perform poorly in out-of-sample prediction compared to trivial predictive models.

Introduction

The study of social networks is increasingly concerned with modelling network change over time, as longitudinal analysis is usually better equipped for finding explanations and testing theories about the evolution of networks as well as the impact its structure has on constituent nodes (e.g. Steglich et al., 2010). Network analysis over time commonly uses network panel data: a network structure among the same set of nodes that is observed at two or more time points. By now (this is written in early 2017), several statistical approaches are available to analyse such data sets. The most widely used are the stochastic actor-oriented model (SAOM; Snijders et al., 2010b) and several extensions to the exponential random graph model (ERGM; Lusher et al., 2013). These models and variations may appear almost indistinguishable to scientists interested in applying inferential methods to network panel data. However, they rest on quite different statistical assumptions that strongly affect the kind of inference one can draw from the estimated model parameters and, thus, the kind of questions that can be answered with each method.

While statistical models can be compared on many dimensions, we mainly focus in this article on differences in how they treat time. In particular, we discuss the difference between discrete-time, auto-regressive models and continuous-time, process-based models. Due to its increased use (e.g. in McFarland et al., 2014) and recent claims about its advantage relative to other models for network panel data (Desmarais and Cranmer, 2012, Leifeld and Cranmer, 2016), we choose the TERGM (or temporal ERGM) for this comparison case to represent auto-regressive models.1 The continuous-time model we discuss for comparison is the SAOM. Note that for ERGMs both continuous-time and auto-regressive extensions have been proposed – we focus on the latter group. The purpose is to compare the principles of auto-regressive and continuous-time network models and not the relative merits of either particular model – the two cases can be seen as representations of their respective model classes. This article highlights, by way of illustration, the most important differences in assumptions and their interpretive implications between these approaches and thus facilitates the applied researcher’s decision which to use in their own research.

When comparing statistical models it is tempting to ask which model is “better”. However, “better” implicates at least two quite different dimensions: explanation and prediction. On the one hand, it has been argued that accurate prediction is a chief criterion of a “good” model (Friedman, 1953, Jasso, 1988). Intuitively a “good” model should be able to extrapolate accurately into the future, which can be tested for a single dataset by simple out-of-sample prediction. At the same time, the criterion for what should be predicted correctly in a model with dependent data (such as networks) is not trivial, as a network is more than just an series of independent tie observations but also the structures that these ties form (see discussion in Section 5).

On the other hand, it has been argued that the endeavour of social science is not to predict, but to explain and understand the world (Hedström, 2005, Elster, 2007). Models with absurd assumptions or intractable algorithms can generate fairly accurate predictions, but teach us little about the world. Social mechanisms, by contrast, can help us explain the social world and inform our understanding of our own and others’ behaviour, but their concatenation in complex ways means that only in the simplest of systems can we expect this to result in accurate prediction at a micro-level. Indeed, even models with poor predictive power can generate valuable insights (see also Epstein 2008). In this line of reasoning, a good model is characterised by reasonable assumptions, as well as by clear interpretability of parameters in light of social mechanisms.

In this paper, we do not necessarily advocate for one or the other position, but investigate how different model assumptions make them applicable to different questions and thus to different empirical problems. As such, we elaborate what conclusions can be drawn from estimated parameters using the SAOM or the TERGM.

The remainder of the article is organised as follows. We first introduce the two different longitudinal/temporal network models (Section 2), and highlight their main features from a statistical point of view. The first main distinguishing feature of the model that is discussed concerns whether it is actor-oriented or tie-oriented (Section 3). Subsequently, the treatment of time is examined. Focus is on the interpretation of parameters and model consistency with regards to the differences between auto-regressive compared to process-based modelling (Section 4). The different treatment of time and how that influences parameters is shown in an empirical example. Finally, we demonstrate that both models perform poorly in out-of-sample prediction (Section 5) across two datasets, suggesting that we need to be careful as to the purposes of longitudinal network research.

Section snippets

The models

A social network needs to be understood as a system of interdependent units. Whether one is interested in the details of network dependencies or just needs to control for them, research on networks requires statistical tools that can adequately deal with this challenge. The model families that most explicitly deal with dependencies for such inferential-statistical analysis of social network data are exponential random graph models (ERGMs; Frank and Strauss, 1986, Pattison and Wasserman, 1999,

Actor vs. tie based modelling

The principal difference between any model from the ERGM family and the SAOM family discussed in the literature is that the former is “tie-oriented” while the latter is “actor-oriented” (Block et al., 2016). In a very general sense, this means that the locus of modelling differs between the models. The former models whether a tie is likely to exist depending on how it is embedded in substructures in the network. This is reflected by the common interpretation of ERGM parameters, which provide

Process-based vs. auto-regressive modelling

The more important division among the discussed longitudinal network models for the comparison at hand is how they treat time. As argued in Section 2, continuous-time and discrete-time models differ fundamentally: the former models a process whereas the latter models a cross-sectional observation using a previous time-point as a predictor. The conceptualisation of time upon which a model is based strongly impacts how parameters can be interpreted and, accordingly, which kind of research

Model performance for tie prediction

After outlining issues of time-dependence of parameters that affect interpretation in the previous section, we now turn to discussing predictive power of the two models under analysis. We first discuss prediction of out-of-sample dependence structures more generally with the conclusion that a model specification should be available that gives a reasonable out-of-sample fit for either model. Then we test recent claims that TERGM provides greater predictive power on the tie-level and thus should

Discussion and conclusion

Several approaches to analysing longitudinal network data have been proposed in the literature on statistical network modelling recently. In this article, we compared two of these models that are being applied by practical researchers – the SAOM and the TERGM. The former is a process-based, continuous-time model in which dependence unfolds over time, while in the latter, network dependence is modelled within an observation and decoupled from time. This means that the SAOM, and other

Acknowledgements

The authors would like to thank the network groups at Nuffield College, Groningen University, ETH Zürich and University of Melbourne for their valuable comments and feedback to this work.

References (49)

  • M.A.J. Duijn et al.

    p2: a random effects model with covariates for directed graphs

    Stat. Neerlandica

    (2004)
  • J. Elster

    Explaining Social Behavior: More Nuts and Bolts for the Social Sciences

    (2007)
  • J.M. Epstein

    Why model?

    J. Artif. Soc. Social Simul.

    (2008)
  • O. Frank et al.

    Markov graphs

    J. Am. Stat. Assoc.

    (1986)
  • M. Friedman

    The methodology of positive economics

  • T. Gneiting et al.

    Probabilistic forecasts, calibration and sharpness

    J. R. Stat. Soc. Ser. B: Stat. Methodol.

    (2007)
  • T.M. Hamill

    Interpretation of rank histograms for verifying ensemble forecasts

    Mon. Weather Rev.

    (2001)
  • S. Hanneke et al.

    Discrete temporal models of social networks

    Electron. J. Stat.

    (2010)
  • P. Hedström

    Dissecting the Social: On the Principles of Analytical Sociology

    (2005)
  • P.W. Holland et al.

    A dynamic model for social networks

    J. Math. Sociol.

    (1977)
  • D.R. Hunter et al.

    Inference in curved exponential family models for networks

    J. Comput. Graphical Stat.

    (2006)
  • D.R. Hunter et al.

    Goodness of fit of social network models

    J. Am. Stat. Assoc.

    (2008)
  • D.R. Hunter et al.

    Ergm: a package to fit, simulate and diagnose exponential-Family models for networks

    J. Stat. Softw.

    (2008)
  • G. Jasso

    Principles of theoretical analysis

    Sociol. Theory

    (1988)
  • Cited by (74)

    View all citing articles on Scopus
    View full text