Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The articles in this bundle are all associated with the notion of interaction and represent the genesis of the subject of graphical models in its modern form, the origins of these being traceable back to Gibbs [11] and Wright [30] and earlier.

Around 1976, Terry was fascinated by the notion of conditional independence, along the lines later published in Dawid [67]. In 1976, Terry invited me to Perth and we were running a daily research seminar with the theme of studying similarities and differences between Statistics and Statistical Mechanics. In particular, we wondered what the relations were between notions of interaction as represented in linear models, in multi-dimensional contingency tables, and in stochastic models for particle systems; in addition, the purpose was also to understand what was the relation between these concepts and conditional independence.

As we discovered that these were all essentially the same concepts, the similarity being obscured by very different traditions of notation, the term graphical model was coined. Our findings, also obtained in collaboration with John Darroch, were collected in Darroch et al. [4], and later expanded and published in Speed [24], Darroch et al. [5], and Darroch and Speed [3] as well as Lauritzen et al. [19] and to some extent Speed [25], the latter giving an overview of a number of different variants and proofs of what has become known as the Hammersley–Clifford theorem [142].

Of these articles, Darroch et al. [5] rather quickly had a seminal impact and a small community of researchers in the area of graphical models gradually emerged. In a certain sense, the article does not contain much formally new material (if any at all), but for the first time a simple, visual description and interpretation of the class of log-linear models [1213], which otherwise could seem obscure, was available. The interpretation of a subclass of the models in terms of conditional independence had an immediate intuitive appeal. In addition, the article identified and emphasized models represented by chordal or triangulated graphs as those where estimation and other issues had a particularly simple solution, the combinatorial theory of these graphs being further studied in Lauritzen et al. [19].

Darroch and Speed [3] studied the notion of interaction from an algebraic point of view in terms of fundamental decompositions of the linear space of functions on a product of finite sets; indeed it essentially but implicitly uses the fundamental decomposition of this space into irreducible components which are stable under a product of symmetric groups [9] and thus gives an elegant algebraic perspective on the Hammersley–Clifford theorem.

Towards the end of 1976, Terry serendipitously came across Wermuth [29], which identified that a completely analogous theory could be developed for the Gaussian case, with chordal graphs playing essentially the same role as in the case of log-linear models; indeed, Dempster [8] had developed the basic computational and statistical theory for these under the name of models for covariance selection. This fact and the corresponding interpretation was emphasized and discussed in Darroch et al. [4] as well as in Speed [2425], but received otherwise relatively little attention at the time. Gaussian graphical models have had a remarkable renaissance in connection with the modern analysis of high-dimensional data, for example concerning gene expression [1023]. Out of this early work with Gaussian graphical models grew also the article by Speed and Kiiveri [26], which describes and unifies a class of iterative algorithms for fitting Gaussian graphical models of which special cases previously had been considered by e.g. Dempster [8]. Essentially, there are two fundamental types, of which one initially uses the estimate under no restrictions and iteratively ensures that restrictions of the model are satisfied; the other type initially uses a trivial estimator and iteratively ensures that the likelihood equations are satisfied. The article elegantly shows that an abundance of hybrids of these algorithms can be constructed and gives a unified proof of their convergence.

The last two articles [1617], represent the genesis of what today is probably the most prolific and well-known type of graphical models; these are based on directed acyclic graphs and admitting interpretation in causal terms similar to that of structural equation models [1]. At the time when these articles appeared they were (undeservedly) largely ignored both by the statistical and structural equation communities. Graphical models based on directed acyclic graphs—now mostly known as Bayesian networks [21]—have an unquestionable prominence in current scientific literature, but the surge of interest in these models was in particular generated by the prolific research activities in computer science, where work such as, for example, Lauritzen and Spiegelhalter [18], Pearl [22], Spirtes et al. [27], Heckerman et al. [15], and Pearl [20] established these models as objects worthy of intense study. In retrospect, it is clear that the global Markov property defined in Kiiveri et al. [17] was not the optimal one as there are independence relations true in any Bayesian network that cannot be derived from it, but fundamentally this article establishes the correct class of directed Markov models for the first time and thus yields a conditional independence perspective on structural equation models, as later elaborated, for example by Spirtes et al. [28].