Order-invariant prior specification in Bayesian factor analysis
Introduction
Let be an -vector of observed random variables, which for simplicity we take to be centered. Let be a standard normal -vector of latent factors, with . The factor analysis model postulates that where is an unknown loading matrix, and is an -vector of normally distributed error terms that are independent of . The error terms are assumed to be mutually independent with comprising unknown positive variances that are also known as uniquenesses. This model with an unrestricted loading matrix is sometimes referred to as exploratory factor analysis—in contrast to confirmatory factor analysis, which refers to situations in which some collection of entries of is modeled as zero.
Integrating out the latent factors in (1.1), the observed random vector is seen to follow a centered multivariate normal distribution with covariance matrix As discussed in detail in Anderson and Rubin (1956), determines the unrestricted loading matrix only up to orthogonal rotation. Indeed, for any orthogonal matrix . More details on factor analysis can be found, for instance, in Bartholomew et al. (2011), Drton et al. (2007), and Mulaik (2010).
In this paper, we are concerned with Bayesian inference in (exploratory) factor analysis. In Bayesian computation, it is convenient to impose an identifiability constraint on the loading matrix . A common choice is to restrict to be lower triangular with nonnegative diagonal entries, that is, for and for ; see Geweke and Zhou (1996), Aguilar and West (2000), Lopes and West (2004) and Chapter 12 in Congdon (2001). Under these constraints, a full rank matrix is uniquely determined by . In the papers just referenced and also the software implementation provided by Martin et al. (2011), a default prior on the lower triangular loading matrix has all its non-zero entries independent with Here, denotes a truncated normal distribution on , i.e., the conditional distribution of given for . The variance is a hyperparameter. The prior distribution for the uniquenesses has independent of and also mutually independent with Inverse Gamma distribution, for hyperparameters . Equivalently, is chi-square distributed with degrees of freedom; compare Eqn. (26) in Geweke and Zhou (1996).
In the approach that was just described and that will be in the focus of this paper, the prior for the loadings is derived from centered normal distributions with common variance. While this can be restrictive, it is a frequently used default, possibly due to a lack of prior information that warrants non-zero means or unequal variances; see e.g. Section 6 in Ansari et al. (2002). This said, many generalizations have been discussed. For instance, Ghosh and Dunson (2009), Bhattacharya and Dunson (2011), and Conti et al. (2014) present methods to capture sparsity in loading matrices. Other extensions consider -distributed latent factors (Ando, 2009), nonparametric Bayes techniques (Paisley and Carin, 2009) and problems with temporal dependence, see e.g., Nakajima and West (2013) and Zhou et al. (2014).
As discussed in Lopes and West (2004, Sect. 6), the prior specification in (1.3) is such that the induced prior on depends on the way the variables and the associated rows of the loading matrix are ordered. Indeed, a priori, follows a chi-square distribution with degrees of freedom . Consequently, the implied prior and also the posterior distribution for the covariance matrix from (1.2) is not invariant under permutations of the variables.
In this paper we propose a modification of the prior distribution for that maintains the convenience of computing with an identifiable lower triangular loading matrix all the while making the prior distributions of and invariant under reordering of the variables. Our proposal, described in Section 2, merely changes the prior distributions of the diagonal entries in (1.3), which will be taken from a slightly more general family than the truncated normal. The details of a Gibbs sampler to draw from the resulting posterior are given in Section 3. Numerical examples are shown Section 4. We conclude with a discussion in Section 5, where we emphasize in particular that the role of lower-triangular loading matrices is a computational one; other ways of mapping the covariance matrix to a (unique) loading matrix can be considered when defining a target of inference.
Section snippets
Order-invariant prior distribution
Without any identifiability constraints, the loading matrix takes its values in all of . A natural default prior would then be to take all entries , and , to be independent random variables; we write . The spherical normal distribution is clearly invariant under permutation of the rows of the matrix. Hence, the induced prior distribution of and of the covariance matrix from (1.2) is invariant under simultaneous permutation
Gibbs sampler
Consider an actual inferential setting in which we observe a sample that comprises independent random vectors drawn from a distribution in the -factor model. Let be the matrix with the vectors as rows. Let be an associated matrix whose rows are independent vectors of latent factors. The factor analysis model dictates that where is an matrix of stochastic errors. The pairs for are independent, and in each pair
Numerical experiments
We illustrate the use of the two different priors, obtained from (1.3) and (2.1), respectively, on a simulated dataset that involves variables and is of size . The data are drawn from the factor distribution given by the following loading matrix and uniquenesses that we generated randomly:
Conclusion
In Bayesian inference in exploratory factor analysis, priors are often specified via a lower triangular loading matrix whose entries are assumed to be independent normal or truncated normal. We propose a modification of this approach, replacing the truncated normal priors by other distributions such that the induced prior on the covariance matrix in (1.2) is invariant under reordering of the considered variables. Specifically, the prior distribution of is equal to that obtained when
Acknowledgments
This work was supported by the U.S. National Science Foundation (DMS-1305154) and by the University of Washington Royalty Research Fund.
References (22)
Bayesian factor analysis with fat-tailed factors and its exact marginal likelihood
J. Multivariate Anal.
(2009)- et al.
Bayesian exploratory factor analysis
J. Econometrics
(2014) - et al.
Bayesian forecasting and portfolio decisions using dynamic dependent sparse factor models
Int. J. Forecast.
(2014) - et al.
Bayesian dynamic factor models and variance matrix discounting for portfolio allocation
J. Bus. Econom. Statist.
(2000) - et al.
Statistical inference in factor analysis.
- et al.
Heterogeneous factor analysis models: a Bayesian approach
Psychometrika
(2002) - et al.
- et al.
Sparse Bayesian infinite factor models
Biometrika
(2011) - et al.
Algebraic factor analysis: tetrads, pentads and beyond
Probab. Theory Related Fields
(2007)
Robust graphical modeling with classical and alternative -distributions
Ann. Appl. Stat.
Cited by (7)
A Mode-Jumping Algorithm for Bayesian Factor Analysis
2022, Journal of the American Statistical AssociationData-based RNA-seq simulations by binomial thinning
2020, BMC BioinformaticsMixtures of multivariate restricted skew-normal factor analyzer models in a Bayesian framework
2019, Computational Statistics