Abstract
In some multivariate problems with missing data, pairs of variables exist that are never observed together. For example, some modern biological tools can produce data of this form. As a result of this structure, the covariance matrix is only partially identifiable, and point estimation requires that identifying assumptions be made. These assumptions can introduce an unknown and potentially large bias into the inference. This paper presents a method based on semidefinite programming for automatically quantifying this potential bias by computing the range of possible equal-likelihood inferred values for convex functions of the covariance matrix. We focus on the bias of missing value imputation via conditional expectation and show that our method can give an accurate assessment of the true error in cases where estimates based on sampling uncertainty alone are overly optimistic.
Similar content being viewed by others
References
Banerjee O, El Ghaoui L, d’Aspremont A, Natsoulis G (2006) Convex optimization techniques for fitting sparse gaussian graphical models. In: ACM International Conference Proceeding Series. Citeseer, vol 148, pp 89–96
Beale E, Little R (1975) Missing values in multivariate analysis. J Roy Stat Soc Ser B 37(1):129–145
Beck A, Teboulle M (2010) Gradient-based algorithms with applications to signal recovery problems. In: Palomar DP, Eldar YC (eds) Convex optimization in signal processing and communications. Cambridge University Press, Cambridge, pp 42–88
Bendall SC, Nolan GP, Roederer M, Chattopadhyay PK (2012) A deep profiler’s guide to cytometry. Trends Immunol 33(7):323–332
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York
Chattopadhyay P, Price D, Harper T, Betts M, Yu J, Gostick E, Perfetto S, Goepfert P, Koup R, De Rosa S et al (2006) Quantum dot semiconductor nanocrystals for immunophenotyping by polychromatic flow cytometry. Nat Med 12(8):972–977
Hartley H, Hocking R (1971) The analysis of incomplete data. Biometrics 27(4):783–823
Little R, Rubin D (1987) Statistical analysis with missing data. Wiley, Hoboken
Moriarity C, Scheuren F (2001) Statistical matching: a paradigm for assessing the uncertainty in the procedure. Journal of Official Statistics-Stockholm 17(3):407–422
Newell EW, Sigal N, Bendall SC, Nolan GP, Davis MM (2012) Cytometry by time-of-flight shows combinatorial cytokine expression and virus-specific cell niches within a continuum of cd8+ t cell phenotypes. Immunity 36(1):142–152. doi:10.1016/j.immuni.2012.01.002
Rubin DB (1986) Statistical matching using file concatenation with adjusted weights and multiple imputations. J Bus Econ Stat 4(1):87–94
Acknowledgments
The authors would like to thank Noah Simon for discussion of optimization methods and Jacob Bien for other helpful discussions. We would also like to thank Sean Bendall and Erin Simons for first posing the problem to us, and Evan Newell for providing us with the mass cytometry data. MGG is supported by a National Science Foundation GRFP Fellowship. SSSO is a Taub fellow and is supported by US National Institutes of Health (NIH) (U19 AI057229). RT was supported by NSF grant DMS-9971405 and NIH grant N01-HV-28183.
Author information
Authors and Affiliations
Corresponding author
Appendix: Algorithmic details
Appendix: Algorithmic details
From Sect. 3.2, the inner optimization problem we wish to solve is
This optimization problem, with fixed \(t\), can be solved by generalized gradient descent. The gradient of the objective is
If we initialize the first time with \(\varSigma = \hat{\varSigma } \in \mathcal S \), we want to remain within \(\mathcal S \) with each step. Therefore, we project the gradient into the linear space \(\{\varSigma : \varSigma _{ij} = \hat{\varSigma }_{ij} \forall (i,j) \in \bigcup _k \left( J_k\times J_k\right) \}\). This is equivalent to holding the coordinates \((i,j) \in \bigcup _k \left( J_k\times J_k\right) \) fixed, and only taking gradient steps in the other directions.
With a step size of \(\delta \), the update becomes
Using warm starts, we repeatedly solve this problem with increasing \(t\) to obtain the final solution. This barrier method is discussed in Boyd and Vandenberghe (2004), which recommends using a sequence of \(t\) that increase by a factor of \(\mu \) (around 10–20) at each outer loop iteration. More details of the method can be found in Section 11.3.1 of that book.
We also include acceleration, as in Banerjee et al. (2006) and Beck and Teboulle (2010), among others. This is shown in Fig. 5 to give practically significant improvements in algorithm timings. The final algorithm is shown in Algorithm 1.
Some care needs to be taken in selecting the step size \(\delta \). We choose it by backtracking to result in a decrease in the objective, to remain inside the positive semidefinite cone, and to satisfy the majorization requirements of generalized gradient descent (Beck and Teboulle 2010). This sub-algorithm is shown as Algorithm 2.
Rights and permissions
About this article
Cite this article
G’Sell, M.G., Shen-Orr, S.S. & Tibshirani, R. Sensitivity analysis for inference with partially identifiable covariance matrices. Comput Stat 29, 529–546 (2014). https://doi.org/10.1007/s00180-013-0451-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-013-0451-4