Skip to main content
Log in

Sensitivity analysis for inference with partially identifiable covariance matrices

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In some multivariate problems with missing data, pairs of variables exist that are never observed together. For example, some modern biological tools can produce data of this form. As a result of this structure, the covariance matrix is only partially identifiable, and point estimation requires that identifying assumptions be made. These assumptions can introduce an unknown and potentially large bias into the inference. This paper presents a method based on semidefinite programming for automatically quantifying this potential bias by computing the range of possible equal-likelihood inferred values for convex functions of the covariance matrix. We focus on the bias of missing value imputation via conditional expectation and show that our method can give an accurate assessment of the true error in cases where estimates based on sampling uncertainty alone are overly optimistic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Banerjee O, El Ghaoui L, d’Aspremont A, Natsoulis G (2006) Convex optimization techniques for fitting sparse gaussian graphical models. In: ACM International Conference Proceeding Series. Citeseer, vol 148, pp 89–96

  • Beale E, Little R (1975) Missing values in multivariate analysis. J Roy Stat Soc Ser B 37(1):129–145

    MATH  MathSciNet  Google Scholar 

  • Beck A, Teboulle M (2010) Gradient-based algorithms with applications to signal recovery problems. In: Palomar DP, Eldar YC (eds) Convex optimization in signal processing and communications. Cambridge University Press, Cambridge, pp 42–88

  • Bendall SC, Nolan GP, Roederer M, Chattopadhyay PK (2012) A deep profiler’s guide to cytometry. Trends Immunol 33(7):323–332

    Article  Google Scholar 

  • Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  • Chattopadhyay P, Price D, Harper T, Betts M, Yu J, Gostick E, Perfetto S, Goepfert P, Koup R, De Rosa S et al (2006) Quantum dot semiconductor nanocrystals for immunophenotyping by polychromatic flow cytometry. Nat Med 12(8):972–977

    Article  Google Scholar 

  • Hartley H, Hocking R (1971) The analysis of incomplete data. Biometrics 27(4):783–823

    Article  Google Scholar 

  • Little R, Rubin D (1987) Statistical analysis with missing data. Wiley, Hoboken

    MATH  Google Scholar 

  • Moriarity C, Scheuren F (2001) Statistical matching: a paradigm for assessing the uncertainty in the procedure. Journal of Official Statistics-Stockholm 17(3):407–422

    Google Scholar 

  • Newell EW, Sigal N, Bendall SC, Nolan GP, Davis MM (2012) Cytometry by time-of-flight shows combinatorial cytokine expression and virus-specific cell niches within a continuum of cd8+ t cell phenotypes. Immunity 36(1):142–152. doi:10.1016/j.immuni.2012.01.002

    Article  Google Scholar 

  • Rubin DB (1986) Statistical matching using file concatenation with adjusted weights and multiple imputations. J Bus Econ Stat 4(1):87–94

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Noah Simon for discussion of optimization methods and Jacob Bien for other helpful discussions. We would also like to thank Sean Bendall and Erin Simons for first posing the problem to us, and Evan Newell for providing us with the mass cytometry data. MGG is supported by a National Science Foundation GRFP Fellowship. SSSO is a Taub fellow and is supported by US National Institutes of Health (NIH) (U19 AI057229). RT was supported by NSF grant DMS-9971405 and NIH grant N01-HV-28183.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Max Grazier G’Sell.

Appendix: Algorithmic details

Appendix: Algorithmic details

From Sect. 3.2, the inner optimization problem we wish to solve is

$$\begin{aligned}&\underset{\varSigma }{\mathrm{minimize}} \;\; -\frac{1}{t}\log |\varSigma | + \varSigma _{j,J_k}\hat{\varSigma }_{J_k,J_k}^{-1}(x_{J_k}-\hat{\mu }_{J_k})\\&\text {subject to } \;\; \varSigma _{ab} = \hat{\varSigma }_{ab} \text { for } (a,b)\in \bigcup _{\ell } \left( J_{\ell }\times J_{\ell }\right) . \end{aligned}$$

This optimization problem, with fixed \(t\), can be solved by generalized gradient descent. The gradient of the objective is

$$\begin{aligned} \nabla (\mathrm{objective})&= -\frac{1}{t} \varSigma ^{-1} + C \\ C_{ab}&= \left\{ \begin{array}{ll} \left( \hat{\varSigma }_{J_k,J_k}^{-1}(x_{J_k}-\hat{\mu }_{J_k})\right) [b] &{} \quad a=j \text { and } b\in J_k\\ \left( \hat{\varSigma }_{J_k,J_k}^{-1}(x_{J_k}-\hat{\mu }_{J_k})\right) [a] &{} \quad a\in J_k \text { and } b=j\\ 0 &{} \quad \text {otherwise} \end{array}\right. \end{aligned}$$

If we initialize the first time with \(\varSigma = \hat{\varSigma } \in \mathcal S \), we want to remain within \(\mathcal S \) with each step. Therefore, we project the gradient into the linear space \(\{\varSigma : \varSigma _{ij} = \hat{\varSigma }_{ij} \forall (i,j) \in \bigcup _k \left( J_k\times J_k\right) \}\). This is equivalent to holding the coordinates \((i,j) \in \bigcup _k \left( J_k\times J_k\right) \) fixed, and only taking gradient steps in the other directions.

With a step size of \(\delta \), the update becomes

$$\begin{aligned} \begin{array}{ll} \varSigma ^{(t+1)}_{ab} = \varSigma ^{(t)}_{ab} + \delta \left( \frac{1}{t} (\varSigma ^{(t)})_{ab}^{-1} - C^{(t)}_{ab}\right) &{} \quad (a,b) \notin \bigcup _{\ell } \left( J_{\ell }\times J_{\ell }\right) \\ \varSigma ^{(t+1)}_{ab} = \varSigma ^{(t)}_{ab} &{} \quad (a,b) \in \bigcup _{\ell } \left( J_{\ell }\times J_{\ell }\right) \end{array} \end{aligned}$$

Using warm starts, we repeatedly solve this problem with increasing \(t\) to obtain the final solution. This barrier method is discussed in Boyd and Vandenberghe (2004), which recommends using a sequence of \(t\) that increase by a factor of \(\mu \) (around 10–20) at each outer loop iteration. More details of the method can be found in Section 11.3.1 of that book.

We also include acceleration, as in Banerjee et al. (2006) and Beck and Teboulle (2010), among others. This is shown in Fig. 5 to give practically significant improvements in algorithm timings. The final algorithm is shown in Algorithm 1.

figure a

Some care needs to be taken in selecting the step size \(\delta \). We choose it by backtracking to result in a decrease in the objective, to remain inside the positive semidefinite cone, and to satisfy the majorization requirements of generalized gradient descent (Beck and Teboulle 2010). This sub-algorithm is shown as Algorithm 2.

figure b

Rights and permissions

Reprints and permissions

About this article

Cite this article

G’Sell, M.G., Shen-Orr, S.S. & Tibshirani, R. Sensitivity analysis for inference with partially identifiable covariance matrices. Comput Stat 29, 529–546 (2014). https://doi.org/10.1007/s00180-013-0451-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-013-0451-4

Keywords

Navigation