Skip to main content

Advertisement

Log in

A Monte Carlo evaluation of three methods to detect local dependence in binary data latent class models

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Binary data latent class analysis is a form of model-based clustering applied in a wide range of fields. A central assumption of this model is that of conditional independence of responses given latent class membership, often referred to as the “local independence” assumption. The results of latent class analysis may be severely biased when this crucial assumption is violated; investigating the degree to which bivariate relationships between observed variables fit this hypothesis therefore provides vital information. This article evaluates three methods of doing so. The first is the commonly applied method of referring the so-called “bivariate residuals” to a Chi-square distribution. We also introduce two alternative methods that are novel to the investigation of local dependence in latent class analysis: bootstrapping the bivariate residuals, and the asymptotic score test or “modification index”. Our Monte Carlo simulation indicates that the latter two methods perform adequately, while the first method does not perform as intended.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Agresti A (2002) Categorical data analysis, 2nd edn. Wiley-Interscience, New York

    Book  MATH  Google Scholar 

  • Ahlquist JS, Breunig C (2012) Model-based clustering and typologies in the social sciences. Polit Anal 20(1):92–112

    Article  Google Scholar 

  • Albert P, Dodd L (2004) A cautionary note on the robustness of latent class models for estimating diagnostic error without a gold standard. Biometrics 60(2):427–435

    Article  MathSciNet  MATH  Google Scholar 

  • Baughman A, Bisgard K, Cortese M, Thompson W, Sanden G, Strebel P (2008) Utility of composite reference standards and latent class analysis in evaluating the clinical accuracy of diagnostic tests for pertussis. Clin Vaccine Immunol 15(1):106–114

    Article  Google Scholar 

  • Chen F, Mackey A, Vermunt J, Roos D (2007) Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS One 2(4):e383

    Article  Google Scholar 

  • Collins LM, Lanza ST (2010) Latent class and latent transition analysis: with applications in the social, behavioral, and health sciences, vol 718. Wiley, New York

  • Efron B (1982) The Jackknife, the bootstrap, and other resampling plans. In: Proceedings of the CBMS-NSF Regional Conference Series in Applied Mathematics, Society for Industrial and Applied Mathematics (SIAM), Philadelphia

  • Evers M, Namboodiri N (1979) On the design matrix strategy in the analysis of categorical data. Sociol Methodol 10:86–111

    Article  Google Scholar 

  • Faraone S, Tsuang M (1994) Measuring diagnostic accuracy in. Am J Psychiatry 1(51):651

    Google Scholar 

  • Forcina A (2008) Identifiability of extended latent class models with individual covariates. Comput Stat Data Anal 52(12):5263–5268

    Article  MathSciNet  MATH  Google Scholar 

  • Formann A (1992) Linear logistic latent class analysis for polytomous data. J Am Stat Assoc 87(418): 476–486

    Google Scholar 

  • Gaffikin L, McGrath J, Arbyn M, Blumenthal P (2007) Visual inspection with acetic acid as a cervical cancer test: accuracy validated using latent class analysis. BMC Med Res Methodol 7(1):36

    Article  Google Scholar 

  • Gallego A, Oberski D (2012) Personality and political participation: the mediation hypothesis. Polit Behav 34:424–451

    Article  Google Scholar 

  • Glas C (1998) Detection of differential item functioning using Lagrange multiplier tests. Stat Sinica 8: 647–668

    Google Scholar 

  • Glas C (1999) Modification indices for the 2-PL and the nominal response model. Psychometrika 64(3): 273–294

    Google Scholar 

  • Goodman L (1974) Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61(2):215

    Article  MathSciNet  MATH  Google Scholar 

  • Hadgu A, Dendukuri N, Hilden J (2005) Evaluation of nucleic acid amplification tests in the absence of a perfect gold-standard test: a review of the statistical and epidemiologic issues. Epidemiology 16(5): 604–612

    Google Scholar 

  • Hagenaars JAP (1988) Latent structure models with direct effects between indicators local dependence models. Sociol Methods Res 16(3):379–405

    Article  Google Scholar 

  • Hagenaars JAP, McCutcheon AL (2002) Applied latent class analysis. Cambridge University Press, Cambridge

  • Heinen T (1996) Latent class and discrete latent trait models: similarities and differences. Sage, Thousand Oaks

    Google Scholar 

  • Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1): 177–196

    Google Scholar 

  • Hope T, Norris PA (2012) Heterogeneity in the frequency distribution of crime victimization. J Quant Criminol. doi:10.1007/s10940-012-9190-x

  • Huang G, Bandeen-Roche K (2004) Building an identifiable latent class model with covariate effects on underlying and measured variables. Psychometrika 69(1):5–32

    Article  MathSciNet  Google Scholar 

  • Hybels C, Blazer D, Pieper C, Landerman L, Steffens D (2009) Profiles of depressive symptoms in older adults diagnosed with major depression: a latent cluster analysis. Am J Geriatr Psychiatry 17(5):387

    Article  Google Scholar 

  • Langeheine R, Pannekoek J, Van de Pol F (1996) Bootstrapping goodness-of-fit measures in categorical data analysis. Sociol Methods Res 24(4):492–516

    Article  Google Scholar 

  • Laumann EO, Paik A, Rosen RC (1999) Sexual dysfunction in the United States. JAMA 281(6):537–544

    Article  Google Scholar 

  • Maydeu-Olivares A, Joe H (2005) Limited-and full-information estimation and goodness-of-fit testing in \(2^n\) contingency tables. J Am Stat Assoc 100(471):1009–1020

    Article  MathSciNet  MATH  Google Scholar 

  • McLachlan G, Peel D (2000) Finite mixture models volume 299. Wiley-Interscience, New York

    Book  Google Scholar 

  • Nyholt D, Gillespie N, Heath A, Merikangas K, Duffy D, Martin N (2004) Latent class and genetic analysis does not support migraine with aura and migraine without aura as separate entities. Genet Epidemiol 26(3):231–244

    Article  Google Scholar 

  • R Core Team (2012) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0

  • Rao CR (1948) Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. In: Proceedings of the Cambridge philosophical society, vol 44, pp 50–57. Cambridge University Press, Cambridge

  • Saris W, Satorra A, Sörbom D (1987) The detection and correction of specification errors in structural equation models. Sociol Methodol 17:105–129

    Article  Google Scholar 

  • Satorra A (1989) Alternative test criteria in covariance structure analysis: a unified approach. Psychometrika 54(1):131–151

    Article  MathSciNet  Google Scholar 

  • Savage M, Devine F, Cunningham N, Taylor M, Li Y, Hjellbrekke J, Le Roux B, Friedman S, Miles A (2013) A new model of social class? Findings from the BBC’s Great British Class Survey Experiment. Sociology 47(2):219–250

    Article  Google Scholar 

  • Sörbom D (1989) Model modification. Psychometrika 54(3):371–384

    Article  MathSciNet  Google Scholar 

  • Tay L, Newman D, Vermunt J (2011) Using mixed-measurement item response theory with covariates (MM-IRT-C) to ascertain observed and unobserved measurement equivalence. Organ Res Methods 14(1):147–176

    Article  Google Scholar 

  • Torrance-Rynard V, Walter S (1998) Effects of dependent errors in the assessment of diagnostic test performance. Stat Med 16(19):2157–2175

    Article  Google Scholar 

  • Vacek P (1985) The effect of conditional dependence on the evaluation of diagnostic tests. Biometrics 41(4):959–968

    Google Scholar 

  • van der Linden W, Glas C (2010) Statistical tests of conditional independence between responses and/or response times on test items. Psychometrika 75(1):120–139

    Google Scholar 

  • Vermunt JK, Magidson J (2005) Technical guide for latent GOLD 4.0: Basic and advanced. Statistical Innovations Inc, Belmont

  • Walter S, Irwig L (1988) Estimation of test error rates, disease prevalence and relative risk from misclassified data: a review. J Clin Epidemiol 41(9):923–937

    Article  Google Scholar 

  • Walter SD, Riddell CA, Rabachini T, Villa LL, Franco EL (2013) Accuracy of p53 codon 72 polymorphism status determined by multiple laboratory methods: a latent class model analysis. PloS one 8(2):e56430

    Article  Google Scholar 

  • White N, Johnson H, Silburn P, Mellick G, Dissanayaka N, Mengersen K (2012) Probabilistic subgroup identification using bayesian finite mixture modelling: a case study in Parkinson’s disease phenotype identification. Stat Methods Med Res 21(6):563–583

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel L. Oberski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oberski, D.L., van Kollenburg, G.H. & Vermunt, J.K. A Monte Carlo evaluation of three methods to detect local dependence in binary data latent class models. Adv Data Anal Classif 7, 267–279 (2013). https://doi.org/10.1007/s11634-013-0146-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-013-0146-2

Keywords

Mathematics Subject Classification (2010)

Navigation