Skip to main content
Log in

Efficient ways to impute incomplete panel data

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

We find that existing multiple imputation procedures that are currently implemented in major statistical packages and that are available to the wide majority of data analysts are limited with regard to handling incomplete panel data. We review various missing data methods that we deem useful for the analysis of incomplete panel data and discuss, how some of the shortcomings of existing procedures can be overcome. In a simulation study based on real panel data, we illustrate these procedures’ quality and outline fruitful avenues of future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Ackerman, B.P., Brown, E.D., Izard, C.E.: The relations between contextual risk, earned income, and the school adjustment of children from economically disadvantaged families. Dev. Psychol. 40(2), 204–216 (2004a)

    Article  Google Scholar 

  • Ackerman, B.P., Brown, E.D., Izard, C.E.: The relations between persistent poverty and contextual risk and children’s behavior in elementary school. Dev. Psychol. 40(3), 367–377 (2004b)

    Article  Google Scholar 

  • Allison, P.D.: Missing Data. Sage, Thousand Oaks (2001)

    Google Scholar 

  • Bailey, L., Chapman, D.W., Kasprzyk, D.: Nonresponse adjustment procedures at the census bureau: A review. In: Proceedings of the Annual Research Conference, pp. 421–444, U.S. Bureau of the Census, Washington (1985)

    Google Scholar 

  • Bingham, C.R., Crockett, L.J.: Longitudinal adjustment patterns of boys and girls experiencing early, middle, and late sexual intercourse. Dev. Psychol. 32(4), 647–658 (1996)

    Article  Google Scholar 

  • Bingham, C.R., Stemmler, M., Petersen, A.C., Graber, J.A.: Imputing missing data values in repeated measurement within-subjects designs. Methods Psychol. Res. Online 3(2), 131–155 (1998)

    Google Scholar 

  • Bryk, A.S., Raudenbush, S.W.: Hierarchical Linear Models. Sage, Newbury Park (1992)

    Google Scholar 

  • Carpenter, J., Kenward, M., Evans, S., White, I.: Last observation carryforward and last observation analysis. Stat. Med. 23, 3241–3244 (2004)

    Article  Google Scholar 

  • Chambers, J.M.: Software for Data Analysis: Programming with R. Springer, New York (2008)

    Book  MATH  Google Scholar 

  • Collins, L.M., Schafer, J.L., Kam, C.M.: A comparison of inclusive and restrictive missing-data strategies in modern missing-data procedures. Psychol. Methods 6, 330–351 (2001)

    Article  Google Scholar 

  • Cook, R.J., Zeng, L., Yi, G.Y.: Marginal analysis of incomplete longitudinal binary data: A cautionary note on LOCF imputation. Biometrics 60, 820–828 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Crockett, L.J., Bingham, C.R.: Anticipating adulthood: Expected timing of work and family transitions among rural youth. J. Res. Adolesc. 10(2), 151–172 (1996)

    Article  Google Scholar 

  • Davidov, E., Thörner, S., Schmidt, P., Gosen, S., Wolf, C.: Level and change of group-focused enmity in Germany: Unconditional and conditional latent growth curve models with four panel waves. Adv. Stat. Anal. (2011, this issue). doi:10.1007/s10182-011-0174-1

  • Everitt, B., Hothorn, T.: A Handbook of Statistical Analysis Using R. Chapman & Hall, Boca Raton (2006)

    Book  MATH  Google Scholar 

  • Ezzati-Rice, T.M., Johnson, W., Khare, M., Little, R.J.A., Rubin, D.B., Schafer, J.L.: A simulation study to evaluate the performance of model-based multiple imputations in NCHS health examination surveys. In: Proceedings of the Annual Research Conference, pp. 257–266, U.S. Bureau of the Census, Washington (1995)

    Google Scholar 

  • Faraway, J.J.: Linear Models with R. Chapman & Hall, Boca Raton (2004)

    Google Scholar 

  • Faraway, J.J.: Extending Linear Models with R. Chapman & Hall, Boca Raton (2006)

    MATH  Google Scholar 

  • German, A., Hill, J.: Data Analysis Using Multilevel/Hierarchical Models. Cambridge University Press, Cambridge (2007)

    Google Scholar 

  • Graham, J.W.: Adding missing-data-relevant variables to FIML-based structural equation models. Struct. Equ. Model. 10(1), 80–100 (2003)

    Article  MathSciNet  Google Scholar 

  • Graham, J.W.: Missing data analysis: Making it work in the real world. Annu. Rev. Psychol. 60, 549–576 (2009)

    Article  Google Scholar 

  • Graham, J.W., Schafer, J.L.: On the performance of multiple imputation for multivariate data with small sample size. In: Hoyle, R. (ed.) Statistical Strategies for Small Sample Research, pp. 1–29. Sage, Thousand Oaks (1999)

    Google Scholar 

  • Graham, J.W., Cumsille, P.E., Elek-Fisk, E.: Methods for handling missing data. In: Schinka, J.A., Velicer, W.F. (eds.) Handbook of Psychology: Volume 2. Research Methods in Psychology, pp. 87–114. Wiley, Hoboken (2003)

    Google Scholar 

  • Horton, N.J., Kleinman, K.P.: Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models. Am. Stat. 61(1), 79–90 (2007)

    Article  MathSciNet  Google Scholar 

  • Horton, N.J., Lipsitz, S.R.: Multiple imputation in practice: Comparison of software packages for regression models with missing variables. Am. Stat. 55, 244–254 (2001)

    Article  MathSciNet  Google Scholar 

  • Kalton, G., Kasprzyk, D.: The treatment of missing survey data. Surv. Methodol. 12, 1–16 (1986)

    Google Scholar 

  • Laird, N.M.: Missing data in longitudinal studies. Stat. Med. 7, 305–315 (1988)

    Article  Google Scholar 

  • Lally, J.R., Mangione, P.L., Honig, A.S.: The Syracuse University Family Development Research Program: Long-range impact of an early intervention with low-income children and their families. In: Powell, D.R. (ed.) Parent Education as Early Childhood Intervention: Emerging Directions in Theory, Research and Practice, pp. 79–104. Ablex, Norwood (1988)

    Google Scholar 

  • Larsson, B., Possum, S., Clifford, G., Drugli, M.B., Handegård, B.H., Mørch, W.-T.: Treatment of oppositional defiant and conduct problems in young Norwegian children. Eur. Child Adolesc. Psych. 18(1), 42–52 (2008)

    Article  Google Scholar 

  • Little, R.J.A.: Missing-data adjustments in large surveys. J. Bus. Econ. Stat. 6(3), 287–296 (1988)

    Article  MathSciNet  Google Scholar 

  • Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)

    MATH  Google Scholar 

  • Lösel, E., Beelmann, A., Stemmler, M.: Skalen zur Messung sozialen Problemverhaltens bei Vorschul- und Grundschulkindern. Die deutschen Versionen des Eyberg Child Behavior Inventory (ECBI) und des Social Behavior Questionnaire (SBQ) [unpublished manuscript]. University of Erlangen-Nürnberg, Department of Psychology (2002)

  • Lösel, F., Stemmler, M., Jaursch, S., Beelmann, A.: Universal prevention of antisocial development: Short- and long-term effects of a child- and parent-oriented program. Monatsschr. Kriminol. Strafrechtsreform 92, 289–307 (2009)

    Google Scholar 

  • Lösel, R., Wüstendörfer, W.: Zum Problem unvollständiger Datenmatrizen in der empirischen Sozialforschung [The problem of missing data in social science research]. Köln. Z. Soziol. Soz.psychol. 26, 342–357 (1974)

    Google Scholar 

  • Loukas, A., Fitzgerald, H.E., Zucker, R.A., von Eye, A.: Parental alcoholism and co-occurring antisocial behavior: Prospective relationships to externalizing behavior problems in their young sons. J. Abnorm. Child Psychol. 29(2), 91–106 (2001)

    Article  Google Scholar 

  • McArdle, J.J.: Longitudinal dynamic analyses of cognition in the health and retirement study panel. Adv. Stat. Anal. (2011, this issue). doi:10.1007/s10182-011-0168-z

  • McCord, J.: A thirty-year follow-up of treatment effects. Am. Psychol. 33, 284–289 (1978)

    Article  Google Scholar 

  • Muthén, L.K., Muthén, B.O.: Mplus User’s Guide, 6th edn. Muthén & Muthén, Los Angeles (2010)

    Google Scholar 

  • Neyman, J.: Outline of a theory of statistical estimation based on the classical theory of probability. Philos. Trans. R. Soc. Lond. Ser. A 236, 333–380 (1937)

    Article  Google Scholar 

  • Neyman, J., Pearson, E.S.: On the problem of most efficient tests of statistical hypotheses. Philos. Trans. R. Soc. Lond. Ser. A 237, 289–337 (1933)

    Article  Google Scholar 

  • Raghunathan, T.E.: What do we do with missing data? Some options for analysis of incomplete data. Annu. Rev. Publ. Health 25, 99–117 (2004)

    Article  Google Scholar 

  • Raghunathan, T.E., Lepkowski, J.M., van Hoewyk, J., Solenberger, P.: A multivariate technique for multiply imputing missing values using a sequence of regression models. Surv. Methodol. 27(1), 85–96 (2001)

    Google Scholar 

  • Reinecke, J., Seddig, D.: Growth mixture models in longitudinal research. Adv. Stat. Anal. (2011, this issue). doi:10.1007/s10182-011-0171-4

  • Rubin, D.B.: Inference and missing data. Biometrika 63, 581–592 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  • Rubin, D.B.: Statistical matching using file concatenation with adjusted weights and multiple imputations. J. Bus. Econ. Stat. 4(1), 87–94 (1986)

    Article  Google Scholar 

  • Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman &Hall, London (1997a)

    Book  MATH  Google Scholar 

  • Schafer, J.L.: Imputation of missing covariates under a general linear mixed model. Technical Report 97-10, University Park: Pennsylvania State University, The Methodology Center (1997b)

  • Schafer, J.L., Graham, J.W.: Missing data: Our view of the state of the art. Psychol. Methods 7, 147–177 (2002)

    Article  Google Scholar 

  • Schafer, J.L., Olsen, M.K.: Multiple imputation for missing-data problems: A data analyst’s perspective. Multivar. Behav. Res. 33, 545–571 (1998)

    Article  Google Scholar 

  • Schafer, J.L., Yucel, R.M.: Computational strategies for multivariate linear mixed-effects models with missing values. J. Comput. Graph. Stat. 11(2), 437–457 (2002)

    Article  MathSciNet  Google Scholar 

  • Seiffge-Krenke, L., Stemmler, M.: Coping with everyday stress and links to medical and psychosocial adaptation in diabetic adolescents. J. Adolesc. Health 33, 180–188 (2003)

    Article  Google Scholar 

  • Stemmler, M., Petersen, A.C.: Gender differential influences of early adolescent risk factors for the development of depressive affect. J. Youth Adolesc. 34(3), 175–183 (2005)

    Article  Google Scholar 

  • Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation (with discussion). J. Am. Stat. Assoc. 82, 528–550 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Tremblay, R.E., Desmarais-Gervais, L., Gagnon, C., Charlebois, P.: The preschool behavior questionnaire. Stability of its factor structure between cultures, sexes, ages and socioeconomic classes. Int. J. Behav. Dev. 10, 467–484 (1987)

    Article  Google Scholar 

  • Tremblay, R.E., Loeber, R., Gagnon, C., Charlebois, R., Larive, S., LeBlanc, M.: Disruptive boys with stable and unstable high fighting behavior patterns during junior elementary school. J. Abnorm. Child Psychol. 19(3), 285–300 (1991)

    Article  Google Scholar 

  • Tremblay, R.E., Vitaro, E., Gagnon, C., Piche, C., Royer, N.: A prosocial scale for the preschool behavior questionnaire: Concurrent and predictive correlates. Int. J. Behav. Dev. 15(2), 227–245 (1992)

    Google Scholar 

  • van Buuren, S.: Multiple imputation of discrete and continuous data by fully conditional specification. Stat. Methods Med. Res. 16(3), 219–242 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • van Buuren, S., Groothuis-Oudshoorn, K.: MICE: Multivariate imputation by chained equations in R. J. Stat. Softw. (2011, forthcoming). Available from http://www.stefvanbuuren.nl/publications/MICE%20in%20R%20-%20Draft.pdf

  • van Buuren, S., Brand, J.P.L., Groothuis-Oudshoorn, C.G.M., Rubin, D.B.: Fully conditional specication in multivariate imputation. J. Stat. Comput. Simul. 76(12), 1049–1064 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Weins, C., Reinecke, J.: Delinquenzverläufe im Jugendalter: Eine methodologische Analyse zur Auswirkung von fehlenden Werten im Längsschnitt [Development of juvenile delinquency: An analysis of the effects of missing data]. Monatsschr. Kriminol. Strafrechtsreform 90(5), 418–437 (2007)

    Google Scholar 

  • Yu, L.M., Burton, A., Rivero-Arias, O.: Evaluation of software for multiple imputation of semi-continuous data. Stat. Methods Med. Res. 16, 243–258 (2007)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristian Kleinke.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kleinke, K., Stemmler, M., Reinecke, J. et al. Efficient ways to impute incomplete panel data. AStA Adv Stat Anal 95, 351–373 (2011). https://doi.org/10.1007/s10182-011-0179-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-011-0179-9

Keywords

Navigation