Skip to main content
Log in

Reconstruction of missing data in multivariate processes with applications to causality analysis

  • Published:
International Journal of Advances in Engineering Sciences and Applied Mathematics Aims and scope Submit manuscript

Abstract

Recovery of missing observations in time-series has been a century-long subject of study, giving rise to two broad classes of methods, namely, one that reconstructs data and the other that directly estimate the statistical properties of the data, largely for univariate processes. In this work, we present a data reconstruction technique for multivariate processes. The proposed method is developed in the framework of sparse optimization while adopting a parametric approach using vector auto-regressive (VAR) models, where both the temporal and spatial correlations can be exploited for efficient data recovery. The primary purpose of recovering the missing data in this work is to develop a directed graphical or a network representation of the multivariate process under study. Existing methods for data-driven network reconstruction are built on the assumption of data being available at regular intervals. In this respect, the proposed method offers an effective methodology for reconstructing weighted causal networks from missing data. The scope of this work is restricted to linear, jointly stationary multivariate processes that can be suitably represented by VAR models of finite order and missing data of the random type. Simulation studies on different data generating processes with varying proportions of missing observations illustrate the efficacy of the proposed method in recovering the multivariate signals and thereby reconstructing weighted causal networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Imtiaz, S., Shah, S.: Treatment of missing values in process data analysis. Can. J. Chem. Eng. 86(5), 838–858 (2008)

    Article  Google Scholar 

  2. Lakshminarayan, K., Harp, S.A., Samad, T.: Imputation of missing data in industrial databases. Appl. Intell. 11(3), 259–275 (1999)

    Article  Google Scholar 

  3. Lomb, N.R.: Least-squares frequency analysis of unequally spaced data. Astrophys. Space Sci. 39(2), 447–462 (1976)

    Article  Google Scholar 

  4. Kasam, A.A., Lee, B.D., Paredis, C.J.: Statistical methods for interpolating missing meteorological data for use in building simulation. In: Building Simulation, vol. 7, pp. 455–465. Tsinghua University Press, Springer (2014). https://doi.org/10.1007/s12273-014-0174-7

  5. Ferrari, G.T., Ozaki, V.: Missing data imputation of climate datasets: implications to modeling extreme drought events. Rev. Bras. Meteorol. 29(1), 21–28 (2014)

    Article  Google Scholar 

  6. Kourti, T., MacGregor, J.F.: Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemom. Intell. Lab. Syst. 28(1), 3–21 (1995)

    Article  Google Scholar 

  7. Scargle, J.D.: Studies in astronomical time-series analysis. ii-statistical aspects of spectral analysis of unevenly spaced data. Astrophys. J. 263, 835–853 (1982)

    Article  Google Scholar 

  8. Warga, A.: Bond returns, liquidity, and missing data. J. Financial Quant. Anal. 27(4), 605–617 (1992)

    Article  Google Scholar 

  9. Babu, P., Stoica, P.: Spectral analysis of nonuniformly sampled data-a review. Digit. Signal Process. 20(2), 359–378 (2010)

    Article  Google Scholar 

  10. Scargle, J.D.: Studies in astronomical time-series analysis. iii-fourier transforms, autocorrelation functions, and cross-correlation functions of unevenly spaced data. Astrophys. J. 343, 874–887 (1989)

    Article  Google Scholar 

  11. Hocke, K., Kämpfer, N.: Gap filling and noise reduction of unevenly sampled data by means of the Lomb–Scargle periodogram. Atmos. Chem. Phys. 9(12), 4197–4206 (2009)

    Article  Google Scholar 

  12. Hocke, K.: Phase estimation with the lomb-scargle periodogram method. In: Annales Geophysicae, vol. 16, pp. 356–358. Copernicus (1998)

  13. Schafer, J.L., Olsen, M.K.: Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivar. Behav. Res. 33(4), 545–571 (1998)

    Article  Google Scholar 

  14. Isaksson, A.J.: Identification of arx-models subject to missing data. IEEE Trans. Autom. Control 38(5), 813–819 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  15. de Waele, S., Broersen, P.M.T.: Error measures for resampled irregular data. IEEE Trans. Instrum. Meas. 49(2), 216–222 (2000). https://doi.org/10.1109/19.843052

    Article  Google Scholar 

  16. Liu, S., Molenaar, P.C.: ivar: a program for imputing missing data in multivariate time-series using vector autoregressive models. Behav. Res. Methods 46(4), 1138–1148 (2014)

    Article  Google Scholar 

  17. Junger, W., de Leon, A.P.: Imputation of missing data in time-series for air pollutants. Atmos. Environ. 102, 96–104 (2015)

    Article  Google Scholar 

  18. Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  19. Baccalá, L.A., Sameshima, K.: Partial directed coherence: a new concept in neural structure determination. Biol. Cybern. 84(6), 463–474 (2001)

    Article  MATH  Google Scholar 

  20. Gigi, S., Tangirala, A.: Reconstructing plant connectivity using directed spectral decomposition. IFAC Proc. Vol. 45(15), 481–486 (2012)

    Article  Google Scholar 

  21. Granger, C.W.: Investi gating causal relations by econometric models and cross-spectral methods. Econometrica 37, 424–438 (1969). https://doi.org/10.2307/1912791

    Article  MATH  Google Scholar 

  22. Gigi, S., Tangirala, A.K.: Quantitative analysis of directional strengths in jointly stationary linear multivariate processes. Biol. Cybern. 103(2), 119–133 (2010)

    Article  MATH  Google Scholar 

  23. Eichler, M.: A graphical approach for evaluating effective connectivity in neural systems. Philos. Trans. R. Soc. Lond. B Biol. Sci. 360(1457), 953–967 (2005)

    Article  Google Scholar 

  24. Eichler, M.: On the evaluation of information flow in multivariate systems by the directed transfer function. Biol. Cybern. 94(6), 469–482 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  25. Bahadori, M.T., Liu, Y.: Granger causality analysis in irregular time-series. In: SDM, pp. 660–671. SIAM (2012)

  26. Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, 1st edn. Springer, Berlin (2010)

    Book  MATH  Google Scholar 

  27. Candes, E.J.: The restricted isometry property and its implications for compressed sensing. Comptes Rendus Math. 346(9), 589–592 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  28. Perepu, S.K., Tangirala, A.K.: Reconstruction of missing data using compressed sensing techniques with adaptive dictionary. J. Process Control 47, 175–190 (2016)

    Article  Google Scholar 

  29. Wiener, N.: The theory of prediction. Mod. Math. Eng. 1, 125–139 (1956)

    Google Scholar 

  30. Granger, C.W.: Some recent development in a concept of causality. J. Econom. 39(1), 199–211 (1988)

    Article  MathSciNet  Google Scholar 

  31. Lütkepohl, H.: New Introduction to Multiple Time-Series Analysis. Springer, Berlin (2005)

    Book  MATH  Google Scholar 

  32. Garg, A., Tangirala, A.K.: Interaction assessment in multivariable control systems through causality analysis. IFAC Proc. Vol. 47(1), 585–592 (2014)

    Article  Google Scholar 

  33. Ambat, S.K., Hari, K., et al.: Fusion of sparse reconstruction algorithms for multiple measurement vectors. arXiv preprint arXiv:1504.01705 (2015)

  34. Wooten, R.: Statistical analysis of the relationship between wind speed, pressure and temperature. In: Sixth International Conference on Dynamic Systems and Applications (2011)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arun K. Tangirala.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Agarwal, P., Tangirala, A.K. Reconstruction of missing data in multivariate processes with applications to causality analysis. Int J Adv Eng Sci Appl Math 9, 196–213 (2017). https://doi.org/10.1007/s12572-017-0198-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12572-017-0198-1

Keywords

Navigation