Abstract
Copula modeling for serial dependence has been extensively discussed in a time series context. However, fitting copula-based Markov models for serially dependent survival data is challenging due to the complex censoring mechanisms. The purpose of this paper is to develop likelihood-based methods for fitting a copula-based Markov chain model to serially dependent event times that are dependently censored by a terminal event, such as death. We propose a novel copula-based Markov chain model for describing serial dependence in recurrent event times. We also apply another copula model for handling dependent censoring. Due to the complex likelihood function with the two copulas, we propose a two-stage estimation method under Weibull distributions for fitting the survival data. The asymptotic normality of the proposed estimator is established through the theory of estimating functions. We propose a jackknife method for interval estimates, which is shown to be asymptotically consistent. To select suitable copulas for a given dataset, we propose a model selection method according to the 2nd stage likelihood. We conduct simulation studies to assess the performance of the proposed methods. For illustration, we analyze survival data from colorectal cancer patients. We implement the proposed methods in our original R package “Copula.Markov.survival” that is made available in CRAN (https://cran.r-project.org/).
Similar content being viewed by others
Change history
15 January 2021
A Correction to this paper has been published: https://doi.org/10.1007/s42081-020-00099-4
References
Balakrishnan, N., Chimitova, E., et al. (2013). Testing goodness of fit of parametric AFT and PH models with residuals. Communications in Statistics-Simulation and Computation, 42(6), 1352–1367.
Bedair, K., Hong, Y., et al. (2016). Multivariate frailty models for multi-type recurrent event data and its application to cancer prevention trial. Computational Statistics & Data Analysis, 101, 161–173.
Cai, J., & Schaubel, D. E. (2004). Marginal means/rates models for multiple type recurrent event data. Lifetime Data Analysis, 10, 121–138.
Chen, X., & Fan, Y. (2006). Estimation and model selection of semiparametric copula-based multivariate dynamic models under copula misspecification. Journal of Econometrics, 135(1–2), 125–154.
Cox, D. R., & Reid, N. (1987). Parameter orthogonality and approximate conditional inference. Journal of the Royal Statistical Society: Series B (Methodological), 49(1), 1–18.
Darsow, W. F., Nguyen, B., & Olsen, E. T. (1992). Copulas and Markov processes. Illinois Journal of Mathematics, 36(4), 600–642.
Domma, F., Giordano, S., & Perri, P. F. (2009). Statistical modeling of temporal dependence in financial data via a copula function. Communications in Statistics-Simulation and Computation, 38(4), 703–728.
Duchateau, L., & Janssen, P. (2007). The Frailty Model. New York: Springer.
Durante, F., & Sempi, C. (2016). Principles of Copula theory. London: Chapman and Hall/CRC.
Emura, T. (2020). joint.Cox: joint frailty-copula models for tumour progression and death in meta-analysis, CRAN, https://CRAN.R-project.org/package=joint.Cox.
Emura, T., & Chen, Y.-H. (2018). Analysis of survival data with dependent censoring, copula-based approaches, JSS research series in statistics. Singapore: Springer.
Emura, T., & Hsu, J.-H. (2020). Estimation of the Mann-Whitney effect in the two-sample problem under dependent censoring. Comptational Statistics & Data Analysis, 150, 106990.
Emura, T., Lin, C.-W., & Wang, W. (2010). A goodness-of-fit test for Archimedean copula models in the presence of right censoring. Computational Statistics & Data Analysis, 54, 3033–3043.
Emura, T., Long, T.-H., & Sun, L.-H. (2017a). R routines for performing estimation and statistical process control under copula-based time series models. Communications in Statistics-Simulation and Computation, 46(4), 3067–3087.
Emura, T., Nakatochi, M., Murotani, K., & Rondeau, V. (2017b). A joint frailty-copula model between tumour progression and death for meta-analysis. Statistical Methods in Medical Research, 26(6), 2649–2666.
Emura, T., Matsui, S., & Rondeau, V. (2019). Survival analysis with correlated endpoints, joint frailty-copula models, JSS research series in statistics. Singapore: Springer.
Erkal Sonmez, O., & Baray, A. (2019). On copula based serial dependence in statistical process control. In F. Calisir, E. Cevikcan, & H. Camgoz Akdag (Eds.), Industrial engineering in the big data Era. Lecture notes in management and industrial engineering. Cham: Springer. https://doi.org/10.1007/978-3-030-03317-0_11.
Gijbels, I., Veraverbeke, N., & Omelka, M. (2011). Conditional copulas, association measures and their applications. Computational Statistics & Data Analysis, 55(5), 1919–1932.
González, J. R., Fernandez, E., Moreno, V., et al. (2005). Sex differences in hospital readmission among colorectal cancer patients. Journal of Epidemiology and Community Health, 59(6), 506–511.
Ha, I. D., Jeong, J. H., & Lee, Y. (2017). Statistical modelling of survival data with random effects. Singapore: Springer.
Huang, X.-W., & Emura, T. (2019). Model diagnostic procedures for copula-based Markov chain models for statistical process control. Communications in Statistics-Simulation and Computation. https://doi.org/10.1080/03610918.2019.1602647.
Hung, W.-L. (2001). Weighted least-squares estimation of the shape parameter of the Weibull distribution. Quality and Reliability Engineering International, 17, 467–469. https://doi.org/10.1002/qre.423.
Joe, H. (1997). Multivariate models and multivariate dependence concepts. London: Chapman and Hall/CRC.
Joe, H. (2005). Asymptotic efficiency of the two-stage estimation method for copula-based models. Journal of Multivariate Analysis, 94(2), 401–419.
Kim, J.-M., & Baik, J. (2018). Anomaly detection in sensor data. Journal of Applied Reliability, 18(1), 20–32.
Kim, J.-M., Baik, J., & Reller, M. (2018). Detecting the change of variance by using conditional distribution with diverse copula functions. In: Paper presented at the Proceedings of the Pacific Rim Statistical Conference for Production Engineering.
Kim, J.-M., Baik, J., & Reller, M. (2019). Control charts of mean and variance using copula Markov SPC and conditional distribution by copula. Communications in Statistics-Simulation and Computation. https://doi.org/10.1080/03610918.2018.1547404.
Lawless, J. F. (2003). Statistical models and methods for lifetime data (2nd ed.). Hoboken: Wiley.
Lawless, J. F., & Yilmaz, Y. E. (2011). Semiparametric estimation in copula models for bivariate sequential survival times. Biometrical Journal, 53(5), 779–796.
Li, Z., Chinchilli, V. M., & Wang, M. (2019a). A Bayesian joint model of recurrent events and a terminal event. Biometrical Journal, 61(1), 187–202.
Li, F., Tang, Y., & Wang, H. J. (2019b). Copula-based semiparametric analysis for time series data with detection limits. Canadian Journal of Statistics, 47(3), 438–454.
Li, Z., Chinchilli, V. M., & Wang, M. (2020). A time-varying Bayesian joint hierarchical copula model for analysing recurrent events and a terminal event: an application to the Cardiovascular Health Study. Journal of the Royal Statistical Society: Series C, 69(1), 151–166.
Lin, W.-C., Emura, T., & Sun, L.-H. (2019). Estimation under copula-based Markov mixture normal models for serially correlated data. Communications in Statistics-Simulation and Computation. https://doi.org/10.1080/03610918.2019.1652318.
Liu, X. (2012). Planning of accelerated life tests with dependent failure modes based on a gamma frailty model. Technometrics, 54(4), 398–409.
Lo, S. M., Mammen, E., & Wilke, R. A. (2020). A nested copula duration model for competing risks with multiple spells. Computational Statistics & Data Analysis, 150, 106986.
MacDonald, I. L. (2014). Does Newton-Raphson really fail? Statistical Methods in Medical Research, 23(3), 308–311.
Meyer, R., & Romeo, J. S. (2015). Bayesian semiparametric analysis of recurrent failure time data using copulas. Biometrical Journal, 57(6), 982–1001.
Nelsen, R. B. (1986). Properties of a one-parameter family of bivariate distributions with specified marginals. Communications in Statistics-Theory and Methods, 15(11), 3277–3285.
Nelsen, R. B. (2006). An introduction to copulas (2nd ed.). New York: Springer.
Ning, J., Rahbar, M. H., et al. (2017). Estimating the ratio of multivariate recurrent event rates with application to a blood transfusion study. Statistical Methods in Medical Research, 26(4), 1969–1981.
Rotolo, F., Paoletti, X., & Michiels, S. (2018). surrosurv: An R package for the evaluation of failure time surrogate endpoints in individual patient data meta-analyses of randomized clinical trials. Computer Methods and Programs in Biomedicine, 155, 189–198.
Schneider, S., Demarqui, F. N., Colosimo, E. A., & Mayrink, V. D. (2020). An approach to model clustered survival data with dependent censoring. Biometrical Journal, 62(1), 157–174.
Shih, J. H. (2014). Copula models. In J. P. Klein, H. C. Van Houwelingen, J. G. Ibrahim, & T. H. Scheike (Eds.), Handbook of survival analysis. Boca Raton: CRC Press.
Shinohara, S., Lin, Y. H., Michimae, H., & Emura. T. (2020) Dynamic lifetime prediction using a Weibull-based bivariate failure time model: a meta-analysis of individual-patient data, in reivew.
Su, C.-L., Lin, F.-C. (2020). Analysis of cyclic recurrent event data with multiple event types, Japanese Journal of Statistics and Data Science. https://doi.org/10.1007/s42081-020-00088-7.
Sun, L.-H., Huang, X.-W., Alqawba, M.-S., Kim, J. M., & Emura, T. (2020). Copula-based Markov models for time series - parametric inference and process control, JSS research series in statistics. Singapore: Springer.
Sun, L.-H., Lee, C.-S., & Emura, T. (2018). A Bayesian inference for time series via copula-based Markov chain models. Communications in Statistics-Simulation and Computation. https://doi.org/10.1080/03610918.2018.1529241.
Valle, L. D., Leisen, F., & Rossini, L. (2018). Bayesian non-parametric conditional copula estimation of twin data. Journal of the Royal Statistical Society: Series C, 67(3), 523–548.
Wang, W., & Emura, T. (2011). Comments on inference in multivariate Archimedean copula models by Genest et al. TEST, 20, 276–280.
Wang, K., Yau, K. K., Lee, A. H., et al. (2007). Multilevel survival modelling of recurrent urinary tract infections. Computer Methods and Programs in Biomedicine, 87(3), 225–229.
Wang, Y. C., Emura, T., Fan, T. H., Lo, S. M., & Wilke, R. A. (2020). Likelihood-based inference for a frailty-copula model based on competing risks failure time data. Qual. Reliab. Eng. Int., 36(5), 1622–1638.
Weibull, W. (1951). Wide applicability. Journal of Applied Mechanics, 103(730), 293–297.
Wu, B. H., Michimae, H., & Emura, T. (2020). Meta-analysis of individual patient data with semi-competing risks under the Weibull joint frailty–copula model. Computational Statistics. https://doi.org/10.1007/s00180-020-00977-1.
Zhang, S., Zhou, Q. M., & Lin, H. (2020). Goodness-of-fit test of copula functions for semi-parametric univariate time series models. Statistical Papers. https://doi.org/10.1007/s00362-019-01153-4.
Acknowledgements
The authors kindly thank the Editor-in-Chief (Prof. Aoshima), the coordinating editor (Prof. Ha), and two anonymous reviewers for their helpful comments that improved the manuscript. The authors also thank Prof. Chyong-Mei Chen who gave us suggestions in the initial stage of this work. The research of Emura T is funded by the grant from the Ministry of Science and Technology of Taiwan (MOST 107-2118-M-008-003-MY3).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: “There are several errors in the supplementary material that were caused during production process”.
Electronic supplementary material
Below is the link to the electronic supplementary material.
42081_2020_87_MOESM2_ESM.pdf
A PDF file including the following: S1: Model checking based on the Weibull plot. S2: Model selection consistency. S3: Comparison between the one-stage and two-stage methods. S4: Computer codes for the adequacy checking of the colorectal cancer data (PDF 200 kb)
Appendices
Appendix 1: partial derivatives
The 1st-stage transformed log-likelihood is
Its partial derivatives are
Define notations \(w_{i} = \exp (\tilde{\lambda })T_{i}^{{*\exp (\,\tilde{\nu }_{2} \,)}}\), \(u_{ij} = \exp (\tilde{r})T_{ij}^{{\exp (\tilde{\nu }_{1} )}}\),
The 2nd stage transformed log-likelihood is
Its partial derivatives are
Appendix 2: sketch of the proof of Theorem 1
We first define
By a Taylor series expansion, the estimating function \({\mathbf{g}}(\,\tilde{\varTheta }\,) = \sum\nolimits_{i = 1}^{N} {{\mathbf{g}}_{i} (\,\tilde{\varTheta }\,)}\) can be expanded around the true parameter value \(\tilde{\varTheta }_{0} = (\,\tilde{r}_{0} ,\;\tilde{\nu }_{1\;0} ,\;\tilde{\theta }_{0} ,\;\tilde{\alpha }_{0} ,\;\tilde{\lambda }_{0} ,\;\tilde{\nu }_{2\;0} \,)\), such that
We plug-in the MLE in the above equation so that the left-side becomes zero. It follows that
Since \(||\,\hat{\tilde{\varTheta }} - \tilde{\varTheta }_{0} \,|| = o_{P} (1)\), the term \(O_{P} (N^{1/2} ||\,\tilde{\varTheta } - \tilde{\varTheta }_{0} \,||^{2} )\) is negligible relative to the left-side. The weak law of large number implies
Under some regularity conditions,
where \(f(\,{\mathbf{x}}_{i} |\tilde{\varTheta }\,) = f(\,t_{i1} ,\; \ldots ,\;t_{{in_{i} }} ,\;\delta_{i1} ,\; \ldots ,\;\delta_{{in_{i} }} ,\;t_{i}^{*} ,\,\delta_{i}^{*} |\tilde{\varTheta }\,) = L_{i}\) is the density function for subject \(i\).
Also, by the central limit theorem,
Finally, by Slutsky’s theorem,
Appendix 3: sketch of the proof of Theorem 2
Below, we give the outline for the proof of the consistency by showing the approximation
We substitute \(\hat{\tilde{\varTheta }}\) by \(\hat{\tilde{\varTheta }}^{(\, - k\,)}\) in Eq. (3), such that
It follows that
Also by Eq. (3), we have
After eliminating \(\tilde{\varTheta }_{0}\) from Eqs. (4) and (5),
Finally, we verify the desired results:
□
Rights and permissions
About this article
Cite this article
Huang, XW., Wang, W. & Emura, T. A copula-based Markov chain model for serially dependent event times with a dependent terminal event. Jpn J Stat Data Sci 4, 917–951 (2021). https://doi.org/10.1007/s42081-020-00087-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-020-00087-8