Skip to main content

Advertisement

Log in

Modeling of semi-competing risks by means of first passage times of a stochastic process

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

In semi-competing risks one considers a terminal event, such as death of a person, and a non-terminal event, such as disease recurrence. We present a model where the time to the terminal event is the first passage time to a fixed level c in a stochastic process, while the time to the non-terminal event is represented by the first passage time of the same process to a stochastic threshold S, assumed to be independent of the stochastic process. In order to be explicit, we let the stochastic process be a gamma process, but other processes with independent increments may alternatively be used. For semi-competing risks this appears to be a new modeling approach, being an alternative to traditional approaches based on illness-death models and copula models. In this paper we consider a fully parametric approach. The likelihood function is derived and statistical inference in the model is illustrated on both simulated and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Aalen OO, Gjessing HK (2001) Understanding the shape of the hazard rate: a process point of view. Stat Sci 16(1):1–22

    MathSciNet  MATH  Google Scholar 

  • Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer Science & Business Media, New York

    Book  MATH  Google Scholar 

  • Bagdonavicius V, Nikulin M (2001) Estimation in degradation models with explanatory variables. Lifetime Data Anal 7:85–103

    Article  MathSciNet  MATH  Google Scholar 

  • Borgan O (1998) Aalen-Johansen estimator. In: Armitage P, Colton T (eds) Encyclopedia of biostatistics, vol 1. Wiley, Chichester, pp 5–10

  • Casella G, Berger R (2002) Statistical inference, 2nd edn. Duxbury, Pacific Grove

    MATH  Google Scholar 

  • Christen JA, Ruggeri F, Villa E (2011) Utility based maintenance analysis using a random sign censoring model. Reliab Eng Syst Safe 96(3):425–431

    Article  Google Scholar 

  • Clayton DG (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65:141–151

    Article  MathSciNet  MATH  Google Scholar 

  • Cooke RM (1993) The total time on test statistic and age-dependent censoring. Stat Probab Lett 18:307–312

    Article  MathSciNet  MATH  Google Scholar 

  • Cooke RM, Bedford T (2002) Reliability databases in perspective. IEEE Trans Reliab 51(3):294–310

    Article  Google Scholar 

  • Copeland EA, Biggs JC, Thompson JM, Crilley P, Szer J, Klein JP, Kapoor N, Avalos BR, Cunningham I, Atkinson K, Downs K, Harmon GS, Daly MB, Brodsky I, Bulova PJSI, Tutschka (1991) Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with bucy2. Blood 78:838–843

    Google Scholar 

  • Esary JD, Proschan F, Walkup DW et al (1967) Association of random variables, with applications. Annal Math Stat 38(5):1466–1474

    Article  MathSciNet  MATH  Google Scholar 

  • Fine JP, Jiang H, Chappell R (2001) On semi-competing risks data. Biometrika 88:907–919

    Article  MathSciNet  MATH  Google Scholar 

  • Fix E, Neyman J (1951) A simple stochastic model of recovery, relapse, death and loss of patients. Human Biol 23:205–241

    Google Scholar 

  • Horrocks J, Thompson ME (2004) Modeling event times with multiple outcomes using the wiener process with drift. Lifetime Data Anal 10:29–49

    Article  MathSciNet  MATH  Google Scholar 

  • Hsieh JJ, Wang W, Ding AA (2008) Regression analysis based on semicompeting risks data. J R Stat Soc Ser B 70:3–20

    MathSciNet  MATH  Google Scholar 

  • Kahle W, Mercier S, Paroissin C (2016) Degradation processes in reliability. Wiley, Hoboken

    Book  MATH  Google Scholar 

  • Klein JP, Moeschberger ML (1997) Survival analysis: techniques for censored and truncated data, 1st edn. Springer Science+Business Media, New York

    Book  MATH  Google Scholar 

  • Lawless J, Crowder M (2004) Covariates and random effects in a gamma process model with application to degradation and failure. Lifetime Data Anal 10:213–227

    Article  MathSciNet  MATH  Google Scholar 

  • Lee MLT, Whitmore G (2006) Threshold regression for survival analysis: modeling event times by a stochastic process reaching a boundary. Stat Sci 21(4):501–513

  • Lindqvist BH (1988) Association of probability measures on partially ordered spaces. J Multivar Anal 26(2):111–132

    Article  MathSciNet  MATH  Google Scholar 

  • Lindqvist BH, Skogsrud G (2008) Modeling of dependent competing risks by first passage times of wiener processes. IIE Trans 41(1):72–80

    Article  Google Scholar 

  • Lindqvist BH, Støve B, Langseth H (2006) Modelling of dependence between critical failure and preventive maintenance: the repair alert model. J Stat Plan Inference 136(5):1701–1717

    Article  MathSciNet  MATH  Google Scholar 

  • Meira-Machado L, de Uña-Álvarez J, Cadarso-Suárez C (2006) Nonparametric estimation of transition probabilities in a non-markov illness-death model. Lifetime Data Anal 12(3):325–344

    Article  MathSciNet  MATH  Google Scholar 

  • Park C, Padgett W (2005) Accelerated degradation models for failure based on geometric brownian motion and gamma processes. Lifetime Data Anal 11:511–527

    Article  MathSciNet  MATH  Google Scholar 

  • Paroissin C, Salami A (2014) Failure time of non homogeneous gamma process. Commun Stat Theory Methods 43(15):3148–3161

    Article  MathSciNet  MATH  Google Scholar 

  • Peng L, Fine JP (2006) Regression modeling of semicompeting risks data. Biometrics 63:96–108

    Article  MathSciNet  MATH  Google Scholar 

  • Putter H, Fiocco M, Geskus R (2007) Tutorial in biostatistics: competing risks and multi-state models. Stat Med 26(11):2389–2430

    Article  MathSciNet  Google Scholar 

  • Temme N (1975) Uniform asymptotic expansions of the incomplete gamma functions and the incomplete beta function. Math Comput 29(132):1109–1114

    Article  MathSciNet  MATH  Google Scholar 

  • van Noortwijk J (2009) A survey of the application of gamma processes in maintenance. Reliab Eng Syst Safe 94:2–21

    Article  Google Scholar 

  • Varadhan R, Xue QL, Bandeen-Roche K (2014) Semicompeting risks in ageing research: methods, issues and needs. Lifetime Data Anal 20:538–562

    Article  MathSciNet  MATH  Google Scholar 

  • Whitmore GA (1986) First-passage-time models for duration data: regression structures and competing risks. J R Stat Soc Ser D (The Statistician) 35(2):207–219

  • Xu J, Kalbfleisch JD, Tai B (2010) Statistical analysis of illness-death processes and semicompeting risks data. Biometrics 66(3):716–725

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful for valuable comments from two reviewers and an associate editor. In particular we thank the reviewer who pointed out that overshoot of the threshold needs to be taken into account for the gamma process.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Henry Lindqvist.

Appendix

Appendix

1.1 Joint density of \((T_d,T_c)\) for \(d<c\)

We will first calculate the joint density of \((T_d,D(T_d))\). This is done by integration of the joint density of \((T_d,D(T_d^-),D(T_d))\), which is given in Kahle et al. (2016, Theorem 2.37). The result is, in our notation,

$$\begin{aligned} \tilde{f}(t,z)= & {} \frac{e^{-z}v'(t)}{\Gamma (v(t))} \int _0^d \frac{x^{v(t)-1}}{z-x} dx \\= & {} \frac{e^{-z}v'(t)}{\Gamma (v(t))} \cdot \frac{d^{v(t)}\left( v(t)d \; _2F_1(1,v(t)+1;v(t)+2;d/z) + v(t)z+z \right) }{v(t)(v(t)+1)z^2} \end{aligned}$$

From this we get the joint density of \((T_d,T_c)\) for \(d < c\),

$$\begin{aligned} f(t_1,t_2&;&v(t_1),v(t_2),d,c)dt_1dt_2 = P(t_1 \le T_d \le t_1 + dt_1, t_2 \le T_c \le t_2 + dt_2) \\= & {} \int _{z=d}^c P(t_1 \le T_d \le t_1 + dt_1, z \le D(T_d) \le z +dz, t_2 \le T_c \le t_2 + dt_2) \\= & {} \left[ \int _{z=d}^c \tilde{f}(t,z) P(t_2 \le T_c \le t_2 + dt_2|D(T_d)=z) dz \right] dt_1\\= & {} \left[ \int _{z=d}^c \tilde{f}(t,z) f(t_2 -t_1; v(t_2) - v(t_1),c -z)dz \right] dt_1 dt_2 \end{aligned}$$

There is, however, also a possibility that the process crosses both d and c at the same time, giving a tie between \(T_d\) and \(T_c\). In this case, the relevant density is

$$\begin{aligned} f(t_1,t_1&;&v(t_1),v(t_1),d,c)dt_1 = P(t_1 \le T_d \le t_1 + dt_1, t_1 \le T_c \le t_1 + dt_1) \\= & {} P(t_1 \le T_d \le t_1 + dt_1, D(T_d)>c) \\= & {} \left[ \int _c^\infty \tilde{f}(t_1,z)dz \right] dt_1 \end{aligned}$$

In our computer calculations we have used the simplifying approximation of the joint density of \((T_d,T_c)\), which assumes that \(D(T_d)=d\) and \(D(T_c)=c\). In this case we have, for \(d<c\),

$$\begin{aligned} f(t_1,t_2&;&v(t_1),v(t_2),d,c)dt_1dt_2 = P(t_1 \le T_d \le t_1 + dt_1, t_2 \le T_c \le t_2 + dt_2) \\= & {} P(t_1 \le T_d \le t_1 + dt_1)P(t_2 \le T_c \le t_2 + dt_2|T_d = t_1)\\= & {} f(t_1; v(t_1), d)dt_1f(t_2 -t_1; v(t_2) - v(t_1),c -d)dt_1dt_2 \end{aligned}$$

1.2 Identifiability of the model

We first prove identifiability of the parameters \(c,\alpha ,\beta \) from the distribution of X. Note that we have \(X=T_c\). Thus, from (4) we have,

$$\begin{aligned} P(X>t) = P(T_c > t) = \gamma (v(t),c)/\Gamma (v(t)) \end{aligned}$$
(11)

where \(\gamma (a,c) = \int _0^c z^{a-1}e^{-z}dz\). We first show as a digression that if c is unknown, then the function v(t) is not nonparametrically identifiable. This follows since in the right hand expression of (11) we may for any given \(c>0\) solve for v(t) for each fixed t. To see this, note from (11) that \(P(X>t)\) equals \(P(W>c)\) where \(W \sim \) gamma(v(t), 1). Since the gamma distribution is stochastically increasing in the shape parameter, here v(t), we may always adjust the v(t) to get a given value for \(P(W>c)\).

Thus suppose instead that (10) holds. Now we use a result from Temme (1975) to see that as \(a \rightarrow \infty \) we have

$$\begin{aligned} \gamma (a,c)/\Gamma (a) \sim \frac{c^a e^{-c}}{\Gamma (1+a)} \sim (2\pi a)^{-1/2} e^{a-c} \left( \frac{c}{a}\right) ^{a} . \end{aligned}$$

Here the last expression is obtained by using Stirling’s formula. Letting \(a=\alpha t^\beta \) and taking the logarithm we get for large t,

$$\begin{aligned} \log P(X>t)\sim & {} -(1/2) \log 2\pi -(1/2) \log \alpha -(1/2) \beta \log t + \alpha t^\beta - c + \alpha t^\beta \log c \nonumber \\- & {} \alpha t^\beta \log \alpha - \alpha t^\beta \beta \log t \end{aligned}$$
(12)

Suppose now there is another combination of \(c, \alpha , \beta \), denoted \(c^*,\alpha ^*,\beta ^*\), for which the same \(P(X>t)\) is obtained for all t. Then letting \(t \rightarrow \infty \), the dominant term in (12) is \(\alpha t^\beta \log t\) which hence must be equal for the two parametrizations, implying \(\beta =\beta ^*\) and hence also \(\alpha =\alpha ^*\). Finally, this clearly implies \(c=c^*\) and we are done.

For identifiability of the full threshold model, it remains to show that the distribution of S conditional on \(S<c\) is identifiable when c and the parameters of the process D(t) are given. We are in fact able to show that this distribution is nonparametrically identifiable for any given v(t) and c, which we for simplicity will assume to be strictly increasing and continuous, with \(v(0)=0\), \(v(\infty )=\infty \). Suppose first that \(c=\infty \), so that \(T_S\) is always observed. We will show that the distribution of \(T_S\) uniquely determines the distribution of S. Now

$$\begin{aligned} P(T_S>t) = P(S>D(t)) = E [P(S>D(t)|D(t))] = E[ \bar{F}_S(W) ] \end{aligned}$$
(13)

where \(W \sim \) gamma(v(t), 1) and \(\bar{F}_S(s)=P(S>s)\). Since this is to hold for all \(t>0\), from the fact that the family of \(W \sim \) gamma(\(\theta ,1)\) is a complete family of distributions, it follows that \(\bar{F}_S\) is uniquely given and hence that the distribution of \(T_S\) uniquely determines the distribution of S Casella and Berger (2002, Chapter 6.2).

Next, for a given \(c < \infty \) we need to show that the (observable) distribution \(P(T_S>t|S<c)\) uniquely determines the distribution \(P(S>s|S<c)\). This follows directly from the above argument which had \(c=\infty \) by considering only distributions for S with support in (0, c).

Note finally that (13) is a result of interest in itself if the distribution of the threshold S is given and one wants the distribution of \(T_S\). Paroissin and Salami (2014) consider the cases where S is, respectively, exponentially and gamma distributed.

1.3 Nonparametric estimation of crude quantities

Consider competing risks with latent variables X and Z. Suppose that n units are observed, either until (independent) right censoring or until time \(T = \min (X,Z)\), whatever comes first. Let \(t_{1}< \cdots < t_{k}\) be the sorted event times, i.e., observations of T. Let further \(\hat{S}(\cdot )\) be the Kaplan–Meier estimator of the survival function of T. Then the so-called Aalen–Johansen estimator of the sub-distribution functions are (Borgan 1998):

$$\begin{aligned} \hat{F}^{*}_{X}(t)= & {} \sum _{i; t_i \le t} \hat{S}(t_i)\frac{\delta _{iX}}{n_i},\nonumber \\ \hat{F}^{*}_{Z}(t)= & {} \sum _{i; t_i \le t} \hat{S}(t_i)\frac{\delta _{iZ}}{n_i}. \end{aligned}$$
(14)

Here \(n_i\) is the number at risk at time \(t_{i}\) while \(\delta _{iX} = 1\) (\(\delta _{iZ} = 1\)) if the observation at time \(t_i\) is an X (Z). The natural estimates of the conditional sub-distribution functions \(\tilde{F}_X(t) \) and \(\tilde{F}_Z(t)\) are hence

$$\begin{aligned} \hat{\tilde{F}}_X(t) = \frac{\hat{F}_X^{*}(t)}{\hat{F}_X^{*}(\infty )} \quad \text {and} \quad \hat{\tilde{F}}_Z(t) = \frac{\hat{F}_Z^{*}(t)}{\hat{F}_Z^{*}(\infty )}. \end{aligned}$$
(15)

With the same notation we have nonparametric estimates of the cumulative cause-specific hazard functions for the two risks given by

$$\begin{aligned} \hat{\Lambda }^{*}_{X}(t)= & {} \sum _{i; t_i \le t}\frac{\delta _{iX}}{n_i},\nonumber \\ \hat{\Lambda }^{*}_{Z}(t)= & {} \sum _{i; t_i \le t}\frac{\delta _{iZ}}{n_i}. \end{aligned}$$
(16)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sildnes, B., Lindqvist, B.H. Modeling of semi-competing risks by means of first passage times of a stochastic process. Lifetime Data Anal 24, 153–175 (2018). https://doi.org/10.1007/s10985-017-9399-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-017-9399-y

Keywords

Navigation