Modeling of semi-competing risks by means of first passage times of a stochastic process

Sildnes, Beate; Lindqvist, Bo Henry

doi:10.1007/s10985-017-9399-y

Modeling of semi-competing risks by means of first passage times of a stochastic process

Published: 22 July 2017

Volume 24, pages 153–175, (2018)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Beate Sildnes¹^nAff2 &
Bo Henry Lindqvist¹

851 Accesses
4 Citations
Explore all metrics

Abstract

In semi-competing risks one considers a terminal event, such as death of a person, and a non-terminal event, such as disease recurrence. We present a model where the time to the terminal event is the first passage time to a fixed level c in a stochastic process, while the time to the non-terminal event is represented by the first passage time of the same process to a stochastic threshold S, assumed to be independent of the stochastic process. In order to be explicit, we let the stochastic process be a gamma process, but other processes with independent increments may alternatively be used. For semi-competing risks this appears to be a new modeling approach, being an alternative to traditional approaches based on illness-death models and copula models. In this paper we consider a fully parametric approach. The likelihood function is derived and statistical inference in the model is illustrated on both simulated and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Competing Risks Modeling by Extended Phase-Type Semi-Markov Distributions

Article 17 February 2021

Phase-Type Models and Their Extension to Competing Risks

A dependent Dirichlet process model for survival data with competing risks

Article 12 October 2020

References

Aalen OO, Gjessing HK (2001) Understanding the shape of the hazard rate: a process point of view. Stat Sci 16(1):1–22
MathSciNet MATH Google Scholar
Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer Science & Business Media, New York
Book MATH Google Scholar
Bagdonavicius V, Nikulin M (2001) Estimation in degradation models with explanatory variables. Lifetime Data Anal 7:85–103
Article MathSciNet MATH Google Scholar
Borgan O (1998) Aalen-Johansen estimator. In: Armitage P, Colton T (eds) Encyclopedia of biostatistics, vol 1. Wiley, Chichester, pp 5–10
Casella G, Berger R (2002) Statistical inference, 2nd edn. Duxbury, Pacific Grove
MATH Google Scholar
Christen JA, Ruggeri F, Villa E (2011) Utility based maintenance analysis using a random sign censoring model. Reliab Eng Syst Safe 96(3):425–431
Article Google Scholar
Clayton DG (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65:141–151
Article MathSciNet MATH Google Scholar
Cooke RM (1993) The total time on test statistic and age-dependent censoring. Stat Probab Lett 18:307–312
Article MathSciNet MATH Google Scholar
Cooke RM, Bedford T (2002) Reliability databases in perspective. IEEE Trans Reliab 51(3):294–310
Article Google Scholar
Copeland EA, Biggs JC, Thompson JM, Crilley P, Szer J, Klein JP, Kapoor N, Avalos BR, Cunningham I, Atkinson K, Downs K, Harmon GS, Daly MB, Brodsky I, Bulova PJSI, Tutschka (1991) Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with bucy2. Blood 78:838–843
Google Scholar
Esary JD, Proschan F, Walkup DW et al (1967) Association of random variables, with applications. Annal Math Stat 38(5):1466–1474
Article MathSciNet MATH Google Scholar
Fine JP, Jiang H, Chappell R (2001) On semi-competing risks data. Biometrika 88:907–919
Article MathSciNet MATH Google Scholar
Fix E, Neyman J (1951) A simple stochastic model of recovery, relapse, death and loss of patients. Human Biol 23:205–241
Google Scholar
Horrocks J, Thompson ME (2004) Modeling event times with multiple outcomes using the wiener process with drift. Lifetime Data Anal 10:29–49
Article MathSciNet MATH Google Scholar
Hsieh JJ, Wang W, Ding AA (2008) Regression analysis based on semicompeting risks data. J R Stat Soc Ser B 70:3–20
MathSciNet MATH Google Scholar
Kahle W, Mercier S, Paroissin C (2016) Degradation processes in reliability. Wiley, Hoboken
Book MATH Google Scholar
Klein JP, Moeschberger ML (1997) Survival analysis: techniques for censored and truncated data, 1st edn. Springer Science+Business Media, New York
Book MATH Google Scholar
Lawless J, Crowder M (2004) Covariates and random effects in a gamma process model with application to degradation and failure. Lifetime Data Anal 10:213–227
Article MathSciNet MATH Google Scholar
Lee MLT, Whitmore G (2006) Threshold regression for survival analysis: modeling event times by a stochastic process reaching a boundary. Stat Sci 21(4):501–513
Lindqvist BH (1988) Association of probability measures on partially ordered spaces. J Multivar Anal 26(2):111–132
Article MathSciNet MATH Google Scholar
Lindqvist BH, Skogsrud G (2008) Modeling of dependent competing risks by first passage times of wiener processes. IIE Trans 41(1):72–80
Article Google Scholar
Lindqvist BH, Støve B, Langseth H (2006) Modelling of dependence between critical failure and preventive maintenance: the repair alert model. J Stat Plan Inference 136(5):1701–1717
Article MathSciNet MATH Google Scholar
Meira-Machado L, de Uña-Álvarez J, Cadarso-Suárez C (2006) Nonparametric estimation of transition probabilities in a non-markov illness-death model. Lifetime Data Anal 12(3):325–344
Article MathSciNet MATH Google Scholar
Park C, Padgett W (2005) Accelerated degradation models for failure based on geometric brownian motion and gamma processes. Lifetime Data Anal 11:511–527
Article MathSciNet MATH Google Scholar
Paroissin C, Salami A (2014) Failure time of non homogeneous gamma process. Commun Stat Theory Methods 43(15):3148–3161
Article MathSciNet MATH Google Scholar
Peng L, Fine JP (2006) Regression modeling of semicompeting risks data. Biometrics 63:96–108
Article MathSciNet MATH Google Scholar
Putter H, Fiocco M, Geskus R (2007) Tutorial in biostatistics: competing risks and multi-state models. Stat Med 26(11):2389–2430
Article MathSciNet Google Scholar
Temme N (1975) Uniform asymptotic expansions of the incomplete gamma functions and the incomplete beta function. Math Comput 29(132):1109–1114
Article MathSciNet MATH Google Scholar
van Noortwijk J (2009) A survey of the application of gamma processes in maintenance. Reliab Eng Syst Safe 94:2–21
Article Google Scholar
Varadhan R, Xue QL, Bandeen-Roche K (2014) Semicompeting risks in ageing research: methods, issues and needs. Lifetime Data Anal 20:538–562
Article MathSciNet MATH Google Scholar
Whitmore GA (1986) First-passage-time models for duration data: regression structures and competing risks. J R Stat Soc Ser D (The Statistician) 35(2):207–219
Xu J, Kalbfleisch JD, Tai B (2010) Statistical analysis of illness-death processes and semicompeting risks data. Biometrics 66(3):716–725
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors are grateful for valuable comments from two reviewers and an associate editor. In particular we thank the reviewer who pointed out that overshoot of the threshold needs to be taken into account for the gamma process.

Author information

Beate Sildnes
Present address: BearingPoint, Tjuvholmen allé 3, 0252, Oslo, Norway

Authors and Affiliations

Department of Mathematical Sciences, Norwegian University of Science and Technology, 7491, Trondheim, Norway
Beate Sildnes & Bo Henry Lindqvist

Authors

Beate Sildnes
View author publications
You can also search for this author in PubMed Google Scholar
Bo Henry Lindqvist
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo Henry Lindqvist.

Appendix

1.1 Joint density of $(T_d,T_c)$ for $d<c$

We will first calculate the joint density of $(T_d,D(T_d))$. This is done by integration of the joint density of $(T_d,D(T_d^-),D(T_d))$, which is given in Kahle et al. (2016, Theorem 2.37). The result is, in our notation,

$$\begin{aligned} \tilde{f}(t,z)= & {} \frac{e^{-z}v'(t)}{\Gamma (v(t))} \int _0^d \frac{x^{v(t)-1}}{z-x} dx \\= & {} \frac{e^{-z}v'(t)}{\Gamma (v(t))} \cdot \frac{d^{v(t)}\left( v(t)d \; _2F_1(1,v(t)+1;v(t)+2;d/z) + v(t)z+z \right) }{v(t)(v(t)+1)z^2} \end{aligned}$$

From this we get the joint density of $(T_d,T_c)$ for $d < c$,

$$\begin{aligned} f(t_1,t_2&;&v(t_1),v(t_2),d,c)dt_1dt_2 = P(t_1 \le T_d \le t_1 + dt_1, t_2 \le T_c \le t_2 + dt_2) \\= & {} \int _{z=d}^c P(t_1 \le T_d \le t_1 + dt_1, z \le D(T_d) \le z +dz, t_2 \le T_c \le t_2 + dt_2) \\= & {} \left[ \int _{z=d}^c \tilde{f}(t,z) P(t_2 \le T_c \le t_2 + dt_2|D(T_d)=z) dz \right] dt_1\\= & {} \left[ \int _{z=d}^c \tilde{f}(t,z) f(t_2 -t_1; v(t_2) - v(t_1),c -z)dz \right] dt_1 dt_2 \end{aligned}$$

There is, however, also a possibility that the process crosses both d and c at the same time, giving a tie between $T_d$ and $T_c$. In this case, the relevant density is

$$\begin{aligned} f(t_1,t_1&;&v(t_1),v(t_1),d,c)dt_1 = P(t_1 \le T_d \le t_1 + dt_1, t_1 \le T_c \le t_1 + dt_1) \\= & {} P(t_1 \le T_d \le t_1 + dt_1, D(T_d)>c) \\= & {} \left[ \int _c^\infty \tilde{f}(t_1,z)dz \right] dt_1 \end{aligned}$$

In our computer calculations we have used the simplifying approximation of the joint density of $(T_d,T_c)$, which assumes that $D(T_d)=d$ and $D(T_c)=c$. In this case we have, for $d<c$,

$$\begin{aligned} f(t_1,t_2&;&v(t_1),v(t_2),d,c)dt_1dt_2 = P(t_1 \le T_d \le t_1 + dt_1, t_2 \le T_c \le t_2 + dt_2) \\= & {} P(t_1 \le T_d \le t_1 + dt_1)P(t_2 \le T_c \le t_2 + dt_2|T_d = t_1)\\= & {} f(t_1; v(t_1), d)dt_1f(t_2 -t_1; v(t_2) - v(t_1),c -d)dt_1dt_2 \end{aligned}$$

1.2 Identifiability of the model

We first prove identifiability of the parameters $c,\alpha ,\beta $ from the distribution of X. Note that we have $X=T_c$. Thus, from (4) we have,

$$\begin{aligned} P(X>t) = P(T_c > t) = \gamma (v(t),c)/\Gamma (v(t)) \end{aligned}$$

(11)

where $\gamma (a,c) = \int _0^c z^{a-1}e^{-z}dz$. We first show as a digression that if c is unknown, then the function v(t) is not nonparametrically identifiable. This follows since in the right hand expression of (11) we may for any given $c>0$ solve for v(t) for each fixed t. To see this, note from (11) that $P(X>t)$ equals $P(W>c)$ where $W \sim $ gamma(v(t), 1). Since the gamma distribution is stochastically increasing in the shape parameter, here v(t), we may always adjust the v(t) to get a given value for $P(W>c)$.

Thus suppose instead that (10) holds. Now we use a result from Temme (1975) to see that as $a \rightarrow \infty $ we have

$$\begin{aligned} \gamma (a,c)/\Gamma (a) \sim \frac{c^a e^{-c}}{\Gamma (1+a)} \sim (2\pi a)^{-1/2} e^{a-c} \left( \frac{c}{a}\right) ^{a} . \end{aligned}$$

Here the last expression is obtained by using Stirling’s formula. Letting $a=\alpha t^\beta $ and taking the logarithm we get for large t,

$$\begin{aligned} \log P(X>t)\sim & {} -(1/2) \log 2\pi -(1/2) \log \alpha -(1/2) \beta \log t + \alpha t^\beta - c + \alpha t^\beta \log c \nonumber \\- & {} \alpha t^\beta \log \alpha - \alpha t^\beta \beta \log t \end{aligned}$$

(12)

Suppose now there is another combination of $c, \alpha , \beta $, denoted $c^*,\alpha ^*,\beta ^*$, for which the same $P(X>t)$ is obtained for all t. Then letting $t \rightarrow \infty $, the dominant term in (12) is $\alpha t^\beta \log t$ which hence must be equal for the two parametrizations, implying $\beta =\beta ^*$ and hence also $\alpha =\alpha ^*$. Finally, this clearly implies $c=c^*$ and we are done.

For identifiability of the full threshold model, it remains to show that the distribution of S conditional on $S<c$ is identifiable when c and the parameters of the process D(t) are given. We are in fact able to show that this distribution is nonparametrically identifiable for any given v(t) and c, which we for simplicity will assume to be strictly increasing and continuous, with $v(0)=0$, $v(\infty )=\infty $. Suppose first that $c=\infty $, so that $T_S$ is always observed. We will show that the distribution of $T_S$ uniquely determines the distribution of S. Now

$$\begin{aligned} P(T_S>t) = P(S>D(t)) = E [P(S>D(t)|D(t))] = E[ \bar{F}_S(W) ] \end{aligned}$$

(13)

where $W \sim $ gamma(v(t), 1) and $\bar{F}_S(s)=P(S>s)$. Since this is to hold for all $t>0$, from the fact that the family of $W \sim $ gamma($\theta ,1)$ is a complete family of distributions, it follows that $\bar{F}_S$ is uniquely given and hence that the distribution of $T_S$ uniquely determines the distribution of S Casella and Berger (2002, Chapter 6.2).

Next, for a given $c < \infty $ we need to show that the (observable) distribution $P(T_S>t|S<c)$ uniquely determines the distribution $P(S>s|S<c)$. This follows directly from the above argument which had $c=\infty $ by considering only distributions for S with support in (0, c).

Note finally that (13) is a result of interest in itself if the distribution of the threshold S is given and one wants the distribution of $T_S$. Paroissin and Salami (2014) consider the cases where S is, respectively, exponentially and gamma distributed.

1.3 Nonparametric estimation of crude quantities

Consider competing risks with latent variables X and Z. Suppose that n units are observed, either until (independent) right censoring or until time $T = \min (X,Z)$, whatever comes first. Let $t_{1}< \cdots < t_{k}$ be the sorted event times, i.e., observations of T. Let further $\hat{S}(\cdot )$ be the Kaplan–Meier estimator of the survival function of T. Then the so-called Aalen–Johansen estimator of the sub-distribution functions are (Borgan 1998):

$$\begin{aligned} \hat{F}^{*}_{X}(t)= & {} \sum _{i; t_i \le t} \hat{S}(t_i)\frac{\delta _{iX}}{n_i},\nonumber \\ \hat{F}^{*}_{Z}(t)= & {} \sum _{i; t_i \le t} \hat{S}(t_i)\frac{\delta _{iZ}}{n_i}. \end{aligned}$$

(14)

Here $n_i$ is the number at risk at time $t_{i}$ while $\delta _{iX} = 1$ ($\delta _{iZ} = 1$) if the observation at time $t_i$ is an X (Z). The natural estimates of the conditional sub-distribution functions $\tilde{F}_X(t) $ and $\tilde{F}_Z(t)$ are hence

$$\begin{aligned} \hat{\tilde{F}}_X(t) = \frac{\hat{F}_X^{*}(t)}{\hat{F}_X^{*}(\infty )} \quad \text {and} \quad \hat{\tilde{F}}_Z(t) = \frac{\hat{F}_Z^{*}(t)}{\hat{F}_Z^{*}(\infty )}. \end{aligned}$$

(15)

With the same notation we have nonparametric estimates of the cumulative cause-specific hazard functions for the two risks given by

$$\begin{aligned} \hat{\Lambda }^{*}_{X}(t)= & {} \sum _{i; t_i \le t}\frac{\delta _{iX}}{n_i},\nonumber \\ \hat{\Lambda }^{*}_{Z}(t)= & {} \sum _{i; t_i \le t}\frac{\delta _{iZ}}{n_i}. \end{aligned}$$

(16)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sildnes, B., Lindqvist, B.H. Modeling of semi-competing risks by means of first passage times of a stochastic process. Lifetime Data Anal 24, 153–175 (2018). https://doi.org/10.1007/s10985-017-9399-y

Download citation

Received: 03 June 2016
Accepted: 14 July 2017
Published: 22 July 2017
Issue Date: January 2018
DOI: https://doi.org/10.1007/s10985-017-9399-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling of semi-competing risks by means of first passage times of a stochastic process

Abstract

Access this article

Similar content being viewed by others

Competing Risks Modeling by Extended Phase-Type Semi-Markov Distributions

Phase-Type Models and Their Extension to Competing Risks

A dependent Dirichlet process model for survival data with competing risks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Joint density of \((T_d,T_c)\) for \(d<c\)

1.2 Identifiability of the model

1.3 Nonparametric estimation of crude quantities

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modeling of semi-competing risks by means of first passage times of a stochastic process

Abstract

Access this article

Similar content being viewed by others

Competing Risks Modeling by Extended Phase-Type Semi-Markov Distributions

Phase-Type Models and Their Extension to Competing Risks

A dependent Dirichlet process model for survival data with competing risks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Joint density of \((T_d,T_c)\) for \(d<c\)

1.2 Identifiability of the model

1.3 Nonparametric estimation of crude quantities

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation