Abstract
Ridder and Woutersen (Ridder, G., and T. Woutersen. 2003. “The Singularity of the Efficiency Bound of the Mixed Proportional Hazard Model.” Econometrica 71: 1579–1589) have shown that under a weak condition on the baseline hazard, there exist root-N consistent estimators of the parameters in a semiparametric Mixed Proportional Hazard model with a parametric baseline hazard and unspecified distribution of the unobserved heterogeneity. We extend the linear rank estimator (LRE) of Tsiatis (Tsiatis, A. A. 1990. “Estimating Regression Parameters using Linear Rank Tests for Censored Data.” Annals of Statistics 18: 354–372) and Robins and Tsiatis (Robins, J. M., and A. A. Tsiatis. 1992. “Semiparametric Estimation of an Accelerated Failure Time Model with Time-Dependent Covariates.” Biometrika 79: 311–319) to this class of models. The optimal LRE is a two-step estimator. We propose a simple one-step estimator that is close to optimal if there is no unobserved heterogeneity. The efficiency gain associated with the optimal LRE increases with the degree of unobserved heterogeneity.
We thank Kei Hirano and Nicole Lott for very helpful comments. We also thank seminar participants at the University of Western Ontario, and the Netherlands Interdisciplinary Demographic Institute. This paper replaces the paper Method of Moments Estimation of Duration Models with Exogenous Regressors (2003). Financial support from NORFACE research programme on Migration in Europe – Social, Economic, Cultural and Policy Dynamics is gratefully acknowledged.
- 1
Horowitz (2001, theorem 2.2) averages gn (Xi); the STATA program on our website is sufficiently fast to apply the bootstrap to most survey datasets.
- 2
The Brent’s method combines the bisection method, the secant method and inverse quadratic interpolation. The idea is to use the secant method or inverse quadratic interpolation if possible, because they converge faster, but to fall back to the more robust bisection method if necessary. The secant method can be thought of as a finite difference approximation of the Newton-Raphson method. The Powell method extends the Brent method by searching in a specific direction, rather than changing one parameter at the time.
- 3
See http://publ.nidi.nl/output/other/LRE.zip for the program and http://publ.nidi.nl/output/other/LRE_help.pdf for the help file.
- 4
In the MLE for models with duration dependence, we do not need the standard identification restriction that the unobserved heterogeneity term has mean one because the baseline hazard is normalized to be equal to 1 in the first interval.
- 5
The Gâteaux derivative is a directional derivative; let
and and η>0 then df(x, a)=limη0[{f(x+aη)–f(x)}/η]. - 6
Our calculations were done in Gauss 6.0 on 3 parallel computers: a Pentium 2.1 PC, a Pentium 2.8 PC and a Pentium 2.0 laptop. The calculations took about 9 weeks of CPU time.
- 7
The LRE with a duration dependence on 10 intervals for a sample size of 500 did not converge in seven of the experiments. The average is therefore base on 93 experiments instead of 100.
- 8
- 9
The Doob-Meyer decomposition theorem is a theorem in stochastic calculus stating the conditions under which a submartingale may be decomposed in a unique way as the sum of a martingale and a continuous increasing process, see Meyer (1963) and Protter (2005).
Appendix A: Additional tables
Estimation method | Sample size | |||
---|---|---|---|---|
500 | 1000 | 5000 | ||
MLE no hetero | α2 | –0.0480* | –0.0319* | –0.0095* |
(0.150) | (0.0103) | (0.0042) | ||
α3 | –0.0082 | –0.0127 | –0.0094* | |
(0.0132) | (0.0088) | (0.0041) | ||
α4 | –0.0149 | –0.0102 | –0.0079 | |
(0.0127) | (0.0089) | (0.0046) | ||
MLE 2 points | α2 | 0.0282 | 0.0257 | 0.0140* |
(0.0194) | (0.0158) | (0.0053) | ||
α3 | 0.1131* | 0.0713* | 0.0257* | |
(0.0237) | (0.0175) | (0.0064) | ||
α4 | 0.1480* | 0.1013* | 0.0438* | |
(0.0273) | (0.0213) | (0.0076) | ||
NPMLE | α2 | 0.0785* | 0.0495* | 0.0211* |
(0.0210) | (0.0152) | (0.0050) | ||
α3 | 0.2011* | 0.1027* | 0.389* | |
(0.0275) | (0.0183) | (0.0059) | ||
α4 | 0.2835* | 0.1782* | 0.0612* | |
(0.0339) | (0.0228) | (0.0079) | ||
LRE | α2 | –0.0333 | –0.0234 | –0.0074 |
(0.0230) | (0.0184) | (0.0066) | ||
α3 | 0.0391 | 0.0158 | –0.0087 | |
(0.0306) | (0.0224) | (0.0093) | ||
α4 | 0.0536 | 0.0264 | –0.0109 | |
(0.0383) | (0.0287) | (0.0128) |
*p<0.05
Sample size | Sample size | |||||
---|---|---|---|---|---|---|
500 | 1000 | 5000 | 500 | 1000 | 5000 | |
MLE no hetero | MLE 2 points | |||||
α2 | –0.0240 | –0.0098 | 0.0068 | 0.0704* | 0.0498* | 0.0464* |
(0.0216) | (0.0153) | (0.0063) | (0.0230) | (0.0176) | (0.0080) | |
α3 | –0.0162 | –0.0089 | –0.0090 | 0.1096* | 0.0740* | 0.0420* |
(0.0241) | (0.0157) | (0.0061) | (0.0283) | (0.0195) | (0.0086) | |
α4 | –0.0609* | –0.0378* | –0.0069 | 0.0958* | 0.0627* | 0.0590* |
(0.0207) | (0.0135) | (0.0054) | (0.0273) | (0.0204) | (0.0098) | |
α5 | 0.0073 | –0.0035 | –0.0115 | 0.1991* | 0.1229* | 0.0690* |
(0.0206) | (0.0144) | (0.0069) | (0.0305) | (0.0231) | (0.0117) | |
α6 | –0.0097 | –0.0024 | –0.0059 | 0.1986* | 0.1348* | 0.0766* |
(0.0207) | (0.0127) | (0.0067) | (0.0340) | (0.0226) | (0.0123) | |
α7 | –0.0593* | –0.0464* | –0.0074 | 0.1617* | 0.0971* | 0.0823* |
(0.0226) | (0.0154) | (0.0072) | (0.0364) | (0.0269) | (0.0135) | |
α8 | –0.0144 | –0.0130 | –0.0023 | 0.2161* | 0.1491* | 0.0963* |
(0.0204) | (0.0151) | (0.0070) | (0.0360) | (0.0277) | (0.0141) | |
α9 | –0.0209 | –0.0076 | –0.0120 | 0.2309* | 0.1616* | 0.0964* |
(0.0243) | (0.0149) | (0.0075) | (0.0388) | (0.0284) | (0.0137) | |
α10 | –0.0383 | –0.0217 | –0.0078 | 0.2324* | 0.1658* | 0.1068* |
(0.0206) | (0.0153) | (0.0071) | (0.0379) | (0.0287) | (0.0154) | |
NPMLE | LRE | |||||
α2 | 0.1790* | 0.1157* | 0.0703* | –0.0648* | –0.0460* | 0.0088 |
(0.0267) | (0.0184) | (0.0088) | (0.0298) | (0.0221) | (0.0106) | |
α3 | 0.3039* | 0.1880* | 0.0871* | –0.0784 | –0.0664* | –0.0070 |
(0.0397) | (0.0239) | (0.0099) | (0.0446) | (0.0315) | (0.0136) | |
α4 | 0.3730* | 0.2298* | 0.1181* | –0.1236* | –0.0942* | –0.0041 |
(0.0466) | (0.0298) | (0.0120) | (0.0514) | (0.0387) | (0.0166) | |
α5 | 0.5390* | 0.3248* | 0.1372* | –0.0554 | –0.0605 | –0.0093 |
(0.0554) | (0.0343) | (0.0146) | (0.0599) | (0.0443) | (0.0203) | |
α6 | 0.5848* | 0.3649* | 0.1573* | –0.0716 | –0.0617 | –0.0050 |
(0.0583) | (0.0383) | (0.0151) | (0.0646) | (0.0496) | (0.0220) | |
α7 | 0.5910* | 0.3554* | 0.1692* | –0.1230 | –0.1079* | –0.0078 |
(0.0646) | (0.0413) | (0.0170) | (0.0698) | (0.0530) | (0.0245) | |
α8 | 0.6916* | 0.4232* | 0.1884* | –0.0844 | –0.0792 | –0.0042 |
(0.0678) | (0.0429) | (0.0179) | (0.0782) | (0.0570) | (0.0258) | |
α9 | 0.7346* | 0.4594* | 0.1918* | –0.0921 | –0.0819 | –0.0157 |
(0.0734) | (0.0441) | (0.0191) | (0.0782) | (0.0578) | (0.0278) | |
α10 | 0.7758* | 0.48169* | 0.2123* | –0.1230 | –0.1038 | –0.0117 |
(0.0736) | (0.0486) | (0.0209) | (0.0803) | (0.0637) | (0.0309) |
For sample size of 500 based on 93 experiments, because in seven experiments the estimation procedure did not convergence . *p<0.05.
Duration dependence | Estimation method | Bias | Std error | RMSE | |
---|---|---|---|---|---|
Positive duration dependence | MLE gamma | α2 | 0.0069 | 0.0096 | 0.0118 |
α3 | –0.0149 | 0.0206 | 0.0255 | ||
NPMLE | α2 | 0.0205 | 0.0157 | 0.0258 | |
α3 | 0.0091 | 0.0283 | 0.0298 | ||
LRE | α2 | –0.0130 | 0.0200 | 0.0238 | |
α3 | –0.0645 | 0.0329 | 0.0724 | ||
LRE-opt | α2 | –0.0134 | 0.0195 | 0.0236 | |
α3 | –0.0533 | 0.327 | 0.0625 | ||
Negative duration dependence | MLE gamma | α2 | 0.0211 | 0.0111 | 0.0239 |
α3 | 0.0553* | 0.0229 | 0.0598 | ||
NPMLE | α2 | 0.0345* | 0.0174 | 0.0386 | |
α3 | 0.1079* | 0.0310 | 0.1123 | ||
LRE | α2 | 0.0369* | 0.0179 | 0.0410 | |
α3 | 0.0643* | 0.0315 | 0.0716 | ||
LRE-opt | α2 | 0.0358* | 0.0178 | 0.0400 | |
α3 | 0.0627* | 0.0314 | 0.0701 | ||
U-shaped duration dependence | MLE gamma | α2 | –0.0009 | 0.0097 | 0.0097 |
α3 | –0.0338* | 0.0173 | 0.0379 | ||
NPMLE | α2 | 0.0385* | 0.0155 | 0.0416 | |
α3 | 0.0149 | 0.0251 | 0.0292 | ||
LRE | α2 | 0.0334 | 0.0186 | 0.0383 | |
α3 | –0.0215 | 0.0271 | 0.0346 | ||
LRE-opt | α2 | 0.0261 | 0.0183 | 0.0319 | |
α3 | –0.0247 | 0.0263 | 0.0361 | ||
Inverse U duration dependence | MLE gamma | α2 | 0.0102 | 0.0104 | 0.0146 |
α3 | –0.0047 | 0.0232 | 0.0237 | ||
NPMLE | α2 | 0.0232 | 0.0140 | 0.0271 | |
α3 | 0.0327 | 0.0295 | 0.0440 | ||
LRE | α2 | 0.0335 | 0.0183 | 0.0381 | |
α3 | 0.0400 | 0.0336 | 0.0522 | ||
LRE-opt | α2 | 0.0321 | 0.0182 | 0.0369 | |
α3 | 0.0344 | 0.0336 | 0.0481 |
For each DGP (gamma mixture) 100 simulations with 1000 observations each. *p<0.05
Duration dependence | Estimation method | Bias | Std error | RMSE | |
---|---|---|---|---|---|
Positive duration dependence | MLE gamma | α2 | 0.0010 | 0.0135 | 0.0135 |
α3 | –0.0267 | 0.0269 | 0.0379 | ||
NPMLE | α2 | 0.0120 | 0.0177 | 0.0213 | |
α3 | –0.0204 | 0.0310 | 0.0371 | ||
LRE | α2 | –0.0148 | 0.0199 | 0.0248 | |
α3 | –0.0656* | 0.0329 | 0.0734 | ||
LRE-opt | α2 | –0.0138 | 0.0199 | 0.0242 | |
α3 | –0.0599 | 0.0328 | 0.0683 | ||
Negative duration dependence | MLE gamma | α2 | 0.0347* | 0.0131 | 0.0371 |
α3 | 0.0633* | 0.0277 | 0.0691 | ||
NPMLE | α2 | 0.0417* | 0.0184 | 0.0456 | |
α3 | 0.0898* | 0.0325 | 0.0956 | ||
LRE | α2 | 0.0378* | 0.0182 | 0.0420 | |
α3 | 0.0539 | 0.0329 | 0.0631 | ||
LRE-opt | α2 | 0.0375* | 0.0181 | 0.0416 | |
α3 | 0.0501 | 0.0327 | 0.0598 | ||
U-shaped duration dependence | MLE gamma | α2 | 0.0052 | 0.0133 | 0.0143 |
α3 | –0.0269 | 0.0225 | 0.0350 | ||
NPMLE | α2 | 0.0308 | 0.0173 | 0.0353 | |
α3 | –0.0159 | 0.0292 | 0.0333 | ||
LRE | α2 | 0.0266 | 0.0184 | 0.0323 | |
α3 | –0.0321 | 0.0254 | 0.0410 | ||
LRE-opt | α2 | 0.0263 | 0.0182 | 0.0320 | |
α3 | –0.0315 | 0.0253 | 0.0404 | ||
Inverse U duration dependence | MLE gamma | α2 | 0.0137 | 0.0123 | 0.0184 |
α3 | –0.0030 | 0.0263 | 0.0264 | ||
NPMLE | α2 | 0.0183 | 0.0149 | 0.0236 | |
α3 | 0.0283 | 0.0305 | 0.0416 | ||
LRE | α2 | 0.0340 | 0.0185 | 0.0387 | |
α3 | 0.0360 | 0.0335 | 0.0491 | ||
LRE-opt | α2 | 0.0313 | 0.0183 | 0.0363 | |
α3 | 0.0290 | 0.0333 | 0.0441 |
For each DGP (gamma mixture) 100 simulations with 1000 observations each. *p<0.05
Appendix B: Proofs and Technical Details
Technical Details Section 2: A Counting Process Approach
The counting process approach is a very useful framework for analyzing duration data since an indicator can be used to denote whether a transition happened or not. Andersen et al. (1993) have provided an excellent survey of counting processes. Less technical surveys have been given by Klein and Moeschberger (1997), Therneau and Grambsch (2000), and Aalen et al. (2009). The main advantage of this framework is that it allows us to express the duration distribution as a regression model with an error term that is a martingale difference. Regression models with martingale difference errors are the basis for inference in time series models with dependent observations. Hence, it is not surprising that inference is much simplified by using a similar representation in duration models.
To start the discussion, we first introduce some notation. A counting process {N(t)|t≥0} is a stochastic process describing the number of events in the interval [0, t] as time proceeds. The process contains only jumps of size +1. For single duration data, the event can only occur once because the units are observed until the event occurs. Therefore we introduce the observation indicator Y(t)=I(T≥t) that is equal to one if the unit is under observation at time t and zero after the event has occurred. The counting process is governed by its random intensity process, Y(t)κ(t), where κ(t) is the hazard in (2). If we consider a small interval (t–dt] of length dt, then Y(t)κ(t) is the conditional probability that the increment dN(t)=N(t)–N(t–) jumps in that interval given all that has happened until just before t. By specifying the intensity as the product of this observation indicator and the hazard rate, we effectively limit the number of occurrences of the event to one. It is essential that the observation indicator only depends on events up to time t.
Usually we do not observe T directly. Instead we observe
with g a known function and C a random vector. The most common example is right censoring, where g(T, C)=min (T, C). By defining the observation indicator as the product of the indicator I(t≤T) and, if necessary, an indicator of the observation plan, we capture when a unit is at risk for the event. In the case of right censoring Y(t)=I(t≤T)I(t≤C), and in all cases of interest we have Y(t)=I(t≤T)IA(t) with A a random set that may depend on random variables. We assume that C and T are conditionally independent given X. The history up to and including t, Yh(t) is assumed to be a left continuous function of t. The history of the whole process also includes the history of the covariate process, Xh(t), and V. Thus, we haveThe sample paths of the conditioning variables should be up to t–, but because these paths are left continuous we can take them up to t. A fundamental result in the theory of counting processes, the Doob-Meyer decomposition,9 allows us to write
where M(t), t≥0 is a martingale with conditional mean and variance given by
The (conditional) mean and variance of the counting process are equal, so the disturbances in (B.2) are heteroscedastic. The probability in (B.1) is zero, if the unit is no longer under observation. A counting process can be considered as a sequence of Bernoulli experiments because if dt is small, (B.3) and (6) give the mean and variance of a Bernoulli random variable. The relation between the counting process and the sequence of Bernoulli experiments given in (B.2) can be considered as a regression model with an additive error that is a martingale difference. This equation resembles a time-series regression model. The Doob-Meyer decomposition is very helpful to the derivation of the distribution of the estimators because the asymptotic behavior of partial sums of martingales is well-known.
Technical Details Section 3: Assumptions 1–4
To simplify the expressions, we use the notation hi(t, θ)= hi(t, Xh,I (t), θ).
The conditional distribution of T given X(‧) and V has hazard rate
with X(‧) a K covariate bounded stochastic process that is independent of V and such that if the probability of the event
some set S with positive measure and for some constants c1, c2, then c1=c2=0. For the baseline hazard, 0<limt↓0λ(t, α0)<∞.For the covariate process X(t), t≥0, we assume that the sample paths are piecewise constant, i.e., its derivative with respect to t is 0 almost everywhere, and left continuous. The hazard that is not conditional on V is
The observation process is Y(t), t≥0 with Y(t)=I9(t≤T)I(t≤C) and we assume
The support of C is bounded.
The parameter vector θ=(β′, α′)′ is an M vector with β a K vector and α an L vector. The parameter space Θ is convex. The baseline hazard λ(t, α)>0 and is twice differentiable and the second derivative is bounded in α (in the parameter space) and t.
The weight function
is an M vector of bounded and left continuous functions. Ifthen there are functions μ(u, θ) (an M vector), Vβ (u, s, θ) (an M×K matrix), and Vα (u, s, θ) (an M×L matrix) such that
and
and
Define
We assume that the M×M matrix [B(θ0) A(θ0)] is nonsingular.
The restriction on the baseline hazard in Assumption A1 ensures identification (see Section 3) and guarantees that the semiparametric information bound is nonsingular (see below). Assumption A2 states that the covariates and the observation indicator are predetermined. Assumption A4 is about smoothness: Suppose that one censors all the data at u=τ+ψ then the expressions in equation (30) and (31) do not change if the value of ψ varies. The derivation of the asymptotic distribution of the LR estimator follows the proof in Tsiatis (1990). Tsiatis requires that the density of U0 is bounded. For the MPH model, this density is
If E(V)=∞, this density is not bounded at u0=0. Inspection of Tsiatis’ proof shows that this does not change the result, and we do not need to impose the restriction that E(V) is finite. The transformed durations are observed up to τ with τ<∞ such that for some ψ,η>0
Pr[min (U0, C) > τ+ψ]≥η.
In the MPH model, this is just an assumption on the distribution of C because for U0 it is satisfied for all τ<∞.
Technical Details Section 4: Lemma 2–3
Lemma 2: If the derivative of κ is bounded on [0, τ] then for ε>0 with
and
we have
for u1, u2 with 0<u1<u2<τ.
If Yh,N(t) is bounded away from zero on [0, τ] for large N, then (B.14) and (B.15) imply that if bN=N–c for
then Note that the uniform convergence holds on a compact subset of [0, τ]. Although this can be generalized to uniform convergence on [0, τ], the variable kernels that are needed for this generalization complicate the asymptotic analysis. In practice, estimation of the hazard is inaccurate near the endpoints, and it may be preferable to exclude observations that are close to the endpoints. Note that the observations near the endpoints are used in the estimation of the hazard. Also, using a bandwidth proportional to N–1/5 and satisfies all the assumptions of this paper.We do not observe the transformed duration
but rather an estimate of this transformed duration, and hence we consider the kernel estimatorLemma 3: The kernel K is positive and bounded on [–1, 1] (and zero elsewhere) and satisfies a Lipschitz condition on this interval. The covariate process X(t) is bounded on [0, τ] and so is
for all α in an open neighborhood of α0. Moreoveruniformly for 0≤u≤τ, θ∈N(θ0) and H has derivatives that are bounded for 0≤u≤τ, θ∈N(θ0). Then for ε>0 such that
we have
Proof: See below.
Note that the conditions on bN are determined in Lemma 2 and that a bandwidth proportional to N–1/5 and
satisfies all the assumptions of this paper. The fact that we use estimated transformed durations does not change the restrictions on the bandwidth choice.At this point we consider the condition in (B.18) more closely. With
if the duration T is (right) censored at C, Y(t)=I(T≥t)I(C≥t), soYU (u, θ)=I(h(T, θ)≥u)‧I(h(C, θ)≥u).
If the censoring time and the duration are conditionally independent given the history up to t, i.e.,
then
If N(θ0) is an open neighborhood of θ0, Xi and Ci are i.i.d., and
then
and by the uniform law of large numbers
uniformly for θ∈N(θ0) and 0≤u≤τ. Because by (B.23) the limit is bounded away from zero, we have
uniformly for θ∈N(θ0) and 0≤u≤τ with
Because h(T,θ0)=U0, (B.19) holds for θ=θ0 if κ0(u) is bounded for 0≤u≤τ. From the expression for κU (u, θ) in (9), a sufficient condition for κU(u, θ) to be bounded for all θ in a neighborhood of θ0 and 0≤t≤τ is that λ(t, α)>0 for all t and on a neighborhood of α0. In the same way, (B.20) holds if the hazard of C is bounded and λ(t, α) is bounded away from zero in a neighborhood around α0.
Proof of Lemma 1
is a linearization of Because SN(θ) is not continuous in θ, it is not possible to linearize this function by a first order Taylor series expansion. Instead we linearize the hazard rate of the transformed durations U(θ). From (4) and (5) we obtainThis relates the hazard of the distribution of U(θ) to that of U0
Because h(h–1(u, θ), θ)=u, we have
The derivatives of κU(u, θ) with respect to θ are
where the last equality follows from a change of variables in the integral. In the same way, we obtain with a change of variable in the integral
The proof consists of checking the conditions for asymptotic linearity of SN(θ) in Tsiatis (1990) and a computation of the coefficients in the linear approximation. In Tsiatis’ proof the covariate in the estimating equation is Xi. We have
and hence the requirement that this is a vector of bounded functions. The equations (9), (10) and (11) are stability conditions [see also Andersen et al. (1993)]. Instead of a mean and variance condition as in Tsiatis (1990), we have a mean and two covariance conditions. Note that by setting s=u, we obtain conditions for uniform convergence to Vα (u, u) and Vβ (u, u). The final condition for linearization is that for u≤τThe assumptions that λ(t,α) is bounded away from zero for all t≥0 and α in the parameter space, that
for all t≥0 and α in the parameter space, and that X(t) is bounded, imply that the second derivative of κU(u, θ) with respect to θ is bounded for all u≤τ and θ∈Θ. This is sufficient for (B.31) if the parameter space is convex.Next we linearize SN(θ). Because
we have if |θ–θ0| is small
The second term is after substitution of (B.29), and (B.30)
The normalized vectors of coefficients converge to (B.12) and (B.13) if (B.10) and (11) hold. This proves the lemma.
Proof of Theorem 1
By van der Vaart (1998) Theorem 5.45, we have from Lemma 1
with M0 the martingale associated with the counting process N0 for U0. By the central limit theorem for integrals of predetermined functions with respect to a martingale, [see e.g., Anderson et al. (1993)], the sum on the right-hand side converges to a normal distribution with the variance matrix in (24).
Proof of Lemma 2 and 3
We have
We first consider the second term. Because K is Lipschitz this is bounded by
Moreover by the mean value theorem, we have that for some intermediate
Because Xi(t) is bounded on [0, τ] and so is
for all α in an open neighborhood of α0, (B.36) is bounded by and substitution in (B.35) gives the upper boundBecause the estimator
is consistent, the upper bound converges to 0 in probability ifNext we consider the first term in (B.34). By subtraction and addition of expected values, this term is bounded by
The first and second terms converge to 0 in probability if
Because of (B.18) the final term converges in probability toThis expression is bounded (both H and K are bounded) by
The first term goes to 0 in probability if
and the second if This completes the proof.References
Aalen, O. O., O. Borgan, and H. K. Gjessing. 2009. Survival and Event History Analysis. New York: Springer Verlag.10.1007/978-0-387-68560-1Search in Google Scholar
Amemiya, T. 1974. “The Nonlinear Two-Stage Least-Squares Estimator.” Journal of Econometrics 2: 105–110.10.1016/0304-4076(74)90033-5Search in Google Scholar
Amemiya, T. 1985. “Instrumental Variable Estimation for the Nonlinear Errors-in-Variables Model.” Journal of Econometrics 28: 273–289.10.1016/0304-4076(85)90001-6Search in Google Scholar
Andersen, P. K., O. Borgan, R. D. Gill, and N. Keiding. 1993. Statistical Models Based on Counting Processes. New York: Springer Verlag.10.1007/978-1-4612-4348-9Search in Google Scholar
Baker, M., and A. Melino. 2000. “Duration Dependence and Nonparametric Heterogeneity: A Monte Carlo Study.” Journal of Econometrics 96: 357–393.10.1016/S0304-4076(99)00064-0Search in Google Scholar
Bearse, P., J. Canals-Cerda, and P. Rilstone. 2007. “Efficient Semiparametric Estimation of Duration Models with Unobserved Heterogeneity.” Econometric Theory 23: 281–308.10.1017/S0266466607070120Search in Google Scholar
Bijwaard, G. E. 2009. “Instrumental Variable Estimation for Duration Data.” In Causal Analysis in Population Studies: Concepts, Methods, Applications, edited by H. Engelhardt, H.-P. Kohler, and A. Fürnkranz-Prskawetz, 111–148. New York: Springer Verlag.10.1007/978-1-4020-9967-0_6Search in Google Scholar
Bijwaard, G. E. 2010. “Immigrant Migration Dynamics Model for The Netherlands.” Journal of Population Economics 23: 1213–1247.10.1007/s00148-008-0228-1Search in Google Scholar
Bijwaard, G. E., and G. Ridder. 2005. “Correcting for Selective Compliance in a Re–employment Bonus Experiment.” Journal of Econometrics 125: 77–111.10.1016/j.jeconom.2004.04.004Search in Google Scholar
Bijwaard, G. E., C. Schluter, and J. Wahba. 2013. “The Impact of Labour Market Dynamics on the Return–Migration of Immigrants.” Review of Economics & Statistics, forthcoming.10.1162/REST_a_00389Search in Google Scholar
Chen, S. 2002. “Rank Estimation of Transformation Models.” Econometrica 70: 1683–1697.10.1111/1468-0262.00347Search in Google Scholar
Chiaporri, P. A., and B. Salanie. 2000. “Testing for Asymmetric Information in Insurance Markets.” Journal of Political Economy 108: 56–78.10.1086/262111Search in Google Scholar
Cox, D. R., and D. Oakes. 1984. Analysis of Survival Data. London: Chapman and Hall.Search in Google Scholar
Elbers, C., and G. Ridder. 1982. “True and Spurious Duration Dependence: The Identifiability of the Proportional Hazard Model.” Review of Economic Studies 49: 403–410.10.2307/2297364Search in Google Scholar
Feller, W. 1971. An Introduction to Probability Theory and its Applications. 3rd ed. John Wiley and Sons.Search in Google Scholar
Hahn, J. 1994. “The Efficiency Bound of the Mixed Proportional Hazard Model.” Review of Economic Studies 61: 607–629.10.2307/2297911Search in Google Scholar
Han, A. K. 1987. “Non–parametric Analysis of a Generalized Regression Model: The Maximum Rank Correlation Estimator.” Journal of Econometrics 35: 303–316.10.1016/0304-4076(87)90030-3Search in Google Scholar
Hausman, J. A., and T. Woutersen. 2005. “Estimating a Semi–Parametric Duration Model without Specifying Heterogeneity.” CeMMAP, working paper, CWP11/05.Search in Google Scholar
Heckman, J. J. 1991. “Identifying the Hand of the Past: Distinguishing State Dependence from Heterogeneity.” American Economic Review 81: 75–79.Search in Google Scholar
Heckman, J. J., and B. Singer. 1984a. “Econometric Duration Analysis.” Journal of Econometrics 24: 63–132.10.1016/0304-4076(84)90075-7Search in Google Scholar
Heckman, J. J., and B. Singer. 1984b. “A Method for Minimizing the Impact of Distributional Assumptions in Econometric Models for Duration Data.” Econometrica 52: 271–320.10.2307/1911491Search in Google Scholar
Honoré, B. E. 1990. “Simple Estimation of a Duration Model with Unobserved Heterogeneity.” Econometrica 58: 453–473.10.2307/2938211Search in Google Scholar
Horowitz, J. L. 1996. “Semiparametric Estimation of a Regression Model with an Unknown Transformation of the Dependent Variable.” Econometrica 64: 103–137.10.2307/2171926Search in Google Scholar
Horowitz, J. L. 1999. “Semiparametric Estimation of a Proportional Hazard Model with Unobserved Heterogeneity.” Econometrica 67: 1001–1018.10.1111/1468-0262.00068Search in Google Scholar
Horowitz, J. L. 2001. The Bootstrap in Handbook of Econometrics, Vol. 5, edited by J. J. Heckman and E. Leamer. North-Holland: Amsterdam.Search in Google Scholar
Khan, S. 2001. “Two Stage Rank Estimation of Quantile Index Models.” Journal of Econometrics 100: 319–355.10.1016/S0304-4076(00)00040-3Search in Google Scholar
Khan, S., and E. Tamer. 2007. “Partial Rank Estimation of Duration Models with General forms of Censoring.” Journal of Econometrics 136: 251–280.10.1016/j.jeconom.2006.03.003Search in Google Scholar
Klein, J. P., and M. L. Moeschberger. 1997. Survival Analysis: Techniques for Censored and Truncated Data. New York: Springer Verlag.Search in Google Scholar
Lai, T. L., and Z. Ying. 1991. “Rank Regression Methods for Left–Truncated and Right-Censored Data.” Annals of Statistics 19: 531–556.10.1214/aos/1176348110Search in Google Scholar
Lancaster, T. 1976. “Redundancy, Unemployment and Manpower Policy: A Comment.” Economic Journal 86: 335–338.10.2307/2230754Search in Google Scholar
Lancaster, T. 1979. “Econometric Methods for the Duration of Unemployment.” Econometrica 47: 939–956.10.2307/1914140Search in Google Scholar
Lin, D. Y., and Z. Ying. 1995. “Semiparametric Inference for the Accelerated Life Model with Time-Dependent Covariates.” Journao of Statistical Planning and Inference 44: 47–63.10.1016/0378-3758(94)00039-XSearch in Google Scholar
Lindsay, B. G. 1983. “The Geometry of Mixture Likelihoods: A General Theory.” Annals of Statistics 11: 86–94.10.1214/aos/1176346059Search in Google Scholar
Manton, K. G., E. Stallard, and J. W. Vaupel. 1981. “Methods for the Mortality Experience of Heterogeneous Populations.” Demography 18: 389–410.10.2307/2061005Search in Google Scholar
Meyer, P. 1963. “Decomposition of Supermartingales: The Uniqueness Theorem.” Illinois Journal of Mathematics 7: 1–17.10.1215/ijm/1255637477Search in Google Scholar
Newey, W. K., and D. McFadden. 1994. “Large Sample Estimation and Hypothesis Testing.” In Handbook of Econometrics,Vol. 4, edited by R. F. Engle and D. MacFadden. North-Holland: Amsterdam.10.1016/S1573-4412(05)80005-4Search in Google Scholar
Powell, M. J. D. 1964. “An Efficient Method for Finding the Minimum of a Function of Several Variables without Calculating Derivatives.” The Computer Journal 7: 155–162.10.1093/comjnl/7.2.155Search in Google Scholar
Prentice, R. L. 1978. “Linear Rank Tests with Right Censored Data.” Biometrika 65: 167–179.10.1093/biomet/65.1.167Search in Google Scholar
Press, W. H., B. P. Flannert, S. A. Teukolsky, and W. T. Vetterling. 1986. Numerical Recipes: The Art of Scientific Computing. Cambridge: Cambridge University Press.10.1016/S0003-2670(00)82860-3Search in Google Scholar
Protter, P. 2005. Stochastic Integration and Differential Equations. New York: Springer Verlag, 107–113.Search in Google Scholar
Ramlau-Hansen, H. 1983. “Smoothing Counting Process Intensities by Means of Kernel Functions.” Annals of Statistics 11: 453–466.10.1214/aos/1176346152Search in Google Scholar
Ridder, G., and T. Woutersen. 2003. “The Singularity of the Efficiency Bound of the Mixed Proportional Hazard Model.” Econometrica 71: 1579–1589.10.1111/1468-0262.00460Search in Google Scholar
Robins, J. M., and A. A. Tsiatis. 1992. “Semiparametric Estimation of an Accelerated Failure Time Model with Time-Dependent Covariates.” Biometrika 79: 311–319.Search in Google Scholar
Sherman, R. P. 1993. “The Limiting Distribution of the Maximum Rank Correlation Estimator.” Econometrica 61: 123–137.10.2307/2951780Search in Google Scholar
Therneau, T., and P. Grambsch. 2000. Modeling Survival Data: Extending the Cox Model. New York: Springer Verlag.10.1007/978-1-4757-3294-8Search in Google Scholar
Tsiatis, A. A. 1990. “Estimating Regression Parameters using Linear Rank Tests for Censored Data.” Annals of Statistics 18: 354–372.10.1214/aos/1176347504Search in Google Scholar
van der Vaart, A. W. 1998. Asymptotic Statistics. Cambridge: Cambridge University Press.10.1017/CBO9780511802256Search in Google Scholar
Wooldridge, J. M. 2005. “Unobserved Heterogeneity and Estimation of Average Partial Effects.” In Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg, edited by D. W. K. Andrews and J. H. Stock, 27–55. Cambridge University Press.10.1017/CBO9780511614491.004Search in Google Scholar
Woutersen, T. 2000. Consistent Estimators for Panel Duration Data with Endogenous Censoring and Endogenous Regressors. Dissertation Brown University.Search in Google Scholar
©2013 by Walter de Gruyter Berlin Boston