Skip to main content

Advertisement

Log in

Variable selection in the accelerated failure time model via the bridge method

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

In high throughput genomic studies, an important goal is to identify a small number of genomic markers that are associated with development and progression of diseases. A representative example is microarray prognostic studies, where the goal is to identify genes whose expressions are associated with disease free or overall survival. Because of the high dimensionality of gene expression data, standard survival analysis techniques cannot be directly applied. In addition, among the thousands of genes surveyed, only a subset are disease-associated. Gene selection is needed along with estimation. In this article, we model the relationship between gene expressions and survival using the accelerated failure time (AFT) models. We use the bridge penalization for regularized estimation and gene selection. An efficient iterative computational algorithm is proposed. Tuning parameters are selected using V-fold cross validation. We use a resampling method to evaluate the prediction performance of bridge estimator and the relative stability of identified genes. We show that the proposed bridge estimator is selection consistent under appropriate conditions. Analysis of two lymphoma prognostic studies suggests that the bridge estimator can identify a small number of genes and can have better prediction performance than the Lasso.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alizadeh AA, Eisen MB, Davis RE, Ma C et al (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403: 503–511

    Article  Google Scholar 

  • Buckley J, James I (1979) Linear regression with censored data. Biometrika 66: 429–436

    Article  MATH  Google Scholar 

  • Dave SS, Wright G, Tan B et al (2004) Prediction of survival in follicular lymphoma based on molecular features of tumor-infiltrating immune cells. New Engl J Med 351: 2159–2169

    Article  Google Scholar 

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32: 407–499

    Article  MATH  MathSciNet  Google Scholar 

  • Frank IE, Friedman JH (1993) A statistical view of some chemometrics regression tools (with discussion). Technometrics 35: 109–148

    Article  MATH  Google Scholar 

  • Fu WJ (1998) Penalized regressions: the bridge versus the Lasso. J Comput Graph Stat 7: 397–416

    Article  Google Scholar 

  • Gui J, Li H (2005) Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics 21: 3001–3008

    Article  Google Scholar 

  • Huang J, Ma SG, Xie HL (2006) Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics 62: 813–820

    Article  MATH  MathSciNet  Google Scholar 

  • Huang J, Horowitz JL, Ma S (2008a) Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann Stat 36: 587–613

    Article  MATH  MathSciNet  Google Scholar 

  • Huang J, Ma SG, Xie HL, Zhang C-H (2009) A group bridge approach for variable selection. Biometrika 96:339–355

    Article  MATH  Google Scholar 

  • Huang J, Ma S, Zhang C (2008b) Adaptive Lasso for high-dimensional regression models. Stat Sinica 18: 1603–1618

    MATH  MathSciNet  Google Scholar 

  • Kalbfleisch JD, Prentice RL (1980) The statistical analysis of failure time data. John Wiley, New York

    MATH  Google Scholar 

  • Leng C, Lin Y, Wahba G (2006) A note on the LASSO and related procedures in model selection. Stat Sinica 16: 1273–1284

    MATH  MathSciNet  Google Scholar 

  • Ma S, Huang J (2007) Additive risk survival model with microarray data. BMC Bioinform 8: 192

    Article  Google Scholar 

  • Rosenwald A, Wright G, Chan WC, Conners JM et al (2002) The use of molecular profiling to predict survival after chemotherapy for diffuse large B cell lymphoma. New Engl J Med 346: 1937–1947

    Article  Google Scholar 

  • Rosenwald A, Wright G, Wiestner A, Chan WC et al (2003) The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell 3: 185–197

    Article  Google Scholar 

  • Stute W (1993) Consistent estimation under random censorship when covariables are available. J Multivar Anal 45: 89–103

    Article  MATH  MathSciNet  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58: 267–288

    MATH  MathSciNet  Google Scholar 

  • van de Geer S (2008) High-dimensional generalized linear models and the Lasso. Ann Stat 36: 614–645

    Article  MATH  Google Scholar 

  • Van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes: with applications to statistics. Springer, New York

    MATH  Google Scholar 

  • Wang S, Nan B, Zhu J, Beer DG (2008) Doubly penalized Buckley-James method for survival data with high-dimensional covariates. Biometrics 6: 132–140

    Article  MathSciNet  Google Scholar 

  • Wei LJ (1992) The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat Med 11: 1871–1879

    Article  Google Scholar 

  • Ying ZL (1993) A large sample study of rank estimation for censored regression data. Ann Stat 21: 76–99

    Article  MATH  Google Scholar 

  • Zhang C, Huang J (2008) The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann Stat 36: 1567–1594

    Article  MATH  MathSciNet  Google Scholar 

  • Zhou M (1992) M-estimation in censored linear models. Biometrika 79: 837–841

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Huang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, J., Ma, S. Variable selection in the accelerated failure time model via the bridge method. Lifetime Data Anal 16, 176–195 (2010). https://doi.org/10.1007/s10985-009-9144-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-009-9144-2

Keywords

Navigation