Literature DB >> 21324971

Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data.

Richard M Simon1, Jyothi Subramanian, Ming-Chung Li, Supriya Menezes.   

Abstract

Developments in whole genome biotechnology have stimulated statistical focus on prediction methods. We review here methodology for classifying patients into survival risk groups and for using cross-validation to evaluate such classifications. Measures of discrimination for survival risk models include separation of survival curves, time-dependent ROC curves and Harrell's concordance index. For high-dimensional data applications, however, computing these measures as re-substitution statistics on the same data used for model development results in highly biased estimates. Most developments in methodology for survival risk modeling with high-dimensional data have utilized separate test data sets for model evaluation. Cross-validation has sometimes been used for optimization of tuning parameters. In many applications, however, the data available are too limited for effective division into training and test sets and consequently authors have often either reported re-substitution statistics or analyzed their data using binary classification methods in order to utilize familiar cross-validation. In this article we have tried to indicate how to utilize cross-validation for the evaluation of survival risk models; specifically how to compute cross-validated estimates of survival distributions for predicted risk groups and how to compute cross-validated time-dependent ROC curves. We have also discussed evaluation of the statistical significance of a survival risk model and evaluation of whether high-dimensional genomic data adds predictive accuracy to a model based on standard covariates alone.

Entities:  

Mesh:

Year:  2011        PMID: 21324971      PMCID: PMC3105299          DOI: 10.1093/bib/bbr001

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  26 in total

1.  Prediction error estimation: a comparison of resampling methods.

Authors:  Annette M Molinaro; Richard Simon; Ruth M Pfeiffer
Journal:  Bioinformatics       Date:  2005-05-19       Impact factor: 6.937

2.  Relative risk trees for censored survival data.

Authors:  M LeBlanc; J Crowley
Journal:  Biometrics       Date:  1992-06       Impact factor: 2.571

3.  Sample size planning for developing classifiers using high-dimensional DNA microarray data.

Authors:  Kevin K Dobbin; Richard M Simon
Journal:  Biostatistics       Date:  2006-04-13       Impact factor: 5.899

4.  Predicting survival from microarray data--a comparative study.

Authors:  H M Bøvelstad; S Nygård; H L Størvold; M Aldrin; Ø Borgan; A Frigessi; O C Lingjaerde
Journal:  Bioinformatics       Date:  2007-06-06       Impact factor: 6.937

Review 5.  Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.

Authors:  F E Harrell; K L Lee; D B Mark
Journal:  Stat Med       Date:  1996-02-28       Impact factor: 2.373

6.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

7.  Linking gene expression data with patient survival times using partial least squares.

Authors:  Peter J Park; Lu Tian; Isaac S Kohane
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

8.  Partial least squares proportional hazard regression for application to DNA microarray survival data.

Authors:  Danh V Nguyen; David M Rocke
Journal:  Bioinformatics       Date:  2002-12       Impact factor: 6.937

9.  Bias in error estimation when using cross-validation for model selection.

Authors:  Sudhir Varma; Richard Simon
Journal:  BMC Bioinformatics       Date:  2006-02-23       Impact factor: 3.169

10.  Semi-supervised methods to predict patient survival from gene expression data.

Authors:  Eric Bair; Robert Tibshirani
Journal:  PLoS Biol       Date:  2004-04-13       Impact factor: 8.029

View more
  80 in total

1.  Patient subgroup identification for clinical drug development.

Authors:  Xin Huang; Yan Sun; Paul Trow; Saptarshi Chatterjee; Arunava Chakravartty; Lu Tian; Viswanath Devanarayan
Journal:  Stat Med       Date:  2017-02-01       Impact factor: 2.373

2.  Stage III Non-Small Cell Lung Cancer: Prognostic Value of FDG PET Quantitative Imaging Features Combined with Clinical Prognostic Factors.

Authors:  David V Fried; Osama Mawlawi; Lifei Zhang; Xenia Fave; Shouhao Zhou; Geoffrey Ibbott; Zhongxing Liao; Laurence E Court
Journal:  Radiology       Date:  2015-07-15       Impact factor: 11.105

3.  Automated identification of stratifying signatures in cellular subpopulations.

Authors:  Robert V Bruggner; Bernd Bodenmiller; David L Dill; Robert J Tibshirani; Garry P Nolan
Journal:  Proc Natl Acad Sci U S A       Date:  2014-06-16       Impact factor: 11.205

4.  A prognostic model of Alzheimer's disease relying on multiple longitudinal measures and time-to-event data.

Authors:  Kan Li; Richard O'Brien; Michael Lutz; Sheng Luo
Journal:  Alzheimers Dement       Date:  2018-01-04       Impact factor: 21.566

5.  Clinical utility of measuring Epstein-Barr virus-specific cell-mediated immunity after HSCT in addition to virological monitoring: results from a prospective study.

Authors:  Angela Chiereghin; Giulia Piccirilli; Tamara Belotti; Arcangelo Prete; Clara Bertuzzi; Dino Gibertoni; Liliana Gabrielli; Gabriele Turello; Eva Caterina Borgatti; Francesco Barbato; Mariarosaria Sessa; Mario Arpinati; Francesca Bonifazi; Tiziana Lazzarotto
Journal:  Med Microbiol Immunol       Date:  2019-07-09       Impact factor: 3.402

6.  Peripheral Neutrophil to Lymphocyte Ratio Improves Prognostication in Colon Cancer.

Authors:  Shahrooz Rashtak; Xiaoyang Ruan; Brooke R Druliner; Hongfang Liu; Terry Therneau; Mohamad Mouchli; Lisa A Boardman
Journal:  Clin Colorectal Cancer       Date:  2017-01-25       Impact factor: 4.481

7.  Clustering approach to identify intratumour heterogeneity combining FDG PET and diffusion-weighted MRI in lung adenocarcinoma.

Authors:  Jonghoon Kim; Seong-Yoon Ryu; Seung-Hak Lee; Ho Yun Lee; Hyunjin Park
Journal:  Eur Radiol       Date:  2018-06-19       Impact factor: 5.315

8.  Sample size considerations of prediction-validation methods in high-dimensional data for survival outcomes.

Authors:  Herbert Pang; Sin-Ho Jung
Journal:  Genet Epidemiol       Date:  2013-03-07       Impact factor: 2.135

9.  CT texture features of liver parenchyma for predicting development of metastatic disease and overall survival in patients with colorectal cancer.

Authors:  Scott J Lee; Ryan Zea; David H Kim; Meghan G Lubner; Dustin A Deming; Perry J Pickhardt
Journal:  Eur Radiol       Date:  2017-11-21       Impact factor: 5.315

10.  Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods.

Authors:  Jean-Eudes Dazard; Michael Choe; Michael LeBlanc; J Sunil Rao
Journal:  Proc Am Stat Assoc       Date:  2014-08
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.