Literature DB >> 20166132

High-dimensional Cox models: the choice of penalty as part of the model building process.

Axel Benner1, Manuela Zucknick, Thomas Hielscher, Carina Ittrich, Ulrich Mansmann.   

Abstract

The Cox proportional hazards regression model is the most popular approach to model covariate information for survival times. In this context, the development of high-dimensional models where the number of covariates is much larger than the number of observations (p>>n) is an ongoing challenge. A practicable approach is to use ridge penalized Cox regression in such situations. Beside focussing on finding the best prediction rule, one is often interested in determining a subset of covariates that are the most important ones for prognosis. This could be a gene set in the biostatistical analysis of microarray data. Covariate selection can then, for example, be done by L(1)-penalized Cox regression using the lasso (Tibshirani (1997). Statistics in Medicine 16, 385-395). Several approaches beyond the lasso, that incorporate covariate selection, have been developed in recent years. This includes modifications of the lasso as well as nonconvex variants such as smoothly clipped absolute deviation (SCAD) (Fan and Li (2001). Journal of the American Statistical Association 96, 1348-1360; Fan and Li (2002). The Annals of Statistics 30, 74-99). The purpose of this article is to implement them practically into the model building process when analyzing high-dimensional data with the Cox proportional hazards model. To evaluate penalized regression models beyond the lasso, we included SCAD variants and the adaptive lasso (Zou (2006). Journal of the American Statistical Association 101, 1418-1429). We compare them with "standard" applications such as ridge regression, the lasso, and the elastic net. Predictive accuracy, features of variable selection, and estimation bias will be studied to assess the practical use of these methods. We observed that the performance of SCAD and adaptive lasso is highly dependent on nontrivial preselection procedures. A practical solution to this problem does not yet exist. Since there is high risk of missing relevant covariates when using SCAD or adaptive lasso applied after an inappropriate initial selection step, we recommend to stay with lasso or the elastic net in actual data applications. But with respect to the promising results for truly sparse models, we see some advantage of SCAD and adaptive lasso, if better preselection procedures would be available. This requires further methodological research.

Entities:  

Mesh:

Year:  2010        PMID: 20166132     DOI: 10.1002/bimj.200900064

Source DB:  PubMed          Journal:  Biom J        ISSN: 0323-3847            Impact factor:   2.207


  19 in total

1.  Voxelwise gene-wide association study (vGeneWAS): multivariate gene-based association testing in 731 elderly subjects.

Authors:  Derrek P Hibar; Jason L Stein; Omid Kohannim; Neda Jahanshad; Andrew J Saykin; Li Shen; Sungeun Kim; Nathan Pankratz; Tatiana Foroud; Matthew J Huentelman; Steven G Potkin; Clifford R Jack; Michael W Weiner; Arthur W Toga; Paul M Thompson
Journal:  Neuroimage       Date:  2011-04-08       Impact factor: 6.556

2.  A 13-gene signature prognostic of HPV-negative OSCC: discovery and external validation.

Authors:  Pawadee Lohavanichbutr; Eduardo Méndez; F Christopher Holsinger; Tessa C Rue; Yuzheng Zhang; John Houck; Melissa P Upton; Neal Futran; Stephen M Schwartz; Pei Wang; Chu Chen
Journal:  Clin Cancer Res       Date:  2013-01-14       Impact factor: 12.531

3.  Pulse Wave Velocity and Machine Learning to Predict Cardiovascular Outcomes in Prediabetic and Diabetic Populations.

Authors:  Rafael Garcia-Carretero; Luis Vigil-Medina; Oscar Barquero-Perez; Javier Ramos-Lopez
Journal:  J Med Syst       Date:  2019-12-09       Impact factor: 4.460

4.  Gene Selection using a High-Dimensional Regression Model with Microarrays in Cancer Prognostic Studies.

Authors:  Shuhei Kaneko; Akihiro Hirakawa; Chikuma Hamada
Journal:  Cancer Inform       Date:  2012-02-27

5.  Extreme learning machine Cox model for high-dimensional survival analysis.

Authors:  Hong Wang; Gang Li
Journal:  Stat Med       Date:  2019-01-10       Impact factor: 2.497

Review 6.  The Current and Future Use of Ridge Regression for Prediction in Quantitative Genetics.

Authors:  Ronald de Vlaming; Patrick J F Groenen
Journal:  Biomed Res Int       Date:  2015-07-26       Impact factor: 3.411

7.  Applying stability selection to consistently estimate sparse principal components in high-dimensional molecular data.

Authors:  Martin Sill; Maral Saadati; Axel Benner
Journal:  Bioinformatics       Date:  2015-04-10       Impact factor: 6.937

Review 8.  Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events.

Authors:  Menelaos Pavlou; Gareth Ambler; Shaun Seaman; Maria De Iorio; Rumana Z Omar
Journal:  Stat Med       Date:  2015-10-29       Impact factor: 2.373

9.  Identifying miRNA-mRNA Integration Set Associated With Survival Time.

Authors:  Yongkang Kim; Sungyoung Lee; Jin-Young Jang; Seungyeoun Lee; Taesung Park
Journal:  Front Genet       Date:  2021-06-29       Impact factor: 4.599

10.  Survival analysis by penalized regression and matrix factorization.

Authors:  Yeuntyng Lai; Morihiro Hayashida; Tatsuya Akutsu
Journal:  ScientificWorldJournal       Date:  2013-04-23
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.