Literature DB >> 21818245

Survival ensembles by the sum of pairwise differences with application to lung cancer microarray studies.

Brent A Johnson1, Qi Long.   

Abstract

Lung cancer is among the most common cancers in the United States, in terms of incidence and mortality. In 2009, it is estimated that more than 150,000 deaths will result from lung cancer alone. Genetic information is an extremely valuable data source in characterizing the personal nature of cancer. Over the past several years, investigators have conducted numerous association studies where intensive genetic data is collected on relatively few patients compared to the numbers of gene predictors, with one scientific goal being to identify genetic features associated with cancer recurrence or survival. In this note, we propose high-dimensional survival analysis through a new application of boosting, a powerful tool in machine learning. Our approach is based on an accelerated lifetime model and minimizing the sum of pairwise differences in residuals. We apply our method to a recent microarray study of lung adenocarcinoma and find that our ensemble is composed of 19 genes while a proportional hazards (PH) ensemble is composed of nine genes, a proper subset of the 19-gene panel. In one of our simulation scenarios, we demonstrate that PH boosting in a misspecified model tends to underfit and ignore moderately-sized covariate effects, on average. Diagnostic analyses suggest that the PH assumption is not satisfied in the microarray data and may explain, in part, the discrepancy in the sets of active coefficients. Our simulation studies and comparative data analyses demonstrate how statistical learning by PH models alone is insufficient.

Entities:  

Year:  2011        PMID: 21818245      PMCID: PMC3148798          DOI: 10.1214/10-AOAS426

Source DB:  PubMed          Journal:  Ann Appl Stat        ISSN: 1932-6157            Impact factor:   2.083


  5 in total

1.  A GENERALIZED WILCOXON TEST FOR COMPARING ARBITRARILY SINGLY-CENSORED SAMPLES.

Authors:  E A GEHAN
Journal:  Biometrika       Date:  1965-06       Impact factor: 2.445

2.  Penalized Estimating Functions and Variable Selection in Semiparametric Regression Models.

Authors:  Brent A Johnson; D Y Lin; Donglin Zeng
Journal:  J Am Stat Assoc       Date:  2008-06-01       Impact factor: 5.033

3.  Survival ensembles.

Authors:  Torsten Hothorn; Peter Bühlmann; Sandrine Dudoit; Annette Molinaro; Mark J van der Laan
Journal:  Biostatistics       Date:  2005-12-12       Impact factor: 5.899

4.  Boosting method for nonlinear transformation models with censored survival data.

Authors:  Wenbin Lu; Lexin Li
Journal:  Biostatistics       Date:  2008-03-15       Impact factor: 5.899

5.  Flexible boosting of accelerated failure time models.

Authors:  Matthias Schmid; Torsten Hothorn
Journal:  BMC Bioinformatics       Date:  2008-06-06       Impact factor: 3.169

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.