| Literature DB >> 25193065 |
P J Newcombe1, H Raza Ali2,3,4, F M Blows5, E Provenzano6, P D Pharoah4,5,7, C Caldas2,4,5, S Richardson1.
Abstract
As data-rich medical datasets are becoming routinely collected, there is a growing demand for regression methodology that facilitates variable selection over a large number of predictors. Bayesian variable selection algorithms offer an attractive solution, whereby a sparsity inducing prior allows inclusion of sets of predictors simultaneously, leading to adjusted effect estimates and inference of which covariates are most important. We present a new implementation of Bayesian variable selection, based on a Reversible Jump MCMC algorithm, for survival analysis under the Weibull regression model. A realistic simulation study is presented comparing against an alternative LASSO-based variable selection strategy in datasets of up to 20,000 covariates. Across half the scenarios, our new method achieved identical sensitivity and specificity to the LASSO strategy, and a marginal improvement otherwise. Runtimes were comparable for both approaches, taking approximately a day for 20,000 covariates. Subsequently, we present a real data application in which 119 protein-based markers are explored for association with breast cancer survival in a case cohort of 2287 patients with oestrogen receptor-positive disease. Evidence was found for three independent prognostic tumour markers of survival, one of which is novel. Our new approach demonstrated the best specificity.Entities:
Keywords: Bayesian variable selection; MCMC; breast cancer; gene expression; penalised regression; reversible jump; stability selection; survival analysis
Mesh:
Substances:
Year: 2016 PMID: 25193065 PMCID: PMC6055985 DOI: 10.1177/0962280214548748
Source DB: PubMed Journal: Stat Methods Med Res ISSN: 0962-2802 Impact factor: 3.021