Yan Li1, Xinyan Zhang1, Tomi Akinyemiju2, Akinyemi I Ojesina2, Jeff M Szychowski1, Nianjun Liu1, Bo Xu3, Nengjun Yi1. 1. Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL 35294, USA. 2. Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL 35294, USA. 3. Department of Oncology, Southern Research Institute, Birmingham, AL 35205, USA.
Abstract
MOTIVATION: Many traditional clinical prognostic factors have been known for cancer for years, but usually provide poor survival prediction. Genomic information is more easily available now which offers opportunities to build more accurate prognostic models. The challenge is how to integrate them to improve survival prediction. The common approach of jointly analyzing all type of covariates directly in one single model may not improve the prediction due to increased model complexity and cannot be easily applied to different datasets. RESULTS: We proposed a two-stage procedure to better combine different sources of information for survival prediction, and applied the two-stage procedure in two cancer datasets: myelodysplastic syndromes (MDS) and ovarian cancer. Our analysis suggests that the prediction performance of different data types are very different, and combining clinical, gene expression and mutation data using the two-stage procedure improves survival prediction in terms of improved concordance index and reduced prediction error. AVAILABILITY AND IMPLEMENTATION: The two-stage procedure can be implemented in BhGLM package which is freely available at http://www.ssg.uab.edu/bhglm/. CONTACT: nyi@uab.edu.
MOTIVATION: Many traditional clinical prognostic factors have been known for cancer for years, but usually provide poor survival prediction. Genomic information is more easily available now which offers opportunities to build more accurate prognostic models. The challenge is how to integrate them to improve survival prediction. The common approach of jointly analyzing all type of covariates directly in one single model may not improve the prediction due to increased model complexity and cannot be easily applied to different datasets. RESULTS: We proposed a two-stage procedure to better combine different sources of information for survival prediction, and applied the two-stage procedure in two cancer datasets: myelodysplastic syndromes (MDS) and ovarian cancer. Our analysis suggests that the prediction performance of different data types are very different, and combining clinical, gene expression and mutation data using the two-stage procedure improves survival prediction in terms of improved concordance index and reduced prediction error. AVAILABILITY AND IMPLEMENTATION: The two-stage procedure can be implemented in BhGLM package which is freely available at http://www.ssg.uab.edu/bhglm/. CONTACT: nyi@uab.edu.
Authors: Seth J Corey; Mark D Minden; Dwayne L Barber; Hagop Kantarjian; Jean C Y Wang; Aaron D Schimmer Journal: Nat Rev Cancer Date: 2007-02 Impact factor: 60.716
Authors: Renée Beekman; Marijke Valkhof; Stefan J Erkeland; Erdogan Taskesen; Veronika Rockova; Justine K Peeters; Peter J M Valk; Bob Löwenberg; Ivo P Touw Journal: PLoS One Date: 2011-10-20 Impact factor: 3.240
Authors: Yuan Yuan; Eliezer M Van Allen; Larsson Omberg; Nikhil Wagle; Ali Amin-Mansour; Artem Sokolov; Lauren A Byers; Yanxun Xu; Kenneth R Hess; Lixia Diao; Leng Han; Xuelin Huang; Michael S Lawrence; John N Weinstein; Josh M Stuart; Gordon B Mills; Levi A Garraway; Adam A Margolin; Gad Getz; Han Liang Journal: Nat Biotechnol Date: 2014-06-22 Impact factor: 54.908