Kevin He1, Yanming Li1, Ji Zhu2, Hongliang Liu3, Jeffrey E Lee4, Christopher I Amos5, Terry Hyslop6, Jiashun Jin7, Huazhen Lin8, Qinyi Wei3, Yi Li1. 1. Department of Biostatistics and. 2. Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109, USA. 3. Department of Medicine, Duke University School of Medicine and Duke Cancer Institute, Duke University Medical Center, Durham, NC 27710, USA. 4. Department of Surgical Oncology, The University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA. 5. Department of Community and Family Medicine, Geisel School of Medicine, Dartmouth College, Hanover, NH 03750, USA. 6. Department of Biostatistics and Bioinformatics, Duke University and Duke Clinical Research Institute, Durham, NC 27710, USA. 7. Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213, USA and. 8. Center of Statistical Research, School of Statistics, Southwestern University of Finance and Economics, Chengdu, Sichuan 611130, China.
Abstract
MOTIVATION: Technological advances that allow routine identification of high-dimensional risk factors have led to high demand for statistical techniques that enable full utilization of these rich sources of information for genetics studies. Variable selection for censored outcome data as well as control of false discoveries (i.e. inclusion of irrelevant variables) in the presence of high-dimensional predictors present serious challenges. This article develops a computationally feasible method based on boosting and stability selection. Specifically, we modified the component-wise gradient boosting to improve the computational feasibility and introduced random permutation in stability selection for controlling false discoveries. RESULTS: We have proposed a high-dimensional variable selection method by incorporating stability selection to control false discovery. Comparisons between the proposed method and the commonly used univariate and Lasso approaches for variable selection reveal that the proposed method yields fewer false discoveries. The proposed method is applied to study the associations of 2339 common single-nucleotide polymorphisms (SNPs) with overall survival among cutaneous melanoma (CM) patients. The results have confirmed that BRCA2 pathway SNPs are likely to be associated with overall survival, as reported by previous literature. Moreover, we have identified several new Fanconi anemia (FA) pathway SNPs that are likely to modulate survival of CM patients. AVAILABILITY AND IMPLEMENTATION: The related source code and documents are freely available at https://sites.google.com/site/bestumich/issues. CONTACT: yili@umich.edu.
MOTIVATION: Technological advances that allow routine identification of high-dimensional risk factors have led to high demand for statistical techniques that enable full utilization of these rich sources of information for genetics studies. Variable selection for censored outcome data as well as control of false discoveries (i.e. inclusion of irrelevant variables) in the presence of high-dimensional predictors present serious challenges. This article develops a computationally feasible method based on boosting and stability selection. Specifically, we modified the component-wise gradient boosting to improve the computational feasibility and introduced random permutation in stability selection for controlling false discoveries. RESULTS: We have proposed a high-dimensional variable selection method by incorporating stability selection to control false discovery. Comparisons between the proposed method and the commonly used univariate and Lasso approaches for variable selection reveal that the proposed method yields fewer false discoveries. The proposed method is applied to study the associations of 2339 common single-nucleotide polymorphisms (SNPs) with overall survival among cutaneous melanoma (CM) patients. The results have confirmed that BRCA2 pathway SNPs are likely to be associated with overall survival, as reported by previous literature. Moreover, we have identified several new Fanconi anemia (FA) pathway SNPs that are likely to modulate survival of CMpatients. AVAILABILITY AND IMPLEMENTATION: The related source code and documents are freely available at https://sites.google.com/site/bestumich/issues. CONTACT: yili@umich.edu.
Authors: Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham Journal: Am J Hum Genet Date: 2007-07-25 Impact factor: 11.025
Authors: Charles M Balch; Jeffrey E Gershenwald; Seng-Jaw Soong; John F Thompson; Michael B Atkins; David R Byrd; Antonio C Buzaid; Alistair J Cochran; Daniel G Coit; Shouluan Ding; Alexander M Eggermont; Keith T Flaherty; Phyllis A Gimotty; John M Kirkwood; Kelly M McMasters; Martin C Mihm; Donald L Morton; Merrick I Ross; Arthur J Sober; Vernon K Sondak Journal: J Clin Oncol Date: 2009-11-16 Impact factor: 44.544
Authors: Wynn H Kao; Adam I Riker; Deepa S Kushwaha; Kimberly Ng; Steven A Enkemann; Richard Jove; Ralf Buettner; Pascal O Zinn; Néstor P Sánchez; Jaime L Villa; Alan D D'Andrea; Jorge L Sánchez; Richard D Kennedy; Clark C Chen; Jaime L Matta Journal: J Invest Dermatol Date: 2011-06-23 Impact factor: 8.551
Authors: Xing Song; Lemuel R Waitman; Yong Hu; Alan S L Yu; David C Robbins; Mei Liu Journal: J Am Med Inform Assoc Date: 2019-03-01 Impact factor: 4.497
Authors: Yu-Chung Lin; Katherine Keenan; Jiafen Gong; Naim Panjwani; Julie Avolio; Fan Lin; Damien Adam; Paula Barrett; Stéphanie Bégin; Yves Berthiaume; Lara Bilodeau; Candice Bjornson; Janna Brusky; Caroline Burgess; Mark Chilvers; Raquel Consunji-Araneta; Guillaume Côté-Maurais; Andrea Dale; Christine Donnelly; Lori Fairservice; Katie Griffin; Natalie Henderson; Angela Hillaby; Daniel Hughes; Shaikh Iqbal; Jennifer Itterman; Mary Jackson; Emma Karlsen; Lorna Kosteniuk; Lynda Lazosky; Winnie Leung; Valerie Levesque; Émilie Maille; Dimas Mateos-Corral; Vanessa McMahon; Mays Merjaneh; Nancy Morrison; Michael Parkins; Jennifer Pike; April Price; Bradley S Quon; Joe Reisman; Clare Smith; Mary Jane Smith; Nathalie Vadeboncoeur; Danny Veniott; Terry Viczko; Pearce Wilcox; Richard van Wylick; Garry Cutting; Elizabeth Tullis; Felix Ratjen; Johanna M Rommens; Lei Sun; Melinda Solomon; Anne L Stephenson; Emmanuelle Brochiero; Scott Blackman; Harriet Corvol; Lisa J Strug Journal: Genet Med Date: 2021-01-26 Impact factor: 8.822