Adel Boueiz1,2, Zhonghui Xu1, Yale Chang3, Aria Masoomi3, Andrew Gregory1, Sharon M Lutz4, Dandi Qiao1, James D Crapo5, Jennifer G Dy3, Edwin K Silverman1,2, Peter J Castaldi1,6. 1. Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States. 2. Pulmonary and Critical Care Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States. 3. Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts, United States. 4. Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, Massachusetts, United States. 5. Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, Colorado, United States. 6. Division of General Medicine and Primary Care, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States.
Abstract
Background: The heterogeneous nature of chronic obstructive pulmonary disease (COPD) complicates the identification of the predictors of disease progression. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features. Methods: We included 4496 smokers with available data from their enrollment and 5-year follow-up visits in the COPD Genetic Epidemiology (COPDGene®) study. We constructed linear regression (LR) and supervised random forest models to predict 5-year progression in forced expiratory in 1 second (FEV1) from 46 baseline features. Using cross-validation, we randomly partitioned participants into training and testing samples. We also validated the results in the COPDGene 10-year follow-up visit. Results: Predicting the change in FEV1 over time is more challenging than simply predicting the future absolute FEV1 level. For random forest, R-squared was 0.15 and the area under the receiver operator characteristic (ROC) curves for the prediction of participants in the top quartile of observed progression was 0.71 (testing) and respectively, 0.10 and 0.70 (validation). Random forest provided slightly better performance than LR. The accuracy was best for Global initiative for chronic Obstructive Lung Disease (GOLD) grades 1-2 participants, and it was harder to achieve accurate prediction in advanced stages of the disease. Predictive variables differed in their relative importance as well as for the predictions by GOLD. Conclusion: Random forest, along with deep phenotyping, predicts FEV1 progression with reasonable accuracy. There is significant room for improvement in future models. This prediction model facilitates the identification of smokers at increased risk for rapid disease progression. Such findings may be useful in the selection of patient populations for targeted clinical trials. JCOPDF
Background: The heterogeneous nature of chronic obstructive pulmonary disease (COPD) complicates the identification of the predictors of disease progression. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features. Methods: We included 4496 smokers with available data from their enrollment and 5-year follow-up visits in the COPD Genetic Epidemiology (COPDGene®) study. We constructed linear regression (LR) and supervised random forest models to predict 5-year progression in forced expiratory in 1 second (FEV1) from 46 baseline features. Using cross-validation, we randomly partitioned participants into training and testing samples. We also validated the results in the COPDGene 10-year follow-up visit. Results: Predicting the change in FEV1 over time is more challenging than simply predicting the future absolute FEV1 level. For random forest, R-squared was 0.15 and the area under the receiver operator characteristic (ROC) curves for the prediction of participants in the top quartile of observed progression was 0.71 (testing) and respectively, 0.10 and 0.70 (validation). Random forest provided slightly better performance than LR. The accuracy was best for Global initiative for chronic Obstructive Lung Disease (GOLD) grades 1-2 participants, and it was harder to achieve accurate prediction in advanced stages of the disease. Predictive variables differed in their relative importance as well as for the predictions by GOLD. Conclusion: Random forest, along with deep phenotyping, predicts FEV1 progression with reasonable accuracy. There is significant room for improvement in future models. This prediction model facilitates the identification of smokers at increased risk for rapid disease progression. Such findings may be useful in the selection of patient populations for targeted clinical trials. JCOPDF
Authors: Vladimir Svetnik; Andy Liaw; Christopher Tong; J Christopher Culberson; Robert P Sheridan; Bradley P Feuston Journal: J Chem Inf Comput Sci Date: 2003 Nov-Dec
Authors: Ciro Casanova; Juan P de Torres; Armando Aguirre-Jaíme; Victor Pinto-Plata; Jose M Marin; Elizabeth Cordoba; Rebeca Baz; Claudia Cote; Bartolome R Celli Journal: Am J Respir Crit Care Med Date: 2011-11-01 Impact factor: 21.405
Authors: MeiLan K Han; Alvar Agusti; Peter M Calverley; Bartolome R Celli; Gerard Criner; Jeffrey L Curtis; Leonardo M Fabbri; Jonathan G Goldin; Paul W Jones; William Macnee; Barry J Make; Klaus F Rabe; Stephen I Rennard; Frank C Sciurba; Edwin K Silverman; Jørgen Vestbo; George R Washko; Emiel F M Wouters; Fernando J Martinez Journal: Am J Respir Crit Care Med Date: 2010-06-03 Impact factor: 21.405
Authors: Firdaus A A Mohamed Hoesein; Eva van Rikxoort; Bram van Ginneken; Pim A de Jong; Mathias Prokop; Jan-Willem J Lammers; Pieter Zanen Journal: Eur Respir J Date: 2012-02-09 Impact factor: 16.671
Authors: Surya P Bhatt; Xavier Soler; Xin Wang; Susan Murray; Antonio R Anzueto; Terri H Beaty; Aladin M Boriek; Richard Casaburi; Gerard J Criner; Alejandro A Diaz; Mark T Dransfield; Douglas Curran-Everett; Craig J Galbán; Eric A Hoffman; James C Hogg; Ella A Kazerooni; Victor Kim; Gregory L Kinney; Amir Lagstein; David A Lynch; Barry J Make; Fernando J Martinez; Joe W Ramsdell; Rishindra Reddy; Brian D Ross; Harry B Rossiter; Robert M Steiner; Matthew J Strand; Edwin J R van Beek; Emily S Wan; George R Washko; J Michael Wells; Chris H Wendt; Robert A Wise; Edwin K Silverman; James D Crapo; Russell P Bowler; MeiLan K Han Journal: Am J Respir Crit Care Med Date: 2016-07-15 Impact factor: 21.405
Authors: Daniel Kotz; Colin R Simpson; Wolfgang Viechtbauer; Onno C P van Schayck; Aziz Sheikh Journal: NPJ Prim Care Respir Med Date: 2014-05-20 Impact factor: 2.871
Authors: Matthew Strand; Aastha Khatiwada; David Baraghoshi; David Lynch; Edwin K Silverman; Surya P Bhatt; Erin Austin; Elizabeth A Regan; Aladin M Boriek; James D Crapo Journal: Chronic Obstr Pulm Dis Date: 2022-07-29