Jennifer N Fishe1, Jiang Bian2, Zhaoyi Chen3, Hui Hu3, Jae Min3, Francois Modave4, Mattia Prosperi3. 1. Department of Emergency Medicine, University of Florida College of Medicine, Jacksonville, FL, USA. 2. Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA. 3. Department of Epidemiology, College of Medicine & College of Public Health and Health Professions, University of Florida, Gainesville, FL, USA. 4. Center for Health Outcomes and Informatics Research, Loyola University Chicago, Chicago, IL, USA.
Abstract
Objectives: To identify prodromal correlates of asthma as compared to chronic obstructive pulmonary disease and allied-conditions (COPDAC) using a multi domain analysis of socio-ecological, clinical, and demographic domains. Methods: This is a retrospective case-risk-control study using data from Florida's statewide Healthcare Cost and Utilization Project (HCUP). Patients were grouped into three groups: asthma, COPDAC (without asthma), and neither asthma nor COPDAC. To identify socio-ecological, clinical, demographic, and clinical predictors of asthma and COPDAC, we used univariate analysis, feature ranking by bootstrapped information gain ratio, multivariable logistic regression with LogitBoost selection, decision trees, and random forests. Results: A total of 141,729 patients met inclusion criteria, of whom 56,052 were diagnosed with asthma, 85,677 with COPDAC, and 84,737 with neither asthma nor COPDAC. The multi-domain approach proved superior in distinguishing asthma versus COPDAC and non-asthma/non-COPDAC controls (area under the curve (AUROC) 84%). The best domain to distinguish asthma from COPDAC without controls was prior clinical diagnoses (AUROC 82%). Ranking variables from all the domains found the most important predictors for the asthma versus COPDAC and controls were primarily socio-ecological variables, while for asthma versus COPDAC without controls, demographic and clinical variables such as age, CCI, and prior clinical diagnoses, scored better.Conclusions: In this large statewide study using a machine learning approach, we found that a multi-domain approach with demographics, clinical, and socio-ecological variables best predicted an asthma diagnosis. Future work should focus on integrating machine learning-generated predictive models into clinical practice to improve early detection of those common respiratory diseases.
Objectives: To identify prodromal correlates of asthma as compared to chronic obstructive pulmonary disease and allied-conditions (COPDAC) using a multi domain analysis of socio-ecological, clinical, and demographic domains. Methods: This is a retrospective case-risk-control study using data from Florida's statewide Healthcare Cost and Utilization Project (HCUP). Patients were grouped into three groups: asthma, COPDAC (without asthma), and neither asthma nor COPDAC. To identify socio-ecological, clinical, demographic, and clinical predictors of asthma and COPDAC, we used univariate analysis, feature ranking by bootstrapped information gain ratio, multivariable logistic regression with LogitBoost selection, decision trees, and random forests. Results: A total of 141,729 patients met inclusion criteria, of whom 56,052 were diagnosed with asthma, 85,677 with COPDAC, and 84,737 with neither asthma nor COPDAC. The multi-domain approach proved superior in distinguishing asthma versus COPDAC and non-asthma/non-COPDAC controls (area under the curve (AUROC) 84%). The best domain to distinguish asthma from COPDAC without controls was prior clinical diagnoses (AUROC 82%). Ranking variables from all the domains found the most important predictors for the asthma versus COPDAC and controls were primarily socio-ecological variables, while for asthma versus COPDAC without controls, demographic and clinical variables such as age, CCI, and prior clinical diagnoses, scored better.Conclusions: In this large statewide study using a machine learning approach, we found that a multi-domain approach with demographics, clinical, and socio-ecological variables best predicted an asthma diagnosis. Future work should focus on integrating machine learning-generated predictive models into clinical practice to improve early detection of those common respiratory diseases.
Authors: Dinh S Bui; Caroline J Lodge; John A Burgess; Adrian J Lowe; Jennifer Perret; Minh Q Bui; Gayan Bowatte; Lyle Gurrin; David P Johns; Bruce R Thompson; Garun S Hamilton; Peter A Frith; Alan L James; Paul S Thomas; Deborah Jarvis; Cecilie Svanes; Melissa Russell; Stephen C Morrison; Iain Feather; Katrina J Allen; Richard Wood-Baker; John Hopper; Graham G Giles; Michael J Abramson; Eugene H Walters; Melanie C Matheson; Shyamali C Dharmage Journal: Lancet Respir Med Date: 2018-04-05 Impact factor: 30.700
Authors: Benjamin A Goldstein; Ann Marie Navar; Michael J Pencina; John P A Ioannidis Journal: J Am Med Inform Assoc Date: 2016-05-17 Impact factor: 4.497
Authors: Hude Quan; Vijaya Sundararajan; Patricia Halfon; Andrew Fong; Bernard Burnand; Jean-Christophe Luthi; L Duncan Saunders; Cynthia A Beck; Thomas E Feasby; William A Ghali Journal: Med Care Date: 2005-11 Impact factor: 2.983
Authors: Zhaozhong Zhu; Phil H Lee; Mark D Chaffin; Wonil Chung; Po-Ru Loh; Quan Lu; David C Christiani; Liming Liang Journal: Nat Genet Date: 2018-05-21 Impact factor: 38.330