Eric H Chou1, Chih-Hung Wang2,3, Yu-Lin Hsieh4, Babak Namazi5, Jon Wolfshohl1, Toral Bhakta1, Chu-Lin Tsai2,3, Wan-Ching Lien2,3, Ganesh Sankaranarayanan6,7, Chien-Chang Lee2,3, Tsung-Chien Lu2,3. 1. Baylor Scott & White All Saints Medical Center, Department of Emergency Medicine, Fort Worth, Texas. 2. National Taiwan University Hospital, Department of Emergency Medicine, Taipei, Taiwan. 3. National Taiwan University, College of Medicine, Department of Emergency Medicine, Taipei, Taiwan. 4. Danbury Hospital, Department of Internal Medicine, Danbury, Connecticut. 5. Baylor Scott & White Research Institute, Dallas, Texas. 6. Baylor University Medical Center, Center for Evidence Based Simulation, Dallas, Texas. 7. Texas A&M Health Science Center, Department of Surgery, Dallas, Texas.
Abstract
INTRODUCTION: Within a few months coronavirus disease 2019 (COVID-19) evolved into a pandemic causing millions of cases worldwide, but it remains challenging to diagnose the disease in a timely fashion in the emergency department (ED). In this study we aimed to construct machine-learning (ML) models to predict severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infection based on the clinical features of patients visiting an ED during the early COVID-19 pandemic. METHODS: We retrospectively collected the data of all patients who received reverse transcriptase polymerase chain reaction (RT-PCR) testing for SARS-CoV-2 at the ED of Baylor Scott & White All Saints Medical Center, Fort Worth, from February 23-May 12, 2020. The variables collected included patient demographics, ED triage data, clinical symptoms, and past medical history. The primary outcome was the confirmed diagnosis of COVID-19 (or SARS-CoV-2 infection) by a positive RT-PCR test result for SARS-CoV-2, and was used as the label for ML tasks. We used univariate analyses for feature selection, and variables with P<0.1 were selected for model construction. Samples were split into training and testing cohorts on a 60:40 ratio chronologically. We tried various ML algorithms to construct the best predictive model, and we evaluated performances with the area under the receiver operating characteristic curve (AUC) in the testing cohort. RESULTS: A total of 580 ED patients were tested for SARS-CoV-2 during the study periods, and 98 (16.9%) were identified as having the SARS-CoV-2 infection based on the RT-PCR results. Univariate analyses selected 21 features for model construction. We assessed three ML methods for performance: of the three methods, random forest outperformed the others with the best AUC result (0.86), followed by gradient boosting (0.83) and extra trees classifier (0.82). CONCLUSION: This study shows that it is feasible to use ML models as an initial screening tool for identifying patients with SARS-CoV-2 infection. Further validation will be necessary to determine how effectively this prediction model can be used prospectively in clinical practice.
INTRODUCTION: Within a few months coronavirus disease 2019 (COVID-19) evolved into a pandemic causing millions of cases worldwide, but it remains challenging to diagnose the disease in a timely fashion in the emergency department (ED). In this study we aimed to construct machine-learning (ML) models to predict severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infection based on the clinical features of patients visiting an ED during the early COVID-19 pandemic. METHODS: We retrospectively collected the data of all patients who received reverse transcriptase polymerase chain reaction (RT-PCR) testing for SARS-CoV-2 at the ED of Baylor Scott & White All Saints Medical Center, Fort Worth, from February 23-May 12, 2020. The variables collected included patient demographics, ED triage data, clinical symptoms, and past medical history. The primary outcome was the confirmed diagnosis of COVID-19 (or SARS-CoV-2 infection) by a positive RT-PCR test result for SARS-CoV-2, and was used as the label for ML tasks. We used univariate analyses for feature selection, and variables with P<0.1 were selected for model construction. Samples were split into training and testing cohorts on a 60:40 ratio chronologically. We tried various ML algorithms to construct the best predictive model, and we evaluated performances with the area under the receiver operating characteristic curve (AUC) in the testing cohort. RESULTS: A total of 580 ED patients were tested for SARS-CoV-2 during the study periods, and 98 (16.9%) were identified as having the SARS-CoV-2 infection based on the RT-PCR results. Univariate analyses selected 21 features for model construction. We assessed three ML methods for performance: of the three methods, random forest outperformed the others with the best AUC result (0.86), followed by gradient boosting (0.83) and extra trees classifier (0.82). CONCLUSION: This study shows that it is feasible to use ML models as an initial screening tool for identifying patients with SARS-CoV-2 infection. Further validation will be necessary to determine how effectively this prediction model can be used prospectively in clinical practice.
Authors: Qun Li; Xuhua Guan; Peng Wu; Xiaoye Wang; Lei Zhou; Yeqing Tong; Ruiqi Ren; Kathy S M Leung; Eric H Y Lau; Jessica Y Wong; Xuesen Xing; Nijuan Xiang; Yang Wu; Chao Li; Qi Chen; Dan Li; Tian Liu; Jing Zhao; Man Liu; Wenxiao Tu; Chuding Chen; Lianmei Jin; Rui Yang; Qi Wang; Suhua Zhou; Rui Wang; Hui Liu; Yinbo Luo; Yuan Liu; Ge Shao; Huan Li; Zhongfa Tao; Yang Yang; Zhiqiang Deng; Boxi Liu; Zhitao Ma; Yanping Zhang; Guoqing Shi; Tommy T Y Lam; Joseph T Wu; George F Gao; Benjamin J Cowling; Bo Yang; Gabriel M Leung; Zijian Feng Journal: N Engl J Med Date: 2020-01-29 Impact factor: 176.079
Authors: Frank S Heldt; Marcela P Vizcaychipi; Sophie Peacock; Mattia Cinelli; Lachlan McLachlan; Fernando Andreotti; Stojan Jovanović; Robert Dürichen; Nadezda Lipunova; Robert A Fletcher; Anne Hancock; Alex McCarthy; Richard A Pointon; Alexander Brown; James Eaton; Roberto Liddi; Lucy Mackillop; Lionel Tarassenko; Rabia T Khan Journal: Sci Rep Date: 2021-02-18 Impact factor: 4.379
Authors: Cristina Menni; Ana M Valdes; Claire J Steves; Tim D Spector; Maxim B Freidin; Carole H Sudre; Long H Nguyen; David A Drew; Sajaysurya Ganesh; Thomas Varsavsky; M Jorge Cardoso; Julia S El-Sayed Moustafa; Alessia Visconti; Pirro Hysi; Ruth C E Bowyer; Massimo Mangino; Mario Falchi; Jonathan Wolf; Sebastien Ourselin; Andrew T Chan Journal: Nat Med Date: 2020-05-11 Impact factor: 53.440