| Literature DB >> 35270653 |
Abdulrahman A Alrajhi1, Osama A Alswailem2, Ghassan Wali3, Khalid Alnafee4, Sarah AlGhamdi5, Jhan Alarifi5, Sarab AlMuhaideb6, Hisham ElMoaqet7, Ahmad AbuSalah5.
Abstract
Clinicians urgently need reliable and stable tools to predict the severity of COVID-19 infection for hospitalized patients to enhance the utilization of hospital resources and supplies. Published COVID-19 related guidelines are frequently being updated, which impacts its utilization as a stable go-to resource for informing clinical and operational decision-making processes. In addition, many COVID-19 patient-level severity prediction tools that were developed during the early stages of the pandemic failed to perform well in the hospital setting due to many challenges including data availability, model generalization, and clinical validation. This study describes the experience of a large tertiary hospital system network in the Middle East in developing a real-time severity prediction tool that can assist clinicians in matching patients with appropriate levels of needed care for better management of limited health care resources during COVID-19 surges. It also provides a new perspective for predicting patients' COVID-19 severity levels at the time of hospital admission using comprehensive data collected during the first year of the pandemic in the hospital. Unlike many previous studies for a similar population in the region, this study evaluated 4 machine learning models using a large training data set of 1386 patients collected between March 2020 and April 2021. The study uses comprehensive COVID-19 patient-level clinical data from the hospital electronic medical records (EMR), vital sign monitoring devices, and Polymerase Chain Reaction (PCR) machines. The data were collected, prepared, and leveraged by a panel of clinical and data experts to develop a multi-class data-driven framework to predict severity levels for COVID-19 infections at admission time. Finally, this study provides results from a prospective validation test conducted by clinical experts in the hospital. The proposed prediction framework shows excellent performance in concurrent validation (n=462 patients, March 2020-April 2021) with highest discrimination obtained with the random forest classification model, achieving a macro- and micro-average area under receiver operating characteristics curve (AUC) of 0.83 and 0.87, respectively. The prospective validation conducted by clinical experts (n=185 patients, April-May 2021) showed a promising overall prediction performance with a recall of 78.4-90.0% and a precision of 75.0-97.8% for different severity classes.Entities:
Keywords: COVID-19; applied artificial intelligence; decision support systems; hospital operations; severity prediction
Mesh:
Year: 2022 PMID: 35270653 PMCID: PMC8910504 DOI: 10.3390/ijerph19052958
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Details of the study cohort and data sets used for model development, testing, and validation.
Figure 2Distribution of training data after merging classes A and B and oversampling minority classes.
Optimal hyperparameter settings for the different machine learning models considered in this study.
| Model | Hyperparameter | Best Selection |
|---|---|---|
| MLR | Reg. penalty | L2 Norm |
| Reg. Coeff. | 1 | |
| RF | max_depth | 60 |
| max_features | Auto | |
| n_estimators | 800 | |
| XGBoost | learning_rate | 0.1 |
| max_depth | 9 | |
| n_estimators | 100 | |
| Extra Trees | max_depth | 100 |
| n_estimators | 500 | |
| min_samples_split | 5 |
Figure 3for different classification models over test patient data: (a) RF classifier, (b) Extra Trees classifier, (c) multinomial logistic regression, (d) XGB classifier.
Results for different classification models using test patient data (concurrent validation).
|
|
|
|
|
|
|
| MLR | 0.83 | 0.66 | 0.78 | 0.82 | 0.76 |
| XGBoost | 0.82 | 0.72 | 0.88 | 0.85 | 0.81 |
| Extra Trees | 0.84 | 0.73 | 0.85 | 0.86 | 0.81 |
| RF | 0.86 | 0.75 | 0.88 | 0.87 | 0.83 |
Severity prediction performance over prospective validation data.
| RF Classification Performance over Validation Set | |||
|---|---|---|---|
|
|
|
|
|
| Stage | 90% | 75.0% | 81.8% |
| Stage | 78.4% | 69.0% | 73.4% |
| Stage | 78.9% | 97.8% | 87.4% |
Figure 4Feature importance in predicting COVID-19 severity: (a) standard RF feature importance and (b) permutation feature importance.