| Literature DB >> 35021153 |
Yue Gao1,2, Xiaoming Xiong1,2, Xiaofei Jiao1,2, Yang Yu1,2, Jianhua Chi1,2, Wei Zhang1,2, Lingxi Chen3, Shuaicheng Li3, Qinglei Gao1,2.
Abstract
Corticosteroid has been proved to be one of the few effective treatments for COVID-19 patients. However, not all the patients were suitable for corticosteroid therapy. In this study, we aimed to propose a machine learning model to forecast the response to corticosteroid therapy in COVID-19 patients. We retrospectively collected the clinical data about 666 COVID-19 patients receiving corticosteroid therapy between January 27, 2020, and March 30, 2020, from two hospitals in China. The response to corticosteroid therapy was evaluated by hospitalization time, oxygen supply duration, and the outcomes of patients. Least Absolute Shrinkage and Selection Operator (LASSO) was applied for feature selection. Five prediction models were applied in the training cohort and assessed in an internal and an external validation dataset, respectively. Finally, two (C reactive protein, lymphocyte percent) of 36 candidate immune/inflammatory features were finally used for model development. All five models displayed promising predictive performance. Notably, the ensemble model, PRCTC (prediction of response to corticosteroid therapy in COVID-19 patients), derived from three prediction models including Gradient Boosted Decision Tree (GBDT), Neural Network (NN), and logistic regression (LR), achieved the best performance with an area under the curve (AUC) of 0.810 (95% confidence interval [CI] 0.760-0.861) in internal validation cohort and 0.845 (95% CI 0.779-0.911) in external validation cohort to predict patients' response to corticosteroid therapy. In conclusion, PRCTC proposed with universality and scalability is hopeful to provide tangible and prompt clinical decision support in management of COVID-19 patients and potentially extends to other medication predictions.Entities:
Keywords: C reactive protein; COVID-19; corticosteroid; lymphocyte percent; machine learning
Mesh:
Substances:
Year: 2022 PMID: 35021153 PMCID: PMC8791209 DOI: 10.18632/aging.203819
Source DB: PubMed Journal: Aging (Albany NY) ISSN: 1945-4589 Impact factor: 5.682
Figure 1The features were selected by LASSO. (A) showed LASSO variable trace profiles of the ten features. The vertical dashed line shows the best lambda value (0.081) chosen by tenfold cross-validation. (B) showed features with zero coefficient (colored with gray) at lambda = 0.081, was considered less crucial to the patient’s response to corticosteroid therapy and removed by Lasso logistic regression analysis. Features with positive coefficient (colored with red) are regarded as positively associated with response to corticosteroid therapy. Features with negative coefficient (colored with blue) are regarded as negatively associated with response to corticosteroid therapy. Abbreviations: LASSO least absolute shrinkage and selection operator; IL-8 interleukin-8; IL-10 interleukin-10; IL-6 interleukin-6; IL-2R interleukin-2 receptor; IL-1β interleukin-1β; TNF-α tumor necrosis factor α; PCT procalcitonin; CRP C reactive protein.
Baseline clinical characteristics of patients.
|
|
|
|
|
|
| 64 (54–71) | 64 (54–71) | 64 (52.5–71) | 67 (59–75) | |
|
| ||||
| Female | 295 (44.29) | 123 (45.90) | 122 (45.69) | 50 (38.17) |
| Male | 371 (55.71) | 145 (54.10) | 145 (54.31) | 81 (61.83) |
|
| 296 (44.44) | 120 (44.78) | 109 (40.82) | 67 (51.15) |
|
| 75 (11.26) | 22 (8.21) | 33 (12.36) | 20 (15.27) |
|
| 128 (19.22) | 46 (17.16) | 57 (21.35) | 25 (19.08) |
|
| 11 (1.65) | 3 (1.12) | 4 (1.50) | 4 (3.05) |
|
| 13 (1.95) | 8 (2.99) | 3 (1.12) | 2 (1.53) |
|
| ||||
| General group | 323 (48.50) | 199 (75.25) | 89 (33.33) | 35 (26.72) |
| Severe and critical group | 343 (51.50) | 69 (24.75) | 178 (66.67) | 96 (73.28) |
|
| 587 (88.14) | 236 (88.06) | 246 (92.13) | 105 (80.15) |
|
| 499 (74.92) | 198 (73.88) | 195 (73.03) | 106 (80.92) |
|
| 365 (54.80) | 157 (58.58) | 141 (52.81) | 67 (51.15) |
|
| 268 (40.24) | 100 (37.31) | 102 (38.20) | 66 (50.38) |
|
| 272 (40.84) | 96 (35.82) | 130 (48.69) | 46 (35.11) |
|
| 179 (26.88) | 73 (27.24) | 81 (30.34) | 25 (19.08) |
|
| 143 (21.47) | 54 (20.15) | 64 (23.97) | 25 (19.08) |
| 22 (14–30) | 23 (16–32) | 21 (14–28) | 19 (11.5–27.5) | |
| 15 (6–23) | 15 (7–24) | 14 (6–23) | 11 (4–21) | |
|
| ||||
|
| 514 (82.24%) | 214 (85.26%) | 205 (81.67%) | 95 (77.24%) |
|
| 111 (17.76%) | 37 (14.74%) | 46 (18.33%) | 28 (22.76%) |
|
| ||||
| Discharge | 473 (71.02) | 203 (75.75) | 189 (70.79) | 81 (61.83) |
| Death | 193 (28.98) | 65 (24.25) | 78 (29.21) | 50 (38.17) |
| 8.20 (3.30–14.40) | 8.75 (4.05–15.03) | 7.70 (3.10–13.75) | 7.10 (2.93–15.98) | |
|
| ||||
| mg/l, median (IQR) | 79.35 (34.13–150.53) | 69.30 (31.00–126.20) | 87.30 (39.80–160.70) | 84.30 (26.58–151.65) |
|
| ||||
| Response | 260 (39.03) | 103 (38.43) | 108 (40.45) | 49 (37.40) |
| Non-response | 406 (60.96) | 165 (61.57) | 159 (59.55) | 82 (62.60) |
Abbreviation: IQR, interquartile ranges; CHD, coronary heart disease; COPD, chronic obstructive pulmonary disease; CKD, chronic kidney disease; CRP, C reactive protein.
Performance for prediction of response to corticosteroid therapy of models in different cohorts.
|
|
|
|
|
|
|
|
|
| |
|
| |||||||||
| LR | 0.810 (0.759–0.861) | 0.670 (0.611–0.727) | 0.444 (0.349–0.543) | 0.824 (0.756–0.880) | 0.632 (0.513–0.739) | 0.686 (0.615–0.751) | 0.282 | 0.522 | 0.179 |
| SVM | 0.809 (0.758–0.859) | 0.6854 (0.626–0.741) | 0.370 (0.279–0.469) | 0.899 (0.842–0.941) | 0.714 (0.578–0.827) | 0.678 (0.610–0.740) | 0.292 | 0.488 | 0.188 |
| GBDT | 0.803 (0.751–0.855) | 0.730 (0.673–0.783) | 0.694 (0.598–0.780) | 0.755 (0.680–0.819) | 0.658 (0.563–0.744) | 0.784 (0.711–0.847) | 0.445 | 0.676 | 0.180 |
| KNN | 0.784 (0.731–0.837) | 0.704 (0.645–0.758) | 0.519 (0.420–0.616) | 0.830 (0.763–0.885) | 0.675 (0.563–0.774) | 0.717 (0.647–0.781) | 0.362 | 0.586 | 0.192 |
| NN | 0.804 (0.753–0.854) | 0.700 (0.642–0.755) | 0.676 (0.579–0.763) | 0.717 (0.640–0.786) | 0.619 (0.525–0.707) | 0.765 (0.689–0.831) | 0.387 | 0.646 | 0.180 |
| PRCTC | 0.810 (0.760–0.861) | 0.738 (0.681–0.790) | 0.685 (0.589–0.771) | 0.774 (0.701–0.836) | 0.673 (0.577–0.759) | 0.783 (0.711–0.845) | 0.457 | 0.679 | 0.177 |
|
| |||||||||
| LR | 0.808 (0.734–0.882) | 0.725 (0.640–0.780) | 0.571 (0.422–0.712) | 0.817 (0.716–0.894) | 0.651 (0.491–0.790) | 0.761 (0.659–0.846) | 0.398 | 0.609 | 0.171 |
| SVM | 0.812 (0.739–0.885) | 0.687 (0.600–0.765) | 0.429 (0.288–0.578) | 0.842 (0.744–0.913) | 0.618 (0.436–0.778) | 0.711 (0.611–0.799) | 0.288 | 0.506 | 0.191 |
| GBDT | 0.842 (0.776–0.908) | 0.779 (0.698–0.847) | 0.776 (0.634–0.882) | 0.781 (0.675–0.864) | 0.679 (0.540–0.797) | 0.853 (0.753– 0.924) | 0.541 | 0.724 | 0.157 |
| KNN | 0.787 (0.710–0.863) | 0.718 (0.632–0.793) | 0.551 (0.402–0.693) | 0.817 (0.716–0.894) | 0.643 (0.480–0.785) | 0.753 (0.650–0.838) | 0.379 | 0.593 | 0.183 |
| NN | 0.810 (0.736–0.883) | 0.733 (0.649–0.806) | 0.735 (0.589–0.851) | 0.732 (0.622–0.824) | 0.621 (0.484–0.745) | 0.822 (0.715–0.902) | 0.450 | 0.673 | 0.163 |
| PRCTC | 0.845 (0.779–0.911) | 0.771 (0.690–0.840) | 0.755 (0.611– 0.867) | 0.781 (0.675–0.864) | 0.673 (0.533–0.793) | 0.842 (0.740–0.916) | 0.523 | 0.712 | 0.156 |
Abbreviation: AUC, area under the curve; LR, logistic regression; SVM, supported vector machine; GBDT, gradient boosted decision tree; KNN, k-nearest neighbor; NN, neural network; PRCTC, prediction of response to corticosteroid therapy in COVID-19 patients; SN, sensitivity; SP, specificity; PPV, positive predictive value; NPV, negative predictive value; CI, confidence interval.
Figure 2PRCTC achieved a prompt performance in evaluation on the validation datasets. (A–C) showed ROC curve and AUC of SVM, LR, GBDT, KNN, and NN in training cohort, internal validation cohort, and external validation cohort, respectively. Abbreviations: PRCTC, prediction of response to corticosteroid therapy in COVID-19 patients; ROC, receiver operating characteristic curve; AUC, area under the curve; SVM, supported vector machine; LR, logistic regression; GBDT, gradient boosted decision tree; KNN, k-nearest neighbor; NN, neural network.
Figure 3Calibration curves of PRCTC model were shown in validation cohorts. Calibration curves of PRCTC model were shown for internal validation cohort (A) and external validation cohort (B), respectively. The triangle represents the observation group. Each group contained an average of 20 observations. The dashed line is the datum line. The bottom vertical lines refer to the predicted probability distribution. The red curve is the fitted nonparametric calibration curve. PRCTC predicted probability distribution on ground-truth no-response and response patients were shown in internal validation (C) and external validation cohort (D), respectively. Abbreviations: PRCTC, prediction of response to corticosteroid therapy in COVID-19 patients.