| Literature DB >> 34244563 |
Richard Du1,2, Efstratios D Tsougenis2, Joshua W K Ho3, Joyce K Y Chan4, Keith W H Chiu1, Benjamin X H Fang5, Ming Yen Ng1,6, Siu-Ting Leung7, Christine S Y Lo8, Ho-Yuen F Wong5, Hiu-Yin S Lam5, Long-Fung J Chiu9, Tiffany Y So10, Ka Tak Wong11, Yiu Chung I Wong12, Kevin Yu12, Yiu-Cheong Yeung13, Thomas Chik13, Joanna W K Pang14, Abraham Ka-Chung Wai15, Michael D Kuo1, Tina P W Lam5, Pek-Lan Khong1, Ngai-Tseung Cheung16, Varut Vardhanabhuti17.
Abstract
Triaging and prioritising patients for RT-PCR test had been essential in the management of COVID-19 in resource-scarce countries. In this study, we applied machine learning (ML) to the task of detection of SARS-CoV-2 infection using basic laboratory markers. We performed the statistical analysis and trained an ML model on a retrospective cohort of 5148 patients from 24 hospitals in Hong Kong to classify COVID-19 and other aetiology of pneumonia. We validated the model on three temporal validation sets from different waves of infection in Hong Kong. For predicting SARS-CoV-2 infection, the ML model achieved high AUCs and specificity but low sensitivity in all three validation sets (AUC: 89.9-95.8%; Sensitivity: 55.5-77.8%; Specificity: 91.5-98.3%). When used in adjunction with radiologist interpretations of chest radiographs, the sensitivity was over 90% while keeping moderate specificity. Our study showed that machine learning model based on readily available laboratory markers could achieve high accuracy in predicting SARS-CoV-2 infection.Entities:
Year: 2021 PMID: 34244563 PMCID: PMC8270945 DOI: 10.1038/s41598-021-93719-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Schematic showing study design with patient selection at each point of the study and temporal representation of training and validation sets in Hong Kong.
Baseline demographics and laboratory characteristics of the primary cohort.
| COVID-19 | Other viral PNA | Bacterial PNA | Clinical PNA | Other infections | Other diseases | |
|---|---|---|---|---|---|---|
| Total, n | 447 | 405 | 1515 | 1862 | 256 | 663 |
| Female, n (%) | 202 (45) | 192 (47) | 570 (38) | 865 (46) | 138 (54) | 276 (42) |
| Mean ± SD | 42 ± 17 | 53 ± 22 | 74 ± 17 | 73 ± 19 | 45 ± 21 | 61 ± 21 |
| Median (IQR) | 39 (28–57) | 52 (35–70) | 77 (65–87) | 79 (62–88) | 39 (29–59) | 64 (46–79) |
| 16–35 | 174 (39) | 101 (25) | 46 (3) | 117 (6) | 104 (41) | 105 (16) |
| 35–50 | 113 (25) | 87 (21) | 85 (6) | 147 (8) | 62 (24) | 82 (12) |
| 50–65 | 115 (26) | 83 (20) | 237 (16) | 281 (15) | 37 (14) | 152 (23) |
| > 65 | 45 (10) | 134 (33) | 1147 (76) | 1317 (71) | 53 (21) | 324 (49) |
| Missing, n (%) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| Mean ± SD | 13.9 ± 1.4 | 12.9 ± 2.2 | 11.1 ± 2.4 | 11.4 ± 2.5 | 13.2 ± 2.0 | 12.0 ± 2.6 |
| Median (IQR) | 13.9 (13.0–15.0) | 13.2 (11.7–14.6) | 11.1 (9.3–12.8) | 11.7 (9.7–13.2) | 13.4 (12.1–14.5) | 12.4 (10.3–14.0) |
| Missing, n (%) | 0 (0) | 0 (0) | 0 (0) | 1 (0) | 0 (0) | 0 (0) |
| Mean ± SD | 0.4 ± 0.0 | 0.4 ± 0.1 | 0.3 ± 0.1 | 0.3 ± 0.1 | 0.4 ± 0.1 | 0.4 ± 0.1 |
| Median (IQR) | 0.4 (0.4–0.4) | 0.4 (0.4–0.4) | 0.3 (0.3–0.4) | 0.4 (0.3–0.4) | 0.4 (0.4–0.4) | 0.4 (0.3–0.4) |
| Missing, n (%) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| Mean ± SD | 5.5 ± 1.9 | 8.8 ± 4.8 | 12.6 ± 12.9 | 11.3 ± 6.7 | 9.4 ± 5.5 | 11.5 ± 21.5 |
| Median (IQR) | 5.2 (4.3–6.4) | 7.7 (6.0–10.2) | 11.1 (7.5–15.4) | 9.8 (7.1–13.7) | 7.7 (6.3–10.5) | 9.0 (6.8–12.4) |
| Missing, n (%) | 4 (1) | 48 (12) | 97 (6) | 108 (6) | 17 (7) | 43 (6) |
| Mean ± SD | 1.3 ± 0.6 | 1.4 ± 0.8 | 1.1 ± 2.1 | 1.2 ± 1.5 | 1.7 ± 1.1 | 1.4 ± 2.5 |
| Median (IQR) | 1.3 (0.9–1.7) | 1.3 (0.8–1.8) | 0.9 (0.5–1.4) | 1.0 (0.6–1.5) | 1.6 (1.0–2.1) | 1.2 (0.7–1.7) |
| Missing, n (%) | 4 (1) | 48 (12) | 100 (7) | 108 (6) | 17 (7) | 44 (7) |
| Mean ± SD | 0.5 ± 0.2 | 0.6 ± 0.4 | 0.7 ± 1.0 | 0.8 ± 1.8 | 0.6 ± 0.4 | 0.7 ± 0.9 |
| Median (IQR) | 0.5 (0.3–0.6) | 0.6 (0.4–0.8) | 0.6 (0.4–0.9) | 0.6 (0.4–0.9) | 0.5 (0.3–0.7) | 0.6 (0.4–0.9) |
| Missing, n (%) | 4 (1) | 48 (12) | 97 (6) | 108 (6) | 17 (7) | 43 (6) |
| Mean ± SD | 3.6 ± 1.7 | 6.7 ± 4.8 | 10.4 ± 7.4 | 9.1 ± 5.5 | 7.1 ± 5.6 | 8.4 ± 5.7 |
| Median (IQR) | 3.3 (2.4–4.4) | 5.4 (3.9–7.8) | 8.9 (5.8–13.5) | 7.8 (5.1–11.6) | 5.2 (3.9–8.3) | 6.7 (4.5–10.6) |
| Missing, n (%) | 0 (0) | 0 (0) | 4 (0) | 2 (0) | 0 (0) | 0 (0) |
| Mean ± SD | 221.2 ± 74.1 | 233.3 ± 87.4 | 246.0 ± 117.7 | 241.2 ± 108.7 | 234.1 ± 81.2 | 255.4 ± 108.4 |
| Median (IQR) | 205.0 (171.0–259.0) | 223.0 (173.0–280.0) | 231.0 (169.5–307.0) | 224.5 (168.0–294.0) | 233.0 (179.0–281.2) | 236.0 (188.0–301.0) |
| Missing, n (%) | 48 (11) | 181 (45) | 610 (40) | 638 (34) | 122 (48) | 292 (44) |
| Mean ± SD | 1.9 ± 4.0 | 4.9 ± 7.7 | 9.9 ± 9.6 | 7.4 ± 8.1 | 5.7 ± 9.4 | 5.9 ± 8.0 |
| Median (IQR) | 0.4 (0.2–1.4) | 1.9 (0.4–6.0) | 7.1 (2.1–15.2) | 4.9 (1.1–11.0) | 0.7 (0.1–7.0) | 2.1 (0.3–8.8) |
| Missing, n (%) | 31 (7) | 250 (62) | 936 (62) | 1020 (55) | 146 (57) | 347 (52) |
| Mean ± SD | 208.9 ± 74.8 | 238.1 ± 124.7 | 371.9 ± 625.5 | 287.0 ± 344.9 | 198.5 ± 73.4 | 296.8 ± 500.6 |
| Median (IQR) | 186.5 (158.0–237.0) | 210.0 (165.0–253.2) | 252.6 (193.7–353.0) | 226.5 (184.0–290.0) | 177.0 (155.5–212.0) | 210.5 (178.0–293.2) |
SD standard deviation, IQR intequartile range, PNA pneumonia, WBC white blood cell, CRP C-reactive protein, LDH lactate dehydrogenase.
Figure 2Box Plots and pairwise Mann–Whitney U test summary for common blood laboratory markers. For each blood laboratory marker, the lower and upper bounds of the diagnostic reference range adopted in the local hospitals are given by the grey dotted lines. Statistical significance is indicated by the orange highlights, and the effect size estimated by f is given in the table. If statistical significance is achieved this is highlighted in orange. (a) Boxplot for comparing white blood cell (WBC) counts across different disease groups (Kruskal–Wallis H: p < 0.001). (b) Boxplot for comparing lymphocyte counts across different disease groups (Kruskal–Wallis H: p < 0.001). (c) Boxplot for comparing platelet counts across different disease groups (Kruskal–Wallis H: p < 0.001). (d) Boxplot for comparing C-reactive protein (CRP) level across different disease groups (Kruskal–Wallis H: p < 0.001). (e) Boxplot for comparing lactate dehydrogenase (LDH) level across different disease groups (Kruskal–Wallis H: p < 0.001). (f) Boxplot for comparing haemoglobin distribution across different disease groups (Kruskal–Wallis H: p < 0.001). PNA pneumonia.
Baseline demographics and laboratory and clinical characteristics of validation sets.
| Validation set 1 | Validation set 2a | Validation set 3 | ||||
|---|---|---|---|---|---|---|
| COVID-19 | Non COVID-19 | COVID-19 | Non COVID-19 | COVID-19 | Non COVID-19 | |
| Total, n | 40 | 565 | 155 | 2966 | 27 | 355 |
| Female, n (%) | 25 (62) | 261 (46) | 84 (54) | 1420 (48) | 15 (56) | 181 (51) |
| Mean ± SD | 57 ± 19 | 67 ± 20 | 54 ± 17 | 67 ± 20 | 55 ± 19 | 72 ± 19 |
| Median (IQR) | 58 (46–70) | 70 (54–85) | 56 (40–66) | 71 (55–84) | 58 (43–66) | 78 (61–87) |
| 16–35 | 7 (18) | 48 (8) | 30 (19) | 267 (9) | 5 (19) | 20 (6) |
| 35–50 | 6 (15) | 74 (13) | 30 (19) | 328 (11) | 5 (19) | 36 (10) |
| 50–65 | 11 (28) | 105 (19) | 51 (33) | 590 (20) | 9 (33) | 48 (14) |
| > 65 | 16 (40) | 338 (60) | 44 (28) | 1781 (60) | 8 (30) | 251 (71) |
| Missing, n (%) | 0 (0) | 0 (0) | 5 (3) | 15 (1) | 1 (4) | 5 (1) |
| Mean ± SD | 12.7 ± 1.9 | 11.7 ± 2.5 | 13.5 ± 1.5 | 12.0 ± 2.5 | 13.2 ± 2.0 | 11.5 ± 2.6 |
| Median (IQR) | 12.8 (11.3–14.2) | 11.8 (9.9–13.3) | 13.4 (12.7–14.5) | 12.2 (10.4–13.7) | 13.5 (13.0–14.3) | 11.6 (9.8–13.4) |
| Missing, n (%) | 0 (0) | 0 (0) | 6 (4) | 119 (4) | 1 (4) | 5 (1) |
| Mean ± SD | 0.4 ± 0.1 | 0.4 ± 0.1 | 0.4 ± 0.0 | 0.4 ± 0.1 | 0.4 ± 0.1 | 0.3 ± 0.1 |
| Median (IQR) | 0.4 (0.3–0.4) | 0.4 (0.3–0.4) | 0.4 (0.4–0.4) | 0.4 (0.3–0.4) | 0.4 (0.4–0.4) | 0.3 (0.3–0.4) |
| Missing, n (%) | 0 (0) | 0 (0) | 6 (4) | 119 (4) | 1 (4) | 5 (1) |
| Mean ± SD | 6.7 ± 2.4 | 10.7 ± 6.0 | 5.3 ± 1.7 | 10.3 ± 6.0 | 5.0 ± 1.5 | 10.6 ± 5.7 |
| Median (IQR) | 6.2 (5.1–8.0) | 9.2 (6.7–13.0) | 5.0 (4.1–6.2) | 9.0 (6.6–12.5) | 4.7 (3.9–6.0) | 9.4 (7.0–13.0) |
| Missing, n (%) | 0 (0) | 0 (0) | 8 (5) | 618 (21) | 1 (4) | 6 (2) |
| Mean ± SD | 1.4 ± 0.6 | 1.2 ± 0.8 | 1.2 ± 0.5 | 1.4 ± 0.9 | 1.1 ± 0.5 | 1.3 ± 0.8 |
| Median (IQR) | 1.3 (0.9–1.7) | 1.0 (0.7–1.6) | 1.2 (0.9–1.6) | 1.2 (0.8–1.8) | 1.0 (0.8–1.3) | 1.2 (0.8–1.7) |
| Missing, n (%) | 0 (0) | 0 (0) | 8 (5) | 618 (21) | 1 (4) | 6 (2) |
| Mean ± SD | 0.5 ± 0.2 | 0.7 ± 0.8 | 0.5 ± 0.2 | 0.7 ± 0.5 | 0.6 ± 0.3 | 0.7 ± 0.4 |
| Median (IQR) | 0.5 (0.4–0.7) | 0.6 (0.4–0.9) | 0.5 (0.4–0.7) | 0.6 (0.4–0.8) | 0.5 (0.4–0.7) | 0.6 (0.5–0.9) |
| Missing, n (%) | 0 (0) | 0 (0) | 8 (5) | 617 (21) | 1 (4) | 5 (1) |
| Mean ± SD | 4.6 ± 2.4 | 8.5 ± 5.5 | 3.5 ± 1.6 | 8.0 ± 5.6 | 3.2 ± 1.2 | 8.2 ± 4.9 |
| Median (IQR) | 4.2 (2.9–5.2) | 7.0 (4.7–10.7) | 3.1 (2.3–4.2) | 6.6 (4.4–10.3) | 3.1 (2.2–3.7) | 7.0 (4.7–10.2) |
| Missing, n (%) | 0 (0) | 0 (0) | 6 (4) | 122 (4) | 1 (4) | 6 (2) |
| Mean ± SD | 255.0 ± 101.9 | 241.5 ± 105.6 | 198.2 ± 59.5 | 247.2 ± 100.2 | 202.7 ± 49.4 | 245.5 ± 107.2 |
| Median (IQR) | 238.0 (171.8–318.5) | 232.0 (172.0–297.0) | 193.0 (156.0–239.0) | 233.5 (183.0–294.0) | 187.0 (173.2–229.8) | 238.0 (184.0–291.0) |
| Missing, n (%) | 1 (2) | 223 (39) | 35 (23) | 1642 (55) | 8 (30) | 311 (88) |
| Mean ± SD | 4.1 ± 7.4 | 8.2 ± 9.6 | 2.3 ± 3.5 | 6.1 ± 7.3 | 1.9 ± 2.7 | 3.2 ± 3.8 |
| Median (IQR) | 0.7 (0.3–2.4) | 4.4 (0.9–11.9) | 0.7 (0.3–2.9) | 3.3 (0.6–8.8) | 0.5 (0.4–2.1) | 1.8 (0.5–4.3) |
| Missing, n (%) | 4 (10) | 365 (65) | 63 (41) | 2501 (84) | 4 (15) | 341 (96) |
| Mean ± SD | 245.3 ± 79.4 | 274.5 ± 145.4 | 227.4 ± 121.5 | 287.4 ± 339.4 | 251.3 ± 80.6 | 527.1 ± 769.1 |
| Median (IQR) | 228.0 (189.5–268.0) | 226.0 (185.8–322.8) | 190.2 (172.8–235.0) | 225.0 (182.0–290.0) | 231.0 (193.0–286.0) | 280.5 (195.2–442.0) |
| Travel, n (%) | 12 (30) | 133 (24) | N/A | N/A | 0 (0) | 2 (1) |
| Close contact, n (%) | 20 (50) | 14 (2) | N/A | N/A | 19 (70) | 6 (2) |
| Fever, n (%) | 19 (48) | 254 (45) | N/A | N/A | 15 (56) | 168 (47) |
| Cough, n (%) | 23 (58) | 295 (52) | N/A | N/A | 16 (59) | 92 (26) |
| URTI, n (%) | 14 (35) | 94 (17) | N/A | N/A | 14 (52) | 33 (9) |
| Shortness of breath, n (%) | 10 (25) | 236 (42) | N/A | N/A | 1 (4) | 122 (34) |
| Headache, n (%) | 2 (5) | 16 (3) | N/A | N/A | 4 (15) | 19 (5) |
| Myalgia, n (%) | 3 (8) | 9 (2) | N/A | N/A | 0 (0) | 85 (24) |
| Nausea & vomiting, n (%) | 1 (2) | 26 (5) | N/A | N/A | 0 (0) | 85 (24) |
| Diarrhoea, n (%) | 1 (2) | 20 (4) | N/A | N/A | 4 (15) | 36 (10) |
| Anosmia, n (%) | 0 (0) | 0 (0) | N/A | N/A | 5 (19) | 2 (1) |
| Fever (> 37.5 °C), n (%) | 9 (22) | 111 (20) | N/A | N/A | 20 (74) | 166 (47) |
| Requiring supplemental oxygen, n (%) | 4 (10) | 140 (25) | N/A | N/A | 2 (7) | 85 (24) |
SD standard deviation, IQR interquartile range, WBC white blood cell, CRP C-reactive protein, LDH lactate dehydrogenase.
aClinical characteristics were not extracted for validation set 2.
COVID-19 discriminability of the machine learning model and comparison to clinical, radiologist consensus and combined model.
| Positive/total | AUCa | Accuracy | Sensitivity | Specificity | PPV | NPV | |
|---|---|---|---|---|---|---|---|
| n | % (95%-CI) | % (95%-CI) | % (95%-CI) | % (95%-CI) | % (95%-CI) | % (95%-CI) | |
| ML model | 40/605 | 89.9 (85.9–93.9) | 89.3 (86.5–91.6) | 57.5 (40.9–73.0) | 91.5 (88.9–93.7) | 32.6 (22.8–42.3) | 97.9 (96.6–99.1) |
| Clinical model | 40/605 | N/A | 70.4 (66.6–74.0) | 30.0 (16.6–46.5) | 73.3 (69.4–76.9) | 7.4 (3.4–11.4) | 93.7 (91.4–95.9) |
| Radiologist consensus | 40/605 | N/A | 73.2 (69.5–76.7) | 55.0 (38.5–70.7) | 74.5 (70.7–78.1) | 13.3 (8.1–18.4) | 95.9 (94.0–97.8) |
| Radiologist + ML model | 40/605 | N/A | 68.4 (64.6–72.1) | 92.5 (79.6–98.4) | 66.7 (62.7–70.6) | 16.4 (11.6–21.3) | 99.2 (98.3–100.1) |
| ML model | 155/3121 | 91.3 (89.2–93.3) | 93.0 (92.0–93.9) | 57.4 (49.2–65.3) | 94.8 (94.0–95.6) | 36.8 (30.7–42.9) | 97.7 (97.2–98.3) |
| ML model | 27/382 | 95.8 (91.6–99.9) | 96.9 (94.6–98.4) | 77.8 (57.7–91.4) | 98.3 (96.4–99.4) | 77.8 (62.1–93.5) | 98.3 (97.0–99.7) |
| Clinical model | 27/382 | N/A | 67.2 (62.2–71.9) | 57.7 (36.9–76.6) | 67.9 (62.7–72.8) | 11.8 (6.2–17.4) | 95.6 (93.0–98.1) |
| Radiologist readb | 27/382 | N/A | 92.3 (89.1–94.8) | 53.8 (33.4–73.4) | 95.1 (92.3–97.1) | 45.2 (27.6–62.7) | 96.5 (94.6–98.5) |
| Radiologist + ML model | 27/382 | N/A | 55.5 (50.3–60.6) | 92.3 (74.9–99.1) | 52.7 (47.3–58.1) | 12.7 (8.0–17.4) | 98.9 (97.4–100.4) |
AUC area under the curve, PPV positive predictive value, NPV negative predictive value, CI confidence intervals, ML machine learning model.
aAUC for Clinical, Radiologist and combined Radiologist and ML model are not applicable.
bFor validation set 2, only one radiologist interpreted the chest radiograph for validation set 3.
Pneumonia subtype discriminability of the machine learning model.
| Disease | Positive/total | AUC, % (CI) | Accuracy, % (CI) | Sensitivity, % (CI) | Specificity, % (CI) |
|---|---|---|---|---|---|
| N | % (95%-CI) | % (95%-CI) | % (95%-CI) | % (95%-CI) | |
| Bacterial PNA | 45/175 | 77.4 (70.2–84.5) | 55.4 (47.7–62.9) | 100.0 (92.1–100.0) | 40.0 (31.5–49.0) |
| Viral PNA | 49/175 | 62.9 (54.0–71.9) | 56.6 (48.9–64.0) | 67.3 (52.5–80.1) | 52.4 (43.3–61.3) |
| Non PNA | 62/175 | 62.5 (54.1–70.9) | 61.7 (54.1–68.9) | 38.7 (26.6–51.9) | 74.3 (65.3–82.1) |
AUC area under the curve, PNA pneumonia.
Figure 3Case examples of human and machine learning model prediction. The cut-off threshold for SHAP model is 0.48 meaning that if the model output value is above this, then the prediction is positive. The relative contribution of each laboratory marker is shown in the individual SHAP value plot. (a) An elderly female with a positive prediction from chest X-ray (bilateral lower zones shadowing) and positive prediction from laboratory markers (WBC: 5.29, lymphocytes: 1.09, LDH: 247, and CRP: 1.63). The ground-truth COVID-19 RT-PCR result is positive. (b) An elderly male with a negative prediction from chest X-ray (normal radiographic appearance) and positive prediction from laboratory markers (LDH: 178, lymphocytes: 1.46, platelet: 146, and CRP: 1.033The ground-truth COVID-19 RT-PCR result is positive.