| Literature DB >> 32785701 |
Vafa Bayat1, Steven Phelps2, Russell Ryono3, Chong Lee2, Hemal Parekh2, Joel Mewton2, Farshid Sedghi4, Payam Etminani4, Mark Holodniy5,6,7.
Abstract
BACKGROUND: With the limited availability of testing for the presence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and concerns surrounding the accuracy of existing methods, other means of identifying patients are urgently needed. Previous studies showing a correlation between certain laboratory tests and diagnosis suggest an alternative method based on an ensemble of tests.Entities:
Keywords: human coronavirus; machine learning; polymerase chain reaction; viral pneumonia
Mesh:
Year: 2021 PMID: 32785701 PMCID: PMC7454351 DOI: 10.1093/cid/ciaa1175
Source DB: PubMed Journal: Clin Infect Dis ISSN: 1058-4838 Impact factor: 9.079
Summary of the Numbers of Patient Encounter Test Results and Demographic Information Relating the Number and Median Ages Broken Out by Sex of Unique Patients and Tests for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Positive and Negative Patients
| Unique Patient Tests/Encounters | ||||
|---|---|---|---|---|
| Tested | SARS-CoV-2-positive | SARS-CoV-2-negative | SARS-CoV-2-positive and negative | |
| Total | 92 254 | 7335 | 84 919 | N/A |
| Unique patients | ||||
| Tested | SARS-CoV-2 only positive | SARS-CoV-2 only negative | SARS-CoV-2 positive and negative | |
| Male | 69 634 | 5003 | 63 841 | 790 |
| Female | 6357 | 437 | 5881 | 39 |
| Total | 75 991 | 5440 | 69 722 | 829 |
| Demographic information | ||||
| Mean age (male) | 65.82 | 65.72 | 65.79 | 68.80 |
| Mean age (female) | 52.77 | 52.62 | 52.76 | 56.26 |
| Mean age (all) | 64.73 | 64.67 | 64.69 | 68.21 |
Abbreviation: N/A, not applicable.
Summary of the XGBoost Machine Learning Prediction Model Results
| SARS-CoV-2 (+) vs SARS-CoV-2 (−) | |
|---|---|
|
| 15/20 |
|
| 92 254 |
|
| 75 991 |
|
| 7335 |
|
| 6269 |
|
| 11 002 |
|
| 22 763 |
|
| 86.77 |
|
| 82.39 |
|
| 86.40 |
|
| 35.30 |
|
| 98.25 |
Abbreviation: SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Summary of the 20 Features Discussed in the Manuscript, Listed in Descending Order of Importance.
| Features | Variable Importance | Missingness-COVID-19 Tested Patients (Pre-filter) | Missingness-COVID-19 Tested Patients (Post-filter) | Normal Range | COVID-19 Range (Present Study) | COVID-19 Range (ref. [ | COVID-19 Range (ref. [ | COVID-19 With ARDS (ref. [ |
|---|---|---|---|---|---|---|---|---|
|
| 100.0 | 0.917 | 0.811 | 12–300 | 680.0 ± 823.0 | |||
|
| 70.35 | 0.371 | 0.004 | 4–11 | 6.76 ± 3.26 | 7.0 ± 3.6 | 4.7 ± 1.2 | 7.2 ± 2.8 |
|
| 59.70 | 0.456 | 0.015 | <0.5 | 0.07 ± 0.12 | |||
|
| 51.28 | 0.085 | 0.024 | 97–99 | 98.82 ± 1.12 | 98.2 | ||
|
| 49.48 | 0.897 | 0.772 | <0.8 | 7.40 ± 7.27 | 9.75 ± 6.64 | 8.72 ± 1.73 | |
|
| 44.28 | 0.921 | 0.811 | 60–100 | 311.10 ± 197.47 | 408.1 ± 231.0 | 483.0 ± 119 | |
|
| 36.51 | 0.927 | 0.829 | <0.5 | 1.53 ± 2.26 | 4.0 ± 7.0 | ||
|
| 26.69 | 0.472 | 0.027 | <0.3 | 0.02 ± 0.03 | |||
|
| 24.27 | 0.449 | 0.013 | 2–8 | 9.62 ± 4.32 | |||
|
| 19.94 | 0.541 | 0.046 | 0–35 | 41.75 ± 40.00 | 38.2 ± 24.6 | 25.5 ± 17.0 | |
|
| 18.95 | 0.473 | 0.011 | 3.5–5.5 | 3.47 ± 0.65 | 3.44 ± 0.57 | 3.07 ± 0.27 | |
|
| 16.52 | 0.357 | 0.000 | 38–48 (male) | 38.73 ± 6.30 | |||
|
| 14.72 | 0.846 | 0.668 | <100 | 379.48 ± 798.90 | |||
|
| 14.65 | 0.353 | 0.001 | 150–450 | 213.29 ± 88.2 | 162.7 ± 45 | 168 ± 39 | 166.5 ± 26 |
|
| 14.29 | 0.512 | 0.012 | 36–92 | 82.04 ± 45.78 | |||
|
| 13.85 | 0.458 | 0.016 | <6 | 1.11 ± 1.67 | |||
|
| 13.39 | 0.451 | 0.017 | 0.88–4.0 | 5.52 ± 5.97 | |||
|
| 12.67 | 0.501 | 0.015 | 0.1–1.2 | 0.68 ± 0.43 | |||
|
| 12.19 | 0.361 | 0.000 | 28–32 | 29.34 ± 2.47 | |||
|
| 6.17 | 0.441 | 0.006 | 0.3–0.9 | 0.62 ± 0.33 |
Variable importances range from 0 to 100 on a relative scale. Missingness in the COVID-19 tested cohort before and after the application of the any-15-of-20 completeness requirement are shown. All feature data are listed as mean ± standard deviation. Information for the last 3 columns are taken from the referenced publications.
Abbreviations: ARDS, acute respiratory distress syndrome; AST, aspartate aminotransferase; BNP, B-type natriuretic peptide; COVID-19, coronavirus disease 2019; CRP, C-reactive protein; LDH, lactate dehydrogenase.
Figure 1.Separation in score distributions for each feature. The histograms represent relative probability density. In case of multiple scores within the time window, the median value was used. Red indicates SARS-CoV-2 positive patients and blue SARS-CoV-2 negative patients. See Table 3 for the importance of these and other features. Abbreviations: AST, aspartate aminotransferase; BNP, B-type natriuretic peptide; COVID-19, coronavirus disease 2019; LDH, lactate dehydrogenase; MCH, mean corpuscular hemoglobin; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; WBC, white blood cell.