| Literature DB >> 32533957 |
Lara Jehi1, Xinge Ji2, Alex Milinovich2, Serpil Erzurum3, Brian P Rubin4, Steve Gordon5, James B Young6, Michael W Kattan2.
Abstract
BACKGROUND: Coronavirus disease 2019 (COVID-19) is sweeping the globe. Despite multiple case-series, actionable knowledge to tailor decision-making proactively is missing. RESEARCH QUESTION: Can a statistical model accurately predict infection with COVID-19? STUDY DESIGN AND METHODS: We developed a prospective registry of all patients tested for COVID-19 in Cleveland Clinic to create individualized risk prediction models. We focus here on the likelihood of a positive nasal or oropharyngeal COVID-19 test. A least absolute shrinkage and selection operator logistic regression algorithm was constructed that removed variables that were not contributing to the model's cross-validated concordance index. After external validation in a temporally and geographically distinct cohort, the statistical prediction model was illustrated as a nomogram and deployed in an online risk calculator.Entities:
Keywords: COVID-19; infectious disease; predictive modeling; testing
Mesh:
Year: 2020 PMID: 32533957 PMCID: PMC7286244 DOI: 10.1016/j.chest.2020.05.580
Source DB: PubMed Journal: Chest ISSN: 0012-3692 Impact factor: 9.410
Figure 1Timeline shows the evolution of clinical framework to COVID test ordering during the first 10 days of testing. The single asterisk indicates that patients were sent to the ED only if they needed evaluation of additional symptoms and not purely to obtain COVID testing. The double asterisk indicates that the guidelines to order COVID testing followed the Centers for Disease Control and Prevention recommendations. The main change in phase III was a better definition of high-risk categories, rather than reliance on “physician discretion.” Of note, only 6.7% were tested in phase I + phase II because of physician discretion alone, so that number was too small to perform any modeling work in that group. COVID = coronavirus disease 2019; OR = operating room; VV = virtual visit.
Figure 2Proportion of COVID-19 negative tests being avoided (solid line, true negative rate) vs proportion of COVID-19 positive tests being identified (dashed line, true positive rate) at different nomogram predicted probability cutoffs. For example, if a predicted probability of ≥0.60 was required before testing, nearly all negative cases would have been avoided, but approximately 95% of positive cases would have been missed. At a cutoff of 12.3%, the proportion of negative tests being avoided is equal to the proportion of positive tests being detected (intersection of red and blue lines). The Table below shows the sensitivity, specificity, NPV, and PPV for this cutoff of 12.3%. For higher cutoffs, we illustrate how sensitivity decreases while specificity increases. NPV = negative predictive value; PPV = positive predictive value. See Figure 1 legend for expansion of other abbreviation.
Baseline Demographic and Clinical Characteristics in 11,672 Patients Who Tested Positive vs Negative to COVID-19 in the Development Cohort in the Cleveland Clinic Health System before April 2, 2020, and a Validation Cohort of 2,295 Patients in the Florida Cleveland Clinic Health System Patients Tested Between April 2 and 16, 2020
| Variable | Development Cohort | Florida Validation Cohort | ||||
|---|---|---|---|---|---|---|
| COVID-19 Negative | COVID-19 Positive | COVID-19 Negative | COVID-19 Positive | |||
| No. | 10,854 | 818 | 2005 | 290 | ||
| Physician discretion, No. (%) | 773 (99.3) | 6 (0.7) | < .001 | 580 (98.5) | 9 (1.5) | < .001 |
| Demographics | ||||||
| Race, No. (%) | < .001 | < .001 | ||||
| Asian | 174 (98) | 9 (2) | 46 (85.2) | 8 (14.8) | ||
| Black | 2,138 (91.1) | 207 (8.9) | 209 (79.8) | 53 (20.2) | ||
| Other | 1,194 (92.1) | 102 (7.9) | 369 (84.6) | 67 (15.4) | ||
| White | 7,348 (93.6) | 500 (6.4) | 1381 (89.5) | 162 (10.5) | ||
| Male (%) | 4,192 (91.0) | 415 (9.0) | < .001 | 831 (85.8) | 138 (14.2) | .055 |
| Ethnicity, No. (%) | < .001 | < .001 | ||||
| Hispanic | 505 (91.3) | 48 (8.7) | 529 (81.4) | 121 (18.6) | ||
| Non-Hispanic | 9,608 (93.2) | 697 (6.8) | 1383 (89.6) | 160 (10.4) | ||
| Unknown | 741 91.0) | 73 (9.0) | 93 (91.2) | 9 (8.8) | ||
| Smoking, No. (%) | < .001 | < .001 | ||||
| Current Smoker | 1,593 (97.7) | 37 (2.3) | 67 (91.8) | 6 (8.2) | ||
| Former Smoker | 2,692 (93.0) | 202 (7.0) | 366 (81.3) | 84 (18.7) | ||
| No | 5,141 (92.1) | 440 (7.9) | 626 (87.4) | 90 (12.6) | ||
| Unknown | 1,428 (91.1) | 139 (8.9) | 946 (89.6) | 110 (10.4) | ||
| Age, median [IQR], y | 46.89 [31.57-62.85] | 54.23 [38.81-65.94] | < .001 | 56.02 [41.95-67.52] | 51.60 [36.69-63.08] | < .001 |
| Exposure history: Yes, No. (%) | ||||||
| Exposed to COVID-19 ? | 1,510 (94.5) | 88 (4.5) | .013 | 492 (68.5) | 226 (31.5) | < .001 |
| Family member with COVID-19? | 911 (94.1) | 57 (5.9) | .174 | 467 (68.9) | 211 (31.1) | < .001 |
| Presenting symptoms: Yes, No. (%) | ||||||
| Cough? | 2,782 (95.5) | 130 (4.5) | < .001 | 609 (70.8) | 251 (29.2) | < .001 |
| Fever? | 1,918 (94.6) | 110 (5.4) | < .001 | 532 (69.9) | 229 (30.1) | < .001 |
| Fatigue? | 1,472 (94.4) | 87 (5.6) | < .001 | 406 (68.4) | 188 (31.6) | < .001 |
| Sputum production? | 929 (96.0) | 38 (4.0) | < .001 | 343 (68.2) | 160 (31.8) | < .001 |
| Flu-like symptoms? | 1,813 (94.3) | 108 (5.7) | .011 | 507 (70.7) | 210 (29.3) | < .001 |
| Shortness of breath? | 1,578 (96.0) | 64 (4.0) | < .001 | 462 (75.5) | 150 (24.5) | < .001 |
| Diarrhea? | 629 (95.0) | 33 (5.0) | .043 | 347 (69.5) | 152 (30.5) | < .001 |
| Loss of appetite? | 671 (93.4) | 47 (6.6) | .671 | 343 (67.0) | 169 (33.0) | < .001 |
| Vomiting? | 536 (97.1) | 16 (2.9) | < .001 | 309 (73.2) | 113 (26.8) | < .001 |
| Comorbidities | ||||||
| BMI, median [IQR], kg/m2 | 28.46 [23.90-33.94] | 29.23 [25.86-33.78] | .001 | 27.60 [23.49-31.05] | 28.91 [24.81-33.60] | .037 |
| COPD/emphysema? Yes, No. (%) | 304 (96.2) | 12 (3.8) | .031 | 36 (94.7) | 2 (5.3) | .257 |
| Asthma? Yes, No. (%) | 2,761 (94.9) | 147 (5.1) | < .001 | 176 (91.7) | 16 (8.3) | .078 |
| Diabetes mellitus? Yes, No. (%) | 2,486 (93.0) | 188 (7.0) | .993 | 224 (86.2) | 36 (13.8) | .6 |
| Hypertension? Yes, No. (%) | 4,324 (92.7) | 342 (7.3) | .283 | 460 (86.3) | 73 (13.7) | .444 |
| Coronary artery disease? Yes, No. (%) | 1,325 (93.6) | 90 (7.4) | .336 | 141 (97.9) | 3 (2.1) | < .001 |
| Heart failure? Yes, No. (%) | 1,170 (94.7) | 66 (5.3) | .018 | 88 (96.7) | 3 (3.3) | .01 |
| Cancer? Yes, No. (%) | 1,616 (93.7) | 108 (6.8) | .208 | 245 (92.8) | 19 (7.2) | .006 |
| Transplantation history? Yes, No. (%) | 190 (96.4) | 7 (3.6) | .046 | 43 (95.6) | 2 (4.4) | .149 |
| Multiple sclerosis? Yes, No. (%) | 96 (91.4) | 9 (8.6) | .661 | 8 (88.9) | 1 (11.1) | 1 |
| Connective tissue disease? Yes, No. (%) | 3,505 (94.5) | 203 (5.5) | < .001 | 41 (89.1) | 5 (10.9) | .889 |
| Inflammatory bowel disease? Yes, No. (%) | 943 (95.6) | 45 (4.4) | .002 | 34 (81.0) | 8 (19.0) | .304 |
| Immunosuppressive disease? Yes, No. (%) | 1,557 (94.5) | 91 (5.5) | .012 | 163 (92.6) | 13 (7.4) | .039 |
| Vaccination history: Yes, No. (%) | ||||||
| Influenza vaccine? | 5,940 (93.9) | 384 (6.1) | < .001 | 328 (91.6) | 30 (8.4) | .011 |
| Pneumococcal polysaccharide vaccine? | 2,667 (95.2) | 135 (4.8) | < .001 | 115 (92.0) | 10 (8.0) | .143 |
| Laboratory findings on presentation | ||||||
| Pretesting platelets, median [IQR], k/uL | 245.00 [189.00-304.00] | 190.00 [154.00-241.50] | < .001 | 236.00 [180.00-304.00] | 213.50 [173.00-286.75] | .698 |
| Pretesting AST, median [IQR], U/L | 23.00 [17.00-34.00] | 32.00 [24.25-47.00] | < .001 | 22.00 [18.00-34.50] | 31.00 [21.00-53.25] | .146 |
| Pretesting BUN, median [IQR], mg/dL | 15.00 [11.00-23.00] | 14.00 [10.00-22.00] | .099 | 18.00 [13.00-27.25] | 12.00 [8.25-15.50] | .003 |
| Pretesting chloride, median [IQR], mmol/L | 101.00 [97.00-103.00] | 99.00 [96.00-102.00] | < .001 | 100.00 [96.00-102.00] | 97.50 [92.75-99.25] | .026 |
| Pretesting creatinine, median [IQR], mg/dL | 0.90 [0.71-1.21] | 1.01 [0.79-1.29] | < .001 | 0.94 [0.77-1.45] | 0.92 [0.87-1.03] | .677 |
| Pretesting hematocrit, median [IQR], % | 39.10 [34.20-43.00] | 40.60 [37.15-43.85] | < .001 | 36.80 [32.20-41.00] | 38.50 [36.02-43.20] | .221 |
| Pretesting potassium, median [IQR], mmol/L | 4.00 [3.80-4.40] | 4.00 [3.70-4.20] | < .001 | 4.10 [3.90-4.60] | 4.15 [3.90-4.35] | .808 |
| Home medications | ||||||
| Immunosuppressive treatment? Yes (%) | 423 (97.2) | 12 (2.8) | .001 | 97 (83.6) | 19 (16.4) | .271 |
| Nonsteroidal antiinflammatory drugs? Yes (%) | 3,084 (95.1) | 162 (5.0) | < .001 | 156 (94.0) | 10 (6.0) | .011 |
| Steroids? Yes (%) | 2,317 (95.5) | 109 (4.5) | < .001 | 135 (93.8) | 9 (6.2) | .024 |
| Carvedilol? Yes (%) | 333 (96.2) | 13 (3.8) | .022 | 27 (100.0) | 0 | .09 |
| ACE inhibitor? Yes (%) | 805 (93.3) | 58 (6.7) | .784 | 60 (89.6) | 7 (10.4) | .718 |
| ARB? Yes (%) | 585 (91.7) | 53 (8.3) | .214 | 78 (90.7) | 8 (9.3) | .434 |
| Melatonin? Yes (%) | 513 (97.0) | 16 (3.0) | < .001 | 18 (100.0) | 0 | .206 |
| Social influencers of health | ||||||
| Population/km | 3.06 [2.69-3.36] | 3.08 [2.72-3.37] | .24 | 3.20 [3.02-3.35] | 3.28 [3.12-3.42] | < .001 |
| Median income × $1000, median [IQR], $ | 55.61 [38.73-78.56] | 60.46 [42.77-84.24] | < .001 | 66.28 [53.41-89.11] | 59.07 [47.59-75.56] | < .001 |
| Population per housing unit, median [IQR], No. | 2.21 [1.88-2.56] | 2.25 [1.89-2.59] | .038 | 2.47 [1.83-2.87] | 2.61 [2.11-2.92] | .001 |
ACE = angiotensin converting enzyme; ARB = angiotensin receptor blocker; AST = aspartate aminotransferase; COVID-19 = coronavirus disease 2019; IQR = interquartile range.
Figure 3Calibration curves for the model predicting likelihood of a positive test. The x-axis displays the predicted probabilities generated by the statistical model, and the y-axis shows the fraction of the patients who were COVID-19 positive at the given predicted probability. The 45-degree line therefore indicates perfect calibration, for example, a predicted probability of 0.2 is associated with an actual observed proportion of 0.2. The solid blue line indicates the model’s relationship with the outcome. The closer the line is to the 45-degree line, the closer the model’s predicted probability is to the actual proportion. A, The calibration curve in the development cohort of 11,672 patients tested in Cleveland Clinic Health System before April 2. B, The calibration curve in the Florida Validation Cohort (2,295 patients tested in Cleveland Clinic Florida from April 2-16, 2020). As demonstrated, there is excellent correspondence between the predicted probability of a positive test and the observed frequency of COVID-19 positive in both cohorts. See Figure 1 legend for expansion of abbreviation.
Figure 4The graphic version of the model (A) and the corresponding online risk calculator (B). The example for both is a 60-year-old white male, former smoker, who presented with cough, fever, and a history of a known family member with COVID-19. He has coronary artery disease, did not receive vaccinations against influenza or pneumococcal pneumonia this year, and is only on melatonin to help with sleep. No laboratory tests were performed at the time of COVID-19 testing. His predicted risk of testing positive is 13.79%. If race is changed to black, with all other variables remaining constant, his relative risk almost doubles to an absolute value of 23.95%. ACE = angiotensin converting enzyme; ARB = ••••; AST = ••••; NSAIDS = nonsteroidal antiinflammatory drugs. See Figure 1 for expansion of other abbrevation.