| Literature DB >> 33590417 |
Dimitris Bertsimas1,2, Leonard Boussioux3, Ryan Cory-Wright3, Arthur Delarue3, Vassilis Digalakis3, Alexandre Jacquillat4,3, Driss Lahlou Kitane3, Galit Lukin3, Michael Li3, Luca Mingardi3, Omid Nohadani5, Agni Orfanoudaki3, Theodore Papalexopoulos3, Ivan Paskov3, Jean Pauphilet6, Omar Skali Lami3, Bartolomeo Stellato7, Hamza Tazi Bouardi3, Kimberly Villalobos Carballo3, Holly Wiberg3, Cynthia Zeng3.
Abstract
The COVID-19 pandemic has created unprecedented challenges worldwide. Strained healthcare providers make difficult decisions on patient triage, treatment and care management on a daily basis. Policy makers have imposed social distancing measures to slow the disease, at a steep economic price. We design analytical tools to support these decisions and combat the pandemic. Specifically, we propose a comprehensive data-driven approach to understand the clinical characteristics of COVID-19, predict its mortality, forecast its evolution, and ultimately alleviate its impact. By leveraging cohort-level clinical data, patient-level hospital data, and census-level epidemiological data, we develop an integrated four-step approach, combining descriptive, predictive and prescriptive analytics. First, we aggregate hundreds of clinical studies into the most comprehensive database on COVID-19 to paint a new macroscopic picture of the disease. Second, we build personalized calculators to predict the risk of infection and mortality as a function of demographics, symptoms, comorbidities, and lab values. Third, we develop a novel epidemiological model to project the pandemic's spread and inform social distancing policies. Fourth, we propose an optimization model to re-allocate ventilators and alleviate shortages. Our results have been used at the clinical level by several hospitals to triage patients, guide care management, plan ICU capacity, and re-distribute ventilators. At the policy level, they are currently supporting safe back-to-work policies at a major institution and vaccine trial location planning at Janssen Pharmaceuticals, and have been integrated into the US Center for Disease Control's pandemic forecast.Entities:
Keywords: COVID-19; Epidemiological modeling; Machine learning; Optimization
Mesh:
Year: 2021 PMID: 33590417 PMCID: PMC7883965 DOI: 10.1007/s10729-020-09542-0
Source DB: PubMed Journal: Health Care Manag Sci ISSN: 1386-9620
Fig. 1Overview of our end-to-end analytics approach. We leverage diverse data sources to inform a family of descriptive, predictive and prescriptive tools for clinical and policy decision-making support
Count and prevalence of symptoms among COVID- patients, in aggregate, broken down into mild/severe patients, and broken down per continent (Asia, Europe, North America)
| Symptom | All patients | Mild | Severe | Asia | Europe | North America | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Count | ( | Count | ( | Count | ( | Count | ( | Count | ( | Count | ( | ||||||
| Cough | 94,950 | 52.8 | 6,833 | 63.0 | 5,803 | 50.4 | 14,034 | 56.2 | 78,430 | 52.2 | 1,113 | 63.6 | |||||
| Fever | 95,870 | 48.1 | 6,864 | 79.3 | 6,077 | 76.7 | 14,750 | 76.6 | 78,450 | 43.5 | 1,481 | 41.3 | |||||
| Short Breath | 17,290 | 33.7 | 6,006 | 16.1 | 5,373 | 60.7 | 11,330 | 19.7 | 3,512 | 69.9 | 1,111 | 49.2 | |||||
| Fatigue | 11,560 | 31.4 | 5,313 | 35.3 | 1,989 | 40.6 | 11,320 | 30.8 | 226 | 64.2 | − | − | |||||
| Sputum | 7,613 | 26.3 | 4,995 | 29.2 | 1,216 | 34.2 | 7,395 | 26.7 | − | − | 176 | 10.9 | |||||
| Sore Throat | 83,170 | 22.2 | 3,513 | 14.2 | 921 | 8.2 | 6,013 | 10.4 | 75,235 | 22.9 | 550 | 9.8 | |||||
| Myalgia | 12,150 | 17.5 | 4,455 | 16.4 | 1,643 | 19.1 | 8,517 | 15.5 | 1,633 | 33.5 | 755 | 25.3 | |||||
| Elev. Resp. Rate | 7,376 | 16.4 | 527 | 9.7 | 642 | 38.4 | 1,257 | 14.6 | − | − | 6,117 | 16.8 | |||||
| Anorexia | 3,928 | 15.8 | 1,641 | 14.2 | 808 | 15.4 | 3,566 | 13.8 | 312 | 40.5 | − | − | |||||
| Headache | 11,430 | 15.7 | 5,068 | 12.2 | 1,541 | 8.6 | 7,929 | 9.9 | 1,633 | 27.2 | 551 | 8.7 | |||||
| Nausea | 10,070 | 12.4 | 4,238 | 6.5 | 1,798 | 5.6 | 8,262 | 8.2 | 312 | 22.4 | 259 | 9.0 | |||||
| Chest Pain | 3,303 | 11.3 | 767 | 12.2 | 588 | 19.6 | 2,984 | 12.2 | − | − | − | − | |||||
| Diarrhea | 16,520 | 11.1 | 5,687 | 9.7 | 5,369 | 9.0 | 11,470 | 10.8 | 3,512 | 10.4 | 1,066 | 15.4 | |||||
| Cong. Airway | 1,639 | 8.7 | 2,176 | 6.5 | 234 | 14.1 | 1,369 | 8.9 | − | − | 258 | 7.4 | |||||
| Chills | 3,116 | 8.7 | 2,751 | 9.9 | 520 | 9.4 | 2,794 | 8.2 | − | − | 268 | 11.5 | |||||
| Proj. Mortality | 111,700 | 11.7 | 7,428 | 0.4 | 9,146 | 74.0 | 12,820 | 16.7 | 79,750 | 9.9 | 19,060 | 15.8 | |||||
Mild and severe patients only form a subset of the data, and so do patients from Asia, Europe and North America. A “-” indicates that fewer than patients in a subpopulation reported on this symptom
Comorbidities, demographics, average lab values, average length of stay and projected mortality among COVID-19 patients, in aggregate and broken down into mild/severe patients
| Feature | All Patients | Mild Patients | Severe Patients | |||||
|---|---|---|---|---|---|---|---|---|
| No. Report | Avg. | No. Report. | Avg. | No. Report. | Avg. | |||
| Demographics | ||||||||
| Male (%) | 131,200 | 53.0 | 9,570 | 48.8 | 10,120 | 68.7 | ||
| Age (years) | 119,000 | 51.3 | 8,022 | 46.1 | 9,685 | 68.2 | ||
| White/European (%) | 55,490 | 22.2 | 10,120 | 9.7 | 9,887 | 63.9 | ||
| African American (%) | 55,490 | 5.4 | 10,120 | 3.5 | 9,887 | 2.5 | ||
| Asian (%) | 55,320 | 51.3 | 10,300 | 80.2 | 9,933 | 31.2 | ||
| Hispanic/Latino | 50,630 | 19.9 | 8,017 | 0 | 9,107 | 0 | ||
| Multiple ethnicities/other | 55,190 | 3.6 | 10,120 | 6.9 | 9,887 | 2.7 | ||
| Comorbidities | ||||||||
| Smoking history | 27,900 | 16.1 | 6,080 | 12.2 | 1,973 | 16.6 | ||
| Hypertension | 38,390 | 35.9 | 8,252 | 15.2 | 8,449 | 54.4 | ||
| Diabetes | 39,790 | 20.8 | 8,396 | 6.8 | 8,818 | 26.1 | ||
| Cardio Disease | 40,030 | 12.4 | 8,028 | 3.0 | 9,540 | 20.3 | ||
| COPD | 34,150 | 6.0 | 6,297 | 2.8 | 8,727 | 10.0 | ||
| Cancer | 29,170 | 7.2 | 6,259 | 3.2 | 8,355 | 12.9 | ||
| Liver Disease | 18,300 | 2.8 | 1,875 | 2.3 | 6,832 | 3.5 | ||
| Cebrovascular | 6,830 | 9.8 | 3,245 | 2.7 | 1,360 | 24.8 | ||
| Kidney Disease | 35,500 | 5.7 | 6,152 | 1.2 | 8,139 | 10.8 | ||
| Lab values | ||||||||
| WBC Count (109/L) | 19,970 | 6.41 | 5,403 | 5.07 | 2,305 | 6.80 | ||
| Neutrophil Count (109/L) | 12,500 | 4.72 | 2,236 | 5.12 | 1,410 | 5.78 | ||
| Platelet Count (109/L) | 12,125 | 195.7 | 5,165 | 184.0 | 2,105 | 170.4 | ||
| ALT (U/L) | 14,467 | 29.0 | 2,840 | 24.6 | 2,428 | 31.1 | ||
| AST (U/L) | 14,214 | 37.3 | 2,766 | 27.1 | 2,366 | 45.7 | ||
| BUN (mmol/L) | 4,822 | 5.22 | 1,700 | 4.18 | 1,138 | 6.86 | ||
| Creatinine ( | 8,504 | 63.08 | 2,529 | 66.0 | 2,454 | 56.4 | ||
| CRP Count (mg/L) | 17,090 | 76.5 | 2,573 | 18.9 | 2,339 | 94.1 | ||
| Interleukin-6 (pg/mL) | 2,582 | 24.57 | 1,127 | 4.17 | 552 | 38.63 | ||
| Procalcitonin (ng/mL) | 14,750 | 2.26 | 1,468 | 1.85 | 1,969 | 4.81 | ||
| D-Dimer (mg/L) | 13,330 | 38.81 | 2,478 | 8.04 | 2,401 | 165.9 | ||
| Length of Stay (days) | 16,010 | 10.7 | 4,131 | 14.0 | 5,642 | 7.97 | ||
| Proj. Mortality ( | 111,700 | 11.7 | 7,428 | 0.4 | 9,146 | 74.0 | ||
Fig. 2Impact of cohort characteristics on projected mortality, assessed at a cohort level. The size of each dot represents the number of patients in the cohort, and its color represents the nation the study was performed in. We only include studies reporting both discharged and deceased patients
Characteristics of study population for mortality prediction model
| All ( | Survivors ( | Non-Survivors ( | P-Value | |
|---|---|---|---|---|
| Age | 68.0 (57.0-79.0) | 63.0 (54.0-74.0) | 81.0 (73.2-86.0) | 1.28E-185 |
| Female a | 1095.0 (38.7%) | 868.0 (40.9%) | 227.0 (31.9%) | 1.18E-05 |
| Heart Rate | 89.0 (79.0-101.0) | 90.0 (80.0-102.0) | 87.0 (78.0-100.0) | 1.29E-03 |
| Oxygen Saturation | 94.0 (90.0-96.0) | 94.4 (92.0-96.0) | 88.5 (80.0-93.6) | 3.16E-37 |
| Temperature (F) | 98.4 (97.5-99.7) | 98.4 (97.5-99.6) | 98.8 (97.7-100.0) | 2.42E-04 |
| Alanine Aminotransferase | 27.0 (17.0-44.0) | 27.8 (17.5-45.0) | 25.5 (16.0-41.0) | 3.77E-02 |
| Aspartate Aminotransferase | 36.0 (25.0-55.0) | 34.0 (24.4-51.0) | 45.0 (30.0-69.0) | 1.55E-11 |
| Blood Glucose | 118.0 (105.0-141.0) | 115.0 (103.4-133.0) | 134.0 (113.0-171.0) | 1.12E-22 |
| Blood Urea Nitrogen | 17.0 (12.6-25.2) | 15.0 (11.5-20.0) | 29.5 (20.3-47.2) | 1.02E-65 |
| C-Reactive Protein | 74.2 (29.1-149.5) | 58.6 (22.7-119.3) | 141.1 (72.0-223.1) | 4.76E-50 |
| Creatinine | 1.0 (0.8-1.2) | 0.9 (0.7-1.1) | 1.3 (1.0-1.8) | 2.84E-36 |
| Hemoglobin | 13.9 (12.7-15.0) | 14.0 (12.9-15.0) | 13.5 (12.0-14.7) | 9.11E-10 |
| Mean Corpsular Volume | 87.8 (84.9-91.0) | 87.5 (84.7-90.4) | 89.3 (85.8-92.7) | 2.80E-08 |
| Platelets | 201.0 (156.0-263.0) | 206.0 (160.0-266.5) | 185.0 (141.0-246.8) | 6.62E-08 |
| Potassium | 4.1 (3.7-4.4) | 4.0 (3.7-4.4) | 4.1 (3.7-4.6) | 1.43E-04 |
| Prothrombin Time (INR) | 1.1 (1.0-1.2) | 1.1 (1.0-1.2) | 1.1 (1.0-1.3) | 3.20E-05 |
| Sodium | 137.1 (135.0-140.0) | 137.0 (135.0-139.4) | 138.0 (135.0-141.0) | 5.65E-08 |
| White Blood Cell Count | 6.8 (5.2-9.2) | 6.5 (5.0-8.7) | 8.0 (5.7-11.4) | 3.00E-15 |
| Cardiac dysrhythmias a | 200.0 (7.1%) | 127.0 (6.0%) | 73.0 (10.3%) | 6.50E-04 |
| Chronic kidney disease a | 65.0 (2.3%) | 33.0 (1.6%) | 32.0 (4.5%) | 3.67E-04 |
| Heart disease a | 125.0 (4.4%) | 80.0 (3.8%) | 45.0 (6.3%) | 1.10E-02 |
| Diabetes a | 345.0 (12.2%) | 234.0 (11.0%) | 111.0 (15.6%) | 2.73E-03 |
| Mortality a | 711 (25.1%) | 0 (0%) | 711 (100%) | – |
a Count (proportion) is reported for binary variables
Characteristics of study population for infection test prediction model
| All ( | No Infection ( | Infection ( | P-Value | |
|---|---|---|---|---|
| Age | 63.0 (49.0-78.0) | 58.0 (42.0-78.0) | 66.0 (55.0-78.5) | 9.00E-28 |
| Female a | 1444.0 (46.1%) | 777.0 (52.7%) | 667.0 (40.2%) | 1.70E-12 |
| Heart Rate | 88.5 (78.0-100.5) | 89.0 (78.0-101.2) | 88.0 (78.2-100.0) | 6.13E-01 |
| Oxygen Saturation | 95.4 (91.8-97.0) | 96.5 (94.8-97.5) | 94.2 (89.6-96.4) | 1.68E-31 |
| Respiratory Frequency | 18.0 (16.0-19.0) | 18.0 (16.0-18.0) | 18.0 (16.0-20.0) | 9.64E-21 |
| Temperature | 98.3 (97.5-99.5) | 97.7 (97.2-98.7) | 99.0 (97.9-100.0) | 4.51E-80 |
| Alanine Aminotransferase | 22.0 (15.0-37.0) | 19.0 (13.0-30.0) | 27.0 (18.0-43.0) | 6.59E-09 |
| Aspartate Aminotransferase | 29.0 (21.0-47.0) | 23.0 (19.0-31.0) | 37.0 (26.0-57.0) | 1.20E-20 |
| Blood Urea Nitrogen | 17.0 (13.0-25.0) | 16.0 (12.0-22.0) | 18.0 (13.0-27.0) | 3.78E-05 |
| Calcium | 9.3 (8.9-9.7) | 9.6 (9.2-9.9) | 9.0 (8.7-9.4) | 1.90E-96 |
| C-Reactive Protein | 31.0 (3.4-107.6) | 4.7 (1.1-35.4) | 69.8 (23.2-152.3) | 1.28E-83 |
| Creatinine | 0.9 (0.8-1.2) | 0.9 (0.7-1.1) | 1.0 (0.8-1.2) | 2.19E-05 |
| Hemoglobin | 13.5 (12.3-14.7) | 13.4 (12.1-14.6) | 13.6 (12.5-14.8) | 5.70E-05 |
| Mean Corpsular Volume | 87.2 (84.0-90.3) | 87.7 (84.2-90.7) | 86.8 (83.9-90.0) | 1.43E-01 |
| Platelets | 223.0 (174.0-285.0) | 241.0 (198.0-297.0) | 202.0 (156.0-266.0) | 1.41E-18 |
| Red Cell Distrbution Width | 13.2 (12.5-14.3) | 13.2 (12.5-14.5) | 13.1 (12.4-14.0) | 1.59E-06 |
| Sodium | 139.0 (137.0-141.0) | 140.0 (138.0-142.0) | 139.0 (136.0-141.0) | 1.12E-14 |
| Prothrombin Time (INR) | 1.0 (1.0-1.1) | 1.1 (1.0-1.1) | 1.0 (1.0-1.1) | 8.96E-01 |
| Total Bilirubin | 0.6 (0.5-0.8) | 0.6 (0.4-0.9) | 0.6 (0.5-0.8) | 8.83E-03 |
| White Blood Cell Count | 7.6 (5.8-10.1) | 8.7 (7.0-11.1) | 6.6 (5.1-8.7) | 7.59E-38 |
| COVID-19 Positive Test a | 1661 (53.0%) | 0 (0%) | 1661 (100%) | – |
Count (proportion) is reported for binary variables
Fig. 3SHapley Additive exPlanations (SHAP) importance plots for the mortality and infection risk calculators. The five most important features are shown for each model. Gender is a binary feature (female is equal to 1, shown in red; male is equal to 0, shown in blue). Each row represents the impact of a feature on the outcome, with higher SHAP values indicating higher likelihood of a positive outcome
Fig. 4Simplified flow diagram of DELPHI
Fig. 5Projection accuracy for the United States
Implementation length and effect of each policy category as implemented in the US
| Restrictions | States | State-Days | Residual Infection Rate |
|---|---|---|---|
| None | 51 | 749 | 100 |
| Others | 25 | 218 | 66.8 ± 5.9 |
| Mass Gathering and Schools | 9 | 72 | 47.9 ± 9.0 |
| Mass Gathering, School, and Others | 37 | 805 | 42.2 ± 4.8 |
| Stay-at-Home Order | 39 | 1876 | 23.9 ± 4.0 |
Fig. 6Reopening scenarios for New York
Fig. 7United States predictions for mid-July under mass gathering, travel and work restrictions
Fig. 8The edge of optimization to eliminate ventilator shortages
Fig. 9Influence of additional buffer and federal surge availability on ventilator shortages and transfers