| Literature DB >> 33073258 |
Saar Shoer1,2, Tal Karady1,2, Ayya Keshet1,2, Smadar Shilo1,2,3, Hagai Rossman1,2, Amir Gavrieli1,2, Tomer Meir1,2, Amit Lavon1,2, Dmitry Kolobkov1,2, Iris Kalka1,2, Anastasia Godneva1,2, Ori Cohen1,2, Adam Kariv4, Ori Hoch4, Mushon Zer-Aviv4, Noam Castel4, Carole Sudre5, Anat Ekka Zohar6, Angela Irony6, Tim Spector5, Benjamin Geiger2, Dorit Hizi4, Varda Shalev6,7, Ran Balicer8, Eran Segal1,2.
Abstract
BACKGROUND: The gold standard for COVID-19 diagnosis is detection of viral RNA through PCR. Due to global limitations in testing capacity, effective prioritization of individuals for testing is essential.Entities:
Keywords: Artificial Intelligence; COVID-19; Diagnosis; Health Policies; Machine Learning; SARS-CoV-2
Mesh:
Year: 2020 PMID: 33073258 PMCID: PMC7547576 DOI: 10.1016/j.medj.2020.10.002
Source DB: PubMed Journal: Med (N Y) ISSN: 2666-6340
Figure 1Study Population Flow Chart
Numbers represent recorded responses. Blue colored boxes show responses that were used in extended features model (top) and primary model (bottom) constructions.
Baseline Characteristics of the Primary Model Population
| Characteristic, Mean (SD) or % | All Individuals n = 43,752 (100%) | IVR Version n = 33,737 (77.11%) | Online Version n = 10,015 (22.89%) | COVID-19 Undiagnosed n = 43,254 (98.862%) | COVID-19 Diagnosed n = 498 (1.138%) |
|---|---|---|---|---|---|
| Age in years | 44.941 (15.499) | 43.897 (15.244) | 48.460 (15.831) | 44.894 (15.47) | 49.076 (17.363) |
| Gender - male | 23,630 (54.009%) | 19,151 (56.766%) | 4,479 (44.723%) | 23,339 (53.958%) | 291 (58.434%) |
| COVID-19 diagnosed | 498 (1.138%) | 384 (1.138%) | 114 (1.138%) | 0 (0.0%) | 498 (100.0%) |
| Prior medical conditions | 8,070 (18.943%) | 5,176 (15.884%) | 2,894 (28.897%) | 7,946 (18.861%) | 124 (26.271%) |
| Feel well | 41,661 (95.221%) | 32,132 (95.243%) | 9,529 (95.147%) | 41,217 (95.291%) | 444 (89.157%) |
| Sore throat | 1,507 (3.445%) | 1,141 (3.382%) | 366 (3.655%) | 1,422 (3.288%) | 85 (17.068%) |
| Cough | 2,138 (4.887%) | 1,459 (4.325%) | 679 (6.78%) | 1,984 (4.587%) | 154 (30.924%) |
| Shortness of breath | 576 (1.317%) | 443 (1.313%) | 133 (1.328%) | 505 (1.168%) | 71 (14.257%) |
| Loss of taste or smell | 605 (1.388%) | 545 (1.624%) | 60 (0.599%) | 469 (1.088%) | 136 (27.812%) |
| Fever (body temperature above 38°C) | 77 (0.176%) | 53 (0.157%) | 24 (0.24%) | 64 (0.148%) | 13 (2.61%) |
Figure 2Primary Model Performance
(A–C) Logistic Regression. (D–F) Gradient Boosting Decision Trees. auROC/auPR, area under the ROC/PR curve; ROC, receiver operator characteristic; PR, precision recall. Confidence intervals are in parenthesis. (A and D) ROC curve of our model consisting of 9 simple questions. (B and E) Precision-recall curve of our model. (C and F) Calibration curve. Top: blue dots represent deciles of predicted probabilities. The dotted diagonal line represents an ideal calibration. Bottom: log-scaled histogram of predicted probabilities of COVID-19 undiagnosed (green) and diagnosed (red). See also Figure S1 and Tables S3–S5.
Figure 3Comparison of Primary Model Predictions to New COVID-19 Cases in Israel over Time
(A) Primary model predictions, averaged across all individuals on a 3-day running average (solid blue) and shifted 4 days forward (dotted blue), compared to the number of newly confirmed COVID-19 cases in Israel by the ministry of health, based on a 3-day running average.
(B) Number of survey responses per day.
Figure 4Primary Model Performance on an Independently Collected Dataset from the US, UK, and Sweden
(A) Area under the receiver operator characteristic curve (auROC) (purple).
(B) Area under the precision-recall curve (auPR) (orange).
(C) Number of survey responses per day.
(D) Receiver operator characteristic curve of our model consisting of 9 simple questions.
(E) Precision-recall curve of our model.
(F) Calibration curve. Top: blue dots represent deciles of predicted probabilities. Dotted diagonal line represents an ideal calibration. Bottom: log-scaled histogram of predicted probabilities of COVID-19 undiagnosed (green) and diagnosed (red).
Error bars represent CI. See also Table S4.
Figure 5Feature Contribution Analysis
Mean absolute Shapley value (in units of log-odds) of (A) the primary model, including all features used in the model, and (B) the extended features model, for the 13 highest contributing features. See also Figure S2 and Table S6.
Figure 6Feature Interpretation Analysis
(A) SHAP values (in units of log-odds) for positive report of a feature colored in red, negative report of a feature colored in blue, and missing answers in gray.
(B) SHAP values for age with number of responses as a histogram at the bottom.
(C–F) SHAP dependence plot of age versus its SHAP value in the model, stratified by positive (red) and negative (blue) responses of loss of taste or smell (C), cough (D), shortness of breath (E), and sore throat (F).
(G–J) SHAP interaction values of age with positive (red) and negative (blue) responses of loss of taste or smell (G), cough (H), shortness of breath (I), and sore throat (J). Error bars represent SD.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Models and their predicted probabilities in addition to tables of de-identified, aggregated data | This paper | |
| python | Version 3.7.6 | |
| sklearn | Version 0.21.3 | |
| xgboost | Version 1.0.2 | |
| shap | Version 0.35.0 | |
| scipy | Version 1.4.1 | |
| zepid | Version 0.8.1 | |
| matplotlib | Version 3.1.1 | |
| seaborn | Version 0.9.0 | |
| Survey source code | Rossman et al. | |
| Models creation and prediction source code | This paper | |