| Literature DB >> 35149754 |
Steven J Krieg1, Carolina Avendano2, Evan Grantham-Brown1, Aaron Lilienfeld Asbun2, Jennifer J Schnur1, Marie Lynn Miranda1,2, Nitesh V Chawla3.
Abstract
COVID-19 remains a global threat in the face of emerging SARS-CoV-2 variants and gaps in vaccine administration and availability. In this study, we analyze a data-driven COVID-19 testing program implemented at a mid-sized university, which utilized two simple, diverse, and easily interpretable machine learning models to predict which students were at elevated risk and should be tested. The program produced a positivity rate of 0.53% (95% CI 0.34-0.77%) from 20,862 tests, with 1.49% (95% CI 1.15-1.89%) of students testing positive within five days of the initial test-a significant increase from the general surveillance baseline, which produced a positivity rate of 0.37% (95% CI 0.28-0.47%) with 0.67% (95% CI 0.55-0.81%) testing positive within five days. Close contacts who were predicted by the data-driven models were tested much more quickly on average (0.94 days from reported exposure; 95% CI 0.78-1.11) than those who were manually contact traced (1.92 days; 95% CI 1.81-2.02). We further discuss how other universities, business, and organizations could adopt similar strategies to help quickly identify positive cases and reduce community transmission.Entities:
Year: 2022 PMID: 35149754 PMCID: PMC8837751 DOI: 10.1038/s41746-022-00562-4
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Summary of testing results. NR and LP represent the students in the adaptive cohort who were predicted by either the node risk or link prediction model, respectively, and NR + LP represents the students who were predicted by both models.
| Cohort | # Tests administered | Positive tests | Positivity rate |
|---|---|---|---|
| General surveillance | 79,932 | 297 | 0.37% [0.28%, 0.47%] |
| Adaptive | 20,862 | 111 | 0.53% [0.33%, 0.76%] |
| Adaptive (NR model only) | 10,251 | 50 | 0.49% [0.42%, 0.56%] |
| Adaptive (LP model only) | 8,089 | 32 | 0.40% [0.30%, 0.47%] |
| Adaptive (both NR + LP models) | 2,608 | 21 | 0.81% [0.51%, 1.24%] |
Fig. 1The percentage of students who tested positive during the initial appointment or a follow-up test within 14 days.
Day 0 is the day they were selected for the cohort. Shaded regions indicate 95% confidence intervals.
Fig. 2The percentage of students from the adaptive cohort who tested positive as predicted by the node risk (NR) and/or link prediction (LP) models within 14 days.
Day 0 is the day they were selected for the adaptive cohort. Shaded regions indicate 95% confidence intervals.
Edge type weights learned by the node risk prediction model. These weights (ω) are utilized in Eq. 1. Each value represents the risk for a student given that they share an edge of type t with another student at i hops in the network.
| Edge type ( | ||
|---|---|---|
| Shared address (roommates) | 0.9978 | 0.9180 |
| Shared dorm suite | 0.3898 | 0.3718 |
| Shared dorm floor | 0.3064 | 0.2672 |
| Shared dorm building | 0.0868 | 0.0351 |
| Enrolled in same course | 0.0010 | 0.0010 |
| Active on same sports team | 0.8903 | 0.8566 |
| Close contact | 0.9806 | 0.0543 |
Fig. 3The distribution of average time to receive a test for students who were exposed to COVID-19 via close contact with another student.
Day 0 is the day the exposing student tested positive. Shaded regions indicate 95% confidence intervals.
Node base risk scores learned by the node risk prediction model. Higher scores indicate a greater transmission risk. To predict high-risk students, each student’s base risk score (P0) is propagated to neighboring nodes via Eq. 1. Q/I represents quarantine/isolation.
| # Days prior | Positive test | Assigned to Q/I | Reported symptoms/exposure |
|---|---|---|---|
| 1 | 0.9839 | 0.0294 | 0.0010 |
| 2 | 0.9208 | 0.1799 | 0.0283 |
| 3 | 0.9904 | 0.3884 | 0.0118 |
| 4 | 0.2026 | 0.2518 | 0.0010 |
| 5 | 0.0010 | 0.0043 | 0.0010 |
| 6 | 0.0679 | 0.1760 | 0.0010 |
| 7 | 0.3180 | 0.0518 | 0.0010 |