| Literature DB >> 36050372 |
Md Mobashir Hasan Shandhi1, Peter J Cho1, Ali R Roghanizad1, Karnika Singh1, Will Wang1, Oana M Enache2, Amanda Stern1, Rami Sbahi1, Bilge Tatar1, Sean Fiscus1, Qi Xuan Khoo1, Yvonne Kuo1, Xiao Lu1, Joseph Hsieh1, Alena Kalodzitsa1, Amir Bahmani3, Arash Alavi3, Utsab Ray3, Michael P Snyder3, Geoffrey S Ginsburg4, Dana K Pasquale5,6, Christopher W Woods7,8, Ryan J Shaw9,10, Jessilyn P Dunn11,12.
Abstract
Mass surveillance testing can help control outbreaks of infectious diseases such as COVID-19. However, diagnostic test shortages are prevalent globally and continue to occur in the US with the onset of new COVID-19 variants and emerging diseases like monkeypox, demonstrating an unprecedented need for improving our current methods for mass surveillance testing. By targeting surveillance testing toward individuals who are most likely to be infected and, thus, increasing the testing positivity rate (i.e., percent positive in the surveillance group), fewer tests are needed to capture the same number of positive cases. Here, we developed an Intelligent Testing Allocation (ITA) method by leveraging data from the CovIdentify study (6765 participants) and the MyPHD study (8580 participants), including smartwatch data from 1265 individuals of whom 126 tested positive for COVID-19. Our rigorous model and parameter search uncovered the optimal time periods and aggregate metrics for monitoring continuous digital biomarkers to increase the positivity rate of COVID-19 diagnostic testing. We found that resting heart rate (RHR) features distinguished between COVID-19-positive and -negative cases earlier in the course of the infection than steps features, as early as 10 and 5 days prior to the diagnostic test, respectively. We also found that including steps features increased the area under the receiver operating characteristic curve (AUC-ROC) by 7-11% when compared with RHR features alone, while including RHR features improved the AUC of the ITA model's precision-recall curve (AUC-PR) by 38-50% when compared with steps features alone. The best AUC-ROC (0.73 ± 0.14 and 0.77 on the cross-validated training set and independent test set, respectively) and AUC-PR (0.55 ± 0.21 and 0.24) were achieved by using data from a single device type (Fitbit) with high-resolution (minute-level) data. Finally, we show that ITA generates up to a 6.5-fold increase in the positivity rate in the cross-validated training set and up to a 4.5-fold increase in the positivity rate in the independent test set, including both symptomatic and asymptomatic (up to 27%) individuals. Our findings suggest that, if deployed on a large scale and without needing self-reported symptoms, the ITA method could improve the allocation of diagnostic testing resources and reduce the burden of test shortages.Entities:
Year: 2022 PMID: 36050372 PMCID: PMC9434073 DOI: 10.1038/s41746-022-00672-z
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Fig. 1Overview of the Intelligent Testing Allocation (ITA) model, the CovIdentify cohort, and data.
a Overview of the ITA model in comparison to a Random Testing Allocation (RTA) model that demonstrates the benefit of using the ITA model over existing RTA methods to improve the positivity rate of diagnostic testing in resource-limited settings. Human symbols with orange and blue colors represent individuals with and without COVID-19 infection, respectively. b A total of 7348 participants were recruited following informed consent in the CovIdentify study, out of whom 1289 participants reported COVID-19 diagnostic tests (1157 diagnosed as negative for COVID-19 and 132 diagnosed as positive for COVID-19). c The top panel shows the time-averaged step count and the bottom panel shows the time-averaged resting heart rate (RHR) of all participants (n = 50) in the training set (Supplementary Fig. 3, blue) who were tested positive for COVID-19 with the pre-defined baseline (between –60 and –22 days from the diagnostic test) and detection (between –21 and –1 days from the diagnostic test) periods marked with vertical black dashed lines. The dark green dashed lines and the light green dash-dotted lines display the baseline period mean and ± 2 standard deviations from the baseline mean, respectively. The light purple dashed vertical line shows the diagnostic test date.
Summary of the cohorts.
| Cohort | Total | Total COVID+ (test) | Total COVID– (test) |
|---|---|---|---|
| All-Frequency (AF) | 520 (105) | 63 (13) | 457 (92) |
| All-High-Frequency (AHF) | 469 (97) | 54 (11) | 415 (86) |
| Fitbit-High-Frequency (FHF) | 280 (63) | 40 (7) | 240 (56) |
Total refers to training + test data.
Features extracted from the digital biomarkers (DBs) for the development of ITA algorithm.
| Metric | Definition | Equation |
|---|---|---|
| Delta (Δ) | Deviation in digital biomarker from baseline median value | DBDetection – DBBaseline, Median |
| Delta_Normalized | Delta normalized by baseline median value | ((DBDetection – DBBaseline, Median) / DBBaseline, Median) |
| Delta_Standardized | Delta standardized by baseline median and interquartile range (IQR) | ((DBDetection – DBBaseline, Median) / DBBaseline, IQR) |
| Deviation in digital biomarker from baseline mean, standardized by baseline standard deviation (SD) | ((DBDetection – DBBaseline, Mean) / DBBaseline, SD) | |
| Average | Average of interday deviation metrics | |
| Median | Median of interday deviation metrics | |
| Maximum | Maximum of interday deviation metrics | |
| Minimum | Minimum of interday deviation metrics | |
| Range | Range of interday deviation metrics | |
Fig. 2Overview of digital biomarker exploration and feature engineering for the ITA model development on the AF cohort.
a Time-series plot of the deviation in digital biomarkers (ΔSteps and ΔRHR) in the detection window compared to baseline periods, between the participants diagnosed as COVID-19 positive and negative. The horizontal dashed line displays the baseline median and the confidence bounds show the 95% confidence intervals. b Heatmaps of steps and RHR features that are statistically significantly different (p value <0.05; unpaired t-tests) in a grid search with a different detection end date (DED) and detection window length (DWL) combinations, with green boxes showing p values <0.05 and gray boxes showing p values ≥0.05. The p values are adjusted with the Benjamini–Hochberg method for multiple hypothesis correction. c Summary of the significant features (p value <0.05; unpaired t-tests) from b, with each box showing the number of statistically significant features for the different combinations of DED and DWL. The intersection of the significant features across DWL of 3 and 5 days with a common DED of 1 day prior to the test date (as shown using the black rectangle) was used for the ITA model development. d Box plots comparing the distribution of the two most significant steps and RHR features between the participants diagnosed as COVID-19 positive and negative. The centerlines denote feature medians, bounds of boxes represent 25th and 75th percentiles, whiskers denote nonoutlier data range and the diamonds denote outlier values.
Fig. 3Prediction and ranking results of the ITA models on the training sets for the AF (a, d, and g), AHF (b, e, and h), and FHF (c, f, and i) cohorts using features from a combination of Steps and RHR (blue), Steps (green), and RHR (violet) digital biomarkers.
a–c Receiver operating characteristics curves (ROCs) and d–f precision-recall curves (PRCs) for the discrimination between COVID-19-positive participants and -negative participants in the training set. The light blue, light green, and light violet areas show one standard deviation from the mean of the ROCs/PRCs generated from 10-fold nested cross-validation on the training set and the red dashed line shows the results based on a Random Testing Allocation (RTA) model (the null model). g–i The positivity rate of the diagnostic testing subpopulation as determined by ITA given a specific number of available diagnostic tests. The red dashed line displays the positivity rate/pretest probability of an RTA (null) model.
Fig. 4Prediction and ranking results of the ITA models on the test set of the FHF cohort using RHR digital biomarkers.
a ROC and b PRC for the discrimination between COVID-19-positive participants (n = 7) and -negative participants (n = 56). The red dashed line shows the results based on an RTA model. c Positivity rate of the diagnostic testing subpopulation as determined by ITA given a specific number of available diagnostic tests. The red dashed line shows the positivity rate of an RTA (null) model.