| Literature DB >> 35236896 |
Shakti K Davis1, Joseph L Natale2, Ashley E Mason3, Frederick M Hecht4, Wendy Hartogensis4, Natalie Damaso1, Kajal T Claypool1,5, Stephan Dilchert6, Subhasis Dasgupta7, Shweta Purawat7, Varun K Viswanath8, Amit Klein9, Anoushka Chowdhary4, Sarah M Fisher10, Claudine Anglo4, Karena Y Puldon4, Danou Veasna4, Jenifer G Prather4, Leena S Pandya4, Lindsey M Fox4, Michael Busch11, Casey Giordano12, Brittany K Mercado13, Jining Song7, Rafael Jaimes1, Brian S Baum1, Brian A Telfer1, Casandra W Philipson1, Paula P Collins1, Adam A Rao14, Edward J Wang8, Rachel H Bandi15, Bianca J Choe16, Elissa S Epel17, Stephen K Epstein18, Joanne B Krasnoff19, Marco B Lee20, Shi-Wen Lee21, Gina M Lopez22, Arpan Mehta23, Laura D Melville24, Tiffany S Moon25, Lilianne R Mujica-Parodi26, Kimberly M Noel27, Michael A Orosco28, Jesse M Rideout29, Janet D Robishaw19, Robert M Rodriguez30, Kaushal H Shah31, Jonathan H Siegal32, Amarnath Gupta2,7, Ilkay Altintas2,7, Benjamin L Smarr2,9.
Abstract
Early detection of diseases such as COVID-19 could be a critical tool in reducing disease transmission by helping individuals recognize when they should self-isolate, seek testing, and obtain early medical intervention. Consumer wearable devices that continuously measure physiological metrics hold promise as tools for early illness detection. We gathered daily questionnaire data and physiological data using a consumer wearable (Oura Ring) from 63,153 participants, of whom 704 self-reported possible COVID-19 disease. We selected 73 of these 704 participants with reliable confirmation of COVID-19 by PCR testing and high-quality physiological data for algorithm training to identify onset of COVID-19 using machine learning classification. The algorithm identified COVID-19 an average of 2.75 days before participants sought diagnostic testing with a sensitivity of 82% and specificity of 63%. The receiving operating characteristic (ROC) area under the curve (AUC) was 0.819 (95% CI [0.809, 0.830]). Including continuous temperature yielded an AUC 4.9% higher than without this feature. For further validation, we obtained SARS CoV-2 antibody in a subset of participants and identified 10 additional participants who self-reported COVID-19 disease with antibody confirmation. The algorithm had an overall ROC AUC of 0.819 (95% CI [0.809, 0.830]), with a sensitivity of 90% and specificity of 80% in these additional participants. Finally, we observed substantial variation in accuracy based on age and biological sex. Findings highlight the importance of including temperature assessment, using continuous physiological features for alignment, and including diverse populations in algorithm development to optimize accuracy in COVID-19 detection from wearables.Entities:
Mesh:
Year: 2022 PMID: 35236896 PMCID: PMC8891385 DOI: 10.1038/s41598-022-07314-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Enrollment and follow-up.
Participant characteristics.
| Characteristic | Overall*, | COVID-19 Training*, | COVID-19 Independent Validation*, |
|---|---|---|---|
| 18–30 years | 8,555 (14%) | 9 (12%) | 1 (10%) |
| 31–40 years | 16,756 (27%) | 22 (31%) | 1 (10%) |
| 41–50 years | 17,502 (29%) | 18 (25%) | 3 (30%) |
| 51–80 years | 18,148 (30%) | 23 (32%) | 5 (50%) |
| 81 + years | 102 (0.2%) | 0 (0%) | 0 (0%) |
| Female | 24,374 (40%) | 29 (40%) | 2 (20%) |
| Male | 36,632 (60%) | 43 (60%) | 8 (80%) |
| Other | 56 (< 0.1%) | 0 (0%) | 0 (0%) |
| Non-Hispanic White | 50,130 (82%) | 61 (85%) | 9 (90%) |
| Non-Hispanic Black | 909 (1.5%) | 1 (1.4%) | 0 (0%) |
| African | 107 (0.2%) | 0 (0%) | 0 (0%) |
| American Indian | 98 (0.2%) | 0 (0%) | 0 (0%) |
| Native Hawaiian or Other Pacific Islander | 161 (0.3%) | 0 (0%) | 0 (0%) |
| Asian | 3,313 (5.4%) | 1 (1.4%) | 0 (0%) |
| South Asian | 936 (1.5%) | 0 (0%) | 0 (0%) |
| Middle Eastern | 595 (1.0%) | 1 (1.4%) | 1 (10%) |
| Other | 4,768 (7.8%) | 8 (11%) | 0 (0%) |
| Hispanic or Latino Origin | 3,570 (5.8%) | 11 (15%) | 0 (0%) |
| Non-Hispanic | 57,491 (94%) | 61 (85%) | 10 (100%) |
| Less than high school | 284 (0.5%) | 0 (0%) | 0 (0%) |
| High School/GED or some college | 7,958 (13%) | 13 (18%) | 1 (10%) |
| Associate degree or higher | 51,754 (85%) | 58 (81%) | 9 (90%) |
| Didn't specify | 1,066 (1.7%) | 1 (1.4%) | 0 (0%) |
| 7,810 (13%) | 8 (11%) | 2 (20%) | |
| 0 | 53,222 (87%) | 69 (96%) | 10 (100%) |
| 1 | 5,269 (8.6%) | 2 (2.8%) | 0 (0%) |
| 2 or more | 2,524 (4.1%) | 1 (1.4%) | 0 (0%) |
| No Symptoms | (n/a) | 8 (12%) | 0 (0%) |
| 1 to 3 Symptoms | (n/a) | 33 (48%) | 1 (12%) |
| 4 to 6 Symptoms | (n/a) | 24 (35%) | 7 (88%) |
| Greater than 6 Symptoms | (n/a) | 4 (5.8%) | 0 (0%) |
| 1 or more tests | 8,736 (14%) | 33 (45%) | 10 (100%) |
| No tests | 54,417 (86%) | 40 (55%) | 0 (0%) |
| Non-reactive | 7,710 (88%) | 6 (18%) | 0 (0%) |
| Reactive | 175 (2.0%) | 18 (55%) | 10 (100%) |
| Indeterminate | 851 (9.7%) | 9 (27%) | 0 (0%) |
*n (% of non-missing). 1mean number of daily symptoms were computed for period spanning 3 days before to 3 days after the diagnosis date (for participants with a confirmed diagnosis). In the overall sample, n = 61,063 had available age data; n = 61,062 had available sex data, education data, and frontline worker status data; n = 61,017 had available race data; n = 61,061 had available ethnicity data; and n = 61,015 had available comorbid condition data. Within COVID-19 training data, n = 72 had available data for age, sex, race, ethnicity, education, frontline worker status, and comorbid conditions; n = 69 had available data for symptoms. Within COVID-19 independent validation data n = 8 had available data for symptoms; n = 10 had available data for all other characteristics.
Figure 2Algorithms aligned by PX can be used to classify COVID-19 infection. Each panel shows a set of receiver operator curves (ROC) with shading indicating ± 95% CI. PX = date of maximal change from average over the 21-day DX region; SX = date of onset of one of four core symptoms of COVID-19; DX = date of diagnostic testing for COVID-19; HR = heart rate, HRV = heart rate variability, and RR = respiratory rate. Numbers in relationship to PX, SX, and DX refer to number of days before (negative numbers) or after (positive numbers) each of these dates. Models trained by alignment to PX were more accurate as the evaluation window approached PX (A; from red pre-PX to blue post-PX; n = 73; in all cases, the number of negative training samples was 179,010; the number of positive training samples were: 8678, 9059, 9527, 9719, and 9705, respectively), with a peak accuracy at the window of PX + 0: PX + 2 days. ROC curves generated from models trained by alignment to DX performed best when evaluated relative to PX (B; n = 41, restricted to the subset of individuals with reliable symptom onset reports). Models trained by alignment to PX, SX, and DX performed comparably when evaluated at PX + 0: PX + 2 days (C; n = 41). Exclusion of any physiological measure lowers performance, with the ROC AUC dropping the most when HRV was omitted (D; n = 73).
Figure 3Continuous physiology data allow more precise alignment for machine learning training, sickness profiling. We analyzed continuous heart rate (HR, beats per min, blue) and respiratory rate (RR, breaths per min, purple) within the presumptive illness window of DX-2 weeks: DX + 1 week (grey shaded region) to detect statistical deviations (A, dashed lines; zoom in B); the average location of the two detections defined PX (yellow). On average, distance from PX to DX was 1 day longer than SX to DX (C); SX (n = 67) relies on report, and so is missing in some cases when PX (n = 73) is present. Profiles of physiological data aligned by PX from the n = 73 cohort for heart rate (HR) and heart rate variability (HRV; D), dermal temperature during wake and sleep (E) and estimated metabolic equivalents (MET) of physical activity and respiratory rate (RR; F). See Fig. 2 and Methods for definitions of DX, PX, and SX.
Figure 4Accuracy changes across different populations. The model trained at PX + 0: PX + 2 showed different performance accuracy (ROC AUC) when we segregated participants by antibody test result (A), sex (B) and age group (C); [95% CI] (N). Each panel uses the participants (n = 73) who reported positive diagnostic tests for SARS CoV-2 and were included in algorithm training. Pos = positive, Indet = indeterminate, Neg = negative antibody test. The algorithm performed as expected on individuals with positive antibody tests (red), who were very similar to individuals with indeterminate antibody tests (purple). The algorithm was less accurate for individuals with negative antibody tests (green), consistent with the algorithm showing COVID-19 specificity. The ROC AUC for women was lower than the ROC AUC for men. Age groups showed different levels of overall accuracy that were not merely proportional to N.