| Literature DB >> 31886412 |
Alexander J Boe1,2, Lori L McGee Koch1,3, Megan K O'Brien1,4, Nicholas Shawen1,5, John A Rogers6, Richard L Lieber2,4,7, Kathryn J Reid3, Phyllis C Zee3, Arun Jayaraman1,4.
Abstract
Polysomnography (PSG) is the current gold standard in high-resolution sleep monitoring; however, this method is obtrusive, expensive, and time-consuming. Conversely, commercially available wrist monitors such as ActiWatch can monitor sleep for multiple days and at low cost, but often overestimate sleep and cannot differentiate between sleep stages, such as rapid eye movement (REM) and non-REM. Wireless wearable sensors are a promising alternative for their portability and access to high-resolution data for customizable analytics. We present a multimodal sensor system measuring hand acceleration, electrocardiography, and distal skin temperature that outperforms the ActiWatch, detecting wake and sleep with a recall of 74.4% and 90.0%, respectively, as well as wake, non-REM, and REM with recall of 73.3%, 59.0%, and 56.0%, respectively. This approach will enable clinicians and researchers to more easily, accurately, and inexpensively assess long-term sleep patterns, diagnose sleep disorders, and monitor risk factors for disease in both laboratory and home settings.Entities:
Keywords: Biomarkers; Biotechnology; Computational science
Year: 2019 PMID: 31886412 PMCID: PMC6925191 DOI: 10.1038/s41746-019-0210-1
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Participant characteristics and PSG sleep architecture measures.
| Sleep quality metric | Mean (SD) |
|---|---|
| PSQI Global Score | 3.7 (2.1) |
| Total sleep time (min) | 425.75 (32.6) |
| Sleep efficiency (%) | 88.9 (6.8) |
| Sleep onset latency (min) | 15.1 (11.4) |
| Latency to persistent sleep (min) | 25.8 (23.5) |
| WASO (min) | 29.5 (23.7) |
| Stage 1 (%) | 4.4 (1.8) |
| Stage 2 (%) | 51.6 (8.6) |
| Stage SWS (%) | 27.0 (7.7) |
| REM sleep (%) | 16.9 (6.5) |
| REM latency (min) | 194 (89.2) |
PSQI Pittsburgh Sleep Quality Index, WASO wake after sleep onset, SWS slow wave sleep, REM rapid eye movement
Fig. 1Performance of ActiWatch wrist sensor.
Confusion matrix for the ActiWatch in a two-stage resolution, depicting average classification rate of wake and sleep stages. The ActiWatch demonstrated high recall of sleep (high sensitivity) but often misclassified Wake as sleep (low specificity).
Fig. 2Performance of a bagging decision tree classifier for different sleep staging resolutions.
Confusion matrices (top), Receiver Operating Characteristic (ROC) curves (middle), and interquartile range (IQR) plots of model performance (bottom), obtained from leave-one-out cross-validation subject, for a two-stage wake vs. sleep classification, b three-stage wake vs. NREM vs. REM classification, c four-stage wake vs. light vs. deep vs. REM classification. ROC curves show the trade-off between sensitivity and specificity for a given model across subjects (line: mean; shading: standard deviation). Area under the ROC curve (AUROC) is listed for each stage; a value of 1.0 denotes a perfect classifier, whereas a value of 0.5 denotes a classifier that performs no better than random and has no predictive power. IQR plots illustrate how well the model generalizes across subjects, with smaller ranges indicating good performance and high generalizability irrespective of the subject (center line: median; box limits: upper and lower quartiles; whiskers: 1.5 × IQR; points: outliers).
Mean (SD) AUROC for different subsets of the proposed sensor system for the two-, three-, and four-stage resolution models.
| Sleep stage | ACC, ECG, TEMP (all) | ACC ND, ECG, all distal TEMP | ACC ND, ECG, all proximal TEMP | ACC ND, ECG, Chest TEMP (single proximal sensor), hand TEMP ND (single distal sensor) | ACC ND, ECG, hand TEMP ND | ACC ND, ECG | ACC ND | ECG | Hand TEMP ND |
|---|---|---|---|---|---|---|---|---|---|
| Wake (2-stage) | 0.89 (0.12) | 0.86 (0.16) | 0.84 (0.15) | 0.87 (0.16) | 0.85 (0.14) | 0.82 (0.19) | 0.72 (0.21) | 0.71 (0.16) | |
| Sleep (2-stage) | 0.89 (0.12) | 0.86 (0.16) | 0.84 (0.15) | 0.87 (0.16) | 0.85 (0.14) | 0.82 (0.19) | 0.72 (0.21) | 0.71 (0.16) | |
| Wake (3-stage) | 0.90 (0.11) | 0.87 (0.15) | 0.84 (0.16) | 0.87 (0.16) | 0.83 (0.17) | 0.81 (0.18) | 0.67 (0.24) | 0.72 (0.17) | |
| NREM (3-stage) | 0.74 (0.13) | 0.75 (0.13) | 0.75 (0.12) | 0.76 (0.13) | 0.76 (0.11) | 0.68 (0.11) | 0.71 (0.12) | 0.58 (0.15) | |
| REM (3-stage) | 0.66 (0.19) | 0.62 (0.17) | 0.63 (0.16) | 0.64 (0.15) | 0.65 (0.15) | 0.45 (0.13) | 0.68 (0.11) | 0.50 (0.14) | |
| Wake (4-stage) | 0.90 (0.11) | 0.87 (0.15) | 0.85 (0.14) | 0.88 (0.14) | 0.85 (0.14) | 0.82 (0.18) | 0.70 (0.24) | 0.71 (0.17) | |
| Light (4-stage) | 0.53 (0.10) | 0.57 (0.09) | 0.56 (0.09) | 0.58 (0.10) | 0.57 (0.10) | 0.58 (0.05) | 0.59 (0.09) | 0.51 (0.08) | |
| Deep (4-stage) | 0.71 (0.10) | 0.70 (0.09) | 0.68 (0.09) | 0.70 (0.09) | 0.70 (0.09) | 0.64 (0.10) | 0.65 (0.11) | 0.57 (0.11) | |
| REM (4-stage) | 0.68 (0.15) | 0.65 (0.14) | 0.66 (0.14) | 0.67 (0.13) | 0.68 (0.13) | 0.47 (0.09) | 0.70 (0.10) | 0.49 (0.13) |
ACC accelerometer, ECG electrocardiography, TEMP skin temperature, ND non-dominant side
The minimum sensor set is presented in bold
Fig. 3Effect of number of training subjects on model performance.
Mean and standard deviation (shading) of AUROC for a wake, b NREM, and c REM classes in the three-stage bagging classifier model. The gradual increase in AUROC for each class suggests that a training set larger than N = 11 would continue to improve classification performance.
Comparison of current study results (proposed sensor set and ActiWatch) to previous wearable sensor work.
| Study | Sensor modalities | Subjects | Model | Specificity (detection of wake) | Sensitivity (detection of sleep) |
|---|---|---|---|---|---|
| Beattie et al.[ | ACC; PPG | 60 | Linear discriminant classifier; 54 features extracted from 30 s epochs | 69.3% | 94.6% |
| Aktaruzzaman et al.[ | ACT; ECG | 18 | Support vector machine classifier; 4 features extracted from 7 min epochs | 54% | 81% |
| De Zambotti et al.[ | PPG; ACC; Gyroscope; TEMP | 41 | Proprietary algorithm by ŌURA Ring (Oulu, Finland); direct comparison between wake/sleep output and PSG for each 30 s epoch | 48% | 96% |
| Fonseca et al.[ | ACC; PPG | 152 | Linear discriminant classifier; 54 features extracted from 30 s epochs | 58.2% | 96.9% |
| Razjouyan et al.[ | ACC; ACT | 21 | Threshold optimized by accuracy rate for wake/sleep score; based on 1 min epochs | 53.4% | 94.9% |
Bolded rows indicate results from the current study
Fig. 4Sensor systems and placement during overnight sleep.
a Each participant wore three systems, including the proposed sensor set, consisting of accelerometers (ACC), electrocardiography (ECG), and skin temperature (TEMP), in addition to the wrist actigraphy control device measuring activity counts (ACT) and the gold standard system (PSG). b Size and profile comparison of the proposed sensors with the control device. (iButton: Copyright Maxim Integrated Products. Used by permission. ActiWatch: Permission to use ActiWatch Spectrum image was granted by Philips Respironics. BioStampRC: Permission to use BioStampRC image was granted by MC10, Inc.).
Fig. 5Time synchronization of independent data collection systems.
ActiWatch was synchronized with the BioStampRC by aligning activity counts; BioStampRC was synchronized to the PSG by aligning ECG signals.
Features extracted from the sensor data.
| Sensor modality | Sampling frequency (Hz) | No. of features | Features |
|---|---|---|---|
| Accelerometer | 62.5 | 33 | Mean (x,y,z) Minimum (x,y,z) Maximum (x,y,z) Range (x,y,z) Interquartile range (x,y,z) Standard deviation (x,y,z) Kurtosis (x,y,z) Root mean squared (x,y,z) Variance (x,y,z) Pearson’s coefficient (x,y,z) Pearson’s |
| ECG | 1000 | 14 | Mean R-R interval Minimum R-R interval Maximum R-R interval Standard deviation R-R interval RMSSD NN50, PNN50 NN20, PNN20 VLF, LF, HF LF/HF Ratio |
| Skin temperature | 0.0167 | 4 | Mean DPG Minimum DPG Maximum DPG Range DPG |
RMSSD root mean square of successive differences; NNX number of successive R-R intervals that differ by more than X ms, PNNX ratio of NNX to total number of R-R intervals, VLF very low frequency power (activity in the 0.003–0.04 Hz frequency band); LF low frequency power (activity in the 0.04–0.15 Hz frequency band), HF high frequency power (activity in the 0.15–0.40 Hz frequency band); DPG distal-to-proximal gradient[35]