| Literature DB >> 32803153 |
Daphne Chylinski1, Franziska Rudzik2,3, Dorothée Coppieters T Wallant4, Martin Grignard1, Nora Vandeleene1, Maxime Van Egroo1, Laurie Thiesse2,3, Stig Solbach2, Pierre Maquet1,5, Christophe Phillips1,6, Gilles Vandewalle1, Christian Cajochen2,3, Vincenzo Muto1.
Abstract
Arousals during sleep are transient accelerations of the EEG signal, considered to reflect sleep perturbations associated with poorer sleep quality. They are typically detected by visual inspection, which is time consuming, subjective, and prevents good comparability across scorers, studies and research centres. We developed a fully automatic algorithm which aims at detecting artefact and arousal events in whole-night EEG recordings, based on time-frequency analysis with adapted thresholds derived from individual data. We ran an automated detection of arousals over 35 sleep EEG recordings in healthy young and older individuals and compared it against human visual detection from two research centres with the aim to evaluate the algorithm performance. Comparison across human scorers revealed a high variability in the number of detected arousals, which was always lower than the number detected automatically. Despite indexing more events, automatic detection showed high agreement with human detection as reflected by its correlation with human raters and very good Cohen's kappa values. Finally, the sex of participants and sleep stage did not influence performance, while age may impact automatic detection, depending on the human rater considered as gold standard. We propose our freely available algorithm as a reliable and time-sparing alternative to visual detection of arousals.Entities:
Keywords: arousals; artefacts; automatic detection; electroencephalography; sleep
Year: 2020 PMID: 32803153 PMCID: PMC7115937 DOI: 10.3390/clockssleep2030020
Source DB: PubMed Journal: Clocks Sleep ISSN: 2624-5175
Distribution of recordings in the dataset across raters from Basel according to age and sex.
| SCORER | YOUNG | OLDER | |||||
|---|---|---|---|---|---|---|---|
| F | M | Total | F | M | Total | ||
| BAS1 | 3 | 3 | 6 | 4 | 5 | 9 |
|
| BAS2 | 0 | 4 | 4 | 2 | 2 | 4 |
|
| BAS3 | 2 | 2 | 4 | 0 | 1 | 1 |
|
| BAS4 | 2 | 2 | 4 | 2 | 1 | 3 |
|
| TOTAL | 7 | 11 | 18 | 8 | 9 | 17 |
|
Figure 1Two types of comparison level made—1 s epoch or event. Gold standard human rater (HR) scoring is represented on the top line, with arousals marked by the hatched squares. Automatic Detection (AD) is on the second line with black squares marking detected events. Adapted from [34]. True positive (TP): 1 s epoch/event marked as arousal both by AD and HR. False positive (FP): 1 s epoch/event marked as arousal by AD but not by HR. True negative (TN): 1 s epoch/event marked as arousal-free by both AD and HR. False negative (FN): 1 s epoch/event marked as arousal free by AD but as an arousal by HR.
Cohen’s κ values and their interpretation, from [38].
| Kappa Value | Interpretation |
|---|---|
| <0.00 | Poor |
| 0.00–0.20 | Slight |
| 0.21–40 | Fair |
| 41–0.60 | Moderate |
| 0.61–0.80 | Substantial |
| 0.81–1.00 | Almost perfect |
Agreement coefficients (mean and standard deviation) between HR, with each HR being compared in turn to the other considered as gold standard: Ss (inter-rater agreement); κ (Cohen’s kappa); Se (sensitivity); Cs (mean overlap of events); FDRe (false discovery ratio).
| Gold Standard | Compared | Ss | κs | See | Cs | FDRe |
|---|---|---|---|---|---|---|
| BAS | DC | 94 ± 3% | 0.97 ± 0.02 | 58 ± 16% | 72 ± 7% | 36 ± 12% |
| DC | BAS | 89 ± 4% | 0.94 ± 0.02 | 81 ± 26% | 78 ± 12% | 78 ± 9% |
Figure 2Total number of detected arousal events over all 35 recordings for Basel HR (BAS), Liège HR (DC), and automatic detection (AD), as well as their overlap.
Agreement coefficients (mean ± standard deviation) for all recordings between AD and HR scoring, for both HR inclusive and HR conservative detection as gold standard. EMG indicates automatically detected events that are accompanied by a submental EMG tone increase.
| Gold Standard | Ss | κs | See | Cs | FDRe |
|---|---|---|---|---|---|
| HR inclusive | 86 ± 6% | 0.93 ± 0.03 | 67 ± 23% | 59 ± 13% | 61 ± 16% |
| HR conservative | 88 ± 4% | 0.94 ± 0.02 | 83 ± 26% | 58 ± 14% | 74 ± 12% |
Figure 3Box plot of Cohen’s kappa values for HR inclusive and HR conservative as gold standard (values for all AD arousals in grey, for EMG-associated AD only in white). The boxes’ central lines indicate the medians of κ values, with the bottom and upper edges showing the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme data points not considered outliers—outliers were not removed from the plot.
Raw number of arousals detected per rater for young and older individuals (mean ± SD).
| DC | BAS * | Inclusive | Conservative | AD | AD EMG | |
|---|---|---|---|---|---|---|
| Young | 63 ± 31 | 84 ± 30 | 93 ± 37 | 51 ± 19 | 208 ± 48 | 68 ± 22 |
| Old | 79 ± 34 | 141 ± 53 | 142 ± 46 | 67 ± 23 | 193 ± 39 | 70 ± 25 |
* significant difference between age groups (p = 0.0006).
Results of GLMMs for each agreement coefficient and age/sex for Automatic Detection (AD) against HR inclusive and conservative detection as gold standard, with all AD events (normal) and only AD EMG-associated events (italic).
| Gold Standard | AD Arousals | Ss | κs | See | Cs | FDRe | |
|---|---|---|---|---|---|---|---|
| HR inclusive | All | Age Sex |
|
| |||
| F = 2.83 | F = 2.49 |
| F = 1.42 |
| |||
| F = 2.91 | F = 2.83 | F = 2.12 | F = 1.63 | F = 0.48 | |||
|
|
|
|
|
|
|
| |
|
|
|
|
|
| |||
|
|
|
|
|
| |||
|
|
|
|
|
| |||
| HR conservative | All | Age Sex | |||||
| F = 0.00 | F = 0.01 | F = 0.99 | F = 0.03 | F = 3.52 | |||
| F = 1.64 | F = 1.62 | F = 2.49 | F = 1.24 | F = 0.82 | |||
|
|
|
|
|
|
|
| |
|
|
|
|
|
| |||
|
|
|
|
|
| |||
|
|
|
|
|
| |||
All F tests had 1 (main effect) and 32 (error) degrees of freedom. * Significant after correcting for multiple comparisons (p < 0.005—Bonferroni corrected).
Figure 4Box plot of Cohen’s kappa values for HR inclusive and conservative detection as gold standard by sleep stage. The boxes’ central lines indicate the medians of κ values, with the bottom and upper edges showing the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme data points not considered outliers, and outliers were not removed from the plot.
Figure 5Correlation between all AD arousals and HR for inclusive (A) and conservative (B) detection.
Figure 6Mean relative frequency power for the theta, alpha, and beta bands for arousals detected only by AD (grey), and those detected by AD and HR (white). Error bars show standard deviation.