| Literature DB >> 26380256 |
Aoife Roebuck1, Gari D Clifford2.
Abstract
Obstructive sleep apnea (OSA) is a disorder characterized by repeated pauses in breathing during sleep, which leads to deoxygenation and voiced chokes at the end of each episode. OSA is associated by daytime sleepiness and an increased risk of serious conditions such as cardiovascular disease, diabetes, and stroke. Between 2 and 7% of the adult population globally has OSA, but it is estimated that up to 90% of those are undiagnosed and untreated. Diagnosis of OSA requires expensive and cumbersome screening. Audio offers a potential non-contact alternative, particularly with the ubiquity of excellent signal processing on every phone. Previous studies have focused on the classification of snoring and apneic chokes. However, such approaches require accurate identification of events. This leads to limited accuracy and small study populations. In this work, we propose an alternative approach which uses multiscale entropy (MSE) coefficients presented to a classifier to identify disorder in vocal patterns indicative of sleep apnea. A database of 858 patients was used, the largest reported in this domain. Apneic choke, snore, and noise events encoded with speech analysis features were input into a linear classifier. Coefficients of MSE derived from the first 4 h of each recording were used to train and test a random forest to classify patients as apneic or not. Standard speech analysis approaches for event classification achieved an out-of-sample accuracy (Ac) of 76.9% with a sensitivity (Se) of 29.2% and a specificity (Sp) of 88.7% but high variance. For OSA severity classification, MSE provided an out-of-sample Ac of 79.9%, Se of 66.0%, and Sp = 88.8%. Including demographic information improved the MSE-based classification performance to Ac = 80.5%, Se = 69.2%, and Sp = 87.9%. These results indicate that audio recordings could be used in screening for OSA, but are generally under-sensitive.Entities:
Keywords: LPC; MFCCs; MSE; OSA; audio
Year: 2015 PMID: 26380256 PMCID: PMC4550787 DOI: 10.3389/fbioe.2015.00114
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Subject demographics for each sub-group: normal, snorer, mild OSA, moderate OSA, and severe OSA (mean ± σ).
| Group | Normal | Snorer | Mild | Moderate | Severe |
|---|---|---|---|---|---|
| Gender | 80 m, 75 f | 166 m, 91 f | 79 m, 28 f | 94 m, 30 f | 167 m, 48 f |
| Age (years) | 45.9 ± 17.1 | 46.5 ± 12.0 | 50.5 ± 11.4 | 53.1 ± 12.4 | 52.5 ± 12.6 |
| Neck (cm) | 39.4 ± 4.6 | 41.4 ± 4.3 | 41.9 ± 4.1 | 42.9 ± 3.8 | 45.0 ± 4.8 |
| Height (cm) | 171.2 ± 10.7 | 173.5 ± 10.4 | 174.2 ± 9.9 | 173.0 ± 9.7 | 175.0 ± 9.1 |
| Weight (kg) | 77.7 ± 23.0 | 96.0 ± 24.2 | 212.0 ± 48.8 | 221.2 ± 49.5 | 247.3 ± 74.4 |
| AHI (events/h) | 4.4 ± 7.5 | 6.4 ± 7.4 | 10.6 ± 9.0 | 21.5 ± 11.6 | 47.5 ± 24.5 |
| ODI (events/h) | 3.7 ± 3.5 | 6.0 ± 5.2 | 10.3 ± 7.0 | 22.0 ± 11.6 | 56.8 ± 32.4 |
| BMI (kg/m2) | 29.6 ± 7.9 | 32.0 ± 8.4 | 31.9 ± 7.9 | 33.8 ± 8.5 | 36.9 ± 11.2 |
| ESS | 11.0 ± 5.6 | 12.0 ± 5.2 | 12.2 ± 4.7 | 12.7 ± 4.7 | 14.1 ± 5.3 |
Neck, neck circumference; m, male; f, female.
Demographics of annotated subjects (mean ± σ), m, male; f, female.
| Parameter | Subjects (mean ± σ) |
|---|---|
| Gender | 17 m, 5 f |
| Age (years) | 48.9 ± 15.3 |
| Neck (cm) | 45.7 ± 3.8 |
| Height (cm) | 177.3 ± 10.7 |
| Weight (kg) | 107.4 ± 24.4 |
| AHI (events/h) | 32.4 ± 31.6 |
| ODI (events/h) | 35.7 ± 34.5 |
| BMI (kg/m2) | 34.3 ± 8.9 |
| ESS | 11.7 ± 5.3 |
The number of each event type at the four different window sizes used.
| Window | 0.5 s | 1 s | 2 s | 3 s |
|---|---|---|---|---|
| F | 175 | 175 | 155 | 82 |
| S | 201 | 201 | 201 | 159 |
| N | 190 | 189 | 185 | 167 |
Figure 1Pole-zero plots for a choke event and a snoring event. There are clear differences between the locations of the poles between the two events types, indicating that it might be possible to distinguish between the two. (A) Pole-zero plot for a choke event, where the poles are indicated by the blue crosses. (B) Pole-zero plot for a snoring event, where the poles are indicated by the blue crosses.
Statistics when using clinical thresholds on the demographics, AHI and ODI where both AHI and ODI were automatically calculated by the software.
| Feature | Threshold | Se (%) | Sp (%) | PPV (%) | NPV (%) | Ac (%) |
|---|---|---|---|---|---|---|
| Gender | male | 77.5 | 36.7 | 45.1 | 70.8 | 53.1 |
| Age | 50.0 | 61.7 | 59.0 | 50.2 | 69.6 | 60.0 |
| Neck | 40.0 | 84.9 | 40.4 | 51.7 | 78.1 | 59.5 |
| BMI | 35.0 | 45.0 | 73.8 | 53.4 | 66.8 | 62.3 |
| ESS | 15.0 | 46.4 | 66.4 | 48.5 | 64.5 | 58.3 |
| AHI | 5.0 | 97.4 | 55.3 | 59.2 | 96.9 | 72.1 |
| AHI | 10.0 | 92.9 | 80.1 | 75.6 | 94.4 | 85.2 |
| AHI | ||||||
| AHI | 20.0 | 71.4 | 94.4 | 89.5 | 83.2 | 85.2 |
| AHI | 30.0 | 53.0 | 97.9 | 94.4 | 75.8 | 80.0 |
| ODI | 5.0 | 97.6 | 54.2 | 58.7 | 97.2 | 71.6 |
| ODI | 10.0 | 94.0 | 81.1 | 76.8 | 95.3 | 86.3 |
| ODI | ||||||
| ODI | 20.0 | 74.3 | 96.0 | 92.5 | 84.9 | 87.3 |
| ODI | 30.0 | 56.2 | 98.6 | 96.4 | 77.2 | 81.6 |
The metrics in bold are the baseline to beat as this is the classification problem being addressed: normal/snorer/mild OSA vs. moderate OSA/severe OSA.
Performance when using standard speech analysis techniques and LDA on the test data.
| Window | Data | Se (%) | Sp (%) | PPV (%) | NPV (%) | Ac (%) | AUC |
|---|---|---|---|---|---|---|---|
| 0.5 s | L | 2.6 ± 3.6 | 96.3 ± 2.5 | 12.4 ± 17.0 | 69.7 ± 8.6 | 68.0 ± 8.2 | 0.58 ± 0.13 |
| C | 5.1 ± 5.0 | 89.7 ± 7.4 | 17.3 ± 17.4 | 68.8 ± 9.2 | 64.1 ± 7.8 | 0.51 ± 0.10 | |
| L and C | 12.6 ± 4.0 | 81.4 ± 17.3 | 31.5 ± 17.2 | 67.0 ± 5.4 | 60.0 ± 10.7 | 0.53 ± 0.14 | |
| L and D | 29.8 ± 38.9 | 66.2 ± 28.7 | 34.6 ± 41.7 | 71.1 ± 6.2 | 56.7 ± 12.7 | 0.57 ± 0.09 | |
| C and D | 18.2 ± 20.0 | 80.2 ± 12.2 | 29.0 ± 16.8 | 70.8 ± 12.1 | 60.7 ± 4.0 | 0.57 ± 0.14 | |
| L, C, and D | 38.8 ± 37.7 | 69.9 ± 27.3 | 33.5 ± 31.8 | 75.6 ± 17.6 | 57.0 ± 13.5 | 0.61 ± 0.17 | |
| 1 s | L | 0.9 ± 1.3 | 97.8 ± 2.2 | NaN ± NaN | 69.0 ± 9.8 | 68.0 ± 9.3 | 0.48 ± 0.07 |
| C | 6.9 ± 6.1 | 90.6 ± 5.5 | 21.7 ± 15.9 | 69.6 ± 11.7 | 66.1 ± 12.6 | 0.50 ± 0.08 | |
| L and C | 5.6 ± 4.7 | 86.1 ± 5.6 | 19.4 ± 19.4 | 68.7 ± 12.7 | 62.1 ± 9.0 | 0.49 ± 0.07 | |
| L and D | 26.4 ± 27.9 | 72.4 ± 22.0 | 29.7 ± 27.9 | 69.4 ± 11.2 | 61.2 ± 14.0 | 0.53 ± 0.15 | |
| C and D | 20.1 ± 11.9 | 75.4 ± 6.9 | 26.6 ± 18.3 | 69.3 ± 2.6 | 59.2 ± 4.4 | 0.54 ± 0.15 | |
| L, C, and D | 18.5 ± 16.5 | 80.2 ± 17.8 | 20.6 ± 15.2 | 70.1 ± 12.9 | 63.2 ± 12.9 | 0.53 ± 0.08 | |
| 2 s | L | 9.1 ± 7.2 | 93.9 ± 5.8 | NaN ± NaN | 73.1 ± 16.3 | 69.8 ± 13.9 | 0.57 ± 0.06 |
| C | 25.8 ± 12.3 | 85.6 ± 6.4 | 42.6 ± 13.7 | 74.4 ± 4.5 | 68.6 ± 5.9 | 0.67 ± 0.08 | |
| L and C | 26.0 ± 14.7 | 86.2 ± 11.7 | 45.3 ± 15.5 | 73.7 ± 10.3 | 67.7 ± 10.2 | 0.65 ± 0.11 | |
| L and D | 31.5 ± 34.8 | 80.8 ± 20.8 | NaN ± NaN | 75.6 ± 11.5 | 64.9 ± 6.9 | 0.62 ± 0.07 | |
| C and D | 31.0 ± 19.2 | 83.0 ± 11.2 | 37.5 ± 31.2 | 76.4 ± 11.2 | 68.5 ± 8.1 | 0.65 ± 0.09 | |
| L, C, and D | 39.4 ± 19.6 | 85.0 ± 7.6 | 48.5 ± 20.5 | 77.6 ± 10.6 | 70.6 ± 7.0 | 0.73 ± 0.03 | |
| 3 s | L | 6.4 ± 6.1 | 93.9 ± 4.9 | 25.5 ± 27.7 | 80.4 ± 7.5 | 77.0 ± 8.2 | 0.61 ± 0.08 |
| C | 10.0 ± 13.7 | 94.5 ± 5.0 | NaN ± NaN | 81.8 ± 9.3 | 78.9 ± 10.5 | 0.62 ± 0.08 | |
| L and C | 28.7 ± 25.6 | 93.6 ± 7.4 | 45.3 ± 41.0 | 86.0 ± 13.0 | 82.1 ± 11.0 | 0.71 ± 0.19 | |
| L and D | 17.3 ± 19.7 | 86.3 ± 13.0 | 14.2 ± 8.5 | 81.0 ± 13.3 | 73.8 ± 15.4 | 0.68 ± 0.17 | |
| C and D | 34.8 ± 19.1 | 89.7 ± 10.6 | 52.6 ± 19.4 | 84.4 ± 6.5 | 77.6 ± 4.9 | 0.76 ± 0.10 | |
| L, C, and D | 29.2 ± 17.4 | 88.7 ± 8.9 | 49.1 ± 29.9 | 83.3 ± 9.1 |
L = LPC, C = MFCC, D = demographics. NaN indicates that the classifier never identified a true positive. The metrics in bold indicate the best performance.
LDA Performance when using MSE and demographics on the test data.
| Features | Data set | Se (%) | Sp (%) | PPV (%) | NPV (%) | Ac (%) | AUC |
|---|---|---|---|---|---|---|---|
| MSE | Train | 42.0 ± 14.0 | 81.8 ± 9.5 | 62.5 ± 7.2 | 68.7 ± 2.8 | 66.2 ± 1.1 | 0.66 ± 0.01 |
| Test | 41.1 ± 14.3 | 78.5 ± 11.7 | 58.8 ± 15.6 | 67.1 ± 8.2 | 63.3 ± 5.2 | 0.64 ± 0.03 | |
| MSE | Train | 57.0 ± 4.0 | 78.7 ± 3.3 | 63.8 ± 1.3 | 73.7 ± 1.6 | 70.2 ± 1.3 | 0.76 ± 0.01 |
| Test | 59.1 ± 7.7 | 77.5 ± 2.8 | 64.2 ± 5.8 | 73.1 ± 7.5 | 69.6 ± 3.4 | 0.74 ± 0.03 |
Performance of the RF when using MSE and demographics on the test data.
| Features | Se (%) | Sp (%) | PPV (%) | NPV (%) | Ac (%) | AUC |
|---|---|---|---|---|---|---|
| MSE | 66.0 ± 6.8 | 88.8 ± 1.8 | 79.0 ± 5.1 | 80.1 ± 4.1 | 80.0 ± 3.2 | 0.86 ± 0.04 |
| MSE + demos | 69.2 ± 5.9 | 87.9 ± 3.9 | 79.0 ± 5.3 | 81.2 ± 5.8 |
The metrics in bold indicate the best performance.