| Literature DB >> 29391025 |
Taehoon Kim1, Jeong-Whun Kim2, Kyogu Lee3.
Abstract
PURPOSE: Breathing sounds during sleep are altered and characterized by various acoustic specificities in patients with sleep disordered breathing (SDB). This study aimed to identify acoustic biomarkers indicative of the severity of SDB by analyzing the breathing sounds collected from a large number of subjects during entire overnight sleep.Entities:
Keywords: Acoustic biomarker; Apnea–hypopnea index; Deep neural network; Polysomnography screening test; Sleep disordered breathing
Mesh:
Substances:
Year: 2018 PMID: 29391025 PMCID: PMC5796501 DOI: 10.1186/s12938-018-0448-x
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 2.819
Fig. 1Acquisition of sleep sounds and PSG reports in sleep laboratory. Audio data and PSG reports were recorded from the PSG system. After acquisition, two filtering stages were adopted to eliminate unwanted noises for 120 patients
Fig. 2Audio feature extraction framework. Audio features were extracted in every 5 s windows. Then statistical values (means and standard deviations) of features were calculated over whole sleep period
List of extracted audio features
| Feature (abbreviation) | Description | # of variables |
|---|---|---|
| Spectral centroid (SC) | Center of mass of the spectrum | 1 |
| Spectral rolloff point (SR) | Right skewness of the power spectrum | 1 |
| Spectral flux (SF) | Amount of spectral change of the signal | 1 |
| Compactness | Sum of results of fast Fourier transform over frequency bins | 1 |
| Spectral variability (SV) | Variance of the magnitude spectrum | 1 |
| Root mean square (RMS) | Power of the signal | 1 |
| Fraction of low energy windows (FLEW) | Quietness of the signal relative to the rest of the signal | 1 |
| Zero crossings (ZC) | The number of times the signal changes sign from one sample to another | 1 |
| Strongest beat (SB) | Highest bin in the beat histogram | 1 |
| Beat sum (BS) | Sum of all values in the beat histogram | 1 |
| Strength of strongest beat (SSB) | Strength of the strongest beat in the signal | 1 |
| Strongest frequency via ZC (SF-ZC) | Strongest frequency in the signal by looking at the ZC | 1 |
| Strongest frequency via SC (SF-SC) | Strongest frequency in the signal by looking at the SC | 1 |
| Strongest frequency via FFT max (SF-FFT) | Highest bin in the power spectrum | 1 |
| MFCC | Short-term power spectrum based on the nonlinear mel scale of frequency | 13 (0–12) |
| Constant-Q based MFCC (CQ-MFCC) | MFCC that directly calculates the logarithmic frequency bins | |
| Linear predictive coding (LPC) | Spectral envelope based on the information of a linear predictive model | 10 (0–9) |
| Method of moments (MM) | Calculation of the first 5 statistical method of moments | 5 (0–4) |
| Relative difference function (RDF) | Log of the derivative of the RMS | 1 |
| Area method of moments (AMoM) | Numeric quantities at some distance from a reference point or axis | 10 (0–9) |
| AMoM of MFCC | AMoM derived with MFCC values instead of the density distribution function | 10 (0–9) |
| AMoM of CQ-MFCC | AMoM derived with CQ-MFCC values instead of the density distribution function | 10 (0–9) |
| AMoM of Log of CQ Transform (LCQT) | AMoM derived with Log Constant-Q Transform values instead of the density distribution function | 10 (0–9) |
Fig. 3Selection of acoustic biomarker. Among statistical values of audio features, SDB-severity group discriminators were determined with Tukey-HSD tests. The union of all SDB-severity group discriminators is defined as the acoustic biomarker
Fig. 4Process of qTM extraction. First, absoluted magnitude values are quantized into three levels (silence, low-level signal, high-level signal) for simplification. Among silence periods, apnea candidate periods were determined under standards of AASM and finally signals were quantized into four levels. Temporal transitions of quantized magnitudes were derived and transition probabilities were calculated over whole sleep
Fig. 5Structure of the deep neural network. The network contains two hidden layers with 50 and 25 nodes respectively, two dropout layers, and an output layer with 4 nodes for 4-class classification
Clinical statistics of the population for each group (N = 120)
| Normal | Mild | Moderate | Severe | p value | |
|---|---|---|---|---|---|
| Age | 44.1 ± 20.5 | 54.8 ± 14.4 | 53.9 ± 13.3 | 50.3 ± 16.7 | 0.0543 |
| Body mass index | 23.0 ± 3.9 | 24.4 ± 3.3 | 26.9 ± 3.2 | 27.3 ± 4.19 | 1.31e−5* |
| Apnea–hypopnea index | 1.3 ± 1.3 | 9.1 ± 2.6 | 22.1 ± 4.3 | 57.5 ± 19.7 | 1.91e−43* |
Mean ± standard deviation, * p < 0.05
List of SDB severity group discriminators
| Raw features | First derivatives | |
|---|---|---|
| Normal group discriminator | ||
| Mean | Compactness, FLEW, RDF, AMoM of LCQT 2,7-9 | – |
| Standard deviation | Compactness, MFCC 0,3-11, LPC 6-7, AMoM of MFCC 0-1,4,6, AMoM of LCQT 0,2,4,7-9, AMoM of CQ-MFCC 2,4,7, sb 1-8, A weighted, C weighted, L weighted, peak DB, peak DBA, peak DBC | Compactness, FLEW, MFCC 0,2-3,5-12 |
| Minimum | sb6, sb8 | – |
| Mild group discriminator | ||
| Mean | – | – |
| Standard deviation | – | – |
| Moderate group discriminator | ||
| Mean | – | – |
| Standard deviation | FLEW, MFCC 1 | ZC, SF-ZC, MFCC 1, LPC 2,5 |
| Severe group discriminator | ||
| Mean | SF, MM 0, AMoM 0,3,7, AMoM of MFCC 3, AMoM of LCQT 2-3,7-9, AMoM of CQ-MFCC 3,5,9 | |
| Standard deviation | SF, SV, RMS, ZC, SF-ZC, SF-FFT, LPC 1, MM 0, RDF, AMoM 1,3,6-8, AMoM of MFCC 0,1,3,4,6, AMoM of LCQT 0-4, 6-9, AMoM of CQ-MFCC 1-4,6,7, A_weighted, C_weighted, L_weighted, peak_DB, peak_DBA, peak_DBC | SF, SV, RMS, SF-FFT, LPC 1,7, MM 0,2, RDF |
Abbreviations of features are listed on Table 1
Fig. 6Distribution of subject groups. Using the t-SNE algorithm, distributions of 120 subjects using a whole audio features and b discriminators (acoustic biomarker) respectively. When using the acoustic biomarker, group separability was increased and it has a direct effect on the classification performance
Fig. 7Performance of classification using all audio features (baseline). Specificity, sensitivity and area under ROC curve are depicted when all audio features are adopted as input features. A confusion matrix of the 4-group classification is also presented
Fig. 8Comparison of performance of using audio features extracted under various window sizes
Fig. 9Performance of classification when components of qTM are added as input features
Fig. 10Performance of classification when the acoustic biomarkers are adopted instead of all audio features
Fig. 11Performance of classification when both the acoustic biomarker and the qTM are adopted
Fig. 12Comparison of performance of using various classifiers
Fig. 13Comparison of performance of using different feature sets chosen with SVM-based feature selection algorithm
Fig. 14Performance of binary classifications under various thresholds