| Literature DB >> 26897500 |
Fernando Espinoza-Cuadros1, Rubén Fernández-Pozo2, Doroteo T Toledano3, José D Alcázar-Ramírez4, Eduardo López-Gonzalo5, Luis A Hernández-Gómez6.
Abstract
BACKGROUND: Sleep apnea (OSA) is a common sleep disorder characterized by recurring breathing pauses during sleep caused by a blockage of the upper airway (UA). The altered UA structure or function in OSA speakers has led to hypothesize the automatic analysis of speech for OSA assessment. In this paper we critically review several approaches using speech analysis and machine learning techniques for OSA detection, and discuss the limitations that can arise when using machine learning techniques for diagnostic applications.Entities:
Mesh:
Year: 2016 PMID: 26897500 PMCID: PMC4761156 DOI: 10.1186/s12938-016-0138-5
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 2.819
Descriptive statistics on the 426 male subjects
| Clinical variables | Mean | SD | Range |
|---|---|---|---|
| AHI | 22.5 | 18.1 | 0.0–102.0 |
| Weight (kg) | 91.7 | 17.3 | 61.0–162.0 |
| Height (cm) | 175.3 | 7.1 | 152.0–197.0 |
| BMI (kg/m2) | 29.8 | 5.1 | 20.1–52.1 |
| Age (years) | 48.8 | 12.5 | 20.0–85.0 |
| Cervical perimeter (cm) | 42.2 | 3.2 | 34.0–53.0 |
AHI apnea–hypopnea index, BMI body mass index, SD standard deviation
Fig. 1Acoustic representation of utterances
Implementation tools
| Toola | Function name | Function description | Parameters |
|---|---|---|---|
| HTK | HCopy | Extract the MFCCs coefficients | No. DFT bins = 512 |
| No. filters = 26 | |||
| No. MFCC coeff. = 19 | |||
| No. ΔMFCC coeff. = 19 | |||
| MSR Identity ToolBoxb | GMM_em | GMM–UBM training | No. mixtures = 512 |
| No. of expectation maximization iteration = 10 | |||
| Feature sub-sampling factor = 1 | |||
| MapAdapt | GMM adaptation | Adaptation algorithm = MAP | |
| No. mixtures = 512 | |||
| MAP relevance factor = 10 | |||
| Train_tv_space | Total variability matrix training | Dimension of total variability matrix = {400,300,200,100,50,30} | |
| Number of iteration = 5 | |||
| Extract_ivector | I-vector training | Dimension of total variability matrix = {400,300,200,100,50,30} | |
| LIBSVM | SVM_train | SVR training | Grid search parameters: |
| SVM_predict | SVR regression | Grid search parameters: |
aAll the implementation tools were used under Linux Ubuntu 12.04 LTS Operating System
bExecuted on Matlab 2014a
Fig. 2GMM and supervector modelling
Fig. 3Representation of k-fold cross-validation and grid search for SVR regression and predicting clinical variables
Speakers’ height estimation results
| Regression method | Mean absolute error (cm) | Correlation coefficient (ρ) |
|---|---|---|
| I-vector–LSSVR [ | 6.2 | 0.41b |
| Supervector–SVR | 5.37 | 0.34a |
| I-vector–SVR | 5.06 | 0.45a |
aThese values are significant beyond the 0.01 level of confidence
bLevel of confidence is not reported
Speakers’ age estimation results
| Regression method | Mean absolute error (years) | Correlation coefficient (ρ) |
|---|---|---|
| I-vector–WCCN–SVR [ | 6.0 | 0.77b |
| Supervector–SVR | 7.75 | 0.66a |
| I-vector–SVR | 7.87 | 0.63a |
aThese values are significant beyond the 0.01 level of confidence
bLevel of confidence is not reported
Speakers’ clinical variables estimation using supervector-SVR (linear kernel)
| Clinical variable | MAE | ρ |
|---|---|---|
| AHI | 14.26 | 0.17 |
| Height (cm) | 5.37 | 0.34 |
| Age (years) | 7.75 | 0.66 |
| Weight (kg) | 12.58 | 0.31 |
| BMI (kg/m2) | 3.81 | 0.23 |
| CP (cm) | 2.29 | 0.42 |
AHI apnea–hypopnea index, BMI body mass index, CP cervical perimeter
The correlation coefficients (ρ) are significant beyond the 0.01 level of confidence
Speakers’ clinical variables estimation using i-vectors-SVR (linear kernel)
| Clinical variable | I-vector dimension | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean absolute error (MAE) | Correlation coefficient (ρ) | |||||||||||
| 400 | 300 | 200 | 100 | 50 | 30 | 400 | 300 | 200 | 100 | 50 | 30 | |
| AHI | 13.68 | 13.64 | 13.55 | 13.23 | 13.40 | 13.85 | 0.23 | 0.21 | 0.24 | 0.30 | 0.27 | 0.20 |
| Height (cm) | 5.21 | 5.23 | 5.11 | 5.06 | 5.29 | 5.38 | 0.40 | 0.41 | 0.43 | 0.45 | 0.36 | 0.34 |
| Age (years) | 8.16 | 7.87 | 8.11 | 8.29 | 8.77 | 9.16 | 0.61 | 0.63 | 0.61 | 0.59 | 0.52 | 0.44 |
| Weight (kg) | 12.31 | 12.23 | 12.25 | 11.86 | 12.16 | 12.31 | 0.34 | 0.35 | 0.36 | 0.39 | 0.35 | 0.31 |
| BMI (kg/m2) | 3.59 | 3.65 | 3.67 | 3.69 | 3.74 | 3.80 | 0.33 | 0.30 | 0.29 | 0.28 | 0.26 | 0.18 |
| CP (cm) | 2.28 | 2.26 | 2.20 | 2.26 | 2.31 | 2.42 | 0.44 | 0.45 | 0.49 | 0.47 | 0.44 | 0.32 |
AHI apnea–hypopnea index, BMI body mass index, CP cervical perimeter
The correlation coefficients (ρ) are significant beyond the 0.01 level of confidence
Speakers clinical variables estimation using i-vectors-SVR (RBF kernel)
| Clinical variable | I-vector dimension | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean absolute error (MAE) | Correlation coefficient (ρ) | |||||||||||
| 400 | 300 | 200 | 100 | 50 | 30 | 400 | 300 | 200 | 100 | 50 | 30 | |
| AHI | 14.04 | 13.91 | 13.63 | 13.48 | 13.84 | 14.12 | 0.00 | 0.17 | 0.25 | 0.26 | 0.18 | 0.02 |
| Height (cm) | 5.28 | 5.23 | 5.16 | 5.24 | 5.46 | 5.43 | 0.40 | 0.41 | 0.42 | 0.41 | 0.29 | 0.32 |
| Age (years) | 9.46 | 9.22 | 8.29 | 8.68 | 9.10 | 9.53 | 0.42 | 0.51 | 0.61 | 0.57 | 0.50 | 0.41 |
| Weight (kg) | 12.39 | 12.82 | 12.18 | 12.11 | 12.27 | 12.59 | 0.29 | 0.18 | 0.32 | 0.35 | 0.34 | 0.24 |
| BMI (kg/m2) | 3.73 | 3.70 | 3.66 | 3.68 | 3.72 | 3.77 | 0.20 | 0.18 | 0.27 | 0.27 | 0.21 | 0.14 |
| CP (cm) | 2.38 | 2.42 | 2.32 | 2.34 | 2.42 | 2.44 | 0.31 | 0.26 | 0.42 | 0.40 | 0.31 | 0.26 |
AHI apnea–hypopnea index, BMI body mass index, CP cervical perimeter
The correlation coefficients (ρ) are significant beyond the 0.01 level of confidence
OSA Classification using estimated AHI values
| Feature | Accuracy (%) | Sensitivity (%) | Specificity (%) | ROC AUC |
|---|---|---|---|---|
| Supervectors | 68 | 89 | 18 | 0.58 |
| I-vectors (dim 100) | 71 | 92 | 20 | 0.64 |
Test characteristics of previous research using speech analysis and machine learning for AHI classification and regression
| Study | Population characteristics | Classification | Regression | ||
|---|---|---|---|---|---|
| Correct classification rate (%) | Sensitivity (%) | Specificity (%) | Correlation coefficient | ||
| GMMs [ | 80 male subjects | 81 | 77.5 | 85 | _ |
| HMMs [ | 80 male subjects | 85 | _ | _ | _ |
| Several feature selection and classification schemes [ | 248 subjects | 82.85 | 81.49 | 84.69 | _ |
| Feature selection and GMMs [ | 93 subjects | _ | 86 | 84 | _ |
| Feature selection and GMMs [ | 103 male subjects | 80 | 80.65 | 80 | _ |
| Feature selection, supervectors and SVR [ | 131 males | _ | _ | _ | 0.67a |
| I-vectors/supervectors and SVR this study | 426 males | 71.06 | 92.92 | 20.6 | 0.30 |
aResults using speech features plus age and BMI
Spearman’s correlation between clinical variables
| Feature | AHI | Weight | Height | BMI | Age | CP |
|---|---|---|---|---|---|---|
| AHI | 1 | 0.41a | −0.007 | 0.44a | 0.16a | 0.40a |
| Weight | 0.41a | 1 | 0.40a | 0.89a | −0.11a | 0.71a |
| Height | −0.007 | 0.40a | 1 | −0.02 | −0.35a | 0.13a |
| BMI | 0.44a | 0.89a | −0.02 | 1 | 0.04 | 0.72a |
| Age | 0.16a | −0.11a | −0.35a | 0.04 | 1 | 0.16a |
| CP | 0.40a | 0.71a | 0.13a | 0.72a | 0.16a | 1 |
aThe correlation coefficients (ρ) are significant beyond the 0.01 level of confidence
Wilcoxon two-sampled test for MEAN_HNR_VA_A contrasting gender and group of extreme OSA male speakers
| Mean_HNR_VA_A (Gender) | Mean_HNR_VA_A (extreme OSA male speakers) | |||||
|---|---|---|---|---|---|---|
| Female | Male | p value | Male (AHI ≤5) | Male (AHI ≥30) | p value | |
| Median | 19.43 | 17.07 | <0.0001 | 17.46 | 16.38 | 0.06 |
| SD | 3.98 | 4.23 | 3.89 | 4.32 | ||
| # Samples | 171 | 426 | 69 | 129 | ||
Speakers’ AHI estimation using supervector generated by five high-order cepstral and LPC coefficients [14]
| Set of clinical variables | MAE | Correlation coefficient (ρ) |
|
|---|---|---|---|
| a15, ΔΔc9, a17, ΔΔc12, c16 | 14.33 | 0.12 | 0.008 |
| AGE + BMI | 12.96 | 0.38 | <0.00001 |
| (a15, ΔΔc9, a17, ΔΔc12, c16) + AGE + BMI | 12.24 | 0.46 | <0.00001 |
p values are given for correlation coefficient (ρ)