| Literature DB >> 21235800 |
Zeinab Mahmoudi1, Saeed Rahati, Mohammad Mahdi Ghasemi, Vahid Asadpour, Hamid Tayarani, Mohsen Rajati.
Abstract
BACKGROUND: Speech production and speech phonetic features gradually improve in children by obtaining audio feedback after cochlear implantation or using hearing aids. The aim of this study was to develop and evaluate automated classification of voice disorder in children with cochlear implantation and hearing aids.Entities:
Mesh:
Year: 2011 PMID: 21235800 PMCID: PMC3029214 DOI: 10.1186/1475-925X-10-3
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 2.819
Comparison of defined disorder levels in the study with SIR criteria
| SIR score | Levels of intelligibility in SIR criteria | Levels of voice disorder in the study |
|---|---|---|
| 1 | Connected speech is unintelligible. | Level1 |
| 2 | Connected speech is unintelligible. | Level2 |
| 3 | Connected speech is intelligible to a listener who concentrates on lip-reading. | Level2 |
| 4 | Connected speech is intelligible to a listener who has little experience of a deaf person's speech. | Level3 |
| 5 | Connected speech is intelligible to all listeners. | Level4 |
Demographic data of children participating in the study
| Level of voice disorder | Average age (in month) | Average Age at CI or using HA (in month) | Average last time after CI or using HA (in month) | number of kids in each level |
|---|---|---|---|---|
| Level 1 | 52 | 45.3 | 18 | 6 (4 male & 2 female) |
| Level 2 | 58 | 38 | 20.16 | 6(4 male & 2 female) |
| Level 3 | 59 | 28.5 | 29.3 | 6(3 male & 3 female) |
| Level 4 | 72 | - | - | 12(8 male & 4 female) |
Input features to the recognition system
| features | description |
|---|---|
| f0 | Fundamental frequency of the voice signal |
| RI(Relative Intensity) | Ratio of intensity to the maximum intensity in syllable. |
| f1 | Frequency of the first formant |
| f2 | Frequency of the second formant |
| f3 | Frequency of the third formant |
| f1/f2 | Ratio of first to second formant frequencies |
| Nasality (1/(Af1-A1k)) | Reverse of the difference between amplitude of the first formant and spectral extra peak at 1 kHz |
| Entropy | Approximate entropy of the voice signal |
| Fractal dimension | Fractal dimension of the speech phase space |
| Lyapanov exponent | Lyapanov exponent of the voice signal |
| Mean energy of Wavelet coefficients | Mean energy of Wavelet coefficients in scales 7,8 and 9 |
Figure 1Main diagram of the study.
Figure 2Detailed diagram of the system.
Figure 3Data division in different parts of the system.
Classification rate for all words using frame-based featuresa
| word | Level 1 accuracy | Level 2 accuracy | Level 3 accuracy | Level 4 accuracy |
|---|---|---|---|---|
| 'mashin' | 86.87 | 80 | 98.12 | 66.7 |
| 'mar' | 91.25% | 71.25% | 100% | 51.51% |
| 'moosh' | 83.12% | 67.5% | 100% | 86.97% |
| 'gav' | 90% | 40% | 100% | 94% |
| 'mouz' | 87.5% | 60.62% | 100 | 95.76% |
a Features: f0, f1, f2, f1/f2 , f3, RI, nasality and approximate entropy
Figure 4Phase space reconstruction of the voice samples from the word 'mouz'. a: Phase space of 'mouz' signal from level 1. b: Phase space of 'mouz' signal from level 2. c: Phase space of 'mouz' signal from level 3. d: Phase space of 'mouz' signal from level 4.
Average classification rate for subgroups of features and different fusion rules
| subgroup | Features | Fusion method | Level 1 accuracy | Level 2 accuracy | Level 3 accuracy | Level 4 accuracy | Average accuracy in all levels |
|---|---|---|---|---|---|---|---|
| 1 | f0, f1, f2, f1/f2, f3, RI, nasality, approximate entropy | Stacked fusion | not | 54.0% | 75.0% | 80.0% | 65.7% |
| MVR | 93.8% | 68.8% | 100.0% | 87.9% | 87.4% | ||
| Linear combination | 93.8% | 87.5% | 100.0% | 93.9%% | 93.8% | ||
| 2 | f0, f1, f2, f1/f2, f3, nasality, approximate entropy, fractal dimension | Stacked fusion | 63.0% | 65.2% | 76.3% | 82.0%% | 71.2% |
| MVR | 100.0% | 81.2% | 100.0% | 91.0% | 93.1% | ||
| Linear combination | 87.5% | 81.2% | 100.0% | 100.0% | 92.2% | ||
| 3 | f0, f1, f2, f1/f2, f3, RI, nasality approxiamate entropy, lyapanov exponent | Stacked fusion | not classified | not classified | 71.2% | 79.4% | 62.5% |
| MVR | 100.0% | 75.0% | 100.0% | 91.0% | 91.5% | ||
| Linear combination | 87.5% | 81.2% | 100.0% | 94.0% | 90.7% | ||
| 4 | f0, f1, f2, f1/f2, f3, RI, nasality approxiamate entropy, fractal dimension, lyapanov exponent, wavelet coefficients in 3 scales | Stacked fusion | 60.0% | 62.0% | 80.0% | 87.5% | 73.8% |
| MVR | 100.0% | 93.8% | 100.0% | 94.0% | 96.9% | ||
| Linear combination | 100.0% | 68.8% | 100.0% | 100.0% | 92.2% | ||