| Literature DB >> 35690657 |
Nemuel D Pah1,2, Mohammod A Motin2,3, Dinesh K Kumar4.
Abstract
Dysarthria is an early symptom of Parkinson's disease (PD) which has been proposed for detection and monitoring of the disease with potential for telehealth. However, with inherent differences between voices of different people, computerized analysis have not demonstrated high performance that is consistent for different datasets. The aim of this study was to improve the performance in detecting PD voices and test this with different datasets. This study has investigated the effectiveness of three groups of phoneme parameters, i.e. voice intensity variation, perturbation of glottal vibration, and apparent vocal tract length (VTL) for differentiating people with PD from healthy subjects using two public databases. The parameters were extracted from five sustained phonemes; /a/, /e/, /i/, /o/, and /u/, recorded from 50 PD patients and 50 healthy subjects of PC-GITA dataset. The features were statistically investigated, and then classified using Support Vector Machine (SVM). This was repeated on Viswanathan dataset with smartphone-based recordings of /a/, /o/, and /m/ of 24 PD and 22 age-matched healthy people. VTL parameters gave the highest difference between voices of people with PD and healthy subjects; classification accuracy with the five vowels of PC-GITA dataset was 84.3% while the accuracy for other features was between 54% and 69.2%. The accuracy for Viswanathan's dataset was 96.0%. This study has demonstrated that VTL obtained from the recording of phonemes using smartphone can accurately identify people with PD. The analysis was fully computerized and automated, and this has the potential for telehealth diagnosis for PD.Entities:
Mesh:
Year: 2022 PMID: 35690657 PMCID: PMC9188600 DOI: 10.1038/s41598-022-13865-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Participants’ demographics of PC-GITA database.
| PD subjects | Control subjects | p-value | |||
|---|---|---|---|---|---|
| Male | Female | Male | Female | ||
| # Subjects | 25 | 25 | 25 | 25 | |
| Age (years) | 61.56 ± 11.63 | 60.72 ± 72.66 | 60.36 ± 11.56 | 61.44 ± 6.98 | 0.966* |
| UPDRS | 35.92 ± 22.77 | 37.56 ± 14.03 | 0.760+ | ||
| H&Y | 2.30 ± 0.94 | 2.28 ± 0.54 | 0.927+ | ||
| Years diagnosed | 8.86 ± 5.88 | 12.58 ± 11.52 | 0.157+ | ||
*Calculated using ANOVA with 95% confidence level.
+Calculated using unpaired T-test with 95% confidence level.
Figure 1The waveforms of the five vowels recorded from the control subjects and the PD subjects.
Participants’ demographics of Viswanathan’s database.
| Control Subjects | PD Subjects | p-value | |
|---|---|---|---|
| Number of subjects | 22 | 24 | |
| Age | 66.30 ± 6.20 | 71.92 ± 7.07 | 0.008 |
| PD- | N/A | 25.54 ± 8.78 | 1.42e−05 (PD-off vs PD-on) |
| PD- | N/A | 19.33 ± 9.30 | |
| MoCA | 28.30 ± 1.34 | 27.25 ± 2.67 | 0.118 |
| Duration of disease (years) | N/A | 5.29 ± 2.99 |
Statistical distribution and the result of Mann Whitney U-test.
| Parameters | Phoneme | Mean ± SD | p-value | Effect Size | Parameters | Phoneme | Mean ± SD | p-value | Effect size | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Control | PD | Control | PD | ||||||||
| Intensity (SD) | a | 1.62 ± 0.85 | 1.95 ± 1.25 | 0.051 | − 0.387 | F1(SD) | a | 4.87E+1 ± 4.93E+1 | 7.41E+1 ± 8.22E+1 | 0.000 | − 0.515 |
| e | 1.66 ± 1.03 | 2.21 ± 1.35 | 0.000 | − 0.528 | e | 2.87E+1 ± 5.03E+1 | 3.20E+1 ± 3.66E+1 | 0.001 | − 0.065 | ||
| i | 1.84 ± 1.03 | 2.25 ± 1.36 | 0.018 | − 0.394 | i | 4.08E+1 ± 1.16E+2 | 5.48E+1 ± 1.26E+2 | 0.001 | − 0.120 | ||
| o | 1.79 ± 1.01 | 2.15 ± 1.26 | 0.020 | − 0.353 | o | 4.29E+1 ± 3.57E+1 | 4.87E+1 ± 3.63E+1 | 0.018 | − 0.163 | ||
| u | 1.69 ± 1.05 | 2.35 ± 1.40 | 0.000 | − 0.633 | u | 4.83E+1 ± 3.92E+1 | 5.13E+1 ± 4.23E+1 | 0.412 | − 0.078 | ||
| Intensity (range) | a | 6.12 ± 3.11 | 7.34 ± 4.37 | 0.039 | − 0.392 | F2(SD) | a | 7.89E+1 ± 9.20E+1 | 1.16E+2 ± 1.44E+2 | 0.005 | − 0.403 |
| e | 6.28 ± 3.63 | 8.09 ± 4.50 | 0.000 | − 0.498 | e | 5.73E+1 ± 5.03E+1 | 7.73E+1 ± 6.75E+1 | 0.000 | − 0.397 | ||
| i | 6.62 ± 3.37 | 8.22 ± 4.56 | 0.004 | − 0.473 | i | 6.93E+1 ± 8.55E+1 | 1.04E+2 ± 1.21E+2 | 0.000 | − 0.411 | ||
| o | 6.70 ± 3.65 | 7.95 ± 4.33 | 0.010 | − 0.343 | o | 1.77E+2 ± 2.76E+2 | 1.84E+2 ± 2.50E+2 | 0.002 | − 0.023 | ||
| u | 6.28 ± 3.45 | 8.41 ± 4.60 | 0.000 | − 0.615 | u | 2.91E+2 ± 3.24E+2 | 2.63E+2 ± 2.98E+2 | 0.789 | 0.086 | ||
| Jitter (abs) | a | 4.09E−5 ± 3.32E−5 | 5.70E−5 ± 5.10E−5 | 0.005 | − 0.485 | F3(SD) | a | 1.09E+2 ± 1.18E+2 | 1.35E+2 ± 1.22E+2 | 0.029 | − 0.221 |
| e | 3.56E−5 ± 2.96E−5 | 4.81E−5 ± 3.93E−5 | 0.001 | − 0.424 | e | 8.97E+1 ± 7.35E+1 | 1.14E+2 ± 9.06E+1 | 0.002 | − 0.325 | ||
| i | 3.54E−5 ± 2.98E−5 | 4.46E−5 ± 4.20E−5 | 0.037 | − 0.308 | i | 1.12E+2 ± 7.80E+1 | 1.38E+2 ± 9.85E+1 | 0.017 | − 0.334 | ||
| o | 3.70E−5 ± 3.83E−5 | 4.82E−5 ± 5.30E−5 | 0.100 | − 0.292 | o | 1.21E+2 ± 1.38E+2 | 1.20E+2 ± 1.04E+2 | 0.030 | 0.008 | ||
| u | 2.90E−5 ± 2.02E−5 | 4.35E−5 ± 4.31E−5 | 0.005 | − 0.716 | u | 1.87E+2 ± 1.74E+2 | 1.80E+2 ± 1.68E+2 | 0.812 | 0.041 | ||
| Jitter (rel) | a | 5.84E−3 ± 3.55E−3 | 8.62E−3 ± 7.11E−3 | 0.002 | − 0.782 | F4(SD) | a | 1.66E+2 ± 1.48E+2 | 1.84E+2 ± 1.47E+2 | 0.308 | − 0.119 |
| e | 5.07E−3 ± 3.21E−3 | 7.69E−3 ± 6.28E−3 | 0.000 | − 0.815 | e | 1.91E+2 ± 1.72E+2 | 1.73E+2 ± 1.50E+2 | 0.581 | 0.105 | ||
| i | 5.27E−3 ± 3.18E−3 | 7.31E−3 ± 6.51E−3 | 0.006 | − 0.641 | i | 1.71E+2 ± 1.59E+2 | 1.61E+2 ± 1.32E+2 | 0.828 | 0.064 | ||
| o | 5.30E−3 ± 4.21E−3 | 7.53E−3 ± 7.68E−3 | 0.011 | − 0.530 | o | 1.64E+2 ± 1.69E+2 | 1.60E+2 ± 1.30E+2 | 0.194 | 0.027 | ||
| u | 4.60E−3 ± 2.43E−3 | 7.25E−3 ± 6.46E−3 | 0.000 | − 1.094 | u | 2.37E+2 ± 2.13E+2 | 2.07E+2 ± 1.66E+2 | 0.733 | 0.143 | ||
| Shimmer (abs) | a | 4.72E−1 ± 2.33E−1 | 6.11E−1 ± 3.07E−1 | 0.000 | − 0.596 | VTL(F1) | a | 10.73 ± 1.75 | 11.11 ± 1.88 | 0.101 | − 0.216 |
| e | 4.45E−1 ± 2.07E−1 | 5.77E−1 ± 2.71E−1 | 0.000 | − 0.638 | e | 17.59 ± 2.43 | 17.97 ± 2.48 | 0.291 | − 0.157 | ||
| i | 4.30E−1 ± 2.06E−1 | 5.19E−1 ± 2.43E−1 | 0.000 | − 0.436 | i | 23.66 ± 4.68 | 23.98 ± 4.56 | 0.436 | − 0.068 | ||
| o | 4.02E−1 ± 1.84E−1 | 5.45E−1 ± 3.20E−1 | 0.000 | − 0.772 | o | 16.27 ± 2.20 | 16.57 ± 2.75 | 0.447 | − 0.140 | ||
| u | 3.86E−1 ± 2.05E−1 | 5.24E−1 ± 2.87E−1 | 0.000 | − 0.671 | u | 19.92 ± 3.28 | 20.90 ± 4.02 | 0.090 | − 0.300 | ||
| Shimmer (rel) | a | 4.93E−2 ± 2.55E−2 | 6.47E−2 ± 3.44E−2 | 0.000 | − 0.606 | VTL(F2) | a | 18.82 ± 2.40 | 18.01 ± 2.77 | 0.001 | 0.337 |
| e | 4.58E−2 ± 2.21E−2 | 5.91E−2 ± 3.03E−2 | 0.000 | − 0.601 | e | 12.04 ± 1.22 | 12.05 ± 1.55 | 0.817 | − 0.005 | ||
| i | 4.39E−2 ± 2.20E−2 | 5.28E−2 ± 2.74E−2 | 0.001 | − 0.407 | i | 10.80 ± 1.05 | 11.15 ± 1.43 | 0.088 | − 0.336 | ||
| o | 4.03E−2 ± 1.95E−2 | 5.51E−2 ± 3.60E−2 | 0.000 | − 0.759 | o | 26.76 ± 5.18 | 25.42 ± 5.33 | 0.030 | 0.258 | ||
| u | 3.88E−2 ± 2.06E−2 | 5.26E−2 ± 3.28E−2 | 0.000 | − 0.672 | u | 27.25 ± 8.78 | 25.91 ± 8.55 | 0.059 | 0.153 | ||
| Pitch(SD) | a | 8.56E+0 ± 9.51E+0 | 1.30E+1 ± 1.58E+1 | 0.076 | − 0.470 | VTL(F3) | a | 15.89 ± 1.27 | 15.51 ± 1.43 | 0.024 | 0.293 |
| e | 8.59E+0 ± 1.38E+1 | 1.67E+1 ± 2.39E+1 | 0.000 | − 0.590 | e | 15.74 ± 1.23 | 15.48 ± 1.37 | 0.043 | 0.209 | ||
| i | 8.54E+0 ± 1.10E+1 | 1.24E+1 ± 1.49E+1 | 0.045 | − 0.346 | i | 14.56 ± 1.13 | 14.22 ± 1.16 | 0.002 | 0.299 | ||
| o | 1.07E+1 ± 1.48E+1 | 1.59E+1 ± 2.30E+1 | 0.035 | − 0.350 | o | 15.45 ± 1.43 | 15.29 ± 1.62 | 0.376 | 0.117 | ||
| u | 8.74E+0 ± 1.09E+1 | 1.29E+1 ± 1.52E+1 | 0.003 | − 0.386 | u | 14.91 ± 1.45 | 15.06 ± 1.61 | 0.430 | − 0.102 | ||
| HNR | a | 1.89E+1 ± 4.01E+0 | 1.64E+1 ± 4.77E+0 | 0.000 | 0.643 | VTL(F4) | a | 16.08 ± 1.45 | 16.15 ± 1.30 | 0.379 | − 0.053 |
| e | 1.98E+1 ± 3.96E+0 | 1.78E+1 ± 4.73E+0 | 0.000 | 0.493 | e | 16.07 ± 1.59 | 15.93 ± 1.48 | 0.535 | 0.085 | ||
| i | 2.10E+1 ± 4.23E+0 | 1.99E+1 ± 4.71E+0 | 0.070 | 0.257 | i | 15.76 ± 1.47 | 15.46 ± 1.51 | 0.017 | 0.205 | ||
| o | 2.41E+1 ± 4.19E+0 | 2.18E+1 ± 5.33E+0 | 0.000 | 0.553 | o | 15.95 ± 1.27 | 15.74 ± 1.28 | 0.128 | 0.163 | ||
| u | 2.62E+1 ± 4.42E+0 | 2.39E+1 ± 5.04E+0 | 0.000 | 0.524 | u | 15.24 ± 1.30 | 15.16 ± 1.45 | 0.467 | 0.062 | ||
| NHR | a | 5.11E−2 ± 4.32E−2 | 8.66E−2 ± 8.37E−2 | 0.000 | − 0.823 | ||||||
| e | 3.69E−2 ± 3.00E−2 | 6.50E−2 ± 6.83E−2 | 0.000 | − 0.937 | |||||||
| i | 3.12E−2 ± 2.92E−2 | 4.53E−2 ± 4.55E−2 | 0.007 | − 0.486 | |||||||
| o | 3.02E−2 ± 3.19E−2 | 4.74E−2 ± 6.60E−2 | 0.015 | − 0.541 | |||||||
| u | 1.80E−2 ± 1.97E−2 | 3.17E−2 ± 4.56E−2 | 0.002 | − 0.697 | |||||||
The SVM classification results of PC-GITA database.
| Input parameter to SVM | Phoneme | Accuracy (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|
| [SD(Intensity), range(Intensity)] | /a/ | 53.3 | 56.7 | 50.0 |
| /e/ | 49.7 | 60.0 | 39.3 | |
| /i/ | 56.7 | 64.0 | 49.3 | |
| /o/ | 48.3 | 58.0 | 38.7 | |
| /u/ | 50.3 | 54.7 | 46.0 | |
| /e/ + /o/ + /u/ | ||||
| [Jitt(abs), Jitt(rel), Shim(abs), Shim(rel), SD(pitch), HNR, NHR] | /a/ | 59.9 | 64.4 | 55.3 |
| /e/ | 61.5 | 63.1 | 60.0 | |
| /i/ | 65.2 | 69.8 | 60.7 | |
| /o/ | 61.2 | 63.8 | 58.7 | |
| /u/ | 62.5 | 72.5 | 52.7 | |
| /e/ + /i/ + /o/ | ||||
| [SD(F1), SD(F2), SD(F3), SD(F4)] | /a/ | 61.3 | 72.7 | 50.0 |
| /e/ | 62.0 | 71.3 | 52.7 | |
| /i/ | 59.0 | 70.7 | 47.3 | |
| /o/ | 57.3 | 72.0 | 42.7 | |
| /u/ | 54.3 | 46.7 | 62.0 | |
| /a/ + /e/ + /i/ + /o/ + /u/ | ||||
| [VTL(F1), VTL(F2), VTL(F3), VTL(F4)] | /a/ | 69.3 | 70.7 | 68.0 |
| /e/ | 65.3 | 64.7 | 66.0 | |
| /i/ | 73.0 | 76.0 | 70.0 | |
| /o/ | 66.3 | 70.7 | 62.0 | |
| /u/ | 63.7 | 67.3 | 60.0 | |
| /a/ + /e/ + /i/ + /o/ + /u/ | ||||
| Ten highest-ranked features selected by Relief-F: VTL(F4) of /o/; VTL(F1) of /i/; VTL(F2) of /o/; VTL(F3) of /u/; std(F1) of /o/; std(F2) of /o/; VTL(F1) of /e/; VTL(F1) of /a/; VTL(F2) of /i/; VTL(F2) of /u/ | ||||
Significant values are in bold.
The SVM classification results of Viswanathan’s database.
| Input parameter to SVM | Phoneme | Accuracy (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|
| [VTL(F1), VTL(F2), VTL(F3), VTL(F4)] | /a/ | 85.8 | 87.0 | 84.5 |
| /o/ | 79.0 | 77.5 | 80.5 | |
| /m/ | 87.8 | 87.5 | 88.0 | |
| /a/ + /o/ | 90.0 | 90.5 | 89.5 | |
| /o/ + /m/ | 91.3 | 91.0 | 91.5 | |
| /a/ + /o/ + /m/ | 94.0 | 93.5 | 94.5 |
Significant values are in bold.