| Literature DB >> 35884305 |
Guidong Bao1, Mengchen Lin1, Xiaoqian Sang1, Yangcan Hou1, Yixuan Liu1, Yunfeng Wu1.
Abstract
This article proposes a novel semi-supervised competitive learning (SSCL) algorithm for vocal pattern classifications in Parkinson's disease (PD). The acoustic parameters of voice records were grouped into the families of jitter, shimmer, harmonic-to-noise, frequency, and nonlinear measures, respectively. The linear correlations were computed within each acoustic parameter family. According to the correlation matrix results, the jitter, shimmer, and harmonic-to-noise parameters presented as highly correlated in terms of Pearson's correlation coefficients. Then, the principal component analysis (PCA) technique was implemented to eliminate the redundant dimensions of the acoustic parameters for each family. The Mann-Whitney-Wilcoxon hypothesis test was used to evaluate the significant difference of the PCA-projected features between the healthy subjects and PD patients. Eight dominant PCA-projected features were selected based on the eigenvalue threshold criterion and the statistical significance level (p < 0.05) of the hypothesis test. The SSCL algorithm proposed in this paper included the procedures of the competitive prototype seed selection, K-means optimization, and the nearest neighbor classifications. The pattern classification experimental results showed that the proposed SSCL method can provide the excellent diagnostic performances in terms of accuracy (0.838), recall (0.825), specificity (0.85), precision (0.846), F-score (0.835), Matthews correlation coefficient (0.675), area under the receiver operating characteristic curve (0.939), and Kappa coefficient (0.675), which were consistently better than those results of conventional KNN or SVM classifiers.Entities:
Keywords: K-means clustering; Parkinson’s disease; competitive learning; dysphonia; k-nearest neighbor; pattern recognition; semi-supervised learning
Mesh:
Year: 2022 PMID: 35884305 PMCID: PMC9312485 DOI: 10.3390/bios12070502
Source DB: PubMed Journal: Biosensors (Basel) ISSN: 2079-6374
Subject groups of HC and PD patients, with the age statistics presented as mean ± SD.
| Subject Groups | HC Group | PD Group | ||
|---|---|---|---|---|
| Gender | Male | Female | Male | Female |
| 22 (55%) | 18 (45%) | 27 (67.5%) | 13 (32.5%) | |
| Age (years old) | 66.38 ± 8.38 | 69.58 ± 7.82 | ||
Description of acoustic parameter families derived from the voice records in HC and PD subject groups.
| Parameter Family | Abbreviation | Parameter Description |
|---|---|---|
| Jitter | Jitter-Rel | Relative jitter |
| Jitter-Abs | Absolute jitter | |
| Jitter-RAP | Relative average perturbation | |
| Jitter-PPQ | Pitch perturbation quotient | |
| Shimmer | Shim-Loc | Local shimmer |
| Shim-dB | Shimmer in dB | |
| Shim-APQ3 | 3-point amplitude perturbation quotient | |
| Shim-APQ5 | 5-point amplitude perturbation quotient | |
| Shim-APQ11 | 11-point amplitude perturbation quotient | |
| Harmonic-to-noise | HNR05 | Harmonic-to-noise ratio in 0–500 Hz |
| HNR15 | Harmonic-to-noise ratio in 0–1500 Hz | |
| HNR25 | Harmonic-to-noise ratio in 0–2500 Hz | |
| HNR35 | Harmonic-to-noise ratio in 0–3500 Hz | |
| HNR38 | Harmonic-to-noise ratio in 0–3800 Hz | |
| Nonlinear | RPDE | Recurrence period density entropy |
| DFA | Detrended fluctuation analysis | |
| PPE | Pitch period entropy | |
| GNE | Glottal-to-noise excitation ratio | |
| Frequency | MFCC 0 to 12 | Mel-frequency cepstral coefficient-based |
| Delta 0 to 12 | The derivatives of mel-frequency cepstral |
Figure 1Flowchart of the voice detection procedures that contain vocal parameter analysis, dimensionality reduction, feature selection using the Mann–Whitney–Wilcoxon hypothesis test, pattern analysis based on semi-supervised competitive learning, and classification result evaluation.
Figure 2Pearson correlation coefficient results of the families of (a) Jitter, (b) Shimmer, (c) HNR, (d) Nonlinear, (e) MFCC, and (f) Frequency Delta vocal parameters.
Mann–Whitney–Wilcoxon hypothesis test results of the vocal features derived from the PCA approach. The p-value < 0.05 indicates the significant difference, marked with *. Null hypothesis: Data samples from two subject groups are not significantly different in statistical sense; 1: rejects the null hypothesis, with the corresponding p-value marked with stars, 0: accepts the null hypothesis.
| Vocal Features | Null Hypothesis | |
|---|---|---|
| Jitter-PCA | 1 | 0.0036 * |
| Shimmer-PCA | 1 | 0.0007 * |
| HNR-PCA | 1 | 0.0001 * |
| Nonlinear-RPDE | 0 | 0.1779 |
| Nonlinear-DFA | 0 | 0.3233 |
| Nonlinear-PPE | 1 | 0.0476 * |
| Nonlinear-GNE | 1 | 0.0001 * |
| Frequency-MFCC-PCA1 | 1 | 0.0001 * |
| Frequency-MFCC-PCA2 | 0 | 0.2305 |
| Frequency-MFCC-PCA3 | 0 | 0.2926 |
| Frequency-MFCC-PCA4 | 0 | 0.4885 |
| Frequency-MFCC-PCA5 | 0 | 0.2856 |
| Frequency-MFCC-PCA6 | 0 | 0.2952 |
| Frequency-Delta-PCA1 | 1 | 0.0001 * |
| Frequency-Delta-PCA2 | 0 | 0.1530 |
| Frequency-Delta-PCA3 | 0 | 0.0579 |
| Frequency-Delta-PCA4 | 0 | 0.1624 |
| Frequency-Delta-PCA5 | 1 | 0.0369 * |
Classification results of vocal patterns of HC subjects and PD patients. N/A: Not applicable.
|
| Methods | ||||
|---|---|---|---|---|---|
| Bayesian | Two-Stage | KNN ( | SVM | SSCL | |
| Accuracy ± SD | 0.752 ± 0.086 | 0.779 ± 0.08 | 0.806 ± 0.031 | 0.825 ± 0.03 | 0.838 ± 0.029 |
| Recall ± SD | 0.718 ± 0.132 | 0.765 ± 0.135 | 0.812 ± 0.044 | 0.8 ± 0.045 | 0.825 ± 0.042 |
| Specificity ± SD | 0.786 ± 0.135 | 0.792 ± 0.15 | 0.8 ± 0.045 | 0.85 ± 0.04 | 0.85 ± 0.04 |
| Precision ± SD | 0.785 ± 0.118 | 0.806 ± 0.115 | 0.802 ± 0.044 | 0.842 ± 0.042 | 0.846 ± 0.041 |
| F-score ± SD | 0.75 ± 0.024 | 0.785 ± 0.022 | 0.807 ± 0.012 | 0.821 ± 0.011 | 0.835 ± 0.011 |
| MCC ± SD | 0.505 ± 0.096 | 0.557 ± 0.089 | 0.613 ± 0.049 | 0.651 ± 0.046 | 0.675 ± 0.043 |
| AUC ± SD | N/A | 0.879 ± 0.067 | 0.855 ± 0.029 | 0.868 ± 0.043 | 0.939 ± 0.018 |
| Kappa ± SD | N/A | N/A | 0.613 ± 0.062 | 0.65 ± 0.06 | 0.675 ± 0.058 |
Figure 3ROC curves generated by the SVM with radial basis functions, KNN, and SSCL methods, respectively. The AUC values estimated by the SVM, KNN, and SSCL methods were 0.868 ± 0.043, 0.855 ± 0.029, and 0.939 ± 0.018, respectively.
Summary of the misclassified voice records in percentage and their corresponding subject group and gender information.
|
| KNN | SVM | SSCL | |||
|---|---|---|---|---|---|---|
| Male | Female | Male | Female | Male | Female | |
| HC | 25.8% | 25.8% | 25% | 17.9% | 30.8% | 15.4% |
| PD | 38.7% | 9.7% | 46.4% | 10.7% | 53.8% | 0% |
| Total | 64.5% | 35.5% | 71.4% | 28.6% | 84.6% | 15.4% |
| 100% | 100% | 100% | ||||