| Literature DB >> 28792979 |
Betul Erdogdu Sakar1, Gorkem Serbes2, C Okan Sakar3.
Abstract
The recently proposed Parkinson's Disease (PD) telediagnosis systems based on detecting dysphonia achieve very high classification rates in discriminating healthy subjects from PD patients. However, in these studies the data used to construct the classification model contain the speech recordings of both early and late PD patients with different severities of speech impairments resulting in unrealistic results. In a more realistic scenario, an early telediagnosis system is expected to be used in suspicious cases by healthy subjects or early PD patients with mild speech impairment. In this paper, considering the critical importance of early diagnosis in the treatment of the disease, we evaluate the ability of vocal features in early telediagnosis of Parkinson's Disease (PD) using machine learning techniques with a two-step approach. In the first step, using only patient data, we aim to determine the patient group with relatively greater severity of speech impairments using Unified Parkinson's Disease Rating Scale (UPDRS) score as an index of disease progression. For this purpose, we use three supervised and two unsupervised learning techniques. In the second step, we exclude the samples of this group of patients from the dataset, create a new dataset consisting of the samples of PD patients having less severity of speech impairments and healthy subjects, and use three classifiers with various settings to address this binary classification problem. In this classification problem, the highest accuracy of 96.4% and Matthew's Correlation Coefficient of 0.77 is obtained using support vector machines with third-degree polynomial kernel showing that vocal features can be used to build a decision support system for early telediagnosis of PD.Entities:
Mesh:
Year: 2017 PMID: 28792979 PMCID: PMC5549905 DOI: 10.1371/journal.pone.0182428
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Definitions of vocal features.
| Vocal Feature | Description |
|---|---|
| Jitter(%) | Average absolute difference between consecutive periods, divided by the average period. |
| Jitter(Abs) | Average absolute difference between consecutive periods which gives information about the cycle-to-cycle variation of fundamental frequency given in seconds. |
| Jitter:RAP | Relative Average Perturbation (RAP), which is the average absolute difference between a period and the average of it and its two neighbours, divided by the average period. |
| Jitter:PPQ5 | Five-point Period Perturbation Quotient, computed as the average absolute difference between a period and the average of it and its four closest neighbours, divided by the average period. |
| Jitter:DDP | Average absolute difference between consecutive differences between consecutive periods, divided by the average period. |
| Shimmer | Average absolute difference between the amplitudes of consecutive periods, divided by the average amplitude. |
| Shimmer(dB) | Average absolute base-10 logarithm of the difference between the amplitudes of consecutive periods, multiplied by 20. It gives information about the variability of the peak-to-peak amplitude in decibels. |
| Shimmer:APQ3 | Three-point Amplitude Perturbation Quotient, the average absolute difference between the amplitude of a period and the average of the amplitudes of its neighbours, divided by the average amplitude. |
| Shimmer:APQ5 | Five-point Amplitude Perturbation Quotient, the average absolute difference between the amplitude of a period and the average of the amplitudes of it and its four closest neighbours, divided by the average amplitude. |
| Shimmer:APQ11 | 11-point Amplitude Perturbation Quotient, the average absolute difference between the amplitude of a period and the average of the amplitudes of it and its ten closest neighbours, divided by the average amplitude. |
| Shimmer:DDA | Average absolute difference between consecutive differences between the amplitudes of consecutive periods. |
| Noise to Harmonics Ratio (NHR) | Amplitude of noise relative to tonal components. It quantifies the noise which occurs due to turbulent airflow, resulting from incomplete vocal fold closure in speech pathologies. |
| Harmonics to Noise Ratio | Amplitude of tonal relative to noise components. It has the same aim as NHR. |
| Recurrence period density entropy | Addresses the ability of the vocal folds to sustain stable vocal fold vibrations, quantifying the deviations from exact periodicity |
| Detrended fluctuation analysis | Quantifies the self-similarity of the noise present in the speech caused by the turbulent air flow |
| Pitch period entropy | Measures the impaired control of stable pitch during sustained phonations |
Statistical parameters of vocal features.
| Vocal Feature | Minimum | Maximum | Median | Mean | Std. Dev. | |
|---|---|---|---|---|---|---|
| Jitter(%) | 0.0008 | 0.1000 | 0.0049 | 0.0062 | 0.0056 | |
| Jitter(Abs) | 0 | 0.0004 | 0 | 0.0000 | 0.0001 | |
| Jitter:RAP | 0.0003 | 0.0575 | 0.0022 | 0.0030 | 0.0031 | |
| Jitter:PPQ5 | 0.0004 | 0.0696 | 0.0025 | 0.0033 | 0.0037 | |
| Jitter:DDP | 0.0010 | 0.1726 | 0.0067 | 0.0090 | 0.0094 | |
| Shimmer | 0.0031 | 0.2686 | 0.0275 | 0.0340 | 0.0258 | |
| Shimmer(dB) | 0.0260 | 2.1070 | 0.253 | 0.3110 | 0.2303 | |
| Shimmer:APQ3 | 0.0016 | 0.1627 | 0.0137 | 0.0172 | 0.0132 | |
| Shimmer:APQ5 | 0.0019 | 0.1670 | 0.0159 | 0.0201 | 0.0167 | |
| Shimmer:APQ11 | 0.0025 | 0.2755 | 0.0227 | 0.0275 | 0.0200 | |
| Shimmer:DDA | 0.0048 | 0.488 | 0.0411 | 0.0515 | 0.0397 | |
| Noise to Harmonics Ratio | 0.0003 | 0.7483 | 0.0184 | 0.0321 | 0.0597 | |
| Harmonics to Noise Ratio | 1.6590 | 37.875 | 21.92 | 21.6795 | 4.2911 | |
| Recurrence period density entropy | 0.1510 | 0.9661 | 0.5423 | 0.5415 | 0.1010 | |
| Detrended fluctuation analysis | 0.5140 | 0.8656 | 0.6436 | 0.6532 | 0.0709 | |
| Pitch period entropy | 0.0220 | 0.7317 | 0.2055 | 0.2196 | 0.0915 |
Fig 1Number of positive (above threshold) and negative instances (below threshold) with respect to determined UPDRS threshold.
Fig 2(left) Test set classification accuracies and (right) Matthew's correlation coefficients obtained with k-NN classifier under various UPDRS threshold values.
Fig 4(left) Test set classification accuracies and (right) Matthew's correlation coefficients obtained with ELM classifier under various UPDRS threshold values.
Fig 3(left) Test set classification accuracies and (right) Matthew's correlation coefficients obtained with SVM classifier under various UPDRS threshold value.
Fig 5A summary of results obtained with the best settings of classifiers (left) Matthew's correlation coefficients of the classifiers obtained with their best settings (right) ROC space of the classifiers obtained when UPDRS threshold is set to 15.
Fig 6Scatter of PD data on the first three principal components with UPDRS threshold value of (top) (left) 15 (right) 20 (bottom) (left) 25 (right) 30.
Fig 7Absolute difference between the ratio of the number of patients whose UPDRS is below the corresponding threshold to the number of all patients in cluster 1 and cluster 2.
Ranking of the vocal features based on their mutual information with UPDRS level discretized according to the determined optimal threshold that can be discriminated by machine learning methods.
| Ranking | Dysphonia Measurement | MI Score |
|---|---|---|
| 1 | DFA | 0.0413 |
| 2 | PPE | 0.0302 |
| 3 | RPDE | 0.0287 |
| 4 | HNR | 0.0277 |
| 5 | NHR | 0.0202 |
| 6 | Jitter:PPQ5 | 0.0189 |
| 7 | Jitter(%) | 0.0189 |
| 8 | Jitter(Abs) | 0.0163 |
| 9 | Shimmer:APQ11 | 0.0163 |
| 10 | Jitter:DDP | 0.0155 |
| 11 | Jitter:RAP | 0.0154 |
| 12 | Shimmer(dB) | 0.0120 |
| 13 | Shimmer | 0.0112 |
| 14 | Shimmer:APQ3 | 0.0084 |
| 15 | Shimmer:DDA | 0.0084 |
| 16 | Shimmer:APQ5 | 0.0081 |
Accuracies and MCC values obtained with various settings of k-NN, SVM, and ELM on the dataset consisting of the samples of PD patients whose UPDRS is below this threshold and 8 healthy subjects.
| k-NN (k = 3) | SVM | ELM | ||||||
|---|---|---|---|---|---|---|---|---|
| Distance | Accu. (%) | MCC | Kernel | Accu (%) | MCC | Kernel | Accu. (%) | MCC |
| 94.69±0.01 | 0.63±0.10 | 89.38±0.02 | 0.58±0.06 | 90.07±0.02 | 0.51±0.09 | |||
| 94.99±0.01 | 0.65±0.11 | 91.93±0.02 | 0.56±0.09 | |||||
| 93.30±0.01 | 0.40±0.14 | 91.86±0.02 | 0.63±0.07 | 91.93±0.02 | 0.41±0.16 | |||
Best results obtained with k-NN, SVM and ELM with statistical significance tests.
| k-NN (1) | SVM (2) | ELM (3) | Statistical Significance 1–2 | Statistical Significance 1–3 | Statistical Significance 2–3 | |
|---|---|---|---|---|---|---|
| 94.99 | 96.43 | 91.93 | ||||
| 0.65 | 0.77 | 0.56 |
Paired t-test:
** p < 0.01;
* p < 0.05