| Literature DB >> 27852279 |
He-Hua Zhang1, Liuyang Yang2, Yuchuan Liu2, Pin Wang2, Jun Yin1, Yongming Li3,4, Mingguo Qiu5, Xueru Zhu2, Fang Yan2.
Abstract
BACKGROUND: The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined.Entities:
Keywords: Classification of Parkinson disease; Decorrelated neural network ensembles (DNNE); Ensemble learning; Multi-edit-nearest-neighbor algorithm (MENN); Optimal selection of speech samples; Random forest (RF)
Mesh:
Year: 2016 PMID: 27852279 PMCID: PMC5112697 DOI: 10.1186/s12938-016-0242-6
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 2.819
Description of speech samples based on the same subject
| No of speech samples | Description of speech samples |
|---|---|
| 1st | Sustained vowel of aaa |
| 2nd | Sustained vowel of ooo |
| 3rd | Sustained vowel of uuu |
| 4th–13th | Numbers from 1 to 10 |
| 14th–17th | Short sentences |
| 18th–26th | Words |
Fig. 1Flowchart of the PD_MEdit_EL algorithm
Fig. 2Edited results of training samples: a prior to MENN and b with MENN.(1st dimension means the 1st feature; 2nd dimension means the 2nd feature of the PD data)
Classification accuracy for the “Training_Data” set
| Leave-one-out method | Leave-one-subject-out method | |||||
|---|---|---|---|---|---|---|
| Accuracy | Sensitivity | Specificity | Accuracy | Sensitivity | Specificity | |
| DNNE (with MENN) | ||||||
| Mean | 67.71 | 71.71 | 64 | 73.37 | 71 |
|
| Std | 0.037 | 0.018 | 0.033 | 0.042 | 0.056 | 0.032 |
| Best | 68.79 | 72.83 |
| 85 | 90 | 80 |
| RF (with MENN) | ||||||
| Mean |
|
|
|
|
| 70.50 |
| Std | 0.031 | 0.0052 | 0.0063 | 0.0105 | 0.0225 | 0.0211 |
| Best |
| 73.65 | 68.46 |
|
|
|
| SVM (linear) | ||||||
| Mean | 63.65 | 63.85 | 63.46 | 65 | 65 | 65 |
| Std | 0 | 0 | 0 | 0 | 0 | 0 |
| Best | 63.65 | 63.85 | 63.46 | 65 | 65 | 65 |
| SVM (RBF) | ||||||
| Mean | 63.08 | 73.08 | 57.08 | 67.5 | 80 | 55 |
| Std | 0 | 0 | 0 | 0 | 0 | 0 |
| Best | 63.08 | 73.08 | 57.08 | 67.5 | 80 | 55 |
| Method in [ | ||||||
| Mean | – | – | – | 52.06 | 54.92 | 49.22 |
| Best | – | – | – | 85 | 80 | 90 |
| Method in [ | ||||||
| Mean | – | – | – | – | – | – |
| Best | – | – | – | 70, KNN (k = 1) | 80, KNN (k = 1) | 60, KNN (k = 1) |
DNNE (with MENN) and RF (with MENN): reflect the proposed PD_MEdit_EL algorithm; SVM (with linear): SVM with the linear kernel function; SVM (with RBF): SVM with radial basis function kernel; Method in [1]: classification algorithm from the Ref. [1] and Method in [27]: classification algorithm from Ref. [27]
Classification accuracy of the “Test_Data” set
| For samples (accuracy) (%) | For subjects (accuracy) (%) | |
|---|---|---|
| DNNE (with MENN) | ||
| Mean |
|
|
| Std |
|
|
| Best |
|
|
| RF (with MENN) | ||
| Mean |
|
|
| Std |
|
|
| Best |
|
|
| SVM (linear kernel) | ||
| Mean | 68.45 | 67.86 |
| Std | 0 | 0 |
| Best | 68.45 | 67.86 |
| Method in [ | ||
| Mean | – | 75 |
| Best | – | 75 |
Fig. 3Classification performances between different classification algorithms using the “Training_Data” set. The y-axis: classification accuracy; DNNE_MENN and RF_MENN: reflect the proposed PD_MEdit_EL algorithm; SVM_Linear: SVM with the linear kernel function; SVM_RBF: SVM with radial basis function kernel; Method in [27]: classification algorithm from Ref. [27]
Fig. 4Classification performances between different classification algorithms using the “Test_Data” set. The y-axis: classification accuracy; DNNE_MENN and RF_MENN: reflect the proposed PD_MEdit_EL algorithm; SVM_Linear: SVM with the linear kernel function; SVM_RBF: SVM with radial basis function kernel; Method in [27]: classification algorithm from Ref. [27]
Fig. 5Classification accuracy of different algorithms for the “Training_Data” set. The y-axis: classification accuracy; the x-axis: the number of runs for each algorithm; DNNE_MENN and RF_MENN: reflect the proposed PD_MEdit_EL algorithm; SVM_Linear: SVM with the linear kernel function; SVM_RBF: SVM with radial basis function kernel
Fig. 6Classification accuracy of different algorithms for the “Test_Data” set. The y-axis: classification accuracy; the x-axis: the number of runs for each algorithm; DNNE_MENN and RF_MENN: reflect the proposed PD_MEdit_EL algorithm; SVM_Linear: SVM with the linear kernel function; SVM_RBF: SVM with radial basis function kernel
Classification accuracy for the PD data from Max Little et al.
| Tenfold CV (%) | |||
|---|---|---|---|
| Accuracy | Sensitivity | Specificity | |
| RF (with MENN) | |||
| Mean |
|
|
|
| Std | 0.058 | 0.067 | 0.175 |
| SVM (RBF)_relief | |||
| Mean | 84.5 | 95.2 | 52.5 |
| Std | 0.061 | 0.067 | 0.175 |
RF (with MENN): reflect the proposed PD_MEdit_EL algorithm; SVM (RBF)_relief: SVM with RBF kernel and relief; SVM (with RBF): SVM with radial basis function kernel; Method in [1]: classification algorithm from the Ref. [1] and Method in [27]: classification algorithm from Ref. [27]
Classification result in terms of dependence
| Leave-one-out method | Leave-one-subject-out method | |||||
|---|---|---|---|---|---|---|
| Accuracy | Sensitivity | Specificity | Accuracy | Sensitivity | Specificity | |
| RF_MENN | ||||||
| Mean | 70.93 | 73.27 | 67.59 | 81.50 | 92.50 | 70.50 |
| Std | 0.031 | 0.0052 | 0.0063 | 0.0105 | 0.0225 | 0.0211 |
| Best | 76.07 | 73.65 | 68.46 | 88 | 97 | 88 |
| RF_MENN_inDe | ||||||
| Mean | 55.1 | 59.2 | 51.0 | 65.0 | 75.0 | 55 |
| Std | 0.003 | 0.0063 | 0.0043 | 0.0211 | 0.0242 | 0.0369 |
| Best | 55.4 | 62.9 | 53.5 | 67.5 | 78.5 | 59.5 |
| SVM_RBF | ||||||
| Mean | 63.08 | 73.08 | 57.08 | 67.5 | 80 | 55 |
| Std | 0 | 0 | 0 | 0 | 0 | 0 |
| Best | 63.08 | 73.08 | 57.08 | 67.5 | 80 | 55 |
| SVM_RBF_inDe | ||||||
| Mean | 55.0 | 62.1 | 47.8 | 56.0 | 71.5 | 40.5 |
| Std | 0 | 0 | 0 | 0 | 0 | 0 |
| Best | 55.0 | 62.1 | 47.8 | 56.0 | 71.5 | 40.5 |
RF_MENN: reflect the RF + MENN algorithm; RF_MENN_inDe: RF_MENN in the case when a sample is classified and other samples from same subject are not used for building classification model; SVM_RBF: reflect the SVM with RBF kernel; SVM_RBF_inDe: SVM_RBF in the case when a sample is classified and other samples from same subject are not used for building classification model
Comparison before and after applying the MENN algorithm
| Comparison results | Number of training samples (number of subjects) | Number of training samples of healthy subjects (number of subjects) | Number of training samples of patient subjects (number of subjects) |
|---|---|---|---|
| Before applying MENN algorithm | 1039 (40) | 519 (20) | 520 (20) |
| After applying MENN algorithm | 731 (40) | 364 (20) | 367 (20) |
Fig. 7Distribution of samples both (1) before and (2) after applying the MENN algorithm (1st dimension means the 1st feature of the PD data; 2nd dimension means the 2nd feature of the PD data; 3rd dimension means the 3rd feature of the PD data)
Improvement with the MENN algorithm for the “Training_Data” set
| Leave-one-out | Leave-one-subject-out | |||||
|---|---|---|---|---|---|---|
| Accuracy | Sensitivity | Specificity | Accuracy | Sensitivity | Specificity | |
| DNNE (without MENN) | ||||||
| Mean | 64.09 | 70 | 50.67 | 69.25 | 68.50 | 70 |
| Std | 0.0178 | 0.0206 | 0.0477 | 0.0274 | 0.0298 | 0.0339 |
| Best | 66.92 | 76.71 | 63.85 | 75 | 75 | 75 |
| DNNE (with MENN) | ||||||
| Mean | 67.71 | 71.71 | 64 | 73.37 | 71 |
|
| Std | 0.037 | 0.018 | 0.033 | 0.042 | 0.056 | 0.032 |
| Best | 68.79 | 72.83 |
| 85 | 90 | 80 |
| RF (without MENN) | ||||||
| Mean | 67.39 | 71.62 | 60 | 78.25 | 85.50 | 71.50 |
| Std | 0.0141 | 0.0297 | 0.0165 | 0.041 | 0.035 | 0.074 |
| Best | 68.46 |
| 62.36 | 85 | 90 | 85 |
| RF (with MENN) | ||||||
| Mean |
|
|
|
|
| 70.50 |
| Std | 0.031 | 0.0052 | 0.0063 | 0.0105 | 0.0225 | 0.0211 |
| Best |
| 73.65 | 68.46 |
|
|
|
Improvement with the MENN algorithm for the “Test_Data” set
| For samples (accuracy) (%) | For subjects (accuracy) (%) | |
|---|---|---|
| DNNE (without MENN) | ||
| Mean | 80.42 | 87.5 |
| Std | 0.0327 | 0.0342 |
| Best | 86.9 | 92.86 |
| DNNE (with MENN) | ||
| Mean |
|
|
| Std |
|
|
| Best |
|
|
| RF (without MENN) | ||
| Mean | 56.37 | 54.28 |
| Std | 0.0184 | 0.0471 |
| Best | 59.52 | 60.71 |
| RF (with MENN) | ||
| Mean |
|
|
| Std |
|
|
| Best |
|
|
Establishing method significance using the “Training_Data” set
| Significant difference | LOO | LOSO | ||||
|---|---|---|---|---|---|---|
| Accuracy | Sensitivity | Specificity | Accuracy | Sensitivity | Specificity | |
| DNNE (with MENN) | ||||||
| SVM (linear) | <0.0001 | <0.0001 | 0.1888 | <0.0001 | <0.0001 | 0.0169 |
| SVM (RBF) | <0.0001 | <0.0001 | <0.0001 | 0.0695 | <0.0001 | 0.0207 |
| DNNE (with MENN) | ||||||
| RF (with MENN) | 0.3944 | 0.9469 | 0.2596 | 0.1358 | 0.1071 | 0.6319 |
| RF (with MENN) | ||||||
| SVM (linear) | <0.0001 | <0.0001 | 0.6265 | <0.0001 | <0.0001 | 0.0111 |
| SVM (RBF) | 0.1454 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | 0.1558 |
Establishing method significance using the “Test_Data” set
| Significant difference | For samples | For subjects |
|---|---|---|
| Accuracy | Accuracy | |
| DNNE (with MENN) | ||
| SVM (linear) | <0.0001 | <0.0001 |
| SVM (RBF) | <0.0001 | <0.0001 |
| DNNE (with MENN) | ||
| RF (with MENN) | <0.0001 | 0.0382 |
| RF (with MENN) | ||
| SVM (linear) | <0.0001 | <0.0001 |
| SVM (RBF) | <0.0001 | <0.0001 |