| Literature DB >> 36042792 |
Nada R Yousif1, Hossam Magdy Balaha1, Amira Y Haikal1, Eman M El-Gendy1.
Abstract
Parkinson's disease (PD) is a neurodegenerative disorder with slow progression whose symptoms can be identified at late stages. Early diagnosis and treatment of PD can help to relieve the symptoms and delay progression. However, this is very challenging due to the similarities between the symptoms of PD and other diseases. The current study proposes a generic framework for the diagnosis of PD using handwritten images and (or) speech signals. For the handwriting images, 8 pre-trained convolutional neural networks (CNN) via transfer learning tuned by Aquila Optimizer were trained on the NewHandPD dataset to diagnose PD. For the speech signals, features from the MDVR-KCL dataset are extracted numerically using 16 feature extraction algorithms and fed to 4 different machine learning algorithms tuned by Grid Search algorithm, and graphically using 5 different techniques and fed to the 8 pretrained CNN structures. The authors propose a new technique in extracting the features from the voice dataset based on the segmentation of variable speech-signal-segment-durations, i.e., the use of different durations in the segmentation phase. Using the proposed technique, 5 datasets with 281 numerical features are generated. Results from different experiments are collected and recorded. For the NewHandPD dataset, the best-reported metric is 99.75% using the VGG19 structure. For the MDVR-KCL dataset, the best-reported metrics are 99.94% using the KNN and SVM ML algorithms and the combined numerical features; and 100% using the combined the mel-specgram graphical features and VGG19 structure. These results are better than other state-of-the-art researches.Entities:
Keywords: Feature extraction; Hyperparameters optimization; Machine learning (ML); Parkinson disease (PD); Speech segmentation; Transfer learning (TL); Voice segmentation
Year: 2022 PMID: 36042792 PMCID: PMC9411848 DOI: 10.1007/s12652-022-04342-6
Source DB: PubMed Journal: J Ambient Intell Humaniz Comput
Related studies summarization concerning PD
| Reference | Year | Approach | Dataset | Dataset type | Pros. | Cons. | Best accuracy |
|---|---|---|---|---|---|---|---|
| Pereira et al. Pereira et al. | 2015 | ML using NB, SVM, and OPF | HandPD | Image | Proposing “HandPD” dataset | Achieved accuracy is low | 78.9% using NB |
| Pereira et al. Pereira et al. | 2016 | CNN | Proposing an extension to the “HandPD” dataset using signals from a smartpen from meander and spiral drawings | (1) The use of an imbalanced dataset with more healthy samples and (2) the usage of tablet-based devices requires specific conditions for good quality | 80.19% | ||
| Pereira et al. Pereira et al. | 2016 | Metaheuristics + CNN | Usage of metaheuristic algorithms to tune the hyperparameters | The usage of imbalanced dataset with more healthy samples | 90.39% | ||
| Pereira et al. Pereira et al. | 2018 | CNN | (1) CNN is applied for learning features from handwritten dynamics and (2) proposing “NewHandPD” dataset extracted by the use of a smartpen | Process of the time-series data in a black-box manner | 95% | ||
| Senatore et al. Senatore et al. | 2019 | CGP | The usage of Cartesian Genetic Programming to provide explicit classification rules | Poor results for spiral images | 72.36% | ||
| Impedovo et al. Impedovo | 2019 | SVM with a linear kernel | PaHaW | Usage of velocity signals | Useful in online handwriting only | 98.44% | |
| Naseer et al. Naseer et al. | 2020 | CNN using AlexNet | (1) The usage of fine-tuned pretrained models and (2) the usage of k-fold cross-validation | (1) No consideration of dimensionality reduction and (2) vulnerability to acoustic conditions | 98.28% | ||
| Kamran et al. Kamran et al. | 2021 | CNN using AlexNet, GoogLeNet, VGG, and ResNet | HandPD, NewHandPD, and Parkinson’s Drawing datasets | (1) The usage of several datasets and (2) high achieved accuracy. | Poor accuracy in case of scratch CNN | 99.22% using AlexNet | |
| Sakar et al. Sakar et al. | 2013 | SVM, KNN | Speech data | Voice | Proposal of voice dataset for Parkinson’s disease | Results are biased | 77.5% |
| Caliskan et al. Caliskan et al. | 2017 | DNN | OPD and PSD | Remote diagnosis ability | Low accuracy | 93.79% | |
| Tuncer et al. Tuncer and Dogan | 2019 | SVM, 1NN, DT, and logistic regression | Vowel | Gender classification is taken into account | The usage of small data | 97.62% by 1NN | |
| Zahid et al. Zahid et al. | 2020 | AlexNet | pc-Gita | (1) The usage of deep features of speech and (2) proving that pronunciation of vowels are sufficient in diagnosis | Poor accuracy for isolated words | 99.7% |
Fig. 1The Parkinson diseases learning and optimization framework
Fig. 2Samples from the NewHandPD dataset classes
Fig. 3Presentation of the proposed voice records segmentation approach
Summarization of the number of extracted numerical features for the MDVR-KCL dataset
| Category | Technique | No. features |
|---|---|---|
| MFCC | Slaney | 40 |
| HTK | 40 | |
| Mel-Spectrogram | 128 | |
| Chroma-based | Chroma-only | 12 |
| STFT | 12 | |
| CQT | 12 | |
| CENS | 12 | |
| RMSE | 1 | |
| Spectral-based | Contrast | 7 |
| Flatness | 1 | |
| Centroid | 1 | |
| Bandwidth | 1 | |
| Roll-off Frequency | 1 | |
| ZCR | 1 | |
| Tonnetz | Normal | 6 |
| Harmonic | 6 | |
| Total | 281 |
Summarization of the number of extracted graphs for the MDVR-KCL dataset
| Segment duration (s) | No. PD | No. HC | Total |
|---|---|---|---|
| 5 | 310 | 420 | 730 |
| 15 | 258 | 366 | 624 |
| 30 | 126 | 179 | 305 |
| 60 | 57 | 79 | 136 |
| Total | 751 | 1044 | 1795 |
Fig. 4Sample graphs for each technique for the 60-second-segment-duration
The ranges for each hyperparameter
| Optimizer | Category | Definition | Range |
|---|---|---|---|
| AO | CNN Learning | Loss Function | Categorical Crossentropy, Categorical Hinge, KL Divergence, Poisson, Squared Hinge, and Hinge |
| Batch Size | From 8 to 64 with a step of 8 | ||
| Parameters (i.e., weights) & Optimizer | Adam, Nadam, Adagrad, Adadelta, Adamax, RMSProp, SGD, Ftrl, SGD Nesterov, RMSProp Centered, Adam, and AMSGrad | ||
| CNN Model Structure | Dropout ratio | [0.0, 0.6] | |
| TL learning ratio | From 0 to 100 with a step of 1 | ||
| CNN Data Augmentation | Rotation Range | From 0 to 45 with a step of 1 | |
| Width Shift Range | [0, 0.25] | ||
| Height Shift Range | |||
| Shear Range | |||
| Zoom Range | |||
| Horizontal Flipping | [True, False] | ||
| Vertical Flipping | |||
| Brightness Change (From) | [0.5, 2.0] | ||
| Brightness Change (To) | |||
| GS | KNN | nNeighbors | [1, 2, 3, 5, 7, 10] |
| leafSize | [1, 5, 10, 15] | ||
| p | [1, 2] | ||
| SVM | degree | [1, 2, 3, 4, 5] | |
| C | [0.1, 1, 10, 100, 1000] | ||
| gamma | [1, 0.1, 0.01, 0.001, 0.0001] | ||
| kernel | [Linear, Poly, RBF, Sigmoid, Precomputed] | ||
| DT | criterion | [Gini, Entropy] | |
| splitter | [Best, Random] | ||
| maxDepth | From 3 to 14 with a step of 1 | ||
| NB | alpha | [0, 0.1, 0.5, 1.0, 1.5, 2, 3, 5, 10] | |
| Variance Threshold | threshold | [0, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5] |
Fig. 5Parkinson disease (PD) patient diagnosis
Summary of the ML numerical experiments (i.e., first category experiments)
| Duration (s) | Algorithm | Accuracy | Precision | Recall | F1 | AUC | Scaler | Variance threshold | Best classifiers parameters |
|---|---|---|---|---|---|---|---|---|---|
| 5 | KNN | 98.36% | 97.76% | 98.39% | 98.07% | 98.36% | Min Max | 0.01 | leafSize = 1, nNeighbors = 1, and p = 1 |
| DT | 83.42% | 89.54% | 69.03% | 77.96% | 81.54% | Normalizer | 0 | criterion = entropy and maxDepth = 5 | |
| NB | 57.53% | 0% | 0% | 0% | 50.00% | Normalizer | 0 | alpha = 0 | |
| SVM | 98.90% | 98.71% | 98.71% | 98.71% | 98.88% | Min Max | 0.01 | C = 0.1, degree = 5, gamma = 1, and kernel = poly | |
| 15 | KNN | 99.04% | 98.84% | 98.84% | 98.84% | 99.01% | Max Abs | 0 | leafSize = 1 and nNeighbors = 1 |
| DT | 96.63% | 96.47% | 95.35% | 95.91% | 96.44% | Normalizer | 0 | criterion = entropy and maxDepth = 13 | |
| NB | 58.65% | 0% | 0% | 0% | 50.00% | Normalizer | 0 | alpha = 0 | |
| SVM | 99.04% | 98.46% | 99.22% | 98.84% | 99.07% | Min Max | 0.01 | C = 100, degree = 1, and gamma = 0.1 | |
| 30 | KNN | 98.03% | 100% | 95.24% | 97.56% | 97.62% | Max Abs | 0.05 | leafSize = 1, nNeighbors = 1, and p = 1 |
| DT | 97.38% | 99.17% | 94.44% | 96.75% | 96.94% | Standardization | 0.10 | maxDepth = 10 and splitter = random | |
| NB | 58.69% | 0% | 0% | 0% | 50.00% | Normalizer | 0 | alpha = 0 | |
| SVM | 98.03% | 100% | 95.24% | 97.56% | 97.62% | Max Abs | 0.05 | C = 0.1, gamma = 1, and kernel = poly | |
| 60 | KNN | 98.53% | 100% | 96.49% | 98.21% | 98.25% | Max Abs | 0.01 | leafSize = 1, nNeighbors = 1, and p = 1 |
| DT | 83.82% | 92.68% | 66.67% | 77.55% | 81.43% | Normalizer | 0 | maxDepth = 3 and splitter = random | |
| NB | 58.09% | 0% | 0% | 0% | 50.00% | Normalizer | 0 | alpha = 0 | |
| SVM | 94.85% | 96.30% | 91.23% | 93.69% | 94.35% | Min Max | 0 | C = 1, degree = 1, and gamma = 0.1 | |
| Combined | KNN | 99.94% | 100% | 99.87% | 99.93% | 99.93% | Max Abs | 0 | leafSize = 1 and nNeighbors = 1 |
| DT | 97.38% | 96.56% | 97.20% | 96.88% | 97.36% | Standardization | 0.50 | criterion = entropy and maxDepth = 14 | |
| NB | 58.16% | 0% | 0% | 0% | 50.00% | Normalizer | 0 | alpha = 0 | |
| SVM | 99.94% | 99.87% | 100% | 99.93% | 99.95% | Max Abs | 0 | C = 100, degree = 1, and gamma = 0.1 |
Summary of the confusion matices (i.e., first category experiments)
The top-1 record concerning the accuracy for the pre-trained models for NewHandPD
| # | MobileNet | MobileNetV2 | MobileNetV3Small | MobileNetV3Large | ResNet50 | VGG16 | VGG19 | InceptionResNetV2 |
|---|---|---|---|---|---|---|---|---|
| Loss function | Categorical crossentropy | Categorical crossentropy | KL divergence | KL divergence | KL divergence | KL divergence | Poisson | Categorical crossentropy |
| Batch size | 24 | 8 | 8 | 24 | 56 | 40 | 48 | 56 |
| Dropout ratio | 0.42 | 0 | 0.37 | 0.20 | 0.41 | 0.26 | 0.33 | 0.35 |
| TL learn ratio | 89% | 0% | 84% | 55% | 22% | 57 | 67 | 58 |
| Weights optimizer | SGD Nesterov | Adam | Adagrad | SGD Nesterov | Adagrad | Adagrad | SGD | SGD Nesterov |
| Rotation range | ||||||||
| Width shift range | 0.24 | 0 | 0.21 | 0.07 | 0.20 | 0.22 | 0.15 | 0.22 |
| Height shift range | 0.24 | 0 | 0.05 | 0.02 | 0.05 | 0.03 | 0.2 | 0.06 |
| Shear range | 0.02 | 0 | 0.21 | 0.23 | 0.19 | 0.15 | 0.15 | 0.13 |
| Zoom range | 0.12 | 0 | 0.22 | 0.25 | 0.01 | 0.19 | 0.22 | 0.20 |
| Horizontal flip | ||||||||
| Vertical flip | ||||||||
| Brightness range (low) | 1.24 | 0.5 | 0.92 | 1.29 | 1.01 | 1.34 | 1.32 | 0.56 |
| Brightness range (high) | 1.28 | 0.5 | 1.08 | 1.52 | 1.67 | 1.76 | 1.46 | 1.3 |
| Loss | 0.038 | 0.032 | 0.152 | 0.107 | 0.049 | 0.029 | 0.180 | 0.049 |
| Accuracy | 99.05% | 99.40% | 95.12% | 95.00% | 98.81% | 99.29% | 99.75% | 98.21% |
| F1-Score | 99.05% | 99.40% | 95.16% | 95.00% | 98.81% | 99.29% | 99.75% | 98.21% |
| Precision | 99.05% | 99.40% | 95.36% | 95.00% | 98.81% | 99.29% | 99.75% | 98.21% |
| Recall | 99.05% | 99.40% | 95.00% | 95.00% | 98.81% | 99.29% | 99.75% | 98.21% |
| Specificity | 99.81% | 99.88% | 99.07% | 99.00% | 99.76% | 99.86% | 99.95% | 99.64% |
| AUC | 99.98% | 99.92% | 99.67% | 99.82% | 99.97% | 99.99% | 100% | 99.91% |
| IOU coefficient | 98.87% | 98.72% | 93.13% | 94.80% | 97.09% | 98.02% | 93.45% | 97.31% |
| Dice coefficient | 99.04% | 98.99% | 94.36% | 95.77% | 97.76% | 98.50% | 95.17% | 97.92% |
Fig. 6The NewHandPD experiments summarization
The best hyperparameters correlations for the NewHandPD experiments
| Batch size | Dropout | TL learn ratio | Rotation range | Width shift range | Height shift range | Shear range | Zoom range | Horizontal flip | Vertical flip | Brightness range (low) | Brightness range (high) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Batch size | 1.000 | |||||||||||
| Dropout | 0.488 | 1.000 | ||||||||||
| TL learn ratio | − 0.037 | 0.663 | 1.000 | |||||||||
| Rotation range | 0.112 | 0.543 | 0.817 | 1.000 | ||||||||
| Width shift range | 0.456 | 0.898 | 0.661 | 0.728 | 1.000 | |||||||
| Height shift range | 0.176 | 0.562 | 0.608 | 0.408 | 0.431 | 1.000 | ||||||
| Shear range | 0.271 | 0.342 | 0.231 | 0.142 | 0.203 | − 0.308 | 1.000 | |||||
| Zoom range | 0.047 | 0.251 | 0.731 | 0.646 | 0.259 | 0.144 | 0.558 | 1.000 | ||||
| Horizontal flip | − 0.530 | − 0.365 | − 0.196 | − 0.303 | − 0.315 | 0.271 | − 0.917 | − 0.576 | 1.000 | |||
| Vertical flip | 0.070 | − 0.411 | − 0.766 | − 0.467 | − 0.230 | − 0.511 | − 0.213 | − 0.717 | 0.149 | 1.000 | ||
| Brightness range (low) | 0.149 | 0.365 | 0.512 | 0.266 | 0.293 | 0.443 | 0.371 | 0.446 | − 0.277 | − 0.177 | 1.000 | |
| Brightness range (high) | 0.686 | 0.589 | 0.333 | 0.264 | 0.569 | 0.159 | 0.616 | 0.385 | − 0.669 | − 0.023 | 0.724 | 1.000 |
The top-1 record concerning the accuracy using VGG19 for MDVR-KCL
| # | Specgram | Mel-Specgram | MFCC (SLANEY) | MFCC (HTK) | STFT |
|---|---|---|---|---|---|
| Loss Function | Poisson | Poisson | Poisson | Squared Hinge | KL Divergence |
| Batch Size | 48 | 40 | 16 | 48 | 40 |
| Dropout Ratio | 0.06 | 0.41 | 0.43 | 0.19 | 0.31 |
| TL Learn Ratio | 86 | 67 | 13 | 59 | 61 |
| Weights Optimizer | Adagrad | SGD | RMSProp Centered | Adagrad | SGD |
| Rotation Range | 29 | 29 | 41 | 41 | 35 |
| Width Shift Range | 0.14 | 0.2 | 0 | 0.09 | 0.16 |
| Height Shift Range | 0.09 | 0.14 | 0.15 | 0.05 | 0.2 |
| Shear Range | 0.01 | 0.14 | 0.22 | 0.13 | 0.14 |
| Zoom Range | 0.13 | 0.13 | 0.01 | 0.13 | 0.17 |
| Horizontal Flip | |||||
| Vertical Flip | |||||
| Brightness Range (Low) | 0.72 | 1.33 | 0.55 | 0.65 | 1.38 |
| Brightness Range (High) | 1.32 | 1.65 | 1.16 | 1.33 | 1.41 |
| Loss | 0.619 | 0.505 | 0.791 | 0.654 | 0.090 |
| Accuracy | 89.58% | 100% | 70.03% | 92.68% | 96.93% |
| F1-Score | 89.58% | 100% | 70.03% | 92.68% | 96.93% |
| Precision | 89.58% | 100% | 70.03% | 92.68% | 96.93% |
| Recall | 89.58% | 100% | 70.03% | 92.68% | 96.93% |
| Specificity | 89.58% | 100% | 70.03% | 92.68% | 96.93% |
| AUC | 96.69% | 100% | 76.72% | 95.33% | 99.55% |
| IOU Coefficient | 89.58% | 100% | 70.03% | 92.68% | 96.93% |
| Dice Coefficient | 86.85% | 99.09% | 67.41% | 93.22% | 94.85% |
Fig. 7The MDVR-KCL experiments summarization
The best hyperparameters correlations for the MDVR-KCL experiments
| Batch size | Dropout | TL learn ratio | Rotation range | Width shift range | Height shift range | Shear range | Zoom range | Horizontal flip | Vertical flip | Brightness range (low) | Brightness range (high) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Batch size | 1.000 | |||||||||||
| Dropout | − 0.743 | 1.000 | ||||||||||
| TL learn ratio | 0.923 | − 0.701 | 1.000 | |||||||||
| Rotation range | − 0.456 | 0.241 | − 0.752 | 1.000 | ||||||||
| Width shift range | 0.688 | − 0.169 | 0.816 | − 0.812 | 1.000 | |||||||
| Height shift range | − 0.485 | 0.624 | − 0.292 | − 0.130 | 0.155 | 1.000 | ||||||
| Shear range | − 0.792 | 0.886 | − 0.900 | 0.664 | − 0.519 | 0.423 | 1.000 | |||||
| Zoom range | 0.863 | − 0.445 | 0.836 | − 0.495 | 0.848 | 0.006 | − 0.578 | 1.000 | ||||
| Horizontal flip | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | 1.000 | |||
| Vertical flip | 0.667 | − 0.910 | 0.519 | 0.000 | − 0.036 | − 0.886 | − 0.703 | 0.241 | N/A | 1.000 | ||
| Brightness range (low) | 0.248 | 0.324 | 0.372 | − 0.536 | 0.817 | 0.640 | − 0.005 | 0.657 | N/A | − 0.555 | 1.000 | |
| Brightness range (high) | 0.462 | 0.186 | 0.554 | − 0.671 | 0.898 | 0.162 | − 0.179 | 0.624 | N/A | − 0.250 | 0.816 | 1.000 |
Approximate times for each ML model
| Model | Hyperparameters # | Total configurations # | With 10 folds | Approximate time (s) |
|---|---|---|---|---|
| KNN | 3 | 480 | 480 | |
| SVM | 4 | 6,250 | 6,250 | |
| DT | 3 | 480 | 480 | |
| NB | 1 | 9 | 90 | 90 |
| 7300 |
Related studies comparisons
| References | Best accuracy | Other metrics |
|---|---|---|
| Pereira et al. | 78.9% | – |
| Pereira et al. | 80.19% | – |
| Pereira et al. | 90.39% | – |
| Pereira et al. | 95% | – |
| Senatore et al. | 72.36% | – |
| Impedovo | 98.44% | – |
| Naseer et al. | 98.28% | 85.98% precision, 67.57% sensitivity, and 76.37% specificity |
| Kamran et al. | 99.75% (CNN-TL) | – |
| Sakar et al. | 77.5% | – |
| Caliskan et al. | 86.09% | 58.27% sensitivity abd 95.39% specificity |
| Goyal et al. | 99.37% | – |
| Tuncer and Dogan | 97.62% by 1NN | 97.61% F1 |
| Zahid et al. | 99.7% | – |
| Proposed approach | 99.94% (ML) | Table |
| Proposed approach | 99.75% (NewHandPD) | Table |
| Proposed approach | 100% (MDVR-KCL) | Table |
Table of abbreviations
| Abbreviation | Definition |
|---|---|
| PD | Parkinson’s disease |
| CNN | Convolutional neural network |
| TL | Transfer learning |
| ML | Machine learning |
| AO | Aquila optimizer |
| GS | Grid search |
| DL | Deep learning |
| DT | Decision tree |
| SVM | Support vector machine |
| NB | Naive Bayes |
| KNN | K-Nearest neighbor |
| PSD | Parkinson speech dataset |
| OPD | Oxford Parkinson’s disease detection dataset |
| MFCC | Mel-frequency cepstral coefficients |
| STFT | Short-time Fourier transform |
| CQT | Constant-Q chromagram |
| CENS | Chroma energy normalized |
| ZCR | Zero-crossing rate |
| RMSE | Root mean square error |