| Literature DB >> 35055370 |
Fatih Demir1, Kamran Siddique2, Mohammed Alswaitti3, Kursat Demir4, Abdulkadir Sengur5.
Abstract
Parkinson's disease (PD), which is a slowly progressing neurodegenerative disorder, negatively affects people's daily lives. Early diagnosis is of great importance to minimize the effects of PD. One of the most important symptoms in the early diagnosis of PD disease is the monotony and distortion of speech. Artificial intelligence-based approaches can help specialists and physicians to automatically detect these disorders. In this study, a new and powerful approach based on multi-level feature selection was proposed to detect PD from features containing voice recordings of already-diagnosed cases. At the first level, feature selection was performed with the Chi-square and L1-Norm SVM algorithms (CLS). Then, the features that were extracted from these algorithms were combined to increase the representation power of the samples. At the last level, those samples that were highly distinctive from the combined feature set were selected with feature importance weights using the ReliefF algorithm. In the classification stage, popular classifiers such as KNN, SVM, and DT were used for machine learning, and the best performance was achieved with the KNN classifier. Moreover, the hyperparameters of the KNN classifier were selected with the Bayesian optimization algorithm, and the performance of the proposed approach was further improved. The proposed approach was evaluated using a 10-fold cross-validation technique on a dataset containing PD and normal classes, and a classification accuracy of 95.4% was achieved.Entities:
Keywords: Parkinson’s disease; multi-level feature selection; optimized KNN
Year: 2022 PMID: 35055370 PMCID: PMC8781034 DOI: 10.3390/jpm12010055
Source DB: PubMed Journal: J Pers Med ISSN: 2075-4426
Figure 1The illustration of the proposed approach.
Figure 2Feature importance weights of the concatenated features.
Figure 33D feature representations for PD and normal classes according to feature selection cases: (a) the feature representation for PD class and no feature selection; (b) the feature representation for Normal class and no feature selection; (c) the feature representation for PD class and CLS feature selection; (d) the feature representation for normal class and CLS feature selection.
Accuracy performances for the Chi-square, the L1-Norm SVM, and the ReliefF algorithms.
| Classifier | Accuracy (%) | |||
|---|---|---|---|---|
| Raw Features | L1-Norm SVM | Chi-Square | ReliefF | |
| DT | 81.1 | 82.3 | 81.2 | 82.5 |
| LD | 72.2 | 72.7 | 75.0 | 74.8 |
| NB | 74.6 | 75.3 | 74.9 | 75.6 |
| SVM | 85.6 | 85.8 | 85.6 | 86.7 |
| KNN | 86.9 | 87.8 | 87.7 | 88.6 |
Accuracy performances for three different situations.
| Classifier | Accuracy (%) | ||
|---|---|---|---|
| 752 Features | 341 Features | 220 Features (Multi-Level) | |
| DT | 81.1 | 81.5 | 83.7 |
| LD | 72.2 | 81.0 | 82.0 |
| NB | 74.6 | 77.4 | 79.6 |
| SVM | 85.6 | 87.5 | 89.5 |
| KNN | 86.9 | 88.9 | 91.5 |
Hyperparameter searching range.
| Hyperparameters | ||
|---|---|---|
| Distance Metric | Number of Neighbors | Distance Weight |
| Cityblock | 1–378 | equal |
| Chebyshev | ||
| Correlation | ||
| Cosine | ||
| Euclidean | ||
| Hamming | ||
| Jaccard | ||
| Mahalanobis | ||
| Minkowski | ||
| Spearman | ||
Figure 4Change of minimum classification error in the KNN during Bayesian hyperparameter optimization.
Figure 5Confusion matrices for three different cases: (a) The Fine KNN without feature selection; (b) The Fine KNN with the CLS feature selection; (c) The optimized KNN with the CLS feature selection.
Other performance metric scores.
| Classifier | Class | Sensitivity | Specificity | Precision | F-Score |
|---|---|---|---|---|---|
| Fine KNN | Normal | 0.64 | 0.95 | 0.80 | 0.71 |
| PD | 0.95 | 0.64 | 0.89 | 0.92 | |
| Fine KNN and | Normal | 0.82 | 0.95 | 0.84 | 0.83 |
| PD | 0.95 | 0.82 | 0.94 | 0.94 | |
| Optimized KNN | Normal | 0.92 | 0.96 | 0.90 | 0.91 |
| PD | 0.96 | 0.92 | 0.97 | 0.97 |
Figure 6ROC curves and AUC values for three different cases: (a) The Fine KNN without feature selection; (b) The Fine KNN with the CLS feature selection; (c) The optimized KNN with the CLS feature selection.
Performance comparison of the proposed method with other methods.
| Methods | Accuracy (%) | Sensitivity | Specificity | Precision | F-Score |
|---|---|---|---|---|---|
| Baseline method [ | 86.00 | - | - | - | 0.840 |
| Ashour et al. [ | 93.80 | 0.840 | 0.970 | 0.915 | 0.875 |
| Demir et al. [ | 94.27 | 0.960 | 0.960 | 0.910 | 0.930 |
| Proposed Approach | 95.40 | 0.949 | 0.930 | 0.952 | 0.955 |
Performance of the proposed method with only 252 samples from the dataset.
| Methods | Accuracy (%) | Sensitivity | Specificity | Precision | F-Score |
|---|---|---|---|---|---|
| Proposed Approach | 91.67 | 0.87 | 0.94 | 0.913 | 0.918 |
Performance of the proposed method with oversampling method.
| Methods | Accuracy (%) | Sensitivity | Specificity | Precision | F-score |
|---|---|---|---|---|---|
| Proposed Approach | 94.30 | 0.96 | 0.96 | 0.91 | 0.93 |