| Literature DB >> 33190461 |
Ilias Tougui1,2, Abdelilah Jilbab1,2, Jamal El Mhamdi1,2.
Abstract
OBJECTIVES: Parkinson's disease (PD) is the second most common neurodegenerative disorder; it affects more than 10 million people worldwide. Detecting PD usually requires a professional assessment by an expert, and investigation of the voice as a biomarker of the disease could be effective in speeding up the diagnostic process.Entities:
Keywords: Classification; Machine Learning; Parkinson Disease; Telemedicine; Voice Disorders
Year: 2020 PMID: 33190461 PMCID: PMC7674819 DOI: 10.4258/hir.2020.26.4.274
Source DB: PubMed Journal: Healthc Inform Res ISSN: 2093-3681
Figure 1Cohort selection steps using the demographic survey and the medical timepoint of the records.
Voice record variables
| Variable | Description |
|---|---|
| recordid | This is a unique id for each record. |
| healthCode | This is a unique id for each participant. |
| audio_countdown.m4a | Recording of the environment for 5 seconds to verify that the microphone works. |
| audio_audio.m4a | Voice recording of “aaaah” by the participant for 10 seconds. |
| medtimepoint | This is a very important variable, which indicates when a participant records his voice: |
Final cohort dataset statistics
| PD group | Control group | |
|---|---|---|
| Number of selected recordings | 9,105 | 9,105 |
|
| ||
| Number of participants | 453 | 1,037 |
|
| ||
| Sex | ||
| Male | 280 | 836 |
| Female | 173 | 201 |
|
| ||
| Age (yr) | 64.50 ± 8.16 (18–85) | 53.62 ± 10.99 (18–85) |
Values are presented number or mean±standard deviation (min–max).
Figure 2Feature extraction process and the machine-learning process. PD: Parkinson’s disease, SVM: support vector machine, KNN: k-nearest neighbor, XGBoost: extreme gradient boosting.
Performance of the four techniques using ANOVA and LASSO with various subsets of features
| Method | Parameter value | Number of features | Model | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1-score (%) |
|---|---|---|---|---|---|---|---|
| ANOVA | K = 10 | 10 | Linear SVM | 74.79 | 77.78 | 71.83 | 75.43 |
| KNN | 88.67 | 88.63 | 88.71 | 88.62 | |||
| RF | 92.09 | 91.77 | 92.42 | 92.03 | |||
| XGBoost | 89.33 | 88.94 | 89.72 | 89.24 | |||
|
| |||||||
| K = 20 | 20 | Linear SVM | 74.53 | 77.72 | 71.37 | 75.23 | |
| KNN | 89.49 | 89.16 | 89.82 | 89.41 | |||
| RF | 91.13 | 90.83 | 91.43 | 91.07 | |||
| XGBoost | 89.94 | 89.59 | 90.29 | 89.86 | |||
|
| |||||||
| K = 30 | 30 | Linear SVM | 74.55 | 77.53 | 71.59 | 75.20 | |
| KNN | 90.95 | 90.61 | 91.28 | 90.88 | |||
| RF | 91.42 | 91.07 | 91.77 | 91.35 | |||
| XGBoost | 90.59 | 90.49 | 90.92 | 90.54 | |||
|
| |||||||
| LASSO | C = 0.01 | 11 | Linear SVM | 74.38 | 77.29 | 71.50 | 75.02 |
| KNN | 89.54 | 89.54 | 89.55 | 89.50 | |||
| RF | 92.30 | 92.20 | 92.40 | 92.26 | |||
| XGBoost | 89.45 | 89.37 | 89.53 | 89.40 | |||
|
| |||||||
| C = 0.02 | 21 | Linear SVM | 75.60 | 78.71 | 72.53 | 76.25 | |
| KNN | 91.23 | 91.16 | 91.29 | 91.19 | |||
| RF | 92.29 | 92.15 | 92.43 | 92.25 | |||
| XGBoost | 90.38 | 90.13 | 90.64 | 90.38 | |||
|
| |||||||
| C = 0.03 | 33 | Linear SVM | 76.02 | 78.80 | 73.26 | 76.58 | |
| KNN | 92.69 | 92.38 | 92.99 | 91.59 | |||
| RF | 92.09 | 91.83 | 92.35 | 92.04 | |||
| XGBoost | 90.83 | 90.69 | 90.96 | 90.77 | |||
ANOVA: analysis of variance, LASSO: least absolute shrinkage and selection operator, SVM: support vector machine, KNN: knearest neighbor, RF: random forest, XGBoost: extreme gradient boosting.
Performance of baseline models
| Model | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1-score (%) |
|---|---|---|---|---|
| Linear SVM | 76.47 | 78.60 | 74.36 | 76.88 |
| KNN | 90.22 | 89.74 | 90.70 | 90.13 |
| RF | 89.88 | 88.77 | 90.98 | 89.72 |
| XGBoost | 90.97 | 90.80 | 91.14 | 90.92 |
SVM: support vector machine, KNN: k-nearest neighbor, RF: random forest, XGBoost: extreme gradient boosting.
Optimal hyperparameters of models using random search
| Machine learning technique | Best feature selection method | Optimal hyperparameters values | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1-score (%) |
|---|---|---|---|---|---|---|
| Linear SVM | Feature selection using LASSO with C = 0.03 | penalty = l2 | 76.03 | 78.81 | 73.27 | 76.59 |
| KNN | Feature selection using LASSO with C = 0.03 | n_neighbors = 1 | 94.88 | 95.08 | 94.68 | 94.87 |
| RF | Feature selection using LASSO with C = 0.01 | n_estimators = 1000 | 93.92 | 93.80 | 94.03 | 93.88 |
| XGBoost | Feature selection using LASSO with C = 0.03 | n_estimators = 1000 | 95.31 | 95.19 | 95.43 | 95.28 |
SVM: support vector machine, KNN: k-nearest neighbor, RF: random forest, XGBoost: extreme gradient boosting, LASSO: least absolute shrinkage and selection operator.
Performance of the models on unseen data
| Rank | Model | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1-score (%) |
|---|---|---|---|---|---|
| 1 | XGBoost | 95.78 | 95.32 | 96.23 | 95.74 |
| 2 | KNN | 95.62 | 95.57 | 95.67 | 95.60 |
| 3 | RF | 94.52 | 94.19 | 94.84 | 94.47 |
| 4 | Linear SVM | 75.47 | 77.75 | 73.21 | 75.93 |
SVM: support vector machine, KNN: k-nearest neighbor, RF: random forest, XGBoost: extreme gradient boosting.
Comparison of our methodology with other studies
| Study | Dataset | Methodology | Results |
|---|---|---|---|
| Little et al. [ | They used an original dataset consisting of 195 recordings collected from 31 patients where 23 were diagnosed with PD | They detected dysphonia by discriminating HCs from PD participants, by extracting time domain and frequency domain features | They achieved an accuracy of 91.4% using SVM classifier with 10 highly uncorrelated measures. |
| Benba et al. [ | They used a dataset consisting of 17 PD patients and 17 HCs | They classified PD participants from HCs using a set of recordings recorded using a computer’s microphone, and by extracting 20 MFCC coefficients | They achieved an accuracy of 91.17% using linear SVM with 12 MFCC coefficients. |
| Hemmerling et al. [ | They used an original dataset consisting of 198 recordings collected from 66 patients where 33 were diagnosed with PD | They extracted several acoustic features, and applied Principal Component Analysis (PCA) for feature selection | They achieved and accuracy of 93.43% using linear SVM |
| Singh and Xu [ | They selected randomly 1,000 recordings from the mPower database | They extracted MFCC coefficients using the python_speech_ features library and compared different feature selection techniques | They achieved an accuracy of 99% using SVM with an RBF kernel and by selecting important features using L1 feature selection technique |
| This study | We have used a set of 18,210 smartphone recordings from the mPower database where 9,105 recordings are of PD participants and 9,105 recordings are of healthy controls | We have extracted several features, from time frequency and cepstral domains, we have applied different preprocessing techniques and used two feature selection methods ANOVA and LASSO to compare Four different classifiers using 5-fold cross-validation | We have achieved on unseen data a high accuracy, sensitivity, and specificity of 95.78%, 95.32%, and 96.23% respectively, and an F1-score of 95.74% using XGBoost with 33 features out of 138 that were chosen using LASSO with C = 0.03 |
HC: health control group, PD: Parkinson’s disease, SVM: support vector machine, MFCC: mel-frequency cepstral coefficients, RBF: radial basis function, LASSO: least absolute shrinkage and selection operator.