| Literature DB >> 35510061 |
Fahmida Haque1, Mamun B I Reaz1, Muhammad E H Chowdhury2, Serkan Kiranyaz2, Sawal H M Ali1, Mohammed Alhatou3,4, Rumana Habib5, Ahmad A A Bakar1, Norhana Arsad1, Geetika Srivastava6.
Abstract
Background: Diabetic sensorimotor polyneuropathy (DSPN) is a major form of complication that arises in long-term diabetic patients. Even though the application of machine learning (ML) in disease diagnosis is very common and well-established in the field of research, its application in DSPN diagnosis using nerve conduction studies (NCS), is very limited in the existing literature. Method: In this study, the NCS data were collected from the Diabetes Control and Complications Trial (DCCT) and its follow-up Epidemiology of Diabetes Interventions and Complications (EDIC) clinical trials. The NCS variables are median motor velocity (m/sec), median motor amplitude (mV), median motor F-wave (msec), median sensory velocity (m/sec), median sensory amplitude (μV), Peroneal Motor Velocity (m/sec), peroneal motor amplitude (mv), peroneal motor F-wave (msec), sural sensory velocity (m/sec), and sural sensory amplitude (μV). Three different feature ranking techniques were used to analyze the performance of eight different conventional classifiers.Entities:
Mesh:
Year: 2022 PMID: 35510061 PMCID: PMC9061035 DOI: 10.1155/2022/9690940
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Number of samples in classes among the original, train, and test datasets.
Figure 2Flow chart of the data processing and ML model performance analysis for DSPN severity classification using NCS data.
Baseline characteristics of the EDIC patients.
|
| Mean | Std. error mean | Min | Max | 95% confidence interval |
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
| Lower limit | Upper limit | ||||||||
| Age (years) | 35.95 ± 6.93 | 0.20 | 20.42 | 50.99 | 35.57 | 36.34 | 0.18 | <0.05 | 43.96 |
| Diabetic duration (years) | 14.51 ± 4.92 | 0.14 | 7.08 | 26.92 | 14.24 | 14.78 | 0.12 | <0.05 | 18.40 |
| Hba1c (%) | 8.22 ± 1.39 | 0.04 | 0.00 | 14.00 | 8.14 | 8.30 | 0.10 | 0.95 | 12.54 |
| HDL Cholesterol (mg/dl) | 52.559 ± 15.98 | 0.45 | 0.00 | 103.0 | 51.67 | 53.44 | 0.002 | <0.05 | 0.01 |
| LDL Cholesterol (mg/dl) | 110.68 ± 36.48 | 1.03 | 0.00 | 235.0 | 108.7 | 112.7 | 0.07 | 0.21 | 5.62 |
| BMI (kg/m2) | 26.17 ± 4.05 | 0.11 | 16.63 | 39.48 | 25.94 | 26.39 | 0.04 | <0.05 | 1.56 |
| Hypertension (%) | 0.23 ± 0.42 | 0.01 | 1.00 | 0.16 | 0.20 | 0.25 | 0.16 | <0.05 | 31.75 |
Figure 3Ranking of the NCS features using (a) mrmr (b) relief (c) fscnca feature ranking techniques.
Optimized Hyperparameters of the studied ML algorithms.
| Algorithm | Tuned Hyperparameters |
|---|---|
| Discriminant analysis classifier (DAC) | Discriminate type = quadratic |
| FillCoeffs = Off | |
|
| |
| Ensemble classification model (EC) | Method = AdaBoostM2 |
| Number of learning cycles = 477 | |
| Learning rate = 0.9845217666852848 | |
| Maximum number splits = 311, | |
| Number of variables to sample = all | |
|
| |
| K-nearest neighbour (KNN) model | Distance = Euclidean, |
| Number of neighbors = 1, | |
| Distance weight = inverse | |
| Standardized = false | |
|
| |
| Naive Bayes classifier (NB) | Distribution names = mvmn (multivariate multinomial distribution) |
| Kernel = normal | |
| Support = unbounded | |
|
| |
| Support vector machine classifier (SVM) | Learners = SVM |
| Categorical predictors = all | |
| Split criterion = deviance | |
| Maximum number splits = 960 | |
| Surrogate = off | |
|
| |
| Decision Tree (DT) | Split criterion = deviance |
| Maximum number splits = 960 | |
| Surrogate = off | |
|
| |
| Random Forest (RF) | Number of trees = 100 |
| Compute OOB prediction (flag to compute out-of-bag predictions) = on | |
| Method = classification | |
|
| |
| Logistic Regression (LR) | lambda (regularization parameter) = 1e-4 |
Performance evaluation of different ML models using mrmr feature selection technique for NCS.
| Features | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1-score (%) | Error rate | MCC | Kappa | AUC | |
|---|---|---|---|---|---|---|---|---|---|
| EC | Top 10 | 93.25 ± 0.95 | 91.69 ± 1.03 | 98.44 ± 0.61 | 91.77 ± 1.02 | 0.07 ± 0.01 | 0.90 | 0.82 | 1.00 |
| RF | Top 10 | 93.06 ± 0.63 | 91.61 ± 0.69 | 98.92 ± 0.59 | 91.52 ± 0.76 | 0.07 ± 0.01 | 0.89 | 0.81 | 1.00 |
| DT | Top 10 | 91.34 ± 1.60 | 89.86 ± 1.88 | 99.37 ± 0.46 | 89.46 ± 1.99 | 0.09 ± 0.02 | 0.87 | 0.77 | 0.98 |
| KNN | Top 10 | 79.47 ± 0.94 | 75.71 ± 0.89 | 91.95 ± 1.05 | 75.89 ± 1.01 | 0.21 ± 0.01 | 0.69 | 0.45 | 0.91 |
| SVM | Top 8 | 75.98 ± 1.59 | 69.29 ± 1.92 | 75.18 ± 2.06 | 72.54 ± 1.84 | 0.24 ± 0.02 | 0.64 | 0.36 | 0.96 |
| NB | Top 10 | 73.90 ± 2.02 | 72.35 ± 2.16 | 95.31 ± 1.01 | 72.43 ± 2.02 | 0.26 ± 0.02 | 0.64 | 0.30 | 0.95 |
| LR | Top 9 | 71.76 ± 1.89 | 69.45 ± 1.85 | 93.42 ± 1.22 | 69.19 ± 1.82 | 0.28 ± 0.02 | 0.60 | 0.25 | 0.95 |
| DAC | Top 9 | 70.73 ± 2.44 | 68.66 ± 2.43 | 94.11 ± 1.24 | 68.52 ± 2.24 | 0.29 ± 0.02 | 0.59 | 0.22 | 0.94 |
Performance evaluation of different ML models using relief feature selection technique for NCS.
| Features | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1-score (%) | Error rate | MCC | Kappa | AUC | |
|---|---|---|---|---|---|---|---|---|---|
| EC | Top 10 | 93.40 ± 0.97 | 91.77 ± 1.15 | 98.44 ± 0.73 | 91.90 ± 1.11 | 0.07 ± 0.01 | 0.90 | 0.82 | 1.00 |
| RF | Top 10 | 93.25 ± 0.80 | 91.92 ± 1.04 | 99.10 ± 0.53 | 91.78 ± 1.04 | 0.07 ± 0.01 | 0.90 | 0.82 | 1.00 |
| DT | Top 10 | 91.43 ± 1.66 | 89.99 ± 1.96 | 99.40 ± 0.47 | 89.57 ± 2.05 | 0.09 ± 0.02 | 0.87 | 0.77 | 0.98 |
| KNN | Top 10 | 79.47 ± 0.94 | 75.71 ± 0.89 | 91.95 ± 1.05 | 75.89 ± 1.01 | 0.21 ± 0.01 | 0.69 | 0.45 | 0.91 |
| SVM | Top 8 | 75.98 ± 1.59 | 69.29 ± 1.92 | 75.18 ± 2.06 | 72.54 ± 1.84 | 0.24 ± 0.02 | 0.64 | 0.36 | 0.96 |
| NB | Top 10 | 73.90 ± 2.02 | 72.35 ± 2.16 | 95.31 ± 1.01 | 72.43 ± 2.02 | 0.26 ± 0.02 | 0.64 | 0.30 | 0.95 |
| LR | Top 9 | 71.76 ± 1.89 | 69.45 ± 1.85 | 93.42 ± 1.22 | 69.19 ± 1.82 | 0.28 ± 0.02 | 0.60 | 0.25 | 0.95 |
| DAC | Top 9 | 70.73 ± 2.44 | 68.66 ± 2.43 | 94.11 ± 1.24 | 68.52 ± 2.24 | 0.29 ± 0.02 | 0.59 | 0.22 | 0.94 |
Performance evaluation of different ML models using fscnca feature selection technique for NCS.
| Features | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1-score (%) | Error rate | MCC | Kappa | AUC | |
|---|---|---|---|---|---|---|---|---|---|
| RF | Top 10 | 93.26 ± 0.91 | 91.95 ± 1.03 | 98.95 ± 0.62 | 91.80 ± 1.07 | 0.07 ± 0.01 | 0.90 | 0.82 | 1.00 |
| EC | Top 10 | 93.16 ± 0.89 | 91.49 ± 1.00 | 98.38 ± 0.78 | 91.62 ± 0.96 | 0.07 ± 0.01 | 0.89 | 0.82 | 1.00 |
| DT | Top 10 | 91.60 ± 1.95 | 90.19 ± 2.36 | 99.40 ± 0.47 | 89.78 ± 2.44 | 0.08 ± 0.02 | 0.87 | 0.78 | 0.98 |
| KNN | Top 10 | 79.47 ± 0.94 | 75.71 ± 0.89 | 91.95 ± 1.05 | 75.89 ± 1.01 | 0.21 ± 0.01 | 0.69 | 0.45 | 0.91 |
| SVM | Top 8 | 75.03 ± 1.42 | 68.17 ± 1.76 | 72.69 ± 2.70 | 71.95 ± 1.65 | 0.25 ± 0.01 | 0.63 | 0.33 | 0.95 |
| NB | Top 10 | 73.90 ± 2.02 | 72.35 ± 2.16 | 95.31 ± 1.01 | 72.43 ± 2.02 | 0.26 ± 0.02 | 0.64 | 0.30 | 0.95 |
| LR | Top 10 | 71.57 ± 1.92 | 69.15 ± 1.88 | 93.33 ± 1.27 | 68.91 ± 1.84 | 0.28 ± 0.02 | 0.59 | 0.24 | 0.95 |
| DAC | Top 10 | 70.33 ± 2.09 | 68.15 ± 2.04 | 93.96 ± 1.42 | 68.05 ± 1.95 | 0.30 ± 0.02 | 0.58 | 0.21 | 0.93 |
Figure 4Confusion matrix of the test set for ensemble classifier using Top 10 ranked features by relief feature ranking technique.
Figure 5ROC curve for all 10 features using relief feature ranking using ensemble classifier.
Figure 6ROC curve for all 10 features using fscnca feature ranking using random forest classifier.
Performance evaluation of individual NCS features using the EC model.
| NCS Features | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1-score (%) | Error rate | MCC | kappa |
|---|---|---|---|---|---|---|---|
| Peroneal motor velocity (m/sec) | 57.09 ± 1.30 | 57.00 ± 1.29 | 55.49 ± 1.16 | 56.24 ± 1.22 | 0.43 ± 0.01 | 0.42 | 0.13 |
| Median sensory velocity (m/sec) | 48.54 ± 1.22 | 48.44 ± 1.21 | 46.46 ± 1.39 | 47.43 ± 1.29 | 0.51 ± 0.01 | 0.30 | 0.27 |
| Median motor F-wave (msec) | 55.92 ± 1.62 | 55.84 ± 1.62 | 55.14 ± 1.65 | 55.48 ± 1.63 | 0.44 ± 0.02 | 0.41 | 0.15 |
| Peroneal motor amplitude (mV) | 50.27 ± 1.57 | 50.17 ± 1.56 | 47.74 ± 1.65 | 48.92 ± 1.60 | 0.50 ± 0.02 | 0.33 | 0.25 |
| Sural sensory velocity (m/sec) | 51.11 ± 1.25 | 51.00 ± 1.25 | 48.26 ± 1.55 | 49.59 ± 1.40 | 0.49 ± 0.01 | 0.34 | 0.23 |
| Median motor velocity (m/sec) | 53.53 ± 1.39 | 53.44 ± 1.38 | 51.40 ± 1.44 | 52.40 ± 1.41 | 0.46 ± 0.01 | 0.37 | 0.19 |
| Median sensory amplitude ( | 49.84 ± 1.65 | 49.72 ± 1.64 | 46.29 ± 2.00 | 47.94 ± 1.83 | 0.50 ± 0.02 | 0.32 | 0.25 |
| Peroneal motor F-wave (msec) | 52.92 ± 0.90 | 52.84 ± 0.91 | 51.37 ± 1.09 | 52.09 ± 0.98 | 0.47 ± 0.01 | 0.37 | 0.20 |
| Sural sensory amplitude ( | 50.15 ± 1.07 | 50.03 ± 1.06 | 47.30 ± 1.25 | 48.63 ± 1.15 | 0.50 ± 0.01 | 0.32 | 0.25 |
| Median motor amplitude (mV) | 42.73 ± 1.92 | 42.65 ± 1.93 | 39.93 ± 2.11 | 41.25 ± 2.03 | 0.57 ± 0.02 | 0.22 | 0.34 |
Validation of the grading model proposed by Feldman et al. [27] and this work with DCCT/EDIC ground truth using Fisher exact test.
| DCCT/EDIC ground truth | Total no of samples in the dataset | ||
|---|---|---|---|
| Non-DSPN | DSPN | ||
|
| |||
| Absent | 2610 (100%) | 0 (0%) | 2610 |
| Mild | 219 (21%) | 815 (79%) | 1034 |
| Moderate | 0 (0%) | 1092 (100%) | 1092 |
| Severe | 8 (0.67%) | 1194 (99.33%) | 1202 |
| Total | 2837 | 3101 | 5939 |
|
| |||
|
| |||
| Absent | 2568 (98.4%) | 42 (1.61%) | 2610 |
| Mild | 52 (5.03%) | 982 (94.97%) | 1034 |
| Moderate | 0 (0%) | 1092 (100%) | 1092 |
| Severe | 0 (0%) | 1202 (100%) | 1202 |
| Total | 2837 | 3101 | 5939 |