| Literature DB >> 34625614 |
Feng Lin1, Jiarui Han2, Teng Xue3, Jilan Lin1, Shenggen Chen1, Chaofeng Zhu1, Han Lin1, Xianyang Chen2, Wanhui Lin4, Huapin Huang5.
Abstract
Many studies report predictions for cognitive function but there are few predictions in epileptic patients; therefore, we established a workflow to efficiently predict outcomes of both the Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA) in outpatients with epilepsy. Data from 441 outpatients with epilepsy were included; of these, 433 patients met the 12 clinical characteristic criteria and were divided into training (n = 304) and experimental (n = 129) groups. After descriptive statistics were analyzed, cross-validation was used to select the optimal model. The random forest (RF) algorithm was combined with the redundancy analysis (RDA) algorithm; then, optimal feature selection and resampling were carried out after removing linear redundancy information. The features that contributed more to multiple outcomes were selected. Finally, the external traceability of the model was evaluated using the follow-up data. The RF algorithm was the best prediction model for both MMSE and MoCA outcomes. Finally, seven markers were screened by overlapping the top ten important features for MMSE ranked by RF modeling, those ranked for MoCA ranked by RF modeling, and those for both assessments ranked by RDA. The optimal combination of features were namely, sex, age, age of onset, seizure frequency, brain MRI abnormalities, epileptiform discharge in EEG and usage of drugs. which was the most efficient in predicting outcomes of MMSE, MoCA, and both assessments.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34625614 PMCID: PMC8501137 DOI: 10.1038/s41598-021-99506-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Clinical data screening process.
Features of epileptic patients.
| Features | N (%) or mean ± SD |
|---|---|
| Female | 194 (44.80%) |
| Male | 239 (55.20%) |
| Age | 26.00 ± 4.24 |
| Age of onset | 24.50 ± 2.12 |
| Frequent | 75 (17.32%) |
| Occasional | 180 (41.57%) |
| Rare | 178 (41.11%) |
| Without generalized tonic–clonic seizures | 142 (32.79%) |
| With generalized tonic–clonic seizures | 291 (67.21%) |
| Negative | 417 (96.30%) |
| 16 (3.70%) | |
| Negative | 415 (95.84%) |
| Positive | 18 (4.16%) |
| Negative | 365 (84.29%) |
| Positive | 68 (15.71%) |
| Negative | 390 (90.06%) |
| Positive | 43 (9.94%) |
| Negative | 333 (76.90%) |
| Positive | 100 (23.10%) |
| Negative | 74 (17.09%) |
| Positive | 359 (82.91%) |
| Valproate sodium (1) | 136 (31.40%) |
| No drugs (2) | 23 (5.31%) |
| Others (3) | 274 (63.29%) |
Results of cross-validation for different machine learning algorithms using MMSE data.
| Model | Sensitivity | Specificity | Accuracy | Precision | Recall | AUC |
|---|---|---|---|---|---|---|
| LR | 0.68 | 0.75 | 0.73 | 0.48 | 0.68 | 0.67 |
| DT | 0.41 | 0.89 | 0.78 | 0.70 | 0.41 | 0.63 |
| RF | 0.69 | 0.80 | 0.78 | 0.63 | 0.69 | 0.72 |
| SVM | 0.60 | 0.90 | 0.83 | 0.73 | 0.60 | 0.70 |
Ranking of feature importance in the RF model (MMSE).
| CK | DIS | Mean decrease in accuracy | |
|---|---|---|---|
| Age | 12.12 | 12.67 | 13.78 |
| Age of onset | 10.76 | 10.94 | 12.92 |
| Sex | 4.85 | 4.70 | 6.18 |
| Usage of drugs | 3.13 | 3.50 | 4.51 |
| History of brain trauma or surgery | 3.12 | 1.62 | 3.39 |
| Seizure type | 2.91 | 0.70 | 2.75 |
| Seizure frequency | 2.26 | − 1.35 | 1.17 |
| Epileptiform discharge in EEG | 2.34 | − 1.40 | 1.12 |
| Brain MRI abnormalities | 0.49 | 0.70 | 0.82 |
| Other brain diseases | 0.41 | 0.61 | 0.64 |
| Family history | − 0.90 | 2.39 | 0.53 |
| Status epilepticus | − 0.67 | 0.41 | − 0.25 |
CK control check; DIS disease—people with epilepsy.
Results of cross-validation for different machine learning algorithms using MoCA data.
| Model | Sensitivity | Specificity | Accuracy | Precision | Recall | AUC |
|---|---|---|---|---|---|---|
| LR | 0.57 | 0.76 | 0.64 | 0.84 | 0.57 | 0.62 |
| DT | 0.73 | 0.53 | 0.67 | 0.78 | 0.73 | 0.61 |
| SVM | 0.66 | 0.76 | 0.69 | 0.87 | 0.66 | 0.60 |
| RF | 0.51 | 0.81 | 0.60 | 0.88 | 0.51 | 0.71 |
Ranking of feature importance in the RF model (MoCA).
| CK | DIS | Mean decrease in accuracy | |
|---|---|---|---|
| Status epilepticus | − 3.27 | − 2.31 | 3.66 |
| History of brain trauma or surgery | 0.90 | 3.32 | 3.09 |
| Seizure frequency | 1.27 | 1.97 | 2.46 |
| Sex | − 2.46 | 4.65 | 2.39 |
| Usage of drugs | 0.26 | 2.96 | 2.30 |
| Age | 2.35 | 0.6 | 1.78 |
| Epileptiform discharge in EEG | − 1.95 | − 0.87 | 1.68 |
| Family history | 1.01 | 1.52 | 1.62 |
| Brain MRI abnormalities | 1.28 | 0.13 | 0.90 |
| Age of onset | − 1.07 | 1.49 | 0.71 |
| Other brain diseases | 1.72 | − 0.56 | 0.56 |
| Seizure type | − 2.86 | 1.69 | 0.34 |
The contribution of features in the RDA model.
| RDA1 | RDA2 | |
|---|---|---|
| Age | 0.86 | 0.11 |
| Age of onset | 0.69 | 0.17 |
| Sex | − 0.32 | − 0.27 |
| Usage of drugs | − 0.29 | 0.20 |
| Seizure frequency | 0.20 | − 0.30 |
| Epileptiform discharge in EEG | 0.18 | − 0.01 |
| Brain MRI abnormalities | 0.18 | − 0.38 |
| Status epilepticus | 0.13 | 0.16 |
| Seizure type | − 0.07 | − 0.23 |
| Family history | − 0.05 | 0.33 |
| History of brain trauma or surgery | − 0.01 | 0.45 |
| Other brain diseases | − 0.00 | 0.32 |
Figure 2RDA analysis plot. The length of the arrow shows the strength of the correlation between the variable and the result variable. The longer the arrow length, the stronger the correlation. The vertical distance reflects the correlation between them. The smaller the distance, the stronger the correlation.
Figure 3In the Venn diagram, each circle represents the difference variable in a model, the number of overlaps in the circle represents the number of common variables in the two models, and the overlap area represents the number of unique variables in each model (purple: MMSE; Yellow: MOCA; Green: RDA).
Figure 4ROC curve of MMSE's prediction model. (red: the optimal combination of variables; blue: the top ten features of RDA; green: the top ten features of MMSE RF analysis; purple: the top ten features of MoCA RF analysis).
Validation dataset validated all the combinations of features (MMSE).
| Model | Susceptibility | Specificity | Accuracy | Precision | Recall |
|---|---|---|---|---|---|
| The optimal combination of features | 0.55 | 0.90 | 0.82 | 0.61 | 0.55 |
| The top ten features of RDA | 0.76 | 0.69 | 0.70 | 0.4 | 0.76 |
| The top ten features of MMSE RF analysis | 0.52 | 0.83 | 0.76 | 0.47 | 0.52 |
| The top ten features of MoCA RF analysis | 0.62 | 0.55 | 0.56 | 0.28 | 0.62 |
MMSE Mini-Mental State Examination; MoCA Montreal Cognitive Assessment.
Figure 5ROC curve of MOCA's prediction model. (red: the optimal combination of variables; blue: the top ten features of RDA; green: the top ten features of MMSE RF analysis; purple: the top ten features of MoCA RF analysis).
Validation dataset validated all the combinations of features (MoCA).
| Model | Susceptibility | Specificity | Accuracy | Precision | Recall |
|---|---|---|---|---|---|
| The optimal combination of features | 0.41 | 0.90 | 0.57 | 0.90 | 0.41 |
| The top ten features of RDA | 0.48 | 0.76 | 0.57 | 0.81 | 0.48 |
| The top ten features of MMSE RF analysis | 0.67 | 0.62 | 0.65 | 0.78 | 0.66 |
| The top ten features of MoCA RF analysis | 0.39 | 0.81 | 0.53 | 0.81 | 0.39 |
Different candidate variables predict the correct probability at the same time.
| Model | (0, 0) (%) | (0, 1) (%) | (1, 0) (%) | (1, 1) (%) |
|---|---|---|---|---|
| The optimal combination of features | 38.10 | 70.00 | 21.43 | 50.00 |
| The top ten features of RDA | 38.47 | 72.97 | 13.16 | 46.67 |
| The top ten features of MMSE RF analysis | 16.67 | 28.57 | 0 | 4.88 |
| The top ten features of MoCA RF analysis | 10.00 | 39.53 | 4.55 | 9.09 |