| Literature DB >> 28379999 |
Yijun Zhao1, Brian C Healy2,3, Dalia Rotstein2, Charles R G Guttmann2, Rohit Bakshi2, Howard L Weiner2, Carla E Brodley4, Tanuja Chitnis2.
Abstract
OBJECTIVE: To explore the value of machine learning methods for predicting multiple sclerosis disease course.Entities:
Mesh:
Year: 2017 PMID: 28379999 PMCID: PMC5381810 DOI: 10.1371/journal.pone.0174866
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Baseline demographic and clinical characteristics of study sample.
| N | 1693 |
| Number of females (%) | 1248 (73.7%) |
| Age [years, mean (SD)] | 43.88 (11.46) |
| Number self-reported white (%) | 1562 (92.3%) |
| EDSS (≤1.5 | 919 / 539 / 235 |
EDSS-Expanded Disability Status Scale
Fig 1Flowchart of patient selection.
Fig 1 presents the distribution of patients after imputing the missing values. The labels on the arrows indicate the number of years of follow-up required in the training dataset and which filters were applied. The first box indicates the total number of patients assessed. Note that P (and %P) refers to the number (and percentage) of patients who meet “progression” criteria within the different subgroups.
Predictors of disease classification.
| Demographic | Visit age Disease duration at baseline visit Gender Race Ethnicity Family history of MS Smoking ever |
| Clinical | EDSS Ambulation Index Disease step Disease category Disease activity Pyramidal_functional status score Cerebellar_functional status score Brainstem_functional status score Sensory_functional status score Bowel_bladder_functional status score Visual functional status score Mental functional status score |
| MRI | BPF T2 lesion volume |
| Additional predictors |
EDSS-Expanded Disability Status Scale, FS-functional status, BPF-brain parenchymal fraction
Predictive accuracy for different groups using 12 month visit without MRI based on ten-fold cross-validation.
| Method | Sensitivity | Specificity | Overall Accuracy | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G0 | G1 | G2 | G3 | G0 | G1 | G2 | G3 | G0 | G1 | G2 | G3 | |
| logistic +bagging | 0.62 | 0.56 | 0.55 | 0.54 | 0.64 | 0.64 | 0.67 | 0.64 | 0.63 | 0.61 | 0.63 | 0.61 |
| logistic | 0.35 | 0.41 | 0.38 | 0.54 | 0.86 | 0.83 | 0.79 | 0.64 | 0.68 | 0.67 | 0.65 | 0.61 |
| SVM+bagging | 0.62 | 0.53 | 0.66 | 0.53 | 0.65 | 0.74 | 0.66 | 0.65 | 0.64 | 0.66 | 0.66 | 0.62 |
| SVM+bagging+ cost(1.5) | 0.78 | 0.76 | 0.78 | 0.61 | 0.42 | 0.43 | 0.54 | 0.61 | 0.55 | 0.55 | 0.62 | 0.61 |
| SVM+bagging+ cost(2.0) | 0.86 | 0.81 | 0.82 | 0.59 | 0.27 | 0.32 | 0.46 | 0.54 | 0.48 | 0.51 | 0.58 | 0.55 |
| SVM+bagging+ cost(2.5) | 0.96 | 0.87 | 0.80 | 0.64 | 0.05 | 0.21 | 0.43 | 0.56 | 0.38 | 0.46 | 0.55 | 0.58 |
| SVM+bagging+ cost(3.0) | 1.00 | 0.95 | 0.79 | 0.64 | 0.00 | 0.04 | 0.40 | 0.53 | 0.36 | 0.39 | 0.53 | 0.57 |
G0—entire data set; 1331 patients with 476 worsening cases
G1—patients with initial EDSS < 2; 767 patients with 292 worsening cases
G2—patients with initial EDSS > = 2 and < 4; 429 patients with 147 worsening cases
G3—patients with initial EDSS > = 4; 135 patients with 37 worsening cases
Abbreviations: EDSS-Expanded Disability Status Scale, SVM-Support vector machines
Sensitivity is defined in this study as the proportion of subjects who worsen that are correctly classified. Specificity is defined as the proportion of subjects who did not worsen that are correctly classified. Overall accuracy is defined as the proportion of all subjects who are correctly classified.
Predictive accuracy for different groups using 12 month visit with MRI based on ten-fold cross-validation.
| Method | Sensitivity | Specificity | Overall Accuracy | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G0 | G1 | G2 | G3 | G0 | G1 | G2 | G3 | G0 | G1 | G2 | G3 | |
| Logistic+bagging | 0.67 | 0.64 | 0.52 | 0.50 | 0.68 | 0.67 | 0.64 | 0.59 | 0.68 | 0.66 | 0.60 | 0.57 |
| logistic | 0.55 | 0.58 | 0.37 | 0.48 | 0.78 | 0.73 | 0.66 | 0.63 | 0.69 | 0.66 | 0.57 | 0.60 |
| SVM+bagging | 0.71 | 0.72 | 0.75 | 0.48 | 0.68 | 0.67 | 0.66 | 0.59 | 0.69 | 0.69 | 0.69 | 0.56 |
| SVM+bagging+ cost(1.5) | 0.81 | 0.82 | 0.80 | 0.52 | 0.59 | 0.58 | 0.57 | 0.55 | 0.67 | 0.68 | 0.65 | 0.54 |
| SVM+bagging+ cost(2.0) | 0.85 | 0.85 | 0.81 | 0.48 | 0.53 | 0.54 | 0.50 | 0.54 | 0.65 | 0.67 | 0.60 | 0.53 |
| SVM+bagging+ cost(2.5) | 0.86 | 0.87 | 0.77 | 0.48 | 0.49 | 0.49 | 0.47 | 0.54 | 0.63 | 0.65 | 0.57 | 0.52 |
| SVM+bagging+ cost(3.0) | 0.86 | 0.87 | 0.79 | 0.52 | 0.47 | 0.46 | 0.45 | 0.60 | 0.62 | 0.63 | 0.56 | 0.58 |
G0—entire data set; 574 patients with 216 worsening cases
G1—patients with initial EDSS < 2; 368 patients with 153 worsening cases
G2—patients with initial EDSS > = 2 and < 4; 161 patients with 53 worsening cases
G3—patients with initial EDSS > = 4; 45 patients with 10 worsening cases
Abbreviations: EDSS-Expanded Disability Status Scale, SVM-Support vector machines
Sensitivity is defined in this study as the proportion of subjects who worsen that are correctly classified. Specificity is defined as the proportion of subjects who did not worsen that are correctly classified. Overall accuracy is defined as the proportion of all subjects who are correctly classified.
Predictive accuracy for different groups using 24 month visit without MRI based on ten-fold cross-validation.
| Method | Sensitivity | Specificity | Overall Accuracy | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G0 | G1 | G2 | G3 | G0 | G1 | G2 | G3 | G0 | G1 | G2 | G3 | |
| Logistic+bagging | 0.62 | 0.60 | 0.60 | 0.60 | 0.70 | 0.69 | 0.69 | 0.71 | 0.67 | 0.66 | 0.66 | 0.67 |
| logistic | 0.47 | 0.47 | 0.51 | 0.57 | 0.86 | 0.81 | 0.81 | 0.74 | 0.73 | 0.69 | 0.72 | 0.70 |
| SVM+bagging | 0.61 | 0.58 | 0.64 | 0.54 | 0.79 | 0.77 | 0.77 | 0.62 | 0.73 | 0.71 | 0.73 | 0.60 |
| SVM+bagging+ cost(1.5) | 0.77 | 0.74 | 0.75 | 0.65 | 0.53 | 0.56 | 0.66 | 0.59 | 0.61 | 0.62 | 0.69 | 0.61 |
| SVM+bagging+ cost(2.0) | 0.81 | 0.77 | 0.76 | 0.63 | 0.45 | 0.43 | 0.60 | 0.60 | 0.57 | 0.55 | 0.65 | 0.60 |
| SVM+bagging+ cost(2.5) | 0.84 | 0.81 | 0.77 | 0.67 | 0.37 | 0.38 | 0.57 | 0.54 | 0.52 | 0.53 | 0.63 | 0.57 |
| SVM+bagging+ cost(3.0) | 0.87 | 0.82 | 0.75 | 0.68 | 0.3 | 0.33 | 0.58 | 0.59 | 0.49 | 0.51 | 0.63 | 0.61 |
G0—entire data set; 1236 patients with 409 worsening cases
G1—patients with initial EDSS < 2; 714 patients with 255 worsening cases
G2—patients with initial EDSS > = 2 and < 4; 397 patients with 122 worsening cases
G3—patients with initial EDSS > = 4; 125 patients with 32 worsening cases
Abbreviations: EDSS-Expanded Disability Status Scale, SVM-Support vector machines
Sensitivity is defined in this study as the proportion of subjects who worsen that are correctly classified. Specificity is defined as the proportion of subjects who did not worsen that are correctly classified. Overall accuracy is defined as the proportion of all subjects who are correctly classified.
Predictive accuracy for different groups using 24 month visit data with MRI based on ten-fold cross-validation.
| Method | Sensitivity | Specificity | Overall Accuracy | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G0 | G1 | G2 | G3 | G0 | G1 | G2 | G3 | G0 | G1 | G2 | G3 | |
| Logistic+bagging | 0.59 | 0.51 | 0.46 | 0.70 | 0.66 | 0.61 | 0.64 | 0.69 | 0.63 | 0.58 | 0.60 | |
| logistic | 0.59 | 0.57 | 0.51 | 0.58 | 0.78 | 0.70 | 0.73 | 0.61 | 0.71 | 0.65 | 0.66 | 0.61 |
| SVM+bagging | 0.65 | 0.65 | 0.74 | 0.30 | 0.74 | 0.74 | 0.76 | 0.46 | 0.71 | 0.70 | 0.75 | 0.43 |
| SVM+bagging+ cost(1.5) | 0.79 | 0.81 | 0.77 | 0.38 | 0.59 | 0.56 | 0.69 | 0.48 | 0.67 | 0.66 | 0.72 | 0.46 |
| SVM+bagging+ cost(2.0) | 0.81 | 0.82 | 0.82 | 0.40 | 0.56 | 0.54 | 0.62 | 0.45 | 0.65 | 0.65 | 0.68 | 0.44 |
| SVM+bagging+ cost(2.5) | 0.82 | 0.84 | 0.81 | 0.40 | 0.54 | 0.52 | 0.61 | 0.40 | 0.64 | 0.65 | 0.68 | 0.40 |
| SVM+bagging+ cost(3.0) | 0.82 | 0.84 | 0.80 | 0.28 | 0.51 | 0.48 | 0.61 | 0.41 | 0.63 | 0.63 | 0.67 | 0.38 |
G0—entire data set; 574 patients with 212 worsening cases
G1—patients with initial EDSS < 2; 370 patients with 151 worsening cases
G2—patients with initial EDSS > = 2 and < 4; 159 patients with 51 worsening cases
G3—patients with initial EDSS > = 4; 45 patients with 10 worsening cases
Abbreviations: EDSS-Expanded Disability Status Scale, SVM-Support vector machines
Sensitivity is defined in this study as the proportion of subjects who worsen that are correctly classified. Specificity is defined as the proportion of subjects who did not worsen that are correctly classified. Overall accuracy is defined as the proportion of all subjects who are correctly classified.
Comparison of predictive accuracy for Group 2 (G2) using 12 month and 24 month clinical and MRI data based on ten-fold cross-validation.
| cost | Accuracy of predicting worsening EDSS class (Sensitivity) | Accuracy of predicting non-worsening EDSS class (Specificity) | Overall Accuracy | |||
|---|---|---|---|---|---|---|
| 1Y | 2Y | 1Y | 2Y | 1Y | 2Y | |
| 1 | 0.75 | 0.74 | 0.66 | 0.76 | 0.69 | 0.75 |
| 1.5 | 0.80 | 0.77 | 0.57 | 0.69 | 0.65 | 0.72 |
| 2 | 0.81 | 0.82 | 0.50 | 0.62 | 0.60 | 0.68 |
| 2.5 | 0.77 | 0.81 | 0.47 | 0.61 | 0.57 | 0.68 |
| 3 | 0.79 | 0.80 | 0.45 | 0.61 | 0.56 | 0.67 |