| Literature DB >> 31723436 |
Marco Tk Law1, Anthony L Traboulsee2, David Kb Li3, Robert L Carruthers2, Mark S Freedman4, Shanon H Kolind3, Roger Tam1.
Abstract
BACKGROUND: Enhanced prediction of progression in secondary progressive multiple sclerosis (SPMS) could improve clinical trial design. Machine learning (ML) algorithms are methods for training predictive models with minimal human intervention.Entities:
Keywords: Artificial intelligence; decision support techniques; disease progression; machine learning; prognosis; secondary progressive multiple sclerosis
Year: 2019 PMID: 31723436 PMCID: PMC6836306 DOI: 10.1177/2055217319885983
Source DB: PubMed Journal: Mult Scler J Exp Transl Clin ISSN: 2055-2173
Baseline predictor characteristics of the study sample.
| CDP+ ( | CDP− ( | Overall ( | |
|---|---|---|---|
| Demographical features | |||
| # of females | 74 (64.3%) | 237 (64.1%) | 311 (64.1%) |
| Mean age [years] (SD) | 50.3 (8.2) | 51.1 (7.9) | 50.9 (8.0) |
| Mean duration[ | 9.1 (4.4) | 9.3 (5.1) | 9.3 (5.0) |
| Clinical features | |||
| Median EDSS (25th, 75th %tile) | 6.0 (4.5, 6.0) | 6.0 (4.5, 6.5) | 6.0 (4.5, 6.5) |
| Mean T25W[ | 0.08 (1.52) | 0.05 (1.54) | 0.06 (1.54) |
| Mean 9HP[ | −0.02 (0.93) | 0.07 (0.95) | 0.05 (0.95) |
| Mean PASAT[ | 0.05 (1.02) | 0.01 (1.00) | 0.02 (1.01) |
| MRI biomarkers | |||
| Median T2LV [mm3](25th, 75th %tile) | 10,403.9(3392.5, 19796.4) | 9012.0 (3730.3, 19889.3) | 9321.4 (3621.6, 19872.8) |
| Mean BPF (SD) | 0.7559 (0.0473) | 0.7520 (0.0474) | 0.7530 (0.0476) |
aDisease duration (time since first MS diagnosis).
bStandardized to the Task Force Dataset.[14]Note. Bold face highlights the statistically significant p < 0.05 findings.
Figure 1.Training receiver operating characteristic curve for individual and ensemble models using logistic regression and linear SVM, and decision tree algorithms
Figure 2.Validation receiver operating characteristic curve for individual and ensemble models using logistic regression and linear SVM, and decision tree algorithms
Area under the curve (AUC) of individual and ensemble models constructed using logistic regression, SVM, DT algorithms, and comparisons to other models.
| Reference model | AUC | % AUC difference[ | ||||||
|---|---|---|---|---|---|---|---|---|
| Comparison Model | ||||||||
| % | SD | ensLR | LSVM | ensLSVM | DT | RF | AdB | |
| LR | 49.5 | 3.1 | 1.7(0.595)[−1.9, 5.3] | 2.9(0.107)[−2.6, 8.4] | 1.6 (0.612)[−2.2, 5.3] |
|
|
|
| ensLR | 51.1 | 2.7 | 1.2(0.703)[−2.2, 4.7] | −0.1(0.965)[−3.4, 3.1] |
|
|
| |
| LSVM | 52.4 | 3.1 | −1.4(0.653)[−5.1, 2.4] |
|
|
| ||
| ensLSVM | 51.0 | 2.7 |
|
|
| |||
| DT | 61.8 | 3.0 | −1.1(0.460)[−6.6, 4.4] | −1.6(0.487)[−6.6, 3.4] | ||||
| RF | 60.7 | 3.1 | −0.5 (0.843)[−5.4, 4.3] | |||||
| AdB | 60.2 | 3.1 | ||||||
aDifference is comparison model AUC minus reference model AUC.
bP-value obtained using DeLong’s algorithm for comparing AUC.[20,21]
Sensitivity performance at optimal classification thresholds of individual and ensemble models constructed using logistic regression, LSVM, DT algorithms, and comparisons to other models.
| Reference model | Sensitivity | % Sensitivity difference[ | ||||||
|---|---|---|---|---|---|---|---|---|
| Comparison model | ||||||||
| % | SD | ensLR | LSVM | ensLSVM | DT | RF | AdB | |
| LR | 49.6 | 4.7 | 4.3(0.377)[−5.1, 13.8] |
| −3.5(0.500)[−13.4, 6.4] | 8.7(0.193)[−4.2, 21.6] | 9.6(0.162)[−3.6, 22.8] | 3.5(0.576)[−8.6, 15.5] |
| ensLR | 53.9 | 4.6 |
| −7.8(0.133)[−17.8, 2.2] | 4.3(0.512)[−8,5, 17.2] | 5.2(0.427)[−7.5, 18.0] | −0.9(0.896)[−13.7, 12.0] | |
| LSVM | 68.7 | 4.3 | − | −10.4(0.111)[−23.0, 2.2] | −9.6(0.141)[−22.1, 3.0] | − | ||
| ensLSVM | 46.1 | 4.6 | 12.2(0.053)[0.1, 24.3] |
| 7.0(0.262)[−5.0, 18.9] | |||
| DT | 58.3 | 4.6 | 0.9(0.815)[−6.2, 7.9] | −5.2(0.265)[−14.2, 3.8] | ||||
| RF | 59.1 | 4.6 | −6.1(0.200)[−15.2, 3.0] | |||||
| AdB | 53.0 | 4.7 | ||||||
aDifference is comparison model sensitivity minus reference model sensitivity.
bP-value obtained using the McNemar χ2 test.[22]
Specificity performance at optimal classification thresholds of individual and ensemble models constructed using logistic regression, LSVM, DT algorithms, and comparisons to other models.
| Reference model | Specificity | % Specificity difference[ | ||||||
|---|---|---|---|---|---|---|---|---|
| Comparison model | ||||||||
| % | SD | ensLR | LSVM | ensLSVM | DT | RF | AdB | |
| LR | 51.1 | 2.6 | −2.7(0.355)[−8.4, 3.0] | − | 4.9(0.081)[−0.57, 10.3] |
|
|
|
| ensLR | 48.4 | 2.6 | − |
|
|
|
| |
| LSVM | 37.0 | 2.5 |
|
|
|
| ||
| ensLSVM | 55.9 | 2.6 | 6.2(0.083)[−0.8, 13.2] | 5.1(0.159)[−2.0, 12.2] | 6.5(0.066)[−0.4, 13.4] | |||
| DT | 62.2 | 2.5 | −1.1(0.500)[−4.2, 2.0) | 0.3(0.920)[−4.9, 5.5] | ||||
| RF | 61.1 | 2.5 | 1.4(0.621)[−4.0, 6.7] | |||||
| AdB | 62.4 | 2.5 | ||||||
aDifference is comparison model specificity minus reference model specificity.
bP-value obtained using the McNemar χ2 test.[22]
Positive predictive value, relativity to other models, and change in pre- to post-positive test probabilities at optimal classification thresholds of individual and ensemble models constructed using logistic regression, LSVM, DT algorithms, and comparisons to other models.
| Reference model | PPV | Relative PPV[ | Pre- to post-positive test probability( | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Comparison Model | |||||||||
| % | SD | ensLR | LSVM | ensLSVM | DT | RF | AdB | ||
| LR | 23.9 | 1.9 | 1.02(0.780)[0.88, 1.20] | 1.06(0.328)[0.95, 1.18] | 1.02(0.790)[0.86, 1.23] |
|
|
| 0.2(0.899) |
| ensLR | 24.5 | 2.0 | 1.03(0.679)[0.88, 1.21] | 1.00(0.989)[0.84, 1.20] |
|
|
| 0.8(0.696) | |
| LSVM | 25.3 | 2.8 | 0.97(0.711)[0.82, 1.14] |
|
|
| 1.6(0.559) | ||
| ensLSVM | 24.5 | 1.7 |
|
|
| 0.8(0.636) | |||
| DT | 32.4 | 2.0 | 0.99(0.858)[0.90, 1.09] | 0.94(0.448)[0.81, 1.10] |
| ||||
| RF | 32.1 | 2.1 | 0.95(0.521)[0.82, 1.11] |
| |||||
| AdB | 30.5 | 1.6 |
| ||||||
a.
bP-value obtained using Moskowitz and Pepe’s algorithm.[23]
cP-value obtained using one-sample test of proportion of reference model compared to positive prevalence of 23.7%.
Negative predictive value, relativity to other models, and change in pre- to post-negative test probabilities at optimal classification thresholds of individual and ensemble models constructed using logistic regression, LSVM, DT algorithms, and comparisons to other models.
| Reference model | NPV | Relative NPV[ | Pre- to post- negative test probability( | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Comparison model | |||||||||
| % | SD | ensLR | LSVM | ensLSVM | DT | RF | AdB | ||
| LR | 76.5 | 1.9 | 1.01(0.758)[0.96, 1.06] | 1.03(0.177)[0.98, 1.09] | 1.01(0.826)[0.96, 1.06] |
|
| 1.06(0.054)[1.00, 1.12] | 0.2(0.905) |
| ensLR | 77.2 | 1.8 | 1.03(0.472)[0.96, 1.10] | 1.00(0.922)[0.95, 1.05] |
|
| 1.05(0.126)[0.99, 1.12] | 0.9(0.628) | |
| LSVM | 79.2 | 1.4 | 0.97(0.371)[0.91, 1.03] | 1.04(0.240)[0.97, 1.12] | 1.05(0.234)[0.97, 1.12] | 1.02(0.527)[0.95, 1.10] |
| ||
| ensLSVM | 77.0 | 2.1 |
|
| 1.05(0.068)[1.00, 1.11] | 0.7(0.749) | |||
| DT | 82.7 | 1.8 | 1.00(0.969)[0.97, 1.03] | 0.98(0.312)[0.94, 1.02] |
| ||||
| RF | 82.8 | 1.7 | 0.98(0.311)[0.94, 1.02] |
| |||||
| AdB | 81.1 | 1.9 |
| ||||||
a
bP-value obtained using Moskowitz and Pepe’s algorithm.[23]
cP-value obtained using one-sample test of proportion of reference model compared to negative prevalence of 76.3%.
Contribution of predictors on the training of logistic regression, ensemble SVM, random forest, and AdaBoost models.
| Reference model | Mean % feature contribution to algorithm training[ | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Demographic features | Clinical features | MRI features | |||||||
| Age | Sex | Duration | EDSS | T25W | 9HP | PASAT | T2LV | BPF | |
| LR | 8.7 (5.2) | 9.8 (6.4) | 4.6 (3.5) | 25.4 (5.3) | 2.6 (2.9) | 17.7 (6.5) | 5.8 (5.8) | 7.6 (5.3) | 17.6 (6.9) |
| ensLR | 9.2 (5.9) | 9.6 (9.7) | 4.5 (2.7) | 28.6 (4.7) | 2.2 (1.5) | 23.0 (6.1) | 8.0 (5.3) | 7.1 (4.3) | 7.8 (5.1) |
| LSVM | 7.5 (3.7) | 10.6 (6.2) | 5.4 (4.1) | 26.2 (4.6) | 2.4 (3.1) | 18.1 (5.4) | 6.9 (5.6) | 6.2 (3.7) | 16.6 (6.5) |
| ensLSVM | 7.9 (3.9) | 11.5 (21.9) | 5.5 (2.8) | 17.3 (8.9) | 1.4 (0.6) | 22.4 (8.5) | 5.7 (3.9) | 10.0 (9.0) | 18.3 (5.9) |
| DT | 10.0 (8.7) | 0.8 (2.5) | 7.4 (7.7) | 24.6 (8.4) | 30.2 (8.7) | 8.5 (8.6) | 3.5 (5.4) | 9.9 (5.8) | 5.2 (4.4) |
| RF | 10.6 (7.5) | 0.4 (1.3) | 7.7 (8.3) | 23.3 (7.1) | 25.7 (9.2) | 8.7 (9.1) | 5.8 (5.1) | 12.1 (6.3) | 5.6 (3.3) |
| AdB | 11.8 (4.5) | 1.0 (1.8) | 8.3 (4.5) | 15.0 (5.4) | 18.3 (5.7) | 14.7 (6.7) | 8.5 (6.2) | 10.6 (6.0) | 11.8 (3.6) |
aMean of feature contribution to model training across 10-fold cross validation.
Figure 3.Plot of predictor contribution (with 95% confidence intervals) to independent and ensemble model training using logistic regression, linear SVM, and decision tree algorithms