| Literature DB >> 25610791 |
V Wottschel1, D C Alexander2, P P Kwok2, D T Chard3, M L Stromillo4, N De Stefano4, A J Thompson3, D H Miller3, O Ciccarelli3.
Abstract
We aim to determine if machine learning techniques, such as support vector machines (SVMs), can predict the occurrence of a second clinical attack, which leads to the diagnosis of clinically-definite Multiple Sclerosis (CDMS) in patients with a clinically isolated syndrome (CIS), on the basis of single patient's lesion features and clinical/demographic characteristics. Seventy-four patients at onset of CIS were scanned and clinically reviewed after one and three years. CDMS was used as the gold standard against which SVM classification accuracy was tested. Radiological features related to lesional characteristics on conventional MRI were defined a priori and used in combination with clinical/demographic features in an SVM. Forward recursive feature elimination with 100 bootstraps and a leave-one-out cross-validation was used to find the most predictive feature combinations. 30 % and 44 % of patients developed CDMS within one and three years, respectively. The SVMs correctly predicted the presence (or the absence) of CDMS in 71.4 % of patients (sensitivity/specificity: 77 %/66 %) at 1 year, and in 68 % (60 %/76 %) at 3 years on average over all bootstraps. Combinations of features consistently gave a higher accuracy in predicting outcome than any single feature. Machine-learning-based classifications can be used to provide an "individualised" prediction of conversion to MS from subjects' baseline scans and clinical characteristics, with potential to be incorporated into routine clinical practice.Entities:
Keywords: Clinically isolated syndrome; MRI; Multiple Sclerosis; Support vector machines
Mesh:
Year: 2014 PMID: 25610791 PMCID: PMC4297887 DOI: 10.1016/j.nicl.2014.11.021
Source DB: PubMed Journal: Neuroimage Clin ISSN: 2213-1582 Impact factor: 4.881
Demographic and clinical characteristics of patients with CIS and at least one lesion at baseline.
| CIS at 1-year follow-up (total no. = 74) | CIS at 3-year follow-up (total no. = 70) | |
|---|---|---|
| Gender (F/M) | 49/25 | 47/23 |
| Age, median, mean, median (range) years. | 33.1, 34 (19–49) | 33.2, 34 (19–49) |
| EDSS, median (range) | 1 (0–8) | 1 (0–8) |
| Type of onset, no (number of converters). | Brainstem/cerebellum = 6 (1) | Brainstem/cerebellum = 5 (1) |
| Spinal cord = 4 (4) | Spinal cord = 4 (4) | |
| Optic neuritis = 64 (17) | Optic neuritis = 61 (26) | |
| Others = 0 (0) | Others = 0 (0) | |
| No. of patients with different number of lesions | Up to 3 lesions = 14 | Up to 3 lesions = 13 |
| More than 3 and up to 10 lesions = 23 | More than 3 and up to 10 lesions = 23 | |
| More than 10 lesions = 37 | More than 10 lesions = 34 | |
| Converters at follow-up, no. (%) | 22 (30 %) | 31 (44 %) |
Fig. 1Example of T2 and PD weighted images and corresponding binary lesion mask. Axial T2 weighted image (left), and proton density (PD) weighted image (centre), showing hyperintense white matter lesions; the corresponding binary lesion mask (right) was used to obtain the lesion features entered into the SVM analysis.
Fig. 2Illustration of one permutation within a leave-one-out cross-validation using support vector machines. Training phase: data points with known labels are used to create an optimal separating hyperplane (OSH). Testing phase: previously unseen data point (grey) is assigned a label (converter) based on the position relative to the OSH.
The most predictive combination of features associated with the highest accuracy of prediction of conversion to CDMS at one and three years estimated from a forward RFE. Accuracy, sensitivity, specificity, PPV and NPV are average values of 100 bootstraps.
| 1 year | 3 years | |
|---|---|---|
| Lesion count | ● | |
| Lesion load | ● | |
| Average lesion PD intensity | ● | |
| Average lesion T2 intensity | ||
| Average distance of lesions from the centre of the brain | ● | |
| Presence of lesions in proximity of the centre of the brain | ||
| Shortest horizontal distance of a lesion from the vertical axis | ● | |
| Lesion size profile | ||
| Type of presentation | ● | |
| Age | ● | |
| Gender | ● | |
| EDSS at onset | ● | |
| Polynomial degree | 4 | 1 |
| Accuracy (%) | 71.4 | 68.0 |
| Range (%) | 52–84 | 61–74 |
| 95 % CI | 58–82 | 61–73 |
| Sensitivity (%) | 77 | 60 |
| Specificity (%) | 66 | 76 |
| PPV (%) | 70 | 72 |
| NPV (%) | 74 | 65 |
CI = confidence interval; PPV = positive predictive value; NPV = negative predictive value.
Fig. 3Accuracies of forward RFE for 1-year prediction. Plot showing the development of accuracies after recursively adding features in order to find the most predictive combination for conversion within 1 year.
Fig. 4Accuracies of forward RFE for 3-year prediction. Plot showing the development of accuracies after recursively adding features in order to find the most predictive combination for conversion within 3 years.
Fig. 5Performance of single features vs. feature combination. Bar plot showing the classification accuracy of all individual features vs. the best combination of features obtained with SVMs.