| Literature DB >> 31927500 |
Richard McKinley1, Rik Wepfer2, Lorenz Grunder2, Fabian Aschwanden2, Tim Fischer3, Christoph Friedli4, Raphaela Muri2, Christian Rummel2, Rajeev Verma5, Christian Weisstanner6, Benedikt Wiestler7, Christoph Berger8, Paul Eichinger7, Mark Muhlau9, Mauricio Reyes10, Anke Salmen4, Andrew Chan4, Roland Wiest2, Franca Wagner2.
Abstract
The detection of new or enlarged white-matter lesions is a vital task in the monitoring of patients undergoing disease-modifying treatment for multiple sclerosis. However, the definition of 'new or enlarged' is not fixed, and it is known that lesion-counting is highly subjective, with high degree of inter- and intra-rater variability. Automated methods for lesion quantification, if accurate enough, hold the potential to make the detection of new and enlarged lesions consistent and repeatable. However, the majority of lesion segmentation algorithms are not evaluated for their ability to separate radiologically progressive from radiologically stable patients, despite this being a pressing clinical use-case. In this paper, we explore the ability of a deep learning segmentation classifier to separate stable from progressive patients by lesion volume and lesion count, and find that neither measure provides a good separation. Instead, we propose a method for identifying lesion changes of high certainty, and establish on an internal dataset of longitudinal multiple sclerosis cases that this method is able to separate progressive from stable time-points with a very high level of discrimination (AUC = 0.999), while changes in lesion volume are much less able to perform this separation (AUC = 0.71). Validation of the method on two external datasets confirms that the method is able to generalize beyond the setting in which it was trained, achieving an accuracies of 75 % and 85 % in separating stable and progressive time-points.Entities:
Keywords: Deep Learning; Longitudinal Imaging; MRI; Multiple Sclerosis
Year: 2019 PMID: 31927500 PMCID: PMC6953959 DOI: 10.1016/j.nicl.2019.102104
Source DB: PubMed Journal: Neuroimage Clin ISSN: 2213-1582 Impact factor: 4.881
Fig. 1The DeepSCAN architecture used in this paper for lesion and brain-structure segmentation.
Fig. 2Receiver operating curves for the detection of lesion progression using DeepSCAN, on our internal validation set, via absolute lesion volume change (AUC=0.70), relative volume change (AUC = 0.71), lesion count change (AUC = 0.51), the proposed method using a score margin of.45 (AUC=0.77) and the proposed method using an uncertainty threshold of 0.05 (AUC ≈ 1). The star on each curve represents a cutoff where the patient is labelled as stable if the considered metric is less than or equal to zero.
Ability to distinguish progressive vs stable MS at thresholds corresponding to no lesion change, on internal test set, showing the number of true negatives (TN), false positives (FP), false negatives (FN) and true positives (TP), together with accuracy, positive predictive value and recall. Metrics are shown for the label-flip method (Confidence method) and the margin-based method (Margin method), together with new lesion volume, lesion volume change and lesion count change.
| TN | FP | FN | TP | Accuracy | Sensitivity | PPV | FPR | |
|---|---|---|---|---|---|---|---|---|
| Confidence method > 0 | 74 | 9 | 0 | 13 | 0.91 | 1.00 | 0.59 | 0.11 |
| Margin method > 0 | 83 | 0 | 6 | 7 | 0.94 | 0.54 | 1.00 | 0.00 |
| New lesion volume > 0 | 8 | 75 | 0 | 13 | 0.22 | 1.00 | 0.15 | 0.90 |
| Volume change > 0 | 41 | 42 | 4 | 9 | 0.52 | 0.69 | 0.18 | 0.51 |
| Lesion count change > 0 | 50 | 33 | 8 | 5 | 0.57 | 0.38 | 0.13 | 0.40 |
Performance of the confidence-based method on the three datasets studied in this paper, showing Accuracy, Sensitivity, and Positive Predicative Value (PPV).
| Accuracy | Sensitivity | PPV | ||
|---|---|---|---|---|
| Zurich | 0.75 | 0.60 | 0.84 | |
| Munich | 0.85 | 0.72 | 1.00 | |
| Bern | 0.91 | 1.00 | 0.59 |
Fig. 3Two time-points from the external dataset, showing a missed new lesion. (A) coregistered FLAIR, (B) lesion segmentations, (C) Label-flip maps. New lesion is correctly detected by DeepSCAN at TP2, but not labelled as confident new lesion. Small, faint lesions are more likely to be labelled as uncertain than large, clear lesions.
Fig. 4Two time-points from the external dataset, showing a missed new periventricular lesion. (A) coregistered FLAIR, (B) lesion segmentations, (C) Label-flip maps. Lesion is detected by DeepSCAN at TP2, but location of new lesion is uncertain at TP1. Owing to the similar appearance of periventricular lesions and subependymal gliosis, label confidence is typically low in this region.
Fig. 5A case from the Zurich dataset. Top Row: FLAIR imaging at baseline and three subsequent time-points. A: FLAIR images with lesion masks as provided by the DeepSCAN classifier. B: FLAIR images with masks indicating naive lesion change (lesion is absent at previous time-point but present at current time-point). time-points 3 and 4 show new lesion tissue due to differences in imaging, rather than genuine lesion growth. C: Regions where DeepSCAN flip probability > 0.05 highlighted in blue. D: Confident new lesion tissue maps as provided by the method, showing correctly detected new lesion tissue at time-point 2, and no change at time-points 3 and 4.