| Literature DB >> 31734527 |
M M Weeda1, I Brouwer2, M L de Vos2, M S de Vries2, F Barkhof3, P J W Pouwels2, H Vrenken2.
Abstract
PURPOSE: Accurate lesion segmentation is important for measurements of lesion load and atrophy in subjects with multiple sclerosis (MS). International MS lesion challenges show a preference of convolutional neural networks (CNN) strategies, such as nicMSlesions. However, since the software is trained on fairly homogenous training data, we aimed to test the performance of nicMSlesions in an independent dataset with manual and other automatic lesion segmentations to determine whether this method is suitable for larger, multi-center studies.Entities:
Keywords: Automatic lesion segmentation; Convolutional neural networks; MRI; Multiple sclerosis
Mesh:
Year: 2019 PMID: 31734527 PMCID: PMC6861662 DOI: 10.1016/j.nicl.2019.102074
Source DB: PubMed Journal: Neuroimage Clin ISSN: 2213-1582 Impact factor: 4.881
Overview of the different lesion segmentation methods investigated in this study for their performance to manual segmentation.
| Supervision | Training | Optimization | Method |
|---|---|---|---|
| no | No | no | (1) LesionTOADS |
| yes | No | no | (2) LST-LPA default (3) nicMSlesions default |
| yes | No | yes | (4) LST-LPA adjusted-threshold |
| yes | yes ( | yes | (5) nicMSlesions optimized (6) BIANCA |
| yes | yes ( | No | (7) nicMSlesions single-subject |
Fig. 1Example lesion segmentation shown for the different lesion segmentation methods on the original 3D FLAIR image of a 36 year old female with RRMS: manual (red); LesionTOADS (green), BIANCA (yellow), LST-LPA default (blue), LST-LPA adjusted-threshold (turquoise), nicMSlesions default (pink), and nicMSlesions optimized (purple). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Volumetric and spatial accuracy of the various grouped lesion segmentation methods compared to manual in the fourteen subjects with RRMS.
| RRMS | Lesion volume | ICC absolute agreement | ICC consistency | True positive volume | True negative volume | False positive volume | False negative volume | Sensitivity | 1–Specificity (10E-3) | SI to manual |
|---|---|---|---|---|---|---|---|---|---|---|
| 7.91 (4.26–10.15) | NA | NA | NA | NA | NA | NA | NA | NA | NA | |
| 3.67 | 0.140 (−0.119–0.503) | 0.300 (−0.253–0.705) | 2.78 (1.18–3.28) | 11,092 (11,090–11,095) | 0.72 (0.48–1.11) | 5.07 (3.20–6.20) | 0.312 (0.230–0.405) | 0.065 (0.043–0.100) | 0.444 (0.336–0.542) | |
| 4.40 | 0.73 (−0.059–0.939) | 0.947 (0.845–0.983) | 3.21 (1.48–5.81) | 11,091 (11,089–11,096) | 1.06 (0.44–2.20) | 4.02 (2.94–5.80) | 0.414 (0.298–0.520) | 0.096 (0.039–0.198) | 0.528 (0.425–0.581) | |
| 7.64 (3.82–11.15) | 0.917 (0.769–0.972) | 0.917 (0.764–0.973) | 4.46 (2.28–7.16) | 11,089 (11,086–11,095) | 2.95 (1.62–5.48) | 2.72 (2.07–4.40) | 0.592 (0.431–0.637) | 0.266 (0.145–0.494) | 0.568 (0.481–0.607) | |
| 8.91 | 0.872 (0.506–0.962) | 0.911 (0.747–0.971) | 4.26 (2.23–6.18) | 11,089 (11,085–11,094) | 4.75 (2.91–6.40) | 3.13 (1.81–4.85) | 0.553 (0.429–0.641) | 0.428 (0.262–0.577) | 0.490 (0.424–0.586) | |
| 7.39 (3.25–9.93) | 0.975 (0.928–0.992) | 0.975 (0.925–0.992) | 5.41 (2.22–7.11) | 11,091 (11,088–11,096) | 2.00 (1.27–3.05) | 2.32 (2.02–3.61) | 0.698 (0.536–0.736) | 0.180 (0.114–0.275) | 0.660 (0.613–0.716) | |
| 5.33 | 0.746 (0.189–0.893) | 0.854 (0.809–0.889) | 3.57 (1.77–5.29) | 11,091 (11,089–11,096) | 1.54 (0.95–2.60) | 3.59 (2.55–4.93) | 0.501 (0.401–0.591) | 0.139 (0.06–0.235) | 0.568 (0.490–0.638) | |
| 7.54 (4.37–10.25) | 0.968 (0.905–0.990) | 0.966 (0.898–0.989) | 5.11 (2.67–6.79) | 11,090 (11,087–11,095) | 2.93 (1.92–3.78) | 2.67 (2.08–3.75) | 0.639 (0.521–0.686) | 0.264 (0.173–0.341) | 0.643 (0.514–0.675) |
Volumes are shown as median with interquartile range with the first and last quartile; intraclass correlation coefficients (ICC) are shown with 95% confidence interval. Note that the positive and negative volumes are extracted from the native FLAIR image.
nicMSlesions single-subject output is an average over all fourteen variants (also see Section 3.4).Statistics from repeated measures ANOVA with post-hoc pairwise Wilcoxon Signed Ranks test; * p < 0.05; ⁎⁎p < 0.01; ⁎⁎⁎p < 0.001.
Fig. 2Scatter plots of the automated segmentation lesion volumes versus manual lesion volumes obtained from LesionTOADS (green), LST-LPA default (blue), LST-LPA adjusted-threshold (turquoise), nicMSlesions default (pink), nicMSlesions optimized (purple), and BIANCA (yellow). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Volumetric and spatial accuracy of the different single-subjects configuration of nicMSlesions, each evaluated on the remaining thirteen subjects.
| Manual volume (ml) of single-subject used for training i.e., | Median manual lesion volume (ml) | Median automated lesion volume (ml) | ICC absolute agreement | ICC consistency | Median Dice's similarity index to manual labels |
|---|---|---|---|---|---|
| 2.68 # | 8.73 (6.40–10.25) | 2.94 ** (2.45–4.24) | 0.131 (−0.088–0.499) | 0.396 (−0.201–0.779) | 0.363 (0.311–0.417) |
| 2.72 | 8.58 (5.27–10.20) | 8.18 (5.17–9.71) | 0.760 (0.397–0.919) | 0.758 (0.379–0.919) | 0.521 (0.462–0.581) |
| 4.19 | 8.58 (5.27–10.20) | 7.88 (4.63–10.79) | 0.941 (0.819–0.982) | 0.937 (0.806–0.980) | 0.574 (0.500–0.640) |
| 4.29 | 8.58 (5.22–10.20) | 8.02 (4.94–10.24 | 0.973 (0.914–0.992) | 0.971 (0.908–0.991) | 0.595 0.564–0.663) |
| 6.26 | 8.58 (4.34–10.20) | 5.44 ** (2.44–7.20) | 0.753 (−0.071–0.944) | 0.927 (0.779–0.977) | 0.596 (0.517–0.620) |
| 6.85 | 8.58 (4.34–10.20) | 5.48 ** (2.87–7.59) | 0.856 (−0.014–0.969) | 0.955 (0.861–0.986) | 0.562 (0.520–0.635) |
| 7.24 | 8.58 (4.34–10.20) | 6.09 ** 2.78–7.74) | 0.844 (−0.029–0.967) | 0.953 (0.853–0.985) | 0.599 (0.557–0.668) |
| 8.58 | 7.24 (4.24–10.20) | 6.15 ** (3.07–8.15) | 0.930 (0.284–0.985) | 0.972 (0.910–0.991) | 0.595 (0.529–0.663) |
| 8.88 | 7.24 (4.24–10.20) | 4.68 ** (2.31–650) | 0.727 (−0.072–0.938) | 0.930 (0.787–0.978) | 0.575 (0.439–0.630) |
| 9.22 | 7.24 (4.24–10.20) | 5.50 ** (2.19–7.34) | 0.819 (−0.052–0.963) | 0.963 (0.883–0.989) | 0.535 (0.479–0.653) |
| 10.09 | 7.24 (4.24–9.76) | 5.45 ** (2.39–7.23) | 0.855 (−0.034–0.970) | 0.962 (0.881–0.988) | 0.590 (0.513–0.648) |
| 10.31 | 7.24 (4.24–9.66) | 6.34 ** (2.58–8.09) | 0.885 (0.073–0.975) | 0.959 (0.870–0.987) | 0.606 (0.524–0.676) |
| 14.59 | 7.24 (4.24–9.66) | 4.81 ** (2.37–6.68) | 0.696 (−0.073–0.929) | 0.923 (0.769–0.976) | 0.571 (0.473–0.623) |
| 15.79 | 7.24 (4.24–9.66) | 3.86 ** (2.01–4.94) | 0.593 (−0.066–0.896) | 0.907 (0.723–0.971) | 0.540 (0.427–0.561) |
Abbreviations: IQR gives the interquartile range with the first and last quartile. Statistics from Wilcoxon Signed Ranks test; * p < 0.05; ⁎⁎p < 0.01; ⁎⁎⁎p < 0.001.
Fig. 3Box-and-whiskers plot (min-to-max, line at median) showing Dice's similarity index (SI) in comparison to the manual lesion segmentation for LesionTOADS (green), LST-LPA default (blue), LST-LPA adjusted-threshold (turquoise), nicMSlesions default (pink), nicMSlesions optimized (purple), BIANCA (yellow), and the fourteen different nicMSlesions single-subject variants (orange). Horizontal dotted lines indicate the medians of the other automated lesion segmentation methods. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Volumetric accuracy of the various grouped lesion segmentation methods compared to manual in the five healthy control subjects.
| HC | Mean lesion volume ± SD (range min-to-max) |
|---|---|
| 0.81 ± 0.23 (0.45–1.08) | |
| 0.40 ± 0.22 (0.07–0.67) | |
| 1.36 ± 0.11 (0.40–2.49) | |
| 2.23 ± 1.77 (0.00–5.18) | |
| 0.32 ± 0.43 (0.00–1.53) | |
| 0.27 ± 0.34 (0.00–0.94) | |
| 1.81 ± 0.83 (0.86–3.38) |