| Literature DB >> 28018853 |
Christine Egger1, Roland Opfer2, Chenyu Wang3, Timo Kepp4, Maria Pia Sormani5, Lothar Spies4, Michael Barnett3, Sven Schippling1.
Abstract
INTRODUCTION: Magnetic resonance imaging (MRI) has become key in the diagnosis and disease monitoring of patients with multiple sclerosis (MS). Both, T2 lesion load and Gadolinium (Gd) enhancing T1 lesions represent important endpoints in MS clinical trials by serving as a surrogate of clinical disease activity. T2- and fluid-attenuated inversion recovery (FLAIR) lesion quantification - largely due to methodological constraints - is still being performed manually or in a semi-automated fashion, although strong efforts have been made to allow automated quantitative lesion segmentation. In 2012, Schmidt and co-workers published an algorithm to be applied on FLAIR sequences. The aim of this study was to apply the Schmidt algorithm on an independent data set and compare automated segmentation to inter-rater variability of three independent, experienced raters.Entities:
Keywords: Automated lesion segmentation; Dice coefficient; Fluid-attenuated inversion recovery; Inter-rater variability; Magnetic resonance imaging; Multiple sclerosis
Mesh:
Year: 2016 PMID: 28018853 PMCID: PMC5175993 DOI: 10.1016/j.nicl.2016.11.020
Source DB: PubMed Journal: Neuroimage Clin ISSN: 2213-1582 Impact factor: 4.881
Fig. 1Mean dice coefficients (over n = 50 data sets) depending on threshold (0–0.8) and kappa (0.2, 0.3, and 0.4) for LGA SPM8/SPM12 and LPA SPM12. For each data set and each algorithm the mean of the dice to the three raters was computed.
Mean and standard deviation of dice coefficients (DC).
| All | < 5 ml | 5–15 ml | > 15 ml | |
|---|---|---|---|---|
| Ham-Syd | 0.67 (0.12) | 0.63 (0.15) | 0.69 (0.06) | 0.74 (0.08) |
| Ham-Zur | 0.66 (0.13) | 0.61 (0.16) | 0.68 (0.06) | 0.74 (0.07) |
| Zur-Syd | 0.67 (0.12) | 0.63 (0.14) | 0.69 (0.07) | 0.72 (0.09) |
| Raters | 0.66 (0.12) | 0.62 (0.14) | 0.69 (0.06) | 0.73 (0.08) |
| Raters-LGA SPM 8 | 0.60 (0.15) | 0.53 (0.16) | 0.65 (0.08) | 0.70 (0.09) |
| Raters-LGA SPM 12 | 0.53* (0.16) | 0.45* (0.18) | 0.59* (0.09) | 0.63 (0.10) |
| Raters-LPA | 0.57* (0.16) | 0.49* (0.17) | 0.63 (0.10) | 0.68 (0.11) |
Means and standard deviations (in brackets) of DC between lesion masks generated by manual or automated segmentation. The numbers marked with an asterisk indicate DC which are significantly (p = 0.05) different from the DC of the manual raters ("Raters"). In columns 3–5 the same analysis was performed but restricted to groups with different total lesion volumes.
Median and 95th percentiles of absolute volume differences in ml.
| All | < 5 ml | 5–15 ml | > 15 ml | |
|---|---|---|---|---|
| Ham-Syd | 0.64 (5.45) | 0.27 (2.57) | 0.86 (3.82) | 3.75 (16.94) |
| Ham-Zur | 0.75 (3.13) | 0.47 (2.12) | 1.02 (3.70) | 1.65 (3.29) |
| Zur-Syd | 0.40 (4.49) | 0.28 (1.25) | 0.39 (3.13) | 1.97 (15.61) |
| Raters | 0.66 (3.63) | 0.39 (1.85) | 0.76 (2.71) | 2.63 (11.29) |
| Raters-LGA SPM 8 | 0.68 (7.13) | 0.38 (3.14) | 0.93 (3.75) | 2.94 (9.53) |
| Raters-LGA SPM 12 | 0.93 (7.61) | 0.47 (3.67) | 1.55 (5.16) | 5.56 (16.15) |
| Raters-LPA | 0.85 (8.13) | 0.49 (2.27) | 1.40 (5.05) | 3.77 (16.58) |
Medians and 95th percentiles (in brackets) of absolute volume difference in ml. The numbers marked with an asterisk indicate values which are significantly (p = 0.05) different from the between differences of the manual raters ("Raters"). In columns 3–5 the same analysis was performed but restricted to groups with different levels of total lesion volume.
Fig. 2Precision of three independent manual raters regarding total lesion volumes [log(ml)] visualized by ICC (absolute agreement) (A) and Bland-Altman plots for each pair (B–D). Total lesion volumes [ml] are shown as logarithmic values.
Fig. 3Precision of three automated segmentation tools regarding total lesion volumes [log(ml)]; visualized by Bland-Altman plots for each pair (A,C,E) and ICC (absolute agreement) (B,D,F). Total lesion volumes are shown as logarithmic values and are compared to the averaged values of three manual raters.
Sensitivity and false positive rate.
| Sensitivity | False positive rate | |
|---|---|---|
| Manual rater | 0.676 | 0.324 |
| LPA SPM12 | 0.628 | 0.457 |
| LGA SPM8 | 0.566 | 0.335 |
| LGA SPM12 | 0.505 | 0.393 |
Averaged sensitivity and false positive rate of all six possible manual rater pairings (“Manual rater”; each rater was taken as ground truth for each comparison) and algorithm-to-rater comparison (only manual rater were taken as ground truth).
Median and 95th percentiles of absolute differences in lesion numbers.
| All (n = 50) | |
|---|---|
| Ham-Syd | 8.00 (49.00) |
| Ham-Zur | 5.50 (20.00) |
| Zur-Syd | 6.00 (52.00) |
| Raters | 8.00 (40.00) |
| Raters-LGA SPM 8 | 8.67 (54.00) |
| Raters-LGA SPM 12 | 10.50* (63.00) |
| Raters-LPA | 16.83* (45.67) |
Median and 95th percentiles (in brackets) of absolute differences in lesion numbers. The numbers marked with an asterisk indicate the values which are significantly (p = 0.05) different from the differences between the manual raters ("Raters").