| Literature DB >> 35297114 |
Francesco La Rosa1,2,3, Erin S Beck3,4, Josefina Maranzano5,6, Ramona-Alexandra Todea7, Peter van Gelderen8, Jacco A de Zwart8, Nicholas J Luciano3, Jeff H Duyn8, Jean-Philippe Thiran1,2,9, Cristina Granziera10,11, Daniel S Reich3, Pascal Sati3,12, Meritxell Bach Cuadra2,9.
Abstract
Manually segmenting multiple sclerosis (MS) cortical lesions (CLs) is extremely time consuming, and past studies have shown only moderate inter-rater reliability. To accelerate this task, we developed a deep-learning-based framework (CLAIMS: Cortical Lesion AI-Based Assessment in Multiple Sclerosis) for the automated detection and classification of MS CLs with 7 T MRI. Two 7 T datasets, acquired at different sites, were considered. The first consisted of 60 scans that include 0.5 mm isotropic MP2RAGE acquired four times (MP2RAGE×4), 0.7 mm MP2RAGE, 0.5 mm T2 *-weighted GRE, and 0.5 mm T2 *-weighted EPI. The second dataset consisted of 20 scans including only 0.75 × 0.75 × 0.9 mm3 MP2RAGE. CLAIMS was first evaluated using sixfold cross-validation with single and multi-contrast 0.5 mm MRI input. Second, the performance of the model was tested on 0.7 mm MP2RAGE images after training with either 0.5 mm MP2RAGE×4, 0.7 mm MP2RAGE, or alternating the two. Third, its generalizability was evaluated on the second external dataset and compared with a state-of-the-art technique based on partial volume estimation and topological constraints (MSLAST). CLAIMS trained only with MP2RAGE×4 achieved results comparable to those of the multi-contrast model, reaching a CL true positive rate of 74% with a false positive rate of 30%. Detection rate was excellent for leukocortical and subpial lesions (83%, and 70%, respectively), whereas it reached 53% for intracortical lesions. The correlation between disability measures and CL count was similar for manual and CLAIMS lesion counts. Applying a domain-scanner adaptation approach and testing CLAIMS on the second dataset, the performance was superior to MSLAST when considering a minimum lesion volume of 6 μL (lesion-wise detection rate of 71% versus 48%). The proposed framework outperforms previous state-of-the-art methods for automated CL detection across scanners and protocols. In the future, CLAIMS may be useful to support clinical decisions at 7 T MRI, especially in the field of diagnosis and differential diagnosis of MS patients.Entities:
Keywords: 7 T; cortical lesions; deep learning; detection; multiple sclerosis; ultra-high-field MRI
Mesh:
Year: 2022 PMID: 35297114 PMCID: PMC9539569 DOI: 10.1002/nbm.4730
Source DB: PubMed Journal: NMR Biomed ISSN: 0952-3480 Impact factor: 4.478
FIGURE 1Examples of the three CL types identified in Dataset A shown in the three MRI images considered
FIGURE 2Scheme of the proposed multi‐contrast CNN architecture inspired by the 3D U‐Net. The CNN takes as input 3D patches of size 96 × 96 × 96 of the different MRI contrasts (red frame) and provides as output a CL detection and classification into two classes. Input channel dropout is applied when the two T 2* contrasts are considered (green frame)
FIGURE 3Examples of CLs identified by the experts in the MP2RAGE×4 0.5 mm and T 2* GRE, and retrospectively seen in the MP2RAGE 0.7 mm. A subpial lesion is marked by a blue mask, and an intracortical lesion is marked by a yellow mask
FIGURE 4Visual results showing CL detection with the single input 0.5 mm MP2RAGE×4 model. Left: an intracortical lesion manually segmented (green) that was correctly detected by the automated method (red). Right: a similar example for a leukocortical lesion. GT, ground truth
Median metrics and Cohen's kappa coefficient for the different input contrasts obtained with a sixfold cross‐validation over the 60 cases from Institution A. LTPR, LFPR, and classification accuracy are computed on a lesion level, whereas LTPR, LFPR, DSC, and VD are considered on a patient‐wise level. Bold, the best result for each metric
| Input | Lesion‐wise | Patient‐wise |
| |||||
|---|---|---|---|---|---|---|---|---|
| LTPR | LFPR | Classification accuracy | LTPR | LFPR | DSC | VD | ||
| 0.5 mm MP2RAGE×4, |
|
|
|
| 0.33 | 0.47 | 0.87 | 0.48 |
| 0.5 mm MP2RAGE×4, | 0.73 | 0.36 | 0.85 | 0.72 | 0.38 | 0.36 | 3.01 | 0.46 |
| 0.5 mm MP2RAGE×4, | 0.71 | 0.38 | 0.82 | 0.70 | 0.41 | 0.35 | 3.22 | 0.45 |
| 0.5 mm MP2RAGE×4 |
|
|
|
|
|
|
|
|
|
| 0.36 | 0.45 | 0.82 | 0.31 | 0.50 | 0.18 | 4.51 | 0.22 |
|
| 0.32 | 0.55 | 0.81 | 0.30 | 0.57 | 0.16 | 4.57 | 0.18 |
FIGURE 5Violin plots of the LTPR and LFPR for different input models evaluated with a sixfold cross‐validation over the 60 subjects of Institution A. Each dot represents a subject
FIGURE 6Correlation between the manual CL count and the one provided automatically by CLAIMS (best model trained with MP2RAGE×4). The solid lines show the linear regression model between the two measures along with a confidence interval at 95%. The dashed lines indicate the expected lesion count estimates. The CCC between manual and automatic lesion counts is reported in the legend and shows a substantial agreement
Spearman correlation coefficient ρ (and its relative p‐value) computed between four disability measures and the manual and automated CL count. Both counts show a moderate correlation for all four measures
| CLAIMS CL number | Ground truth CL number | |||
|---|---|---|---|---|
| 𝝆 |
| 𝝆 |
| |
| Ground truth CL number | 0.86 | <0.0001 | — | — |
| EDSS | 0.45 | 0.0003 | 0.43 | 0.0008 |
| 25TW | 0.42 | 0.0008 | 0.43 | 0.0008 |
| 9HPT | 0.50 | <0.0001 | 0.40 | 0.0014 |
| SDMT | −0.58 | <0.0001 | −0.53 | <0.0001 |
FIGURE 7Bland–Altman plot (reference − prediction) of the manually and automatically segmented CL volumes. The solid green line shows the mean difference, whereas the dotted red lines the ±1.96 SD limits of the mean difference
Comparison of lesion detection rates on a patient‐wise level for the different models
| MP2RAGE×4 |
|
| 3 contrasts | |||||
|---|---|---|---|---|---|---|---|---|
| Detected (%) | Per patient | Detected (%) | Per patient | Detected (%) | Per patient | Detected (%) | Per patient | |
| Median (range, IQR) | Median (range, IQR) | Median (range, IQR) | Median (range, IQR) | |||||
| All | 1649 (74%) | 14 (0–163, 36) | 748 (36%) | 6 (0–141, 10) | 692 (32%) | 5 (0–128, 11) | 1648 (74%) | 14 (0–151, 31) |
| Leukocortical | 672 (83%) | 5 (0–54, 13) | 268 (63%) | 4 (0–39, 12) | 254 (59%) | 4 (0–36, 11) | 656 (82%) | 6 (0–48, 13) |
| Intracortical | 83 (53%) | 1 (0–17, 2) | 25 (15%) | 0 (0–5, 1) | 19 (12%) | 0 (0–2, 0) | 76 (49%) | 1 (0–17, 2) |
| Subpial | 894 (69%) | 6 (0–96, 8) | 455 (48%) | 4 (0–92, 6) | 421 (41%) | 5 (0–81, 4) | 916 (70%) | 7 (0–98, 9) |
NS: non‐significant.
p < 0.05.
p < 0.01.
p < 0.001.
FIGURE 8Lesion‐wise CL detection rate for the three different CL lesion types considered over the 60 subjects of Institution A
Metrics obtained for models trained with different inputs (listed in the first column) and tested on the MP2RAGE 0.7 mm images. Bold, the best result for each metric
| Lesion‐wise | Patient‐wise | ||||
|---|---|---|---|---|---|
| Training images | LTPR | LFPR | Classification accuracy | Dice | VD |
| MP2RAGE 0.7 mm | 0.52 | 0.39 |
| 0.25 | 1.10 |
| MP2RAGE×4 0.5 mm | 0.35 | 0.41 | 0.81 | 0.16 | 1.29 |
| MP2RAGE×4 0.5 and MP2RAGE 0.7 mm |
|
|
|
|
|
FIGURE 9False positive and detection rate in a pure testing scenario on the Institution B dataset for CLAIMS, CLAIMS domain‐adapted (CLAIMS_DA), and MSLAST. Different minimum lesion volumes are considered. N refers to the number of CLS in the ground truth for each minimum lesion volume