| Literature DB >> 35557607 |
Tom Finck1, Hongwei Li2, Sarah Schlaeger1, Lioba Grundl1, Nico Sollmann1,3, Benjamin Bender4, Eva Bürkle4, Claus Zimmer1, Jan Kirschke1, Björn Menze2, Mark Mühlau5,6, Benedikt Wiestler1,2.
Abstract
Generative adversarial networks (GANs) can synthesize high-contrast MRI from lower-contrast input. Targeted translation of parenchymal lesions in multiple sclerosis (MS), as well as visualization of model confidence further augment their utility, provided that the GAN generalizes reliably across different scanners. We here investigate the generalizability of a refined GAN for synthesizing high-contrast double inversion recovery (DIR) images and propose the use of uncertainty maps to further enhance its clinical utility and trustworthiness. A GAN was trained to synthesize DIR from input fluid-attenuated inversion recovery (FLAIR) and T1w of 50 MS patients (training data). In another 50 patients (test data), two blinded readers (R1 and R2) independently quantified lesions in synthetic DIR (synthDIR), acquired DIR (trueDIR) and FLAIR. Of the 50 test patients, 20 were acquired on the same scanner as training data (internal data), while 30 were scanned at different scanners with heterogeneous field strengths and protocols (external data). Lesion-to-Background ratios (LBR) for MS-lesions vs. normal appearing white matter, as well as image quality parameters were calculated. Uncertainty maps were generated to visualize model confidence. Significantly more MS-specific lesions were found in synthDIR compared to FLAIR (R1: 26.7 ± 2.6 vs. 22.5 ± 2.2 p < 0.0001; R2: 22.8 ± 2.2 vs. 19.9 ± 2.0, p = 0.0005). While trueDIR remained superior to synthDIR in R1 [28.6 ± 2.9 vs. 26.7 ± 2.6 (p = 0.0021)], both sequences showed comparable lesion conspicuity in R2 [23.3 ± 2.4 vs. 22.8 ± 2.2 (p = 0.98)]. Importantly, improvements in lesion counts were similar in internal and external data. Measurements of LBR confirmed that lesion-focused GAN training significantly improved lesion conspicuity. The use of uncertainty maps furthermore helped discriminate between MS lesions and artifacts. In conclusion, this multicentric study confirms the external validity of a lesion-focused Deep-Learning tool aimed at MS imaging. When implemented, uncertainty maps are promising to increase the trustworthiness of synthetic MRI.Entities:
Keywords: artificial intelligence (AI); deep learning – artificial neural network (DL-ANN); double inversion recovery (DIR); magnetic resonance imaging; multiple sclerosis; neuroradiology; synthetic MRI
Year: 2022 PMID: 35557607 PMCID: PMC9087732 DOI: 10.3389/fnins.2022.889808
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 5.152
Data from center 1 was acquired on the same hardware as training data and thus considered to be of known structure (= internal data).
| Data class (number of image sets) | Classes for study evaluation |
| Training data ( | |
| Test data from (1) ( | Internal data (Known data structure) |
| Test data from (2) ( | External data (Unknown data structure) |
| Test data from (3) ( | External data (Unknown data structure) |
In analogy, data from centers 2 and 3 were acquired on different hardware and considered to be of unknown structure (= external data).
FIGURE 1Architecture, training process, and inference of the image synthesis task. The image Generator G uses the combination of FLAIR and T1w as input to generate synthDIR. The additional supervision from the lesion maps in the training stage drives an enhanced translation of MS-specific lesions (lesion attention). The feedback on the similarity between synthDIR and trueDIR is given by the Discriminator D and a structure similarity loss function and it updates the network weights until the loss function to discern both image pairs is minimal. During the inference stage, the trained generator G can generate the synthDIR and an uncertainty map showing the confidence of the output relating to each voxel. Uncertainty maps are calculated from the voxel-wise variances in signal intensities, as explained in the section “Materials and Methods”.
FIGURE 2Exemplary images of FLAIR, trueDIR, and synthDIR for all centers and scanners.
Lesion counts for all locations and both readers.
| All specific |
| PV lesions |
| JC lesions |
| IT lesions |
| SC lesions |
| |
|
| ||||||||||
| FLAIR vs. synthDIR | 22.5 ± 2.2 vs. 26.7 ± 2.6 | <0.0001 | 12.0 ± 1.2 vs. 13.9 ± 1.4 | <0.0001 | 8.7 ± 1.2 vs. 10.8 ± 1.5 | <0.0001 | 1.9 ± 0.4 vs. 2.2 ± 0.4 | 0.043 | 10.6 ± 1.3 vs. 10.4 ± 1.2 | 0.82 |
| FLAIR vs. trueDIR | 22.5 ± 2.2 vs. 28.6 ± 2.9 | <0.0001 | 12.0 ± 1.2 vs. 13.9 ± 1.4 | <0.0001 | 8.7 ± 1.2 vs. 12.3 ± 1.7 | <0.0001 | 1.9 ± 0.4 vs. 2.4 ± 0.4 | 0.0002 | 10.6 ± 1.3 vs. 10.9 ± 1.4 | 0.36 |
| SynthDIR vs. trueDIR | 26.7 ± 2.6 vs. 28.6 ± 2.9 | 0.0021 | 13.9 ± 1.4 vs. 13.9 ± 1.4 | 0.91 | 10.8 ± 1.5 vs. 12.3 ± 1.7 | <0.0001 | 2.2 ± 0.4 vs. 2.4 ± 0.4 | 0.33 | 10.4 ± 1.2 vs. 10.9 ± 1.4 | 0.66 |
|
| ||||||||||
| FLAIR vs. synthDIR | 19.9 ± 2.0 vs. 22.8 ± 2.2 | 0.0005 | 10.5 ± 1.0 vs. 12.4 ± 1.1 | 0.0004 | 7.8 ± 1.2 vs. 8.5 ± 1.3 | 0.18 | 1.5 ± 0.3 vs. 1.9 ± 0.3 | 0.024 | 13.5 ± 1.9 vs. 10.5 ± 1.5 | <0.0001 |
| FLAIR vs. trueDIR | 19.9 ± 2.0 vs. 23.3 ± 2.4 | <0.0001 | 10.5 ± 1.0 vs. 12.2 ± 1.2 | 0.0014 | 7.8 ± 1.2 vs. 9.7 ± 1.5 | 0.0028 | 1.5 ± 0.3 vs. 1.5 ± 0.3 | 0.99 | 13.5 ± 1.9 vs. 10.5 ± 1.6 | <0.0001 |
| SynthDIR vs. trueDIR | 22.8 ± 2.2 vs. 23.3 ± 2.4 | 0.98 | 12.4 ± 1.1 vs. 12.2 ± 1.2 | 0.26 | 8.5 ± 1.3 vs. 9.7 ± 1.5 | 0.068 | 1.9 ± 0.3 vs. 1.5 ± 0.3 | 0.03 | 10.5 ± 1.5 vs. 10.5 ± 1.6 | 0.70 |
PV, periventricular; JC, juxtacortical; IT, infratentorial; SC, subcortical; FLAIR, fluid-attenuated inversion recovery; trueDIR, real double inversion recovery; synthDIR, synthetic double inversion recovery.
Counts of MS-specific lesions for FLAIR, trueDIR, and synthDIR as a function of data source.
| All |
| Internal data |
| External data |
| |
|
| ||||||
| FLAIR vs. synthDIR | 22.5 ± 2.2 vs. 26.7 ± 2.6 | <0.0001 | 22.2 ± 3.6 vs. 26.6 ± 4.3 | 0.0029 | 22.6 ± 2.8 vs. 27.1 ± 3.4 | <0.0001 |
| FLAIR vs. trueDIR | 22.5 ± 2.2 vs. 28.6 ± 2.9 | <0.0001 | 22.2 ± 3.6 vs. 27.9 ± 4.6 | 0.0001 | 22.6 ± 2.8 vs. 28.9 ± 3.7 | <0.0001 |
| SynthDIR vs. trueDIR | 26.7 ± 2.6 vs. 28.6 ± 2.9 | 0.0021 | 26.6 ± 4.3 vs. 27.9 ± 4.6 | 0.086 | 27.1 ± 3.4 vs. 28.9 ± 3.7 | 0.011 |
|
| ||||||
| FLAIR vs. synthDIR | 19.9 ± 2.0 vs. 22.8 ± 2.2 | 0.0005 | 16.6 ± 2.6 vs. 18.1 ± 2.6 | 0.27 | 21.5 ± 2.6 vs. 25.1 ± 2.9 | 0.0007 |
| FLAIR vs. trueDIR | 19.9 ± 2.0 vs. 23.3 ± 2.4 | <0.0001 | 16.6 ± 2.6 vs. 18.6 ± 2.7 | 0.027 | 21.5 ± 2.6 vs. 25.6 ± 3.3 | 0.0001 |
| SynthDIR vs. trueDIR | 22.8 ± 2.2 vs. 23.3 ± 2.4 | 0.98 | 18.1 ± 2.6 vs. 18.6 ± 2.7 | 0.87 | 25.1 ± 2.9 vs. 25.6 ± 3.3 | 0.90 |
FLAIR, fluid-attenuated inversion recovery; trueDIR, real double inversion recovery; synthDIR, synthetic double inversion recovery.
FIGURE 3Uncertainty maps provide relevant information regarding the validity of voxel-to-voxel translation; increases in uncertainty are scaled from blue to green. Circled in red (Patients 1–4) are hyperintensities in synthDIR without correlation in trueDIR and easily recognized as areas of high variance in the corresponding uncertainty maps, allowing for their identification as artifacts from the synthesis task. On the other hand, true-positive lesions are readily identified as regions with either no (patient 1 – green circle in synthDIR) or low (patient 4 – green circle in synthDIR) values of uncertainty. Hence, interpretation of synthDIR and decision-making on the veracity of lesions is facilitated through uncertainty maps.
Image-wise (SSIM) and voxel-wise (PSNR) comparative metrics for synthDIR and trueDIR.
| SSIM (trueDIR – synthDIR) | PSNR (dB) (trueDIR – synthDIR) | LBR FLAIR | LBR trueDIR | LBR synthDIR | LBR synthDIR w/o LFL | |
| All | 0.954 ± 0.016 | 27.2 ± 2.2 | 1.52 ± 0.49 | 2.86 ± 0.65 | 2.80 ± 0.67 | 2.69 ± 0.66 |
| Internal data | 0.967 ± 0.012 | 29.2 ± 1.64 | 1.45 ± 0.06 | 2.80 ± 0.33 | 2.86 ± 0.34 | 2.68 ± 0.30 |
| External data (2) | 0.941 ± 0.010 | 25.8 ± 1.12 | 1.65 ± 0.12 | 3.01 ± 0.41 | 3.35 ± 0.50 | 3.31 ± 0.45 |
| External data (3) | 0.950 ± 0.012 | 25.6 ± 1.08 | 1.46 ± 0.86 | 2.78 ± 1.00 | 2.19 ± 0.56 | 2.07 ± 0.50 |
LBR are given for FLAIR, trueDIR, synthDIR, as well as for synthDIR generated by a GAN iteration without the lesion-focused loss function (synthDIR w/o LFL). Results are given for internal data, as well as external data (2) and (3). SSIM, structural similarity index measure; PSNR, peak signal-to-noise ratio; LBR, lesion-to-background ratio; LFL, lesion-focused loss; trueDIR, real double inversion recovery; synthDIR, synthetic double inversion recovery.
FIGURE 4Lesion-to-background ratios for FLAIR, trueDIR, and synthDIR. Additionally, LBR was calculated for synthDIR generated by a GAN-iteration without the lesion-focused loss (synthDIR no LFL). Of note, LBR was significantly higher in synthDIR compared to synthDIR without LFL, confirming the hypothesis that domain knowledge can be improved through LFL. While there was no significant difference in LBR between synthDIR and trueDIR (p = 0.41), LBR of synthDIR without LFL remained inferior to the LBR of trueDIR (p = 0.032). LBR, lesion-to-background ratio; LFL, lesion-focused loss.