| Literature DB >> 35524684 |
Benjamin Thyreau1, Yasuko Tatewaki2,3, Liying Chen1, Yuji Takano1,4, Naoki Hirabayashi5, Yoshihiko Furuta5, Jun Hata5, Shigeyuki Nakaji6, Tetsuya Maeda7, Moeko Noguchi-Shinohara8, Masaru Mimura9, Kenji Nakashima10, Takaaki Mori11, Minoru Takebayashi12, Toshiharu Ninomiya5, Yasuyuki Taki1,2,3.
Abstract
White matter lesions (WML) commonly occur in older brains and are quantifiable on MRI, often used as a biomarker in Aging research. Although algorithms are regularly proposed that identify these lesions from T2-fluid-attenuated inversion recovery (FLAIR) sequences, none so far can estimate lesions directly from T1-weighted images with acceptable accuracy. Since 3D T1 is a polyvalent and higher-resolution sequence, it could be beneficial to obtain the distribution of WML directly from it. However a serious difficulty, both for algorithms and human, can be found in the ambiguities of brain signal intensity in T1 images. This manuscript shows that a cross-domain ConvNet (Convolutional Neural Network) approach can help solve this problem. Still, this is non-trivial, as it would appear to require a large and varied dataset (for robustness) labelled at the same high resolution (for spatial accuracy). Instead, our model was taught from two-dimensional FLAIR images with a loss function designed to handle the super-resolution need. And crucially, we leveraged a very large training set for this task, the recently assembled, multi-sites Japan Prospective Studies Collaboration for Aging and Dementia (JPSC-AD) cohort. We describe the two-step procedure that we followed to handle such a large number of imperfectly labeled samples. A large-scale accuracy evaluation conducted against FreeSurfer 7, and a further visual expert rating revealed that WML segmentation from our ConvNet was consistently better. Finally, we made a directly usable software program based on that trained ConvNet model, available at https://github.com/bthyreau/deep-T1-WMH.Entities:
Mesh:
Year: 2022 PMID: 35524684 PMCID: PMC9374893 DOI: 10.1002/hbm.25899
Source DB: PubMed Journal: Hum Brain Mapp ISSN: 1065-9471 Impact factor: 5.399
FIGURE 1The two‐step learning process. The 2D FLAIR model (top) is trained to output a consistent FLAIR mask, initially based on the LST algorithm but with some further manual corrections. The T1 model (bottom) aims to learn a T1 WMH segmentation from those 2D‐FLAIR masks, learning across modality and resolution. It also learns a cortical ribbon as a secondary joint task. Both steps, including relevant preprocessing, use the same, single, training dataset.
Topology of the ConvNets used through this study.
| T1‐WMH net | T1‐ROI net | FLAIR net (WMH or ROI) |
|---|---|---|
|
Input: a T1 image in the MNI workspace Conv3(1, 24), InstNorm, ReLU, Conv3(24, 64), ReLU (block0) MaxPool Conv3(64, 64), InstNorm, ReLU, Conv3(64, 64), ReLU (block1) MaxPool Conv3(64, 64), InstNorm, ReLU, Conv3(64, 64), ReLU (block2) MaxPool Conv3(64, 64), InstNorm, ReLU, Conv3(64, 64), ReLU (block3) MaxPool Conv3(64, 64), InstNorm, ReLU, Conv3(64, 64), ReLU (block4) MaxPool Conv3(64, 128), InstNorm, ReLU, Conv3(128, 64), ReLU Unpool Conv3(64, 64), InstNorm, ReLU, Conv3(64, 64), ReLU Unpool Conv3(64, 64), InstNorm, ReLU, Sum block3, Conv3(64, 64), ReLU Unpool Conv3(64, 64), InstNorm, ReLU, Sum block2, Conv3(64, 64), ReLU Unpool Conv3(64, 64), InstNorm, ReLU, Sum block1, Conv3(64, 64), ReLU Conv3(64, 64), InstNorm, ReLU, Sum block0, Conv3(64, 64), ReLU Conv3(64, 24), ReLU), Conv1(24, 2), Sigmoid output: WMH and Cortex masks |
Input: a T1 image in the MNI workspace Conv3(1, 12, 3) ReLU, (block0) MaxPool, Conv3(12, 16), ReLU, Conv1(16, 16), ReLU (block1) MaxPool, Conv3(16, 16), ReLU, Conv1(16, 16), ReLU (block2) MaxPool, Conv3(16, 16), ReLU, Conv1(16, 16), ReLU (block3) MaxPool, Conv3(16, 16), ReLU, Conv1(16, 16), ReLU (block4) MaxPool, Conv3(16, 16), ReLU, Conv1(16, 16), InstNorm, ReLU, Unpool, Conv3(16, 16), InstNorm, ReLU, Sum block4, Conv3(16, 16), ReLU Unpool, Conv3(16, 16), InstNorm, ReLU, Sum block3, Conv3(16, 16), ReLU Unpool, Conv3(16, 16), InstNorm, ReLU, Sum block2, Conv3(16, 16), ReLU Unpool, Conv3(16, 16), InstNorm, ReLU, Sum block1 Conv3(16, 12), ReLU Unpool, Conv3(12, 12), ReLU, Sum block0, Conv1(12, 12), ReLU, Conv3(12, 8), ReLU, Conv1(8, 4), Softmask output: Four label maps |
Input: a FLAIR image in its native space Conv3(1, 12), InstNorm, ReLU, Conv1(12, 12), ReLU (block0) MaxPool, Conv3(12, 16), ReLU, Conv3(16, 16), ReLU (block1) MaxPool, Conv3(16, 16), ReLU, Conv3(16, 16), ReLU (block2) MaxPool, Conv3(16, 16), ReLU, Conv3(16, 16), ReLU (block3) MaxPool, Conv3(16, 16), ReLU, Conv3(16, 16), ReLU (block4) MaxPool, Conv3(16, 16), ReLU, Conv1(16, 16), InstNorm, ReLU Unpool, Conv3d(16, 16, 3), InstNorm, ReLU, Sum block4 Conv3d(16, 16, 3), ReLU Unpool, Conv3d(16, 16, 3), InstNorm, ReLU, Sum block3, Conv3d(16, 16, 3), ReLU Unpool, Conv3d(16, 16, 3), InstNorm, ReLU, Sum block2, Conv3d(16, 16, 3), ReLU Unpool, Conv3d(16, 16, 3), InstNorm, ReLU, Sum block1, Conv3d(16, 12, 3), ReLU Unpool, Conv3d(12, 12, 3), InstNorm, ReLU, Sum block0, Conv1(12, 12), ReLU, Conv3(12, 8), ReLU, Conv1(8, 3), Sigmoid output: WMH or ROIs, depending on the loaded parameters |
| Note: Starting from block 2, Unpool halve dimension in axial planes only |
Note: All networks are U‐shaped with skip‐connections. Their main difference is the number of convolution kernels, which were hand‐designed following a trade‐off between training speed, model capacity, and resolution. Conv3 and Conv1 refers to the convolution kernel voxel size (3 × 3 × 3 or 1 × 1 × 1). MaxPool operators halve the dimensions and returns indices which are used by Unpool operators.
FIGURE 2Illustration of the sampling performed during the loss computation. The training signal for each voxel of the target image, in FLAIR space, is back‐projected into a corresponding voxel of the T1 output space, perhaps through an augmentation transform, then back‐propagated further up through the model's convolutional layers. The T1 voxels that are not directly linkable to a FLAIR signal are not affected (in particular, they do not receive an interpolated signal).
FIGURE 3Two separate ConvNets were tasked with generating relevant regions of interest (ROIs) represented as color shades for 2D FLAIR (top, greens) and for T1 images (bottom, blues). The FLAIR model, used for temporary usage, was only approximate because of the poor resolution and lack of clearly visible structures. The final model, on T1, was intended to classify lesions (identified and depicted as yellow patches). Still, the hard borders of the ROIs make the current aggregation method perfectible.
FIGURE 4Methods overview. On this coronal slice, a periventricular lesion expands toward the insular cortex. The lesion is visible on the original 2D‐FLAIR image (a), and on the higher‐resolution T1 image (b) although with less contrast. The LST algorithm outcome (c) correctly identified the lesion on the FLAIR image. Our trained ConvNet successfully identified a similar area (d), although at the higher T1 resolution, despite the contrast ambiguity. However, FreeSurfer (e) misclassified part of the lesion as the putamen, while SPM (f) wrongly included it in its gray‐matter mask.
FIGURE 5Illustration of the effect of two loss functions, used here for a cross‐contrast transformation experiment. In this example, a ConvNet was trained to change an input T1‐weighted image (top) into its corresponding FLAIR contrast, based on 2048 training pairs. The MRI sequences used for the target FLAIR contrast have a lower native vertical resolution. Using a standard voxelwise loss function (middle), the model over‐learned the interpolation artifacts. However, using the spatially sparse feedback‐signal loss (bottom), the model successfully learned the contrast transform while maintaining the resolution.
FIGURE 6Agreement between the T1‐based segmentation methods and the FLAIR segmentation method (top). The graph shows the distributions of the overlap as a DICE metric calculated in the resampled FLAIR space and ordered by lesion volume. Colors encode the two methods, and the filled areas contain 90% of the subjects at each volume bin. The mean/SD DICE is 0.33/0.18 (FS7) and 0.55/0.19 (our). Selected MRI cases (bottom), corresponding to numbers on the plot, are selected to illustrate different occurring patterns of overlap.
FIGURE 7Results of the human expert evaluation. The bar chart shows the number of times the method result was considered as the best (or, for negative, worst) by the two raters (sub‐columns). The darkest areas represent segmentations where the two raters' judgments agreed. The results may not sum exactly to 200 as ties were allowed, albeit discouraged.