| Literature DB >> 34520080 |
Hongwei Li1,2, Aurore Menegaux3,4, Benita Schmitz-Koep3,4, Antonia Neubauer3,4, Felix J B Bäuerlein5, Suprosanna Shit1, Christian Sorg3,4,6, Bjoern Menze1,2, Dennis Hedderich3,4.
Abstract
In the last two decades, neuroscience has produced intriguing evidence for a central role of the claustrum in mammalian forebrain structure and function. However, relatively few in vivo studies of the claustrum exist in humans. A reason for this may be the delicate and sheet-like structure of the claustrum lying between the insular cortex and the putamen, which makes it not amenable to conventional segmentation methods. Recently, Deep Learning (DL) based approaches have been successfully introduced for automated segmentation of complex, subcortical brain structures. In the following, we present a multi-view DL-based approach to segment the claustrum in T1-weighted MRI scans. We trained and evaluated the proposed method in 181 individuals, using bilateral manual claustrum annotations by an expert neuroradiologist as reference standard. Cross-validation experiments yielded median volumetric similarity, robust Hausdorff distance, and Dice score of 93.3%, 1.41 mm, and 71.8%, respectively, representing equal or superior segmentation performance compared to human intra-rater reliability. The leave-one-scanner-out evaluation showed good transferability of the algorithm to images from unseen scanners at slightly inferior performance. Furthermore, we found that DL-based claustrum segmentation benefits from multi-view information and requires a sample size of around 75 MRI scans in the training set. We conclude that the developed algorithm allows for robust automated claustrum segmentation and thus yields considerable potential for facilitating MRI-based research of the human claustrum. The software and models of our method are made publicly available.Entities:
Keywords: MRI; claustrum; deep learning; image segmentation; multi-view
Mesh:
Year: 2021 PMID: 34520080 PMCID: PMC8596988 DOI: 10.1002/hbm.25655
Source DB: PubMed Journal: Hum Brain Mapp ISSN: 1065-9471 Impact factor: 5.038
Characteristics of the dataset in this study
| Datasets | Scanner name | Voxel size (mm3) | Number of subjects |
|---|---|---|---|
| Bonn‐1 | Philips Achieva 3 T | 1.00 × 1.00 × 1.00 | 15 |
| Bonn‐2 | Philips Ingenia 3 T | 1.00 × 1.00 × 1.00 | 46 |
| Munich‐1 | Philips Achieva 3 T | 1.00 × 1.00 × 1.00 | 103 |
| Munich‐2 | Philips Ingenia 3 T | 1.00 × 1.00 × 1.00 | 17 |
Note: The dataset consists of 181 subjects from four scanners and two centers.
FIGURE 1Examples of axial (a, b) and coronal (c, d) MR slices with corresponding manual annotation of the claustrum structure (in b and d) by a neuroradiologist
FIGURE 2(a) A schematic view of the proposed segmentation method using multi‐view fully convolutional networks to segment the 3D claustrum jointly; (b) 2D Convolutional network architecture for each view (i.e., axial and coronal). It takes the raw images as input and predicts its segmentation maps. The network consists of several nonlinear computational layers in a shrinking part (left side) and an expansive part (right side) to extract semantic features of the claustrum structure
Segmentation performances (median values with IQR) of the single‐view approaches and multi‐view approaches
| Metrics | Axial (A) | Coronal (C) | Sagittal (S) | A + C | A + C + S |
| ||
|---|---|---|---|---|---|---|---|---|
| A + C vs. A | A + C vs. C | A + C vs. A + C + S | ||||||
| VS (%) |
94.4 [90.1, 96.7] |
94.7 [90.4, 97.3] |
79.1 [73.5, 86.4] |
93.3 [89.6, 96.9] |
92.9 [89.6, 96.5] | .636 |
| .231 |
| HD95↓ (mm) |
1.73 [1.41, 2.24] |
1.41 [1.41, 2.0] |
3.21 [2.24, 3.61] |
1.41 [1.41, 1.79] |
1.73 [1.41, 1.84] |
|
|
|
| DSC (%) |
69.7 [66.0, 72.4] |
70.0 [67.2, 73.2] |
55.2 [45.7, 63.1] |
71.8 [68.7, 74.6] |
71.0 [68.5, 74.3] | <.001 |
|
|
Note: Values in bold denote statistical significance. The combination of axial and coronal views shows its superiority over individual views. Note that we used equal weights for each view in the multi‐view ensemble model.
Abbreviations: A, axial; C, coronal; DSC, dice similarity coefficient; HD95, 95th percentile of Hausdorff distance; S, sagittal; VS, volumetric similarity.
FIGURE 4Segmentation results of the best case and the worst case in terms of DSC. In the predicted segmentation masks, the red pixels represent true positives, the green ones represent false negatives, and the yellow ones represent false positives
FIGURE 3Segmentation results of 5‐fold cross‐validation on the 181 scans across four scanners: Bonn‐Achieva, Bonn‐Ingenia, Munich‐Achieva, and Munich‐Ingenia. Each box plot summarizes the segmentation performance with respect to one specific evaluation metric
Performance comparison of manual and AI‐based segmentations on 20 subjects with Wilcoxon signed‐rank test
| Metrics | Manual segmentation [median, IQR] | AI‐based segmentation [median, IQR] |
|
|---|---|---|---|
| VS (%) | 94.9, [91.4, 97.6] | 94.3, [89.6, 96.7] | .821 |
| HD95 (mm) | 2.24, [2.0, 2.55] | 1.41, [1.41, 2.24] |
|
| DSC (%) | 68.9, [64.2, 70.9] | 71.7, [67.8, 73.5] |
|
Note: We found that AI‐based segmentation performance is equal or superior to the human expert level.
Abbreviations: DSC, Dice similarity coefficient; HD95, 95th percentile of Hausdorff Distance; VS, volumetric similarity.
FIGURE 5Segmentation results of leave‐one‐scanner‐out evaluation on the four scanners. Each sub‐figure summarizes the segmentation performance on the testing scans from four scanners with respect to one metric. For example, the boxplot named Bonn‐Achieva in the left sub‐figure shows the distribution of segmentation results on scanner Bonn‐Achieva (scanner 1) when using data from the other three scanners to train the AI model
Statistics analysis of leave‐one‐scanner‐out segmentation results and k‐fold cross‐validation results
| Metrics | Leave‐one‐scanner‐out (mean ± Std) |
|
|
|---|---|---|---|
| VS (%) | 91.9 ± 6.2 | 92.2 ± 5.7 | .268 |
| HD95(mm)↓ | 1.86 ± 0.58 | 1.76 ± 0.51 |
|
| DSC (%) | 68.3 ± 5.0 | 69.5 ± 5.3 |
|
Note: Values in bold denote statistical significance. Statistical differences between them with respect to HD95 and Dice score were observed. It indicated that testing on unseen scanners harms the segmentation performance.
Abbreviations: DSC, Dice similarity coefficient; HD95, 95th percentile of Hausdorff Distance; VS, volumetric similarity.
FIGURE 6Segmentation performance on the validation set when gradually increasing the percentage of the training data by a step of 10%. Only a marginal improvement on the validation set was observed when >50% of the training set was used