| Literature DB >> 32116623 |
Rui Hua1,2, Quan Huo2, Yaozong Gao2, He Sui3, Bing Zhang4, Yu Sun1, Zhanhao Mo3, Feng Shi2.
Abstract
In this work, we propose a novel cascaded V-Nets method to segment brain tumor substructures in multimodal brain magnetic resonance imaging. Although V-Net has been successfully used in many segmentation tasks, we demonstrate that its performance could be further enhanced by using a cascaded structure and ensemble strategy. Briefly, our baseline V-Net consists of four levels with encoding and decoding paths and intra- and inter-path skip connections. Focal loss is chosen to improve performance on hard samples as well as balance the positive and negative samples. We further propose three preprocessing pipelines for multimodal magnetic resonance images to train different models. By ensembling the segmentation probability maps obtained from these models, segmentation result is further improved. In other hand, we propose to segment the whole tumor first, and then divide it into tumor necrosis, edema, and enhancing tumor. Experimental results on BraTS 2018 online validation set achieve average Dice scores of 0.9048, 0.8364, and 0.7748 for whole tumor, tumor core and enhancing tumor, respectively. The corresponding values for BraTS 2018 online testing set are 0.8761, 0.7953, and 0.7364, respectively. We also evaluate the proposed method in two additional data sets from local hospitals comprising of 28 and 28 subjects, and the best results are 0.8635, 0.8036, and 0.7217, respectively. We further make a prediction of patient overall survival by ensembling multiple classifiers for long, mid and short groups, and achieve accuracy of 0.519, mean square error of 367240 and Spearman correlation coefficient of 0.168 for BraTS 2018 online testing set.Entities:
Keywords: V-Net; brain tumor; deep learning; magnetic resonance imaging; multimodal; segmentation
Year: 2020 PMID: 32116623 PMCID: PMC7033427 DOI: 10.3389/fncom.2020.00009
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 2.380
Figure 1The flow chart of the preprocessing procedures.
Figure 2The architecture of the used V-Net.
The detailed parameters of the used V-Net, as shown in Figure 2.
| Input block | Conv(k = 3, | 96 × 96 × 96 × 4 | 96 × 96 × 96 × 16 |
| Down block 1 | Conv(k = 2, | 96 × 96 × 96 × 16 | 48 × 48 × 48 × 32 |
| Conv(k = 3, | 48 × 48 × 48 × 32 | – | |
| (input+output) + ReLU* | 48 × 48 × 48 × 32 | – | |
| Down block 2 | Conv(k = 2, | 48 × 48 × 48 × 32 | 24 × 24 × 24 × 64 |
| Conv block × 2* | 24 × 24 × 24 × 64 | – | |
| (input+output) + ReLU* | 24 × 24 × 24 × 64 | – | |
| Down block 3 | Conv(k = 2, | 24 × 24 × 24 × 64 | 12 × 12 × 12 × 128 |
| Conv block × 3* | 12 × 12 × 12 × 128 | – | |
| (input+output) + ReLU* | 12 × 12 × 12 × 128 | – | |
| Down block4 | Conv(k = 2, | 12 × 12 × 12 × 128 | 6 × 6 × 6 × 256 |
| Conv block × 3* | 6 × 6 × 6 × 256 | – | |
| (input+output) + ReLU* | 6 × 6 × 6 × 256 | – | |
| Up block 1 | Conv(k = 2, | 6 × 6 × 6 × 256 | 12 × 12 × 12 × 128 |
| Cat(output, skip)* | 12 × 12 × 12 × 128 | 12 × 12 × 12 × 256 | |
| Conv block × 3* | 12 × 12 × 12 × 256 | – | |
| (input+output) + ReLU* | 12 × 12 × 12 × 256 | – | |
| Up block 2 | Conv(k = 2, | 12 × 12 × 12 × 256 | 24 × 24 × 24 × 64 |
| Cat(output+skip)* | 24 × 24 × 24 × 64 | 24 × 24 × 24 × 128 | |
| Conv Block × 3* | 24 × 24 × 24 × 128 | – | |
| (input+output) + ReLU* | 24 × 24 × 24 × 128 | – | |
| Up block 3 | Conv(k = 2, | 24 × 24 × 24 × 128 | 48 × 48 × 48 × 32 |
| Cat(output+skip)* | 48 × 48 × 48 × 32 | 48 × 48 × 48 × 64 | |
| Conv(k = 3, | 48 × 48 × 48 × 64 | – | |
| Conv(k = 3, | 48 × 48 × 48 × 64 | – | |
| (input+output) + ReLU* | 48 × 48 × 48 × 64 | – | |
| Up block 4 | Conv(k = 2, | 48 × 48 × 48 × 64 | 96 × 96 × 96 × 16 |
| Cat(output+skip)* | 96 × 96 × 96 × 16 | 96 × 96 × 96 × 32 | |
| Conv(k = 3, | 96 × 96 × 96 × 32 | – | |
| (input+output) + ReLU* | 96 × 96 × 96 × 32 | – | |
| Out block | Conv(k = 1, | 96 × 96 × 96 × 32 | 96 × 96 × 96 × 4 |
| Softmax | 96 × 96 × 96 × 4 | 96 × 96 × 96 × 1 |
Each Conv sub-block contains three convolution layers: Conv1 (k = 1, p = 0, s = 1), Conv2 (k = 3, p = 1, s = 1), and Conv3 (k = 1, p = 0, s = 1). k, kernel size; p, padding; s, stride. The symbol “–” means the output dimensions are the same with input dimensions. The symbol “.
Figure 3The proposed framework of cascaded V-Nets for brain tumor segmentation.
Selected features in the training data for the prediction of patient overall survival.
| Age | 1 |
| Volume of whole brain | 1 |
| Volume of whole tumor | 1 |
| Volumes of three tumor substructures | 3 |
| Ratio of the whole tumor in whole brain | 1 |
| Ratios of three tumor substructures in whole tumor | 3 |
| Extent of lesion in x, y, z directions | 3 |
| Center coordinates of the whole tumor | 3 |
| Means and variances of three tumor substructures in four MR modalities | 24 |
| First order statistics features of three tumor substructures | 411 |
| Shape-based features of three tumor substructures | 78 |
| Gray level cooccurence matrix features of three tumor substructures | 180 |
| Gray level run length matrix features of three tumor substructures | 96 |
| Neigbouring gray tone difference matrix features of three tumor substructures | 96 |
| Gray level dependence matrix features of three tumor substructures | 84 |
Figure 4The comparison of segmentation results and ground truth on representative cases from local testing set and two clinical testing sets. (A) The segmentation results and ground truth from local testing set. (B) The segmentation results and ground truth from clinical testing set of China-Japan Union Hospital of Jilin University. (C) The segmentation results and ground truth from clinical testing set of Affiliated Drum Tower Hospital of Nanjing University Medical School.
Dice, sensitivity, and specificity measurements of the proposed method on local testing set.
| Dice mean ± SD | 0.8505 ± 0.0972 | 0.7842 ± 0.1919 | 0.7426 ± 0.2080 |
| Sensitivity mean ± SD | 0.9180 ± 0.1091 | 0.7596 ± 0.2199 | 0.7174 ± 0.2337 |
| Specificity mean ± SD | 0.9981 ± 0.0012 | 0.9996 ± 0.0008 | 0.9997 ± 0.0003 |
Dice, sensitivity, specificity, and Hausdorff95 measurements of the proposed method on BraTS 2018 validation set.
| Dice mean ± SD | 0.9048 ± 0.0648 | 0.8364 ± 0.1609 | 0.7768 ± 0.2355 |
| Sensitivity mean ± SD | 0.9146 ± 0.0949 | 0.8453 ± 0.1781 | 0.8166 ± 0.2382 |
| Specificity mean ± SD | 0.9945 ± 0.0041 | 0.9971 ± 0.0041 | 0.9977 ± 0.0032 |
| Hausdorff95 mean ± SD (mm) | 5.1759 ± 7.3622 | 6.2780 ± 7.7681 | 3.5123 ± 4.5407 |
Figure 5The distribution of Dice scores for whole tumor, tumor core and enhancing tumor in ablation experiments. (A) The bar plot of Dice scores for whole tumor. The difference between the baseline V-Nets architecture and our proposed architecture reaches significance as p = 0.011. (B) The bar plot of Dice scores for tumor core. (C) The bar plot of Dice scores for enhancing tumor (The height of the bar indicates the mean Dice scores, and the error bars indicate the standard deviation).
Dice and Hausdorff95 measurements of the proposed method on BraTS 2018 testing set.
| Dice mean ± SD | 0.8761 ± 0.1247 | 0.7953 ± 0.2543 | 0.7364 ± 0.2592 |
| Hausdorff95 mean ± SD (mm) | 7.0514 ± 11.5935 | 6.7262 ± 11.8852 | 3.9217 ± 6.1934 |
The prediction of patient overall survival on BraTS 2018 testing set.
| Accuracy | 0.519 |
| Mean squared error (MSE) | 367239.974 |
| Median square error (MedianSE) | 38416 |
| Standard deviation square error | 945593.877 |
| SpearmanR | 0.168 |
Dice measurements of the proposed method on clinical testing set.
| # of subjects | 28 | 28 |
| Image resolution (mm3) | 0.6 × 0.6 × 6 | 0.67 × 0.67 × 0.67 |
| WT Dice mean ± SD | 0.8635 ± 0.0838 | 0.8692 ± 0.1307 |
| TC Dice mean ± SD | 0.8036 ± 0.1476 | 0.6786 ± 0.3093 |
| ET Dice mean ± SD | 0.7217 ± 0.1968 | 0.7054 ± 0.3557 |