| Literature DB >> 35023255 |
Axel Largent1, Josepheen De Asis-Cruz1, Kushal Kapse1, Scott D Barnett1, Jonathan Murnick1, Sudeepta Basu1, Nicole Andersen1, Stephanie Norman1, Nickie Andescavage1,2, Catherine Limperopoulos1,3,4.
Abstract
Post-hemorrhagic hydrocephalus (PHH) is a severe complication of intraventricular hemorrhage (IVH) in very preterm infants. PHH monitoring and treatment decisions rely heavily on manual and subjective two-dimensional measurements of the ventricles. Automatic and reliable three-dimensional (3D) measurements of the ventricles may provide a more accurate assessment of PHH, and lead to improved monitoring and treatment decisions. To accurately and efficiently obtain these 3D measurements, automatic segmentation of the ventricles can be explored. However, this segmentation is challenging due to the large ventricular anatomical shape variability in preterm infants diagnosed with PHH. This study aims to (a) propose a Bayesian U-Net method using 3D spatial concrete dropout for automatic brain segmentation (with uncertainty assessment) of preterm infants with PHH; and (b) compare the Bayesian method to three reference methods: DenseNet, U-Net, and ensemble learning using DenseNets and U-Nets. A total of 41 T2 -weighted MRIs from 27 preterm infants were manually segmented into lateral ventricles, external CSF, white and cortical gray matter, brainstem, and cerebellum. These segmentations were used as ground truth for model evaluation. All methods were trained and evaluated using 4-fold cross-validation and segmentation endpoints, with additional uncertainty endpoints for the Bayesian method. In the lateral ventricles, segmentation endpoint values for the DenseNet, U-Net, ensemble learning, and Bayesian U-Net methods were mean Dice score = 0.814 ± 0.213, 0.944 ± 0.041, 0.942 ± 0.042, and 0.948 ± 0.034 respectively. Uncertainty endpoint values for the Bayesian U-Net were mean recall = 0.953 ± 0.037, mean negative predictive value = 0.998 ± 0.005, mean accuracy = 0.906 ± 0.032, and mean AUC = 0.949 ± 0.031. To conclude, the Bayesian U-Net showed the best segmentation results across all methods and provided accurate uncertainty maps. This method may be used in clinical practice for automatic brain segmentation of preterm infants with PHH, and lead to better PHH monitoring and more informed treatment decisions.Entities:
Keywords: Bayesian deep learning; Monte Carlo dropout; automatic brain segmentation; post-hemorrhagic hydrocephalus; preterm infants; uncertainty assessment
Mesh:
Year: 2022 PMID: 35023255 PMCID: PMC8933325 DOI: 10.1002/hbm.25762
Source DB: PubMed Journal: Hum Brain Mapp ISSN: 1065-9471 Impact factor: 5.038
Demographics of the subjects
| Demographics | |
|---|---|
| Gestational age (at birth) | 26.75 ± 2.99 weeks |
| Birth weight | 965.00 ± 477.13 grams |
| Post‐conceptual age (at MRI scan) | 36.83 ± 4.09 weeks |
| IVH grade |
Grade 2 = 10 scans Grade 3 = 9 scans Grade 4 = 22 scans |
| Cerebellar hemorrhage | 27 scans |
| Shunt insertion (before scanning) | 1 scan |
| Ventricular access devices insertion (before scanning) | 8 scans |
| Porencephaly | 5 scans |
| Polymicrogyria | 2 scans |
| Callosal hypogenesis | 2 scans |
| Periventricular leukomalacia | 2 scans |
| Cervical cord syrinx | 2 scans |
| Punctate hemorrhage | 3 scans |
Note: Gestational age, birth weight, and postconceptual age are presented as mean ± SD.
Total number of parameters for each deep learning method
| Deep learning methods | Number of parameters |
|---|---|
| DenseNet | Network = 30,617,887 |
| U‐Net | Network = 111,693,063 |
| Ensemble learning using several DenseNets and U‐Nets | Network = 284,623,164 |
| Bayesian U‐Net using 3D spatial dropout ( |
Network = 111,698,695 Stochastic pass = 6 |
| Bayesian U‐Net using 3D spatial concrete dropout |
Network = 111,698,702 Stochastic pass = 6 |
FIGURE 1Architecture of the Bayesian U‐Net method using 3D spatial dropout. All investigated Bayesian deep learning methods used a 3D U‐Net as backnone. This 3D U‐Net was considered as a reference method. The architecture of the 3D U‐Net differs from those of the Bayesian methods by not using dropout techniques in its convolutional blocks
Segmentation endpoint values for all Bayesian U‐Net methods and volumes‐of‐interest
| Lateral ventricles | External CSF | White matter | Cortical gray matter | Cerebellum | Brainstem | |
|---|---|---|---|---|---|---|
| Dice score (ratio) | ||||||
| Bayesian U‐Net using 3D spatial concrete dropout |
|
| 0.900* (± 0.054) | 0.840* (± 0.085) | 0.907 (± 0.061) | 0.900* (± 0.033) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.942 (± 0.052) | 0.820 (± 0.112) |
|
| 0.904 (± 0.061) |
|
| Bayesian U‐Net using 3D spatial dropout ( | 0.946 (± 0.411) | 0.818 (± 0.115) | 0.896* (± 0.058) | 0.839* (± 0.087) |
| 0.902* (± 0.037) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.911+ (± 0.132) | 0.819+ (± 0.112) | 0.893*+ (± 0.057) | 0.835* (± 0.085) | 0.897+ (± 0.074) | 0.898 (± 0.042) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.935*+ (± 0.061) | 0.821 (± 0.113) | 0.893*+ (± 0.058) | 0.831* (± 0.097) | 0.900 (± 0.075) | 0.901* (± 0.036) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.922*+ (± 0.105) | 0.814+ (± 0.119) | 0.892*+ (± 0.057) | 0.832* (± 0.090) | 0.892* (± 0.091) | 0.899* (± 0.037) |
| 95th Hausdorff distance (mm) | ||||||
| Bayesian U‐Net using 3D spatial concrete dropout |
|
| 1.226 (± 0.849) | 1.130* (± 0.841) | 2.365 (± 5.064) | 2.087 (± 3.868) |
| Bayesian U‐Net using 3D spatial dropout ( | 3.108 (± 8.631) | 2.837 (± 4.839) |
|
|
|
|
| Bayesian U‐Net using 3D spatial dropout ( | 2.961 (± 8.540) | 2.964 (± 5.118) | 1.240* (± 0.748) | 1.147* (± 0.878) | 2.598 (± 5.489) | 2.565* (± 4.975) |
| Bayesian U‐Net using 3D spatial dropout ( | 4.835*+ (± 10.280) | 2.053 (± 1.935) | 1.278*+ (± 0.774) | 1.133* (± 0.827) | 1.759 (± 1.047) | 2.133* (± 3.963) |
| Bayesian U‐Net using 3D spatial dropout ( | 3.120+ (± 8.187) | 3.097 (± 6.048) | 1.261*+ (± 0.761) | 1.161* (± 0.848) | 1.762 (± 1.359) | 2.058 (± 3.955) |
| Bayesian U‐Net using 3D spatial dropout ( | 4.694*+ (± 11.853) | 3.361 (± 6.162) | 1.315*+ (± 0.747) | 1.223* (± 0.906) | 1.886*+ (± 1.142) | 2.150* (± 3.965) |
| Average symmetric surface distance (mm) | ||||||
| Bayesian U‐Net using 3D spatial concrete dropout | 0.371 (± 0.302) | 0.458 (± 0.390) | 0.346 (± 0.194) | 0.293* (± 0.162) | 0.651 (± 0.613) | 0.507* (± 0.395) |
| Bayesian U‐Net using 3D spatial dropout ( |
|
|
|
|
|
|
| Bayesian U‐Net using 3D spatial dropout ( | 0.482 (± 0.882) | 0.526 (± 0.557) | 0.373* (± 0.215) | 0.327* (± 0.201) | 0.676 (± 0.742) | 0.487* (± 0.400) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.790 (± 1.271) | 0.461 (± 0.377) | 0.369*+ (± 0.201) | 0.318*+ (± 0.175) | 0.574 (± 0.332) | 0.460 (± 0.344) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.537+ (± 0.827) | 0.554 (± 0.674) | 0.378*+ (± 0.207) | 0.322*+ (± 0.184) | 0.594 (± 0.437) | 0.500* (± 0.385) |
| Bayesian U‐Net using 3D spatial dropout ( | 1.04*+ (± 2.615) | 0.594 (± 0.770) | 0.395*+ (± 0.221) | 0.368*+ (± 0.287) | 0.631* (± 0.457) | 0.530* (± 0.480) |
| Absolute volume difference (cm3) | ||||||
| Bayesian U‐Net using 3D spatial concrete dropout | 1.594 (± 1.893) |
| 7.917 (± 7.955) | 6.785 (± 6.389) |
| 0.410* (± 0.353) |
| Bayesian U‐Net using 3D spatial dropout ( | 1.691 (± 2.001) | 8.507+ (± 7.952) |
|
| 1.088+ (± 1.191) |
|
| Bayesian U‐Net using 3D spatial dropout ( |
| 8.858+ (± 8.916) | 9.280* (± 11.135) | 7.893* (± 9.265) | 0.930 (± 1.084) | 0.317* (± 0.269) |
| Bayesian U‐Net using 3D spatial dropout ( | 3.179 (± 6.103) | 8.131+ (± 8.429) | 8.946* (± 9.209) | 7.204 (± 6.858) | 1.313*+ (± 1.488) | 0.402* (± 0.441) |
| Bayesian U‐Net using 3D spatial dropout ( | 2.382*+ (± 2.430) | 8.118+ (± 7.560) | 9.857*+ (± 11.123) | 7.858* (± 9.294) | 1.057 (± 1.456) | 0.342 (± 0.331) |
| Bayesian U‐Net using 3D spatial dropout ( | 2.490*+ (± 2.927) | 8.826+ (± 8.002) | 9.425* (± 11.384) | 7.748* (± 9.213) | 1.291*+ (± 1.516) | 0.328 (± 0.333) |
| Relative volume difference (ratio) | ||||||
| Bayesian U‐Net using 3D spatial concrete dropout | 0.038 (± 0.047) |
| 0.068 (± 0.062) | 0.071* (± 0.057) |
| 0.090* (± 0.077) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.046 (± 0.065) | 0.130 (± 0.121) |
|
| 0.098+ (± 0.105) |
|
| Bayesian U‐Net using 3D spatial dropout ( |
| 0.135 (± 0.137) | 0.080*+ (± 0.087) | 0.081* (± 0.085) | 0.078 (± 0.083) | 0.070* (± 0.060) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.176 (± 0.496) | 0.121 (± 0.131) | 0.077* (± 0.073) | 0.074* (± 0.059) | 0.114+ (± 0.123) | 0.082* (± 0.080) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.066*+ (± 0.091) | 0.128+ (± 0.135) | 0.086*+ (± 0.091) | 0.083* (± 0.089) | 0.089 (± 0.112) | 0.070 (± 0.058) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.157* (± 0.313) | 0.147+ (± 0.163) | 0.083* (± 0.098) | 0.080* (± 0.085) | 0.115+ (± 0.140) | 0.068 (± 0.064) |
Note: Values of the segmentation endpoints are presented as mean ± standard deviation (over the entire cohort). Highest Dice score values and lowest 95th Hausdorff distance, average symmetric surface distance, absolute and relative volume difference values are shown in bold. p represents the value of the dropout parameter used in the Bayesian U‐Net methods. Wilcoxon tests were used to compare the segmentation endpoint values of the Bayesian U‐Net using 3D spatial dropout with p = 0.1 to those of the other Bayesian U‐Net methods (alternative hypothesis was set to “greater” for the Dice score and “smaller” for the other endpoints). Significant differences (p‐values < 0.05) are displayed with (*). Wilcoxon tests were also used to compare the segmentation endpoint values of the Bayesian U‐Net using 3D spatial concrete dropout to those of the other Bayesian U‐Net methods (alternative hypothesis was set to “greater” for the Dice score and “smaller” for the other endpoints). Significant differences (p‐values < 0.05) are displayed with (+).
Uncertainty endpoint values for all Bayesian U‐Net methods
| Recall | NPV | Accuracy | AUC | |
|---|---|---|---|---|
| Bayesian U‐Net using 3D spatial concrete dropout | 0.953 (± 0.037) | 0.998 (± 0.005) | 0.906* (± 0.032) | 0.949 (± 0.031) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.953 (± 0.032) | 0.998 (± 0.004) |
| 0.941+ (± 0.026) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.951 (± 0.038) | 0.998 (± 0.005) | 0.906* (± 0.029) | 0.900*+ (± 0.048) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.955 (± 0.034) |
| 0.903* (± 0.029) | 0.914*+ (± 0.056) |
| Bayesian U‐Net using 3D spatial dropout ( |
| 0.998 (± 0.004) | 0.905* (± 0.025) |
|
| Bayesian U‐Net using 3D spatial dropout ( | 0.955 (± 0.031) | 0.998 (± 0.004) | 0.903* (± 0.029) | 0.917*+ (± 0.049) |
Note: Values of the uncertainty endpoints are presented as mean ± SD (over the entire cohort). Highest values are shown in bold. p represents the value of the dropout parameter used in the Bayesian U‐Net methods. Wilcoxon tests were used to compare the uncertainty endpoint values of the Bayesian U‐Net using 3D spatial dropout with p = 0.1 to those of the other Bayesian U‐Net methods (alternative hypothesis was set to “greater” for all endpoints). Significant differences (p‐values < 0.05) are displayed with (*). Wilcoxon tests were also used to compare the uncertainty endpoint values of the Bayesian U‐Net using 3D spatial concrete dropout to those of the other Bayesian U‐Net methods (alternative hypothesis was set to “greater” for all endpoints). Significant differences (p‐values < 0.05) are displayed with (+).
FIGURE 2Example of the segmentations and uncertainty maps of one subject for all Bayesian U‐Net methods. The selected subject was the one with the highest AUC values (uncertainty endpoint) for the Bayesian U‐Net using 3D spatial dropout (p = 0.1) and the Bayesian U‐Net using 3D spatial concrete dropout. The AUC values of this subject for the Bayesian U‐Net using 3D spatial concrete dropout and the Bayesian U‐Net using 3D spatial dropout with p = 0.1, 0.2, 0.3, 0.4, and 0.5 were equal to 0.980, 0.975, 0.964, 0.975, 0.977, and 0.965. Uncertainty voxels < threshold indicate where the model is certain about this prediction. Uncertainty voxels > threshold indicate where the model is uncertain about this prediction
Values of the spatial dropout parameter p optimized through the Bayesian U‐Net using 3D spatial concrete dropout for each cross‐validation fold
| Fold 1 | Fold 2 | Fold 3 | Fold 4 | |
|---|---|---|---|---|
| Value of | 0.000 | 0.000 | 0.000 | 0.000 |
| Value of | 0.000 | 0.000 | 0.002 | 0.000 |
| Value of | 0.153 | 0.158 | 0.158 | 0.027 |
| Value of | 0.304 | 0.254 | 0.270 | 0.363 |
| Value of | 0.046 | 0.047 | 0.106 | 0.007 |
| Value of | 0.000 | 0.000 | 0.000 | 0.000 |
| Value of | 0.000 | 0.000 | 0.000 | 0.000 |
Stability of the segmentation results of the Bayesian U‐Net using 3D spatial dropout (p = 0.1) and Bayesian U‐Net using 3D spatial concrete dropout
| Lateral ventricles | External CSF | White matter | Cortical gray matter | Cerebellum | Brainstem | ||
|---|---|---|---|---|---|---|---|
| Dice score (ratio) | |||||||
| Bayesian U‐Net using 3D spatial dropout 1 ( | 0.945 (± 0.039) | 0.814 (± 0.122) | 0.898 (± 0.056) | 0.838 (± 0.089) | 0.886 (± 0.108) | 0.897 (± 0.050) | |
| Bayesian U‐Net using 3D spatial dropout 2 ( | 0.942 (± 0.052) | 0.820 (± 0.112) | 0.901 (± 0.053) | 0.846 (± 0.080) | 0.904 (± 0.061) | 0.906 (± 0.033) | |
| Bayesian U‐Net using 3D spatial dropout 3 ( | 0.948 (± 0.034) | 0.821 (± 0.112) | 0.899 (± 0.058) | 0.844 (± 0.087) | 0.905 (± 0.061) | 0.905 (± 0.032) | |
|
|
| 0.818 (± 0.004) |
|
| 0.898 (± 0.011) |
| |
| Bayesian U‐Net using 3D spatial concrete dropout 1 | 0.948 (± 0.034) | 0.823 (± 0.114) | 0.900 (± 0.054) | 0.840 (± 0.085) | 0.907 (± 0.061) | 0.900 (± 0.033) | |
| Bayesian U‐Net using 3D spatial concrete dropout 2 | 0.946 (± 0.036) | 0.819 (± 0.114) | 0.898 (± 0.059) | 0.836 (± 0.086) | 0.905 (± 0.068) | 0.901 (± 0.031) | |
| Bayesian U‐Net using 3D spatial concrete dropout 3 | 0.942 (± 0.035) | 0.819 (± 0.109) | 0.895 (± 0.057) | 0.838 (± 0.084) | 0.908 (± 0.048) | 0.900 (± 0.038) | |
|
|
|
| 0.898 (± 0.003) | 0.838 (± 0.002) |
| 0.900 (± 0.001) | |
| 95th Hausdorff distance (mm) | |||||||
| Bayesian U‐Net using 3D spatial dropout 1 ( | 2.620 (± 6.575) | 3.267 (± 6.181) | 1.198 (± 0.737) | 1.606 (± 3.168) | 3.891 (± 8.214) | 2.720 (± 5.435) | |
| Bayesian U‐Net using 3D spatial dropout 2 ( | 3.108 (± 8.631) | 2.837 (± 4.839) | 1.193 (± 0.769) | 1.076 (± 0.768) | 1.658 (± 0.836) | 2.001 (± 3.904) | |
| Bayesian U‐Net using 3D spatial dropout 3 ( | 1.096 (± 0.808) | 2.027 (± 1.859) | 1.200 (± 0.796) | 1.079 (± 0.778) | 1.698 (± 0.971) | 2.068 (± 3.949) | |
|
| 2.275* (± 1.050) | 2.710 (± 0.630) |
| 1.254* (± 0.305) |
| 2.263 (± 0.398) | |
| Bayesian U‐Net using 3D spatial concrete dropout 1 | 1.357 (± 2.131) | 1.973 (± 1.849) | 1.226 (± 0.849) | 1.130 (± 0.841) | 2.365 (± 5.064) | 2.087 (± 3.868) | |
| Bayesian U‐Net using 3D spatial concrete dropout 2 | 1.197 (± 1.099) | 3.010 (± 5.446) | 1.233 (± 0.729) | 1.115 (± 0.885) | 3.086 (± 7.235) | 2.130 (± 3.879) | |
| Bayesian U‐Net using 3D spatial concrete dropout 3 | 2.575 (± 6.150) | 2.927 (± 5.277) | 1.238 (± 0.794) | 1.123 (± 0.822) | 3.183 (± 7.561) | 2.084 (± 4.059) | |
|
|
|
| 1.232 (± 0.006) |
| 2.878 (± 0.447) |
| |
| Average symmetric surface distance (mm) | |||||||
| Bayesian U‐Net using 3D spatial dropout 1 ( | 0.429 (± 0.519) | 0.595 (± 0.759) | 0.374 (± 0.222) | 0.375 (± 0.346) | 1.092 (± 1.992) | 0.540 (± 0.469) | |
| Bayesian U‐Net using 3D spatial dropout 2 ( | 0.336 (± 1.372) | 0.194 (± 0.502) | 0.249 (± 0.201) | 0.203 (± 0.225) | 0.394 (± 0.270) | 0.442 (± 0.334) | |
| Bayesian U‐Net using 3D spatial dropout 3 ( | 0.326 (± 0.198) | 0.459 (± 0.365) | 0.341 (± 0.206) | 0.289 (± 0.158) | 0.559 (± 0.304) | 0.447 (± 0.317) | |
|
|
|
|
|
|
|
| |
| Bayesian U‐Net using 3D spatial concrete dropout 1 | 0.371 (± 0.302) | 0.458 (± 0.390) | 0.346 (± 0.194) | 0.293 (± 0.162) | 0.651 (± 0.613) | 0.507 (± 0.395) | |
| Bayesian U‐Net using 3D spatial concrete dropout 2 | 0.369 (± 0.309) | 0.545 (± 0.623) | 0.373 (± 0.216) | 0.326 (± 0.216) | 0.728 (± 1.101) | 0.513 (± 0.472) | |
| Bayesian U‐Net using 3D spatial concrete dropout 3 | 0.476 (± 0.538) | 0.502 (± 0.508) | 0.354 (± 0.192) | 0.322 (± 0.173) | 0.686 (± 0.786) | 0.486 (± 0.366) | |
|
| 0.405* (± 0.061) | 0.502 (± 0.044) | 0.358 (± 0.014) | 0.314 (± 0.018) | 0.688 (± 0.039) | 0.502 (± 0.014) | |
| Absolute volume difference (cm3) | |||||||
| Bayesian U‐Net using 3D spatial dropout 1 ( | 1.594 (± 1.869) | 9.604 (± 9.014) | 8.258 (± 9.174) | 6.559 (± 8.675) | 0.937 (± 0.960) | 0.338 (± 0.290) | |
| Bayesian U‐Net using 3D spatial dropout 2 ( | 1.691 (± 2.001) | 8.507 (± 7.952) | 7.211 (± 8.616) | 5.907 (± 6.714) | 1.088 (± 1.191) | 0.262 (± 0.268) | |
| Bayesian U‐Net using 3D spatial dropout 3 ( | 1.596 (± 1.714) | 8.612 (± 8.671) | 7.505 (± 10.196) | 5.754 (± 7.874) | 1.167 (± 1.333) | 0.364 (± 0.325) | |
|
|
| 8.908 (± 0.605) |
|
| 1.064 (± 0.117) |
| |
| Bayesian U‐Net using 3D spatial concrete dropout 1 | 1.594 (± 1.893) | 6.562 (± 6.473) | 7.917 (± 7.955) | 6.785 (± 6.389) | 0.894 (± 1.158) | 0.410 (± 0.353) | |
| Bayesian U‐Net using 3D spatial concrete dropout 2 | 1.592 (± 2.234) | 7.697 (± 7.328) | 9.639 (± 12.120) | 8.399 (± 10.332) | 0.915 (± 0.907) | 0.372 (± 0.325) | |
| Bayesian U‐Net using 3D spatial concrete dropout 3 | 2.722 (± 3.390) | 8.331 (± 8.189) | 7.847 (± 10.029) | 6.348 (± 8.180) | 0.951 (± 0.964) | 0.440 (± 0.442) | |
|
| 1.969* (± 0.652) |
| 8.468 (± 1.015) | 7.177 (± 1.080) |
| 0.407 (± 0.034) | |
| Relative volume difference (ratio) | |||||||
| Bayesian U‐Net using 3D spatial dropout 1 ( | 0.038 (± 0.037) | 0.153 (± 0.167) | 0.073 (± 0.072) | 0.069 (± 0.079) | 0.091 (± 0.116) | 0.075 (± 0.078) | |
| Bayesian U‐Net using 3D spatial dropout 2 ( | 0.046 (± 0.065) | 0.130 (± 0.121) | 0.062 (± 0.070) | 0.061 (± 0.059) | 0.098 (± 0.105) | 0.056 (± 0.060) | |
| Bayesian U‐Net using 3D spatial dropout 3 ( | 0.038 (± 0.035) | 0.132 (± 0.132) | 0.066 (± 0.082) | 0.063 (± 0.074) | 0.103 (± 0.112) | 0.077 (± 0.067) | |
|
| 0.041 (± 0.005) | 0.138 (± 0.013) |
|
|
|
| |
| Bayesian U‐Net using 3D spatial concrete dropout 1 | 0.038 (± 0.047) | 0.106 (± 0.115) | 0.068 (± 0.062) | 0.071 (± 0.057) | 0.077 (± 0.090) | 0.090 (± 0.077) | |
| Bayesian U‐Net using 3D spatial concrete dropout 2 | 0.039 (± 0.043) | 0.122 (± 0.127) | 0.082 (± 0.096) | 0.084 (± 0.090) | 0.077 (± 0.062) | 0.078 (± 0.059) | |
| Bayesian U‐Net using 3D spatial concrete dropout 3 | 0.046 (± 0.038) | 0.134 (± 0.137) | 0.068 (± 0.072) | 0.064 (± 0.073) | 0.077 (± 0.068) | 0.092 (± 0.092) | |
|
|
|
| 0.072 (± 0.008) | 0.073 (± 0.010) | 0.308 (± 0.400) | 0.087 (± 0.008) | |
Note: Three repeated 4‐fold cross‐validations were conducted on each Bayesian U‐Net method. The obtained segmentation endpoint values are presented as mean ± SD (over the entire cohort) for each cross‐validation. Highest Dice score values and lowest 95th Hausdorff distance, average symmetric surface distance, absolute and relative volume difference values are shown in bold. Friedman tests were used to compare the distributions of the segmentation endpoint values obtained at each cross‐validation (per Bayesian U‐Net method). Significant differences (p‐values < 0.05) are displayed with (*).
Stability of the uncertainty results of the Bayesian U‐Net using 3D spatial dropout (p = 0.1) and Bayesian U‐Net using 3D spatial concrete dropout
| Recall | NPV | Accuracy | AUC | |
|---|---|---|---|---|
| Bayesian U‐Net using 3D spatial dropout 1 ( | 0.956 (± 0.033) | 0.998 (± 0.004) | 0.906 (± 0.032) | 0.906 (± 0.064) |
| Bayesian U‐Net using 3D spatial dropout 2 ( | 0.953 (± 0.032) | 0.998 (± 0.004) | 0.908 (± 0.030) | 0.941 (± 0.026) |
| Bayesian U‐Net using 3D spatial dropout 3 ( | 0.950 (± 0.034) | 0.998 (± 0.004) | 0.909 (± 0.032) | 0.925 (± 0.033) |
|
|
|
|
|
|
| Bayesian U‐Net using 3D spatial concrete dropout 1 | 0.953 (± 0.037) | 0.998 (± 0.005) | 0.906 (± 0.032) | 0.949 (± 0.031) |
| Bayesian U‐Net using 3D spatial concrete dropout 2 | 0.956 (± 0.034) | 0.998 (± 0.005) | 0.905 (± 0.024) | 0.893 (± 0.090) |
| Bayesian U‐Net using 3D spatial concrete dropout 3 | 0.950 (± 0.037) | 0.998 (± 0.005) | 0.902 (± 0.043) | 0.912 (± 0.079) |
|
|
|
| 0.902 (± 0.002) | 0.912* (± 0.028) |
Note: Three repeated 4‐fold cross‐validations were conducted on each Bayesian U‐Net method. The obtained uncertainty endpoint values are presented as mean ± SD (over the entire cohort). Highest values are shown in bold. Friedman tests were used to compare the distributions of the uncertainty endpoint values obtained at each cross‐validation (per Bayesian U‐Net method). Significant differences (p‐values <0.05) are displayed with (*).
Segmentation endpoint values of the DenseNet, U‐Net, ensemble learning using several DenseNets and U‐Nets, and Bayesian U‐Net using 3D spatial dropout (p = 0.1) for each volume‐of‐interest
| Lateral ventricles | External CSF | White matter | Cortical gray matter | Cerebellum | Brainstem | |
|---|---|---|---|---|---|---|
| Dice score (ratio) | ||||||
| DenseNet | 0.814*+ (± 0.213) | 0.732*+ (± 0.136) | 0.848*+ (± 0.080) | 0.750*+ (± 0.143) | 0.527*+ (± 0.233) | 0.610*+ (± 0.166) |
| U‐Net | 0.944+ (± 0.041) | 0.822 (± 0.110) | 0.898* (± 0.057) | 0.841* (± 0.084) | 0.904 (± 0.062) |
|
| Ensemble learning using several DenseNets and U‐Nets | 0.942*+ (± 0.042) | 0.820 (± 0.111) | 0.901 (± 0.054) | 0.843 (± 0.086) | 0.882* (± 0.095) | 0.893 (± 0.060) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.942 (± 0.052) | 0.820 (± 0.112) |
|
| 0.904+ (± 0.061) | 0.906 (± 0.033) |
| Bayesian U‐Net using 3D spatial concrete dropout |
|
| 0.900* (± 0.054) | 0.840* (± 0.085) |
| 0.900* (± 0.033) |
| 95th Hausdorff distance (mm) | ||||||
| DenseNet | 11.452*+ (± 10.594) | 5.234*+ (± 7.668) | 2.264*+ (± 1.555) | 2.087*+ (± 1.793) | 29.618*+ (± 19.223) | 27.258*+ (± 16.143) |
| U‐Net | 2.756 (± 7.703) | 2.541 (± 3.877) | 1.221 (± 0.766) | 1.100 (± 0.838) | 1.681 (± 1.054) | 2.055 (± 3.967) |
| Ensemble learning using several DenseNets and U‐Nets | 2.341*+ (± 3.860) | 3.209+ (± 6.190) |
| 1.102 (± 0.820) | 1.928 (± 1.309) | 2.128 (± 4.029) |
| Bayesian U‐Net using 3D spatial dropout ( | 3.108 (± 8.631) | 2.837 (± 4.839) | 1.193 (± 0.769) |
|
|
|
| Bayesian U‐Net using 3D spatial concrete dropout |
|
| 1.226 (± 0.849) | 1.130* (± 0.841) | 2.365 (± 5.064) | 2.087 (± 3.868) |
| Average symmetric surface distance (mm) | ||||||
| DenseNet | 2.376*+ (± 2.390) | 1.095*+ (± 1.116) | 0.676*+ (± 0.413) | 0.612*+ (± 0.487) | 6.540*+ (± 4.646) | 5.456*+ (± 3.784) |
| U‐Net | 0.486 (± 0.818) | 0.480 (± 0.445) | 0.352 (± 0.204) | 0.305* (± 0.174) | 0.577 (± 0.425) | 0.462 (± 0.352) |
| Ensemble learning using several DenseNets and U‐Nets | 0.449*+ (± 0.396) | 0.566 (± 0.691) | 0.342 (± 0.188) | 0.298* (± 0.165) | 0.682* (± 0.561) | 0.507* (± 0.393) |
| Bayesian U‐Net using 3D spatial dropout ( |
|
|
|
|
|
|
| Bayesian U‐Net using 3D spatial concrete dropout | 0.371 (± 0.302) | 0.458 (± 0.390) | 0.346 (± 0.194) | 0.293* (± 0.162) | 0.651 (± 0.613) | 0.507* (± 0.395) |
| Absolute volume difference (cm3) | ||||||
| DenseNet | 12.671*+ (± 24.936) | 15.398*+ (± 17.117) | 15.597*+ (± 14.814) | 16.439*+ (± 19.628) | 4.289*+ (± 3.752) | 3.669*+ (± 3.672) |
| U‐Net | 2.051*+ (± 2.666) | 7.248+ (± 6.631) | 8.465* (± 10.191) | 6.977 (± 8.801) | 1.147+ (± 1.361) | 0.310* (± 0.321) |
| Ensemble learning using several DenseNets and U‐Nets | 2.166*+ (± 2.579) | 8.716 (± 8.515) | 7.698 (± 8.625) | 5.922 (± 7.571) | 1.496+ (± 1.865) | 0.489* (± 0.631) |
| Bayesian U‐Net using 3D spatial dropout ( | 1.691 (± 2.001) | 8.507+ (± 7.952) |
|
| 1.088+ (± 1.191) |
|
| Bayesian U‐Net using 3D spatial concrete dropout |
|
| 7.917 (± 7.955) | 6.785 (± 6.389) |
| 0.410* (± 0.353) |
| Relative volume difference (ratio) | ||||||
| DenseNet | 0.673*+ (± 1.855) | 0.235*+ (± 0.306) | 0.131*+ (± 0.114) | 0.160*+ (± 0.170) | 0.368*+ (± 0.302) | 0.737*+ (± 0.694) |
| U‐Net | 0.046 (± 0.051) | 0.117 (± 0.115) | 0.073* (± 0.084) | 0.070 (± 0.078) | 0.093+ (± 0.099) | 0.065 (± 0.073) |
| Ensemble learning using several DenseNets and U‐Nets | 0.054*+ (± 0.068) | 0.134+ (± 0.142) | 0.068 (± 0.070) | 0.064 (± 0.071) | 0.128+ (± 0.145) | 0.098* (± 0.105) |
| Bayesian U‐Net using 3D spatial dropout ( | 0.046 (± 0.065) | 0.130+ (± 0.121) |
|
| 0.098+ (± 0.105) |
|
| Bayesian U‐Net using 3D spatial concrete dropout |
|
| 0.068 (± 0.062) | 0.071* (± 0.057) |
| 0.090* (± 0.077) |
Note: Values of the segmentation endpoints are presented as mean ± SD (over the entire cohort). Highest Dice score values and lowest 95th Hausdorff distance, average symmetric surface distance, absolute and relative volume difference values are shown in bold. Wilcoxon tests were used to compare the segmentation endpoint values of the Bayesian U‐Net using 3D spatial dropout (p = 0.1) to those of the other methods (alternative hypothesis is “greater” for the Dice score and “smaller” for the other endpoints). Significant differences (p‐values <0.05) are displayed with (*). Wilcoxon tests were also used to compare the segmentation endpoint values of the Bayesian U‐Net using 3D spatial concrete dropout to those of the other methods (alternative hypothesis was set to “greater” for the Dice score and “smaller” for the other endpoints). Significant differences (p‐values <0.05) are displayed with (+).
FIGURE 3Example of the MRI, the segmentations, and the uncertainty map of one subject for the DenseNet, U‐Net, ensemble learning using several DenseNets and U‐Nets, Bayesian U‐Net using 3D spatial dropout (p = 0.1), and Bayesian U‐Net using 3D spatial concrete dropout. The dice score values of the lateral ventricles of the subject were 0.898, 0.977, 0.980, 0.982, 0.984 for the DenseNet, U‐Net, ensemble learning using several Dense‐Nets and U‐Nets, Bayesian U‐Net using 3D spatial dropout p = 0.1, and Bayesian U‐Net using 3D spatial concrete dropout. Uncertainty voxels < threshold indicate where the model is certain about this prediction. Uncertainty voxels > threshold indicate where the model is uncertain about this prediction
FIGURE 4Cumulative histograms of the Dice score of the external CSF for the DenseNet, U‐Net, ensemble learning using several DenseNets and U‐Nets, Bayesian U‐Net using 3D spatial dropout (p = 0.1), and Bayesian U‐Net using 3D spatial concrete dropout. The dashed lines indicate the number of subjects with Dice score of the external CSF and lateral ventricles inferior to 0.75
FIGURE 5Cumulative histograms of the Dice score of the lateral ventricles for the DenseNet, U‐Net, ensemble learning using several DenseNets and U‐Nets, Bayesian U‐Net using 3D spatial dropout (p = 0.1), and Bayesian U‐Net using 3D spatial concrete dropout. The dashed lines indicate the number of subjects with Dice score of the external CSF and lateral ventricles inferior to 0.75
FIGURE 6Example of a subject with Dice score of the external CSF inferior to 0.75. The dice score values of the external CSF of the subject were 0.572, 0.649, 0.652, 0.642, and 0.665 for the DenseNet, U‐Net, ensemble learning using several DenseNets and U‐Nets, Bayesian U‐Net using 3D spatial dropout with p = 0.1, and Bayesian U‐Net using 3D spatial concrete dropout. Uncertainty voxels < threshold indicate where the model is certain about this prediction. Uncertainty voxels > threshold indicate where the model is uncertain about this prediction