| Literature DB >> 34941650 |
Aymen Meddeb1, Tabea Kossen2, Keno K Bressem1,3, Bernd Hamm1, Sebastian N Nagel1.
Abstract
The aim of this study was to develop a deep learning-based algorithm for fully automated spleen segmentation using CT images and to evaluate the performance in conditions directly or indirectly affecting the spleen (e.g., splenomegaly, ascites). For this, a 3D U-Net was trained on an in-house dataset (n = 61) including diseases with and without splenic involvement (in-house U-Net), and an open-source dataset from the Medical Segmentation Decathlon (open dataset, n = 61) without splenic abnormalities (open U-Net). Both datasets were split into a training (n = 32.52%), a validation (n = 9.15%) and a testing dataset (n = 20.33%). The segmentation performances of the two models were measured using four established metrics, including the Dice Similarity Coefficient (DSC). On the open test dataset, the in-house and open U-Net achieved a mean DSC of 0.906 and 0.897 respectively (p = 0.526). On the in-house test dataset, the in-house U-Net achieved a mean DSC of 0.941, whereas the open U-Net obtained a mean DSC of 0.648 (p < 0.001), showing very poor segmentation results in patients with abnormalities in or surrounding the spleen. Thus, for reliable, fully automated spleen segmentation in clinical routine, the training dataset of a deep learning-based algorithm should include conditions that directly or indirectly affect the spleen.Entities:
Keywords: automated segmentation; deep learning; diagnosis; diagnostic techniques and procedures; image processing
Mesh:
Year: 2021 PMID: 34941650 PMCID: PMC8704906 DOI: 10.3390/tomography7040078
Source DB: PubMed Journal: Tomography ISSN: 2379-1381
Characteristics of the In-house Dataset.
| Training and Validation Dataset | Testing Dataset | |
|---|---|---|
| Number of Patients | 41 | 20 |
| Female | 23 (56%) | 5 (25%) |
| Age * | 62.6 ± 16.2 | 60.4 ± 15.1 |
| Splenomegaly | 21 (51.2%) | 7 (35%) |
| Liver cirrhosis | 11 (26.8%) | 3 (15%) |
| Lymphoma | 6 (15.6%) | 2 (10%) |
| No pathology | 4 (9.8%) | 1 (5%) |
| Other ** | 21 (51.2%) | 14 (70%) |
Unless otherwise indicated, data are expressed as number of participants. * Data are expressed as mean ± standard deviation; ** other pathologies include lung, pancreas, liver, prostate, and colorectal cancers, without direct splenic involvement, but in whom splenomegaly or ascites could be present.
Figure 1Study Design: In-house U-Net was trained and validated with the in-house training and validation dataset, then tested on both test sets. Open U-Net was trained and validated with the open training and validation dataset, then tested on both test sets.
Model performances on the in-house and open testing datasets.
| Model | Dataset | DSC | RAVD * (%) | ASSD (mm) | Hausdorff (mm) | |
|---|---|---|---|---|---|---|
| In-house | In-house testing dataset | Mean | 0.941 | 4.203 | 0.772 | 7.137 |
| 95% CI | 0.932–0.951 | 2.313–6.094 | 0.644–0.900 | 4.591–9.683 | ||
| Open testing | Mean | 0.906 | 9.690 | 0.999 | 8.787 | |
| 95% CI | 0.873–0.939 | 3.877–15.504 | 0.692–1.307 | 5.563–12.011 | ||
| Open | In-house testing | Mean | 0.648 | 42.255 | 5.158 | 30.085 |
| 95% CI | 0.513–0.784 | 26.503–58.008 | 2.406–7.911 | 15.630–44.539 | ||
| Open testing | Mean | 0.897 | 11.488 | 0.982 | 7.569 | |
| 95% CI | 0.859–0.935 | 5.323–17.653 | 0.693–1.272 | 5.115–10.023 |
SD: standard deviation. 95% CI: 95% confidence interval. DSC: Dice Similarity Score (the higher the better). RAVD: Relative Absolute Volume Difference (the lower the better); * no standard deviation is reported because this metric is zero centered and we used the absolute value. ASSD: Average Symmetric Surface Distance (the lower the better). Hausdorff: Maximum Hausdorff Distance (the lower the better).
Figure 2Boxplots showing the segmentation performances of in-house U-Net and open U-Net applied on the open and in-house test datasets. The methods are compared using the Dice similarity score (DSC). Mean (red dashed) and median (green) values are depicted. When applied to the in-house test dataset, the performance of the open U-Net generally drops and becomes more unreliable, which is depicted by a lower mean and median, as well as a larger spread between the best and worst DSC. Table 2 further gives an overview of the results.
Figure 3Sample images showing segmentation results of in-house and open U-Net using the in-house and open datasets. In the open dataset, in-house U-Net and open U-Net show performances, whereas in the in-house dataset, open U-Net shows bad segmentation results, especially in patients with splenomegaly.
Comparison between our in-house U-Net (in Bold) and previous works.
| Method | DSC | Modality | Abnormalities |
|---|---|---|---|
| Gauriau et al. [ | 0.870 ± 0.150 | Abd. CT | No |
| Wood et al. [ | 0.873 | Abd. CT | Yes |
| Gloger et al. [ | 0.906 ± 0.037 | MRI | No |
|
|
|
|
|
| Gibson et al. [ | 0.950 | Abd. CT | No |
| Linguraru et al. [ | 0.952 ± 0.014 | Abd. CT | Yes |