| Literature DB >> 35455906 |
Yingtao Zhang1, Min Xian2, Heng-Da Cheng3, Bryar Shareef2, Jianrui Ding4, Fei Xu3, Kuan Huang3, Boyu Zhang5, Chunping Ning6, Ying Wang7.
Abstract
Breast ultrasound (BUS) image segmentation is challenging and critical for BUS computer-aided diagnosis (CAD) systems. Many BUS segmentation approaches have been studied in the last two decades, but the performances of most approaches have been assessed using relatively small private datasets with different quantitative metrics, which results in a discrepancy in performance comparison. Therefore, there is a pressing need for building a benchmark to compare existing methods using a public dataset objectively, to determine the performance of the best breast tumor segmentation algorithm available today, and to investigate what segmentation strategies are valuable in clinical practice and theoretical study. In this work, a benchmark for B-mode breast ultrasound image segmentation is presented. In the benchmark, (1) we collected 562 breast ultrasound images and proposed standardized procedures to obtain accurate annotations using four radiologists; (2) we extensively compared the performance of 16 state-of-the-art segmentation methods and demonstrated that most deep learning-based approaches achieved high dice similarity coefficient values (DSC ≥ 0.90) and outperformed conventional approaches; (3) we proposed the losses-based approach to evaluate the sensitivity of semi-automatic segmentation to user interactions; and (4) the successful segmentation strategies and possible future improvements were discussed in details.Entities:
Keywords: benchmark; breast ultrasound (BUS) images; computer-aided diagnosis (CAD); segmentation
Year: 2022 PMID: 35455906 PMCID: PMC9025635 DOI: 10.3390/healthcare10040729
Source DB: PubMed Journal: Healthcare (Basel) ISSN: 2227-9032
Recently published approaches.
| Article | Type | Year | Category | Dataset Size/Availability | Metrics |
|---|---|---|---|---|---|
| Kuo, et al. [ | S | 2014 | Deformable models | 98/private | DSC |
| Liu, et al. [ | S | 2010 | Level set-based | 79/private | TP, FP, SI |
| Xian, et al. [ | F | 2015 | Graph-based | 184/private | TPR, FPR, SI, HD, MD |
| Shao, et al. [ | F | 2015 | Graph-based | 450/private | TPR, FPR, SI |
| Huang, et al. [ | S | 2014 | Graph-based | 20/private | ARE, TPVF, FPVF, FNVF |
| Xian, et al. [ | F | 2014 | Graph-based | 131/private | SI, FPR, AHE |
| Gao, et al. [ | S | 2012 | Normalized cut | 100/private | TP, FP, SI, HD, MD |
| Hao, et al. [ | F | 2012 | CRF + DPM | 480/private | JI |
| Moon, et al. [ | S | 2014 | Fuzzy C-means | 148/private | Sensitivity and FP |
| Shan, et al. [ | F | 2012 | Neutrosophic L-mean | 122/private | TPR, FPR, FNR, SI, HD, and MD |
| Hao, et al. [ | F | 2012 | Hierarchical SVM + CRF | 261/private | JI |
| Jiang, et al. [ | S | 2012 | Adaboost + SVM | 112/private | Mean overlap ratio |
| Shan, et al. [ | F | 2012 | Feedforward neural network | 60/private | TPR, FPR, FNR, HD, MD |
| Pons, et al. [ | S | 2014 | SVM + DPM | 163/private | Sensitivity, ROC area |
| Yang, et al. [ | S | 2012 | Naive Bayes classifier | 33/private | FP |
| Torbati, et al. [ | S | 2014 | Feedforward Neural network | 30/private | JI |
| Huang, et al. [ | F | 2020 | Deep CNNs | 325/private + 562/public | TPR, FPR, JI, DSC, AER, AHE, AME |
| Huang, et al. [ | F | 2018 | Deep CNNs + CRF | 325/private | TPR, FPR, IoU |
| Shareef, et al. [ | F | 2020 | Deep CNNs | 725/public | TPR, FPR, JI, DSC, AER, AHE, AME |
| Liu, et al. [ | S | 2012 | Cellular automata | 205/private | TPR, FPR, FNR, SI |
| Gómez, et al. [ | S | 2010 | Watershed | 50/private | Overlap ratio, NRV and PD |
F: fully automatic, S: semi-automatic, SVM: support vector machine, CRF: conditional random field, DPM: deformable part model, CNNs: convolutional neural networks, TP: true positive, FP: false positive, SI: similarity index, HD: Hausdorff distance, MD: mean distance, DSC: Dice similarity, JI: Jaccard index, ROC: Receiver operating characteristic, ARE: average radial error, TPVF: true positive volume fraction, FPVF: false positive volume fraction, FNVF: false negative volume fraction, NRV: normalized residual value, PD: proportional distance, TPR: true positive ratio, FPR: false positive ratio, FNR: false negative ration, and IoU: Intersection over union.
Figure 1Breast ultrasound images collected using different devices. BUS images produced using (a) GE VIVID 7, (b) GELOGIQ E9, (c) Simens ACUSON S2000, and (d) Hitachi EUB-6500.
Figure 2Ground truth generation.
Figure 3Average segmentation results of [22] using ROIs with diferent looseness ratios (LRs).
Figure 4Average segmetation results of [4] using ROIs with different looseness ratios (LRs).
Quantitative results of [4,22] using 10 LRs of ROI.
| Metrics |
| Area Error Metrics | Boundary Error Metrics | Time | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Methods | Ave. TPR | Ave. FPR | Ave. JI | Ave. DSC | Ave. AER | Ave. HE | Ave. MAE | Ave. Time (s) | ||
| [ | 1.1 | 0.73 (0.23) | 0.67 (0.20) | 0.78 (0.18) | 0.35 (0.22) | 45.4 (31.6) | 12.6 (10.9) |
| ||
| 1.3 | 0.79 (0.18) | 0.10 (0.12) | 0.72 (0.16) | 0.82 (0.14) | 0.31 (0.19) |
| 10.9 (8.9) | 22 | ||
| 1.5 | 0.82 (0.15) | 0.13 (0.14) |
| 44.0 (28.3) |
| 27 | ||||
| 1.7 | 0.83 (0.15) | 0.17 (0.18) | 0.83 (0.12) | 0.33 (0.20) | 48.3 (32.2) | 10.9 (8.0) | 27 | |||
| 1.9 | 0.85 (0.14) | 0.20 (0.21) | 0.72 (0.14) | 0.83 (0.12) | 0.36 (0.23) | 51.3 (35.3) | 11.2 (7.9) | 30 | ||
| 2.1 | 0.86 (0.14) | 0.24 (0.25) | 0.71 (0.15) | 0.82 (0.13) | 0.39 (0.27) | 54.9 (38.8) | 11.7 (8.4) | 30 | ||
| 2.3 | 0.86 (0.13) | 0.27 (0.28) | 0.70 (0.15) | 0.82 (0.12) | 0.41 (0.29) | 57.0 (41.7) | 12.1 (8.8) | 36 | ||
| 2.5 | 0.32 (0.33) | 0.69 (0.16) | 0.80 (0.13) | 0.46 (0.34) | 61.3 (44.2) | 13.1 (10.5) | 39 | |||
| 2.7 | 0.35 (0.36) | 0.68 (0.17) | 0.79 (0.14) | 0.48 (0.36) | 62.1 (43.3) | 13.4 (9.5) | 40 | |||
| 2.9 | 0.86 (0.17) | 0.40 (0.41) | 0.66 (0.19) | 0.77 (0.17) | 0.54 (0.44) | 66.2 (46.1) | 14.6 (10.7) | 44 | ||
| [ | 1.1 | 0.70 (0.10) | 0.70 (0.09) | 0.82 (0.07) | 0.31 (0.09) | 35.8 (17.0) | 11.1 (5.3) | 487 | ||
| 1.3 | 0.76 (0.09) | 0.02 (0.03) | 0.75 (0.08) | 0.85 (0.06) | 0.26 (0.09) | 32.0 (15.6) | 9.1 (4.6) | 467 | ||
| 1.5 | 0.79 (0.08) | 0.03 (0.04) | 0.77 (0.08) | 0.87 (0.05) | 29.9 (15.0) | 8.1 (4.2) | 351 | |||
| 1.7 | 0.82 (0.09) | 0.05 (0.06) | 29.5 (16.5) | 7.8 (4.8) | 341 | |||||
|
| 0.84 (0.09) | 0.07 (0.07) | 7.6 (5.3) | 336 | ||||||
| 2.1 | 0.86 (0.08) | 0.10 (0.09) | 0.24 (0.13) | 29.5 (18.4) | 7.7 (5.2) | 371 | ||||
| 2.3 | 0.87 (0.09) | 0.13 (0.12) | 0.78 (0.11) | 0.87 (0.08) | 0.26 (0.16) | 31.3 (21.9) | 8.3 (6.4) | 343 | ||
| 2.5 | 0.89 (0.09) | 0.16 (0.14) | 0.77 (0.11) | 0.87 (0.08) | 0.28 (0.17) | 31.9 (20.1) | 8.5 (6.1) | 365 | ||
| 2.7 | 0.20 (0.15) | 0.75 (0.11) | 0.85 (0.08) | 0.31 (0.18) | 34.1 (20.2) | 9.2 (5.9) | 343 | |||
| 2.9 | 0.25 (0.18) | 0.73 (0.12) | 0.84 (0.10) | 0.35 (0.22) | 36.9 (21.8) | 10.2 (6.7) | 388 | |||
The values in ‘( )’are the standard deviations; and the best performance in each column is highlighted in bold.
Overal performance of all approaches.
| Metrics | Area Error Metrics | Boundary Error Metrics | Time | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Methods | Ave. TPR | Ave. FPR | Ave. JI | Ave. DSC | Ave. AER | Ave. HE | Ave. MAE | Ave. Time (s) | |
| FCN-AlexNet [ | 0.34/-- | 0.74/-- | 0.84/-- | 0.39/-- | 25.1/-- | 7.1/-- | 5.8 | ||
| SegNet [ | 0.94/-- | 0.16/-- | 0.82/-- | 0.89/-- | 0.22/-- | 21.7/-- | 4.5/-- | 12.1 | |
| U-Net [ | 0.92/-- | 0.14/-- | 0.83/-- | 0.90/-- | 0.22/-- | 26.8/-- | 4.9/-- | 2.15 | |
| CE-Net [ | 0.91/-- | 0.13/-- | 0.83/-- | 0.90/-- | 0.22/-- | 21.6/-- | 4.5/-- | 2.0 | |
| MultiResUNet [ | 0.93/-- | 0.11/-- | 0.84/-- | 0.91/-- | 0.19/-- | 4.1/-- | 6.5 | ||
| RDAU NET [ | 0.91/-- | 0.11/-- | 0.84/-- | 0.91/-- | 0.19/-- | 19.3/-- | 4.1/-- | 3.5 | |
| SCAN [ | 0.91/-- | 0.11/-- | 0.83/-- | 0.90/-- | 0.20/-- | 26.9/-- | 4.9/-- | 4.1 | |
| DenseU-Net [ | 0.91/-- | 0.16/-- | 0.81/-- | 0.88/-- | 0.25/-- | 25.3/-- | 5.5/-- | 3.5 | |
| STAN [ | 0.92/-- | 0.09/-- | 0.85/-- | 0.91/-- | 0.18/-- | 18.9/-- | 5.8 | ||
| Xian, et al. [ | 0.81/0.91 | 0.16/0.10 | 0.72/0.84 | 0.83/-- | 0.36/-- | 49.2/24.4 | 12.7/5.8 | 3.5 | |
| Shan, et al. [ | 0.81/0.93 | 1.06/0.13 | 0.60/-- | 0.70/-- | 1.25/-- | 107.6/18.9 | 26.6/5.0 | 3.0 | |
| Shao, et al. [ | 0.67/0.81 | 0.18/0.12 | 0.61/0.74 | 0.71/-- | 0.51/-- | 69.2/50.2 | 21.3/13.4 | 3.5 | |
| Fuzzy FCN [ | 0.94/-- | 0.08/-- | 0.92/-- | 19.8/-- | 4.2/-- | 6.0 | |||
| Huang, et al. [ | 0.93/0.93 | 0.87/0.87 | 0.15/0.15 | 26.0/26.0 | 4.9/4.9 | 6.5 | |||
| Liu, et al. [ | 0.82/0.94 | 0.13/0.08 | 0.73/0.87 | 0.84/-- | 0.31/-- | 44.0/26.3 | 10.4/-- | 27.0 | |
| Liu, et al. [ | 0.84/0.94 | 0.79/0.88 | 0.88/-- | 0.23/-- | 29.0/25.1 | 7.6/-- | 336.0 | ||
The values before the slashes are approaches’ performances on the proposed dataset, and after the slashes are their performances reported in the original publications. Notation ‘--’ indicates that the corresponding metric was not reported in the original paper. The best performance in each column is highlighted in bold.