| Literature DB >> 34993152 |
Yongkai Liu1,2, Qi Miao1,3, Chuthaporn Surawech1,4, Haoxin Zheng1,5, Dan Nguyen6, Guang Yang7, Steven S Raman1, Kyunghyun Sung1,2.
Abstract
Whole-prostate gland (WPG) segmentation plays a significant role in prostate volume measurement, treatment, and biopsy planning. This study evaluated a previously developed automatic WPG segmentation, deep attentive neural network (DANN), on a large, continuous patient cohort to test its feasibility in a clinical setting. With IRB approval and HIPAA compliance, the study cohort included 3,698 3T MRI scans acquired between 2016 and 2020. In total, 335 MRI scans were used to train the model, and 3,210 and 100 were used to conduct the qualitative and quantitative evaluation of the model. In addition, the DANN-enabled prostate volume estimation was evaluated by using 50 MRI scans in comparison with manual prostate volume estimation. For qualitative evaluation, visual grading was used to evaluate the performance of WPG segmentation by two abdominal radiologists, and DANN demonstrated either acceptable or excellent performance in over 96% of the testing cohort on the WPG or each prostate sub-portion (apex, midgland, or base). Two radiologists reached a substantial agreement on WPG and midgland segmentation (κ = 0.75 and 0.63) and moderate agreement on apex and base segmentation (κ = 0.56 and 0.60). For quantitative evaluation, DANN demonstrated a dice similarity coefficient of 0.93 ± 0.02, significantly higher than other baseline methods, such as DeepLab v3+ and UNet (both p values < 0.05). For the volume measurement, 96% of the evaluation cohort achieved differences between the DANN-enabled and manual volume measurement within 95% limits of agreement. In conclusion, the study showed that the DANN achieved sufficient and consistent WPG segmentation on a large, continuous study cohort, demonstrating its great potential to serve as a tool to measure prostate volume.Entities:
Keywords: deep attentive neural network; large cohort evaluation; prostate segmentation; qualitative evaluation; quantitative evaluation; volume measurement
Year: 2021 PMID: 34993152 PMCID: PMC8724207 DOI: 10.3389/fonc.2021.801876
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
T2-weighted TSE MRI sequence parameters in the study.
| View | Axial | Coronal |
|---|---|---|
| Matrix size | 320 × 320 | 320 × 320 |
| Flip angle | 160° | 147° |
| Resolution | 0.625 × 0.625 × 3.6 | 0.625 × 0.625 × 3.6 |
| Field of view (mm2) | 200 × 200 | 200 ×200 |
| Repetition time (ms) | 3,000–7,480 | 2,880–7,200 |
| Echo time (ms) | 97–112 | 97–109 |
| Number of slices | 20 | 20 |
| Scan time (s) | 200 | 200 |
ms, millisecond; s, second; mm, millimeter.
Data characteristics in the training, qualitative, and quantitative evaluation.
| Training dataset | Qualitative evaluation dataset | Quantitative evaluation dataset | Volume evaluation dataset | ||
|---|---|---|---|---|---|
| Number of MRI scans | 335 | 3,210 | 100 | 50 | |
| Number of patients with endorectal coil | 3 | 84 | 0 | 0 | |
| MRI scans with different vendors | Skyra | 295 | 2,806 | 93 | 45 |
| Prisma | 10 | 145 | 4 | 3 | |
| Vida | 30 | 259 | 3 | 2 | |
Figure 1The overall workflow of the automatic WPG segmentation with DANN. Both axial and coronal T2W images were used as input, where the coronal images were used to assist the selection of certain axial images containing the prostate gland. DANNcor was firstly performed on the two middle coronal images, indicated by images with the red border. Next, green lines selected by the prostate segmentation on the coronal images were used to determine the selection of axial slices (images with green borders). Once the axial images were selected, DANNax was performed on the axial MRI slices for the segmentation of WPG.
Description of each visual grade for qualitative segmentation evaluation.
| Score | Visual scoring description |
|---|---|
| 3 | The segmentation is excellent. The vast majority (>90%) of the prostate region has been correctly segmented, and the percentage of prostate slices with the failure segmentation is less than 10%. |
| 2 | The segmentation is acceptable. Most of the region (>70%) is correctly segmented, and the percentage of prostate slices that the method fails to segment is less than 30%. |
| 1 | The segmentation is unacceptable. More than 30% of the prostate region has been not correctly segmented or wrongly segmented, or the percentage of prostate slices that the method fails to segment is larger than 30%. |
Figure 2Typical examples for each visual grade. Rows (A–C) represent two segmentation examples with visual grades 3 (excellent), 2 (acceptable), and 1 (unacceptable), respectively. Slices 1–20 represent MRI slices from superior to inferior. Regions encircled by organ boundary are the prostate whole gland.
Figure 3The proportion of segmentation with acceptable or excellent performance evaluated by radiologists 1 and 2 among all MRI scans (n = 3210). Kappa statistics between the two readers were also provided in the figure.
Confusion matrices between the visual grades assigned by two readers.
| All | Reader 2 | Kappa (κ) | |||
|---|---|---|---|---|---|
| Reader 1 | Visual grade | 1 | 2 | 3 | Substantial agreement |
| 1 | 47 (1.5) | 1 (0.0) | 0 (0.0) | ||
| 2 | 22 (0.7) | 99 (3.1) | 49 (1.5) | ||
| 3 | 0 (0.0) | 63 (2.0) | 2,929 (91.3) | ||
Kappa coefficient (κ) is used to measure the inter-rater variability between the two readers.
Figure 4Confusion matrices of the prostate base, midgland, and apex for the cases without excellent segmentation (n = 281).
Figure 5Confusion matrices of the visual grades of segmentation on MRI scans with and without endorectal coils. Kappa coefficient (κ) is used to measure the inter-rater variability between the two readers.
Quantitative DSC comparisons with baseline methods.
| Methods | DSC |
|---|---|
| Proposed method | 0.93 ± 0.02 |
| DeepLab v3+ | 0.92 ± 0.02 |
| UNet | 0.91 ± 0.03 |
Figure 6Bland–Altman plot to show the agreement between manual and DANN-enabled WPG volume measurements.
Inference time estimation and DSCs obtained with and without coronal segmentation assistance.
| Without coronal segmentation assistance | With coronal segmentation assistance | |
|---|---|---|
| Overall inference time estimation in the qualitative evaluation | 16.4 min (67,775) | 12.6 min (45,713) |
| DSCs obtained in the quantitative evaluation | 0.93 | 0.93 |
() indicates the total amount of MRI slices the method needed to segment.