| Literature DB >> 35141147 |
Xianghua Ye1, Dazhou Guo2, Chen-Kan Tseng3, Jia Ge1, Tsung-Min Hung3, Ping-Ching Pai3, Yanping Ren4, Lu Zheng5, Xinli Zhu1, Ling Peng6, Ying Chen1, Xiaohua Chen7, Chen-Yu Chou3, Danni Chen1, Jiaze Yu8, Yuzhen Chen9, Feiran Jiao10, Yi Xin11, Lingyun Huang11, Guotong Xie11, Jing Xiao11, Le Lu2, Senxiang Yan1, Dakai Jin2, Tsung-Ying Ho9.
Abstract
BACKGROUND: The current clinical workflow for esophageal gross tumor volume (GTV) contouring relies on manual delineation with high labor costs and inter-user variability.Entities:
Keywords: PET/CT (18)F-FDG; deep learning; delineation; esophageal cancer; radiotherapy; segmentation
Year: 2022 PMID: 35141147 PMCID: PMC8820194 DOI: 10.3389/fonc.2021.785788
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Study flow diagram of esophageal gross tumor volume (GTV) segmentation in a multi-institutional setup. CCRT, concurrent chemoradiation therapy; pCT, treatment planning CT.
Figure 2The two-streamed 3D deep learning model for esophageal gross tumor volume (GTV) segmentation using treatment planning CT (pCT) and FDG-PET/CT scans. pCT stream takes the pCT as input and produces the GTV segmentation prediction. The pCT+PET stream takes both pCT and PET/CT scans as input. It first aligns the PET to pCT by registering diagnostic CT (accompanying PET scan) to pCT and applying the deformation field to further map PET to pCT. Then, it uses an early fusion module followed by a late fusion module to segment the esophageal GTV using the complementary information in pCT and PET. This workflow can accommodate to the availability of PET scans in different institutions. Although 3D inputs are used, we depict 2D images for clarity.
Subject and imaging characteristics.
| Characteristics | Entire cohort Institutions 1–4(n = 606) | Training-validation Institution 1(n = 148) | Internal testing Institution 1(n = 104) | External testing Institutions 2–4(n = 354) |
|---|---|---|---|---|
| Sex | … | … | … | … |
| Male | 537 (89%) | 135 (91%) | 98 (94%) | 304 (86%) |
| Female | 69 (11%) | 13 (9%) | 6 (6%) | 50 (14%) |
| Diagnostic age | 65 [57–72] | 55 [50–61] | 56 [50–62] | 67 [61–75] |
| Clinical T stage | … | … | … | … |
| cT2 | 116 (19%) | 24 (16%) | 18 (17%) | 74 (21%) |
| cT3 | 306 (51%) | 71 (48%) | 58 (56%) | 177 (50%) |
| cT4 | 184 (30%) | 53 (36%) | 28 (27%) | 103 (29%) |
| Tumor location | … | … | … | … |
| Cervical | 81 (13%) | 11 (7%) | 10 (10%) | 60 (17%) |
| Upper third | 204 (34%) | 26 (18%) | 35 (34%) | 143 (40%) |
| Middle third | 325 (54%) | 84 (57%) | 63 (61%) | 178 (50%) |
| Lower third | 174 (29%) | 69 (47%) | 35 (34%) | 70 (20%) |
| BMI | … | … | … | … |
| <18.5 | 121 (20%) | 22 (15%) | 15 (14%) | 84 (24%) |
| 18.5–23.9 | 393 (65%) | 94 (63%) | 59 (57%) | 240 (68%) |
| >24 | 92 (15%) | 32 (22%) | 30 (29%) | 30 (8%) |
| Imaging available | … | … | … | … |
| pCT | 606 (100%) | 148 (100%) | 104 (100%) | 354 (100%) |
| PET/CT | 252 (42%) | 148 (100%) | 104 (100%) | 0 (0%) |
Patients may have tumors located across multiple esophagus regions; hence, total numbers summed at various tumor locations for the entire and sub-institution cohorts are greater than the corresponding total patient numbers. Age is presented as median and [interquartile range].
cT2, clinical T stage 2; cT3, clinical T stage 3; cT4, clinical T stage 4; BMI, body mass index; pCT, treatment planning CT.
Quantitative results of esophageal GTV segmentation by the pCT model in the unseen internal testing data.
| Institution 1 (Unseen Internal Testing) Using pCT Model | ||||
|---|---|---|---|---|
| Unacceptable Number (percentage) | DSC mean (95% CI) | HD95 (mm) Mean (95% CI) | ASD (mm) Mean (95% CI) | |
| Total patients (n = 104) | 8 (8%) | 0.81 (0.79, 0.83) | 11.5 (9.2, 13.7) | 2.7 (2.2, 3.3) |
| Clinical T stage | ||||
| cT2 (n = 18) | 4 (22%) | 0.76 (0.67, 0.86) | 12.0 (5.5, 18.4) | 3.0 (1.0, 5.1) |
| cT3 (n = 58) | 3 (5%) | 0.82 (0.80, 0.84) | 10.7 (7.9, 13.5) | 2.5 (1.9, 3.2) |
| cT4 (n = 28) | 1 (4%) | 0.82 (0.79, 0.85) | 12.8 (7.9, 17.7) | 3.0 (2.0, 4.0) |
| Tumor location | ||||
| Cervical (n = 10) | 1 (10%) | 0.82 (0.75, 0.89) | 9.2 (6.5, 12.0) | 2.2 (1.5, 2.8) |
| Upper third (n = 35) | 1 (3%) | 0.83 (0.81, 0.85) | 9.6 (7.4, 11.9) | 2.2 (1.8, 2.5) |
| Middle third (n = 63) | 5 (8%) | 0.80 (0.78, 0.83) | 12.0 (8.9, 15.0) | 2.9 (2.2, 3.6) |
| Lower third (n = 35) | 2 (6%) | 0.81 (0.77, 0.85) | 13.3 (8.6, 18.0) | 3.3 (2.1, 4.5) |
GTV, gross tumor volume; CI, confidence interval; DSC, Dice similarity coefficient; HD95, 95% Hausdorff distance; ASD, average surface distance; cT2, clinical T stage 2; cT3, clinical T stage 3; cT4, clinical T stage 4; pCT, treatment planning CT.
Quantitative results of esophageal GTV segmentation by the pCT+PET model in the unseen internal testing data.
| Institution 1 (Unseen Internal Testing) Using pCT+PET Model | ||||
|---|---|---|---|---|
| Unacceptable Number (percentage) | DSC Mean (95% CI) | HD95 (mm) Mean (95% CI) | ASD (mm) Mean (95% CI) | |
| Total patients (n = 104) | 4 (4%) | 0.83 (0.81, 0.84) | 9.5 (8.0, 10.9) | 2.2 (1.9, 2.5) |
| Clinical T stage | ||||
| cT2 (n = 18) | 3 (17%) | 0.77 (0.69, 0.85) | 11.4 (6.3, 16.6) | 2.7 (1.3, 4.2) |
| cT3 (n = 58) | 0 (0%) | 0.84 (0.82, 0.85) | 9.0 (7.0, 11.0) | 2.0 (1.7, 2.4) |
| cT4 (n = 28) | 1 (4%) | 0.84 (0.82, 0.86) | 9.3 (7.3, 11.4) | 2.3 (1.9, 2.6) |
| Tumor location | ||||
| Cervical (n = 10) | 1 (10%) | 0.83 (0.78, 0.89) | 9.4 (6.2, 12.7) | 2.0 (1.5, 2.5) |
| Upper third (n = 35) | 0 (0%) | 0.84 (0.82, 0.86) | 8.1 (6.2, 10.0) | 1.9 (1.6, 2.2) |
| Middle third (n = 63) | 3 (5%) | 0.83 (0.81, 0.84) | 9.5 (8.0, 11.1) | 2.2 (1.9, 2.5) |
| Lower third (n = 35) | 0 (0%) | 0.83 (0.79, 0.86) | 10.8 (7.5, 14.0) | 2.6 (1.9, 3.3) |
GTV, gross tumor volume; CI, confidence interval; DSC, Dice similarity coefficient; HD95, 95% Hausdorff distance; ASD, average surface distance; cT2, clinical T stage 2; cT3, clinical T stage 3; cT4, clinical T stage 4; pCT, treatment planning CT.
Figure 4(A) Performance comparison of pCT model and pCT+PET in the internal testing set (left to right: cT4, cT3, cT3, multifocal cT2). Red, green, and blue show the contours of ground truth reference, pCT+PET model prediction, and pCT model prediction, respectively. (B) Performance examples of pCT model in the external testing set according to the degree of manual revision (left to right): no revision, >0%–30%, >30%–60%, and unacceptable. Red and blue show the contours of ground truth reference and pCT model prediction, respectively. Green arrow points to the uncommon anatomy for the unacceptable case in the rightmost column. pCT, treatment planning CT; DSC, Dice similarity coefficient.
Quantitative results of esophageal GTV segmentation by the pCT model in the multi-institutional external testing data.
| Institutions 2–4 (External Multi-Institutional Testing) Using pCT Model | ||||
|---|---|---|---|---|
| Unacceptable Number (percentage) | DSC Mean (95% CI) | HD95 (mm) Mean (95% CI) | ASD (mm) Mean (95% CI) | |
| Total patients (n = 354) | 33 (9%) | 0.80 (0.78, 0.81) | 11.8 (10.1, 13.4) | 2.8 (2.4, 3.2) |
| Clinical T stage | ||||
| cT2 (n = 74) | 23 (31%) | 0.71 (0.66, 0.76) | 13.8 (10.0, 17.5) | 3.6 (2.5, 4.8) |
| cT3 (n = 177) | 5 (3%) | 0.81 (0.80, 0.82) | 11.4 (8.8, 13.9) | 2.6 (2.1, 3.2) |
| cT4 (n = 103) | 5 (5%) | 0.82 (0.80, 0.83) | 11.4 (9.3, 13.6) | 2.7 (2.1, 3.3) |
| Tumor location | ||||
| Cervical (n = 60) | 4 (6%) | 0.80 (0.78, 0.82) | 11.7 (8.6, 14.8) | 2.5 (1.7, 3.3) |
| Upper third (n = 143) | 11 (8%) | 0.79 (0.77, 0.81) | 12.6 (10.4, 14.9) | 3.0 (2.4, 3.7) |
| Middle third (n = 178) | 14 (8%) | 0.80 (0.78, 0.81) | 11.5 (9.3, 13.5) | 2.9 (2.4, 3.5) |
| Lower third (n = 70) | 5 (7%) | 0.80 (0.78, 0.82) | 15.4 (9.3, 21.5) | 3.3 (2.1, 4.5) |
GTV, gross tumor volume; CI, confidence interval; DSC, Dice similarity coefficient; HD95, 95% Hausdorff distance; ASD, average surface distance; cT2, clinical T stage 2; cT3, clinical T stage 3; cT4, clinical T stage 4; pCT, treatment planning CT.
Figure 3Expert assessment of manual revision degree of the deep model-predicted contours. Table in the top row summarized the mean and 95% confidence interval (CI) of Dice similarity coefficient (DSC), 95% Hausdorff distance (HD95), and average surface distance (ASD) stratified by different degrees of manual revision. The correlations between the mean of DSC, HD95, ASD, and the degree of manual revision were plotted in the bottom row. Spearman correlation coefficient showed that DSC and degree of manual revision were correlated (R = -0.58, p < 0.001). Same correlation was observed for the HD95 and ASD (HD95: R = 0.60, p < 0.001; ASD: R = 0.60, p < 0.001). Degree of manual revision was defined as the percentage of gross tumor volume (GTV) slices that needed modification for clinical acceptance. pCT, treatment planning CT.
Figure 5Results of multi-user evaluation. (A) Boxplot of Dice similarity coefficient (DSC), 95% Hausdorff distance (HD95), and average surface distance (ASD) for the comparison of manual contours of four radiation oncologists with our treatment planning CT (pCT)-based deep model-predicted contours. Dotted lines indicate the median DSC, HD95, and ASD of our pCT model performance. (B) Comparison of DSC and HD95 between second-time deep learning-assisted contours with those of first-time manual contours. R1 to R4 represent the 4 radiation oncologists involved in the multi-user evaluation. DeepModel is our pCT model.
Figure 6Two qualitative examples (left and right) in sagittal and axial views of comparison between the first-time manual contour (top row) and second-time deep learning-assisted contours (bottom row). Red is the ground truth contour, while green, blue, yellow, and cyan represent the other four radiation oncologists’ contours. The average Dice similarity coefficient (DSC) of 4 radiation oncologists for their first-time manual contour is 0.76 and 0.75 to the two examples, respectively. The DSC performance improved to 0.83 and 0.82 for their second-time contour with assistance from the deep learning predictions.