| Literature DB >> 35549776 |
Lu Zhang1,2, Yicheng Jiang2,3, Zhe Jin1, Wenting Jiang2,3, Bin Zhang1, Changmiao Wang2, Lingeng Wu4, Luyan Chen1, Qiuying Chen1, Shuyi Liu1, Jingjing You1, Xiaokai Mo1, Jing Liu1, Zhiyuan Xiong1, Tao Huang5, Liyang Yang6, Xiang Wan2, Ge Wen7, Xiao Guang Han8,9, Weijun Fan10, Shuixing Zhang11.
Abstract
BACKGROUND: Transcatheter arterial chemoembolization (TACE) is the mainstay of therapy for intermediate-stage hepatocellular carcinoma (HCC); yet its efficacy varies between patients with the same tumor stage. Accurate prediction of TACE response remains a major concern to avoid overtreatment. Thus, we aimed to develop and validate an artificial intelligence system for real-time automatic prediction of TACE response in HCC patients based on digital subtraction angiography (DSA) videos via a deep learning approach.Entities:
Keywords: DSA videos; Deep learning; Hepatocellular carcinoma; Transcatheter arterial chemoembolization
Mesh:
Year: 2022 PMID: 35549776 PMCID: PMC9101835 DOI: 10.1186/s40644-022-00457-3
Source DB: PubMed Journal: Cancer Imaging ISSN: 1470-7330 Impact factor: 5.605
Fig. 1Flowchart of patient inclusion/exclusion for two centers
Fig. 2Workflow of DSA-Net. The procedure of DSA-Net contains imaging acquisition, key frame selection, and construction of segmentation network and prediction network. The segmentation network consists of a temporal difference learning module, a liver region segmentation sub-network, and a final fusion segmentation sub-network. The prediction network included a ResNet18 for image data and a multi-layer perceptron for tabular data
Baseline characteristics of the training and validation cohorts
| Variables | All patients ( | Training cohort ( | Internal validation cohort ( | External validation cohort ( |
|---|---|---|---|---|
| Age (years) | 55.0 ± 11.9 | 55.06 ± 12.2 | 55.5 ± 11.6 | 54.5 ± 11.3 |
| Sex (male) | 548 (90.6) | 330 (91.7) | 107 (86.3) | 111 (91.7) |
| HBV + | 530 (87.6) | 306 (85.0) | 107 (86.3) | 112 (92.6) |
| BCLC stage | ||||
| A | 67 (11.1) | 36 (10.0) | 16 (12.9) | 15 (12.4) |
| B | 538 (88.9) | 324 (90.0) | 108 (87.1) | 106 (87.6) |
| Child-Pugh score | ||||
| 5 | 437 (72.2) | 263 (73.1) | 100 (80.6) | 74 (61.2) |
| 6 | 106 (17.5) | 62 (17.2) | 15 (12.1) | 29 (24.0) |
| 7 | 38 (6.3) | 21 (5.8) | 8 (6.5) | 9 (7.4) |
| 8 | 17 (2.8) | 10 (2.8) | 2 (1.6) | 7 (5.8) |
| 9 | 7 (1.2) | 4 (1.1) | 1 (0.8) | 2 (1.7) |
| Ascites | 43 (7.1) | 24 (6.7) | 7 (5.6) | 12 (9.9) |
| PT (s) | 12.8 ± 5.9 | 13.1 ± 7.6 | 12.3 ± 1.3 | 12.5 ± 1.6 |
| TBIL (μmol/L) | 18.6 ± 16.6 | 19.1 ± 18.6 | 16.7 ± 8.7 | 17.5 ± 13.4 |
| ALB (g/L) | 40.7 ± 23.9 | 41.8 ± 30.7 | 40.2 ± 4.7 | 38.0 ± 5.3 |
| AST (≥40, IU/L) | 351 (58.0) | 230 (63.9) | 69 (55.6) | 52 (43.0) |
| ALT (≥40, IU/L) | 311 (51.4) | 198 (55.0) | 69 (55.6) | 44 (36.4) |
| CRP (≥1, mg/L) | 313 (86.9) | 313 (86.9) | 108 (87.1) | 96 (79.3) |
| AFP (≥200, ng/ml) | 267 (44.1) | 172 (47.8) | 51 (41.1) | 44 (36.4) |
| Treatment response | ||||
| Responders | 335 (55.4) | 176 (48.9) | 69 (55.6) | 90 (74.4) |
| Non-responders | 270 (44.6) | 184 (51.1) | 55 (44.4) | 31 (25.6) |
| Combined with other treatment (yes) | 388 (64.1) | 237 (65.8) | 81 (65.3) | 70 (57.9) |
| Rounds of TACE (≥2) | 523 (86.4) | 320 (88.9) | 104 (83.9) | 99 (81.8) |
Qualitative variables are in n (%) and quantitative variables are in mean ± SD, when appropriate. HBV Hepatitis B virus, BCLC Barcelona Clinic Liver Cancer, AFP a-Fetoprotein, PT Prothrombin time, TBIL Total bilirubin, ALB Albumin, AST Aspartate aminotransferase, ALT Alanine aminotransferase, CRP C-reactive protein
Performance of segmentation models in the validation cohorts
| Cohort | Model | Dice | Accuracy | Patient-level sensitivity | Specificity | PPV | NPV | Lesion-level sensitivity | FPR |
|---|---|---|---|---|---|---|---|---|---|
| Internal validation cohort | Baseline | 0.71 (0.68–0.74) | 96.8 (96.4–97.2) | 80.2 (77.6–82.7) | 98.2 (97.8–98.5) | 75.5 (72.3–78.5) | 98.1 (97.8–98.4) | 84.8 (81.9–87.8) | 34.0 (30.5–37.4) |
| Baseline + TDL | 0.72 (0.70–0.75) | 96.7 (96.3–97.0) | 83.3 (81.0–85.6) | 97.7 (97.3–98.0) | 73.4 (70.2–76.3) | 98.4 (98.2–98.7) | 81.8 (78.6–85.0) | 32.2 (28.8–35.5) | |
| Baseline + LRS | 0.73 (0.70–0.76) | 97.0 (96.6–97.4) | 80.0 (77.3–82.8) | 98.5 (98.2–98.7) | 75.5 (72.3–78.4) | 98.0 (97.7–98.4) | 83.2 (80.1–86.3) | 22.7 (19.3–26.1) | |
| FFS | 0.75 (0.73–0.78) | 97.1 (96.8–97.5) | 82.3 (79.8–84.8) | 98.4 (98.1–98.6) | 77.9 (75.0–80.6) | 98.3 (97.9–98.6) | 87.2 (84.4–89.9) | 23.8 (20.4–27.3) | |
| External validation cohort | Baseline | 0.71 (0.68–0.73) | 96.8 (96.5–97.2) | 73.1 (70.5–75.5) | 99.3 (99.2–99.4) | 83.3 (81.3–85.5) | 97.2 (96.8–97.6) | 90.1 (87.7–92.5) | 34.7 (31.7–37.7) |
| Baseline + TDL | 0.72 (0.70–0.75) | 96.6 (96.3–97.0) | 86.6 (84.8–88.4) | 97.9 (97.6–98.1) | 71.0 (68.5–73.7) | 98.5 (98.1–98.8) | 92.0 (89.8–94.2) | 44.0 (40.9–47.0) | |
| Baseline + LRS | 0.71 (0.69–0.74) | 96.9 (96.6–97.2) | 78.2 (75.6–80.4) | 98.7 (98.5–98.9) | 77.0 (74.2–79.6) | 97.9 (97.6–98.2) | 92.5 (90.4–94.7) | 34.1 (31.0–37.3) | |
| FFS | 0.73 (0.71–0.75) | 97.1 (96.7–97.4) | 79.0 (76.4–81.1) | 98.7 (98.5–98.9) | 79.6 (76.9–82.0) | 98.1 (97.8–98.4) | 94.3 (92.4–96.2) | 30.8 (27.9–33.7) |
The data in parentheses are 95% confidence interval
TDL Temporal difference learning, LRS Liver region segmentation, FFS Final fusion segmentation, PPV Positive predictive value, NPV Negative predictive value, FPR False-positives ratio
Fig. 3A comparison of image segmentation algorithms in the validation cohorts. Ground truth and predicted mask of tumors are labeled in yellow and cyan-blue, respectively. Compared with other algorithms, the FFS model achieved the lowest false positive and missed segmentation in the following four situations: multiple lesions (patient 1), a small lesion < 3 cm (patient 2), a small lesion < 3 cm with obvious surrounding stomach and intestine images (patient 3), and poor image quality (patient 4). TDL, temporal difference learning; LRS, liver region segmentation; FFS, final fusion segmentation
Performance of predictive models in the validation cohorts
| Cohort | Input | Model | AUC | Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) |
|---|---|---|---|---|---|---|---|---|
| Internal validation cohort | KF | Resnet | 0.681 (0.637–0.725) | 70.4 (65.3–75.0) | 47.3 (39.5–55.2) | 88.9 (84.5–93.0) | 77.2 (69.1–85.4) | 67.9 (61.5–73.5) |
| Clinical data | MLP | 0.670 (0.623–0.719) | 67.7 (62.9–72.6) | 60.0 (52.5–67.3) | 73.9 (67.6–79.7) | 64.7 (56.7–71.8) | 69.9 (63.9–75.7) | |
| KF+KF*Pred+Pred | Resnet | 0.733 (0.687–0.779) | 73.7 (69.4–78.8) | 69.7 (62.8–76.8) | 76.8 (70.5–82.2) | 70.6 (63.5–77.4) | 76.1 (70.1–81.9) | |
KF+KF*Pred +Pred+clinical data | Resnet+MLP | 0.782 (0.738–0.826) | 78.2 (74.2–82.3) | 77.6 (70.7–84.0) | 78.7 (72.9–84.1) | 74.4 (67.2–81.4) | 81.5 (75.9–86.8) | |
| KF+ KF* GT+GT | Resnet | 0.727 (0.684–0.770) | 74.7 (70.2–78.8) | 55.2 (47.4–63.2) | 90.3 (86.3–94.1) | 82.0 (75.0–89.0) | 71.6 (65.9–77.2) | |
KF+KF*GT +GT+clinical data | Resnet+MLP | 0.802 (0.759–0.847) | 80.4 (76.3–84.4) | 78.8 (72.2–84.9) | 81.6 (76.5–86.6) | 77.4 (70.6–83.7) | 82.8 (77.4–88.0) | |
| External validation cohort | KF | Resnet | 0.628 (0.573–0.684) | 69.4 (64.5–74.6) | 49.5 (38.6–59.4) | 76.2 (71.2–81.5) | 41.4 (32.1–50.4) | 81.6 (77.0–86.0) |
| Clinical data | MLP | 0.593 (0.529–0.646) | 63.1 (58.5–68.3) | 51.6 (42.1–63.1) | 67.0 (61.6–72.7) | 34.8 (26.8– 42.8) | 80.3 (75.0–86.1) | |
| KF+KF*Pred+Pred | Resnet | 0.712 (0.672–0.753) | 73.9 (69.6–78.5) | 46.7 (39.2–54.2) | 95.7 (92.8–98.1) | 89.5 (83.0–96.0) | 69.2 (63.8–74.5) | |
KF+KF*Pred +Pred+clinical data | Resnet+MLP | 0.670 (0.612–0.726) | 75.1 (70.2–79.5) | 50.5 (40.0–61.5) | 83.5 (79.2–87.7) | 51.1 (40.6–61.4) | 83.2 (78.9–87.5) | |
| KF+KF*GT+GT | Resnet | 0.575 (0.536–0.614) | 76.2 (71.6–80.6) | 19.4 (12.1–28.3) | 95.6 (93.0–97.9) | 60.0 (41.2–78.3) | 77.7 (73.3–81.9) | |
KF+KF*GT +GT+clinical data | Resnet+MLP | 0.817 (0.777–0.856) | 77.9 (73.8–82.0) | 89.2 (82.2–95.2) | 74.0 (68.1–79.2) | 53.9 (46.2–61.8) | 95.3 (92.4–97.7) |
The data in parentheses are 95% confidence interval
AUC Area under the curve, PPV Positive predictive value, NPV Negative predictive value, KF Key frame, Pred: segmentation result from Model 1; GT: segmentation result from ground truth; MLP Multi–layer perceptron
Fig. 4Kaplan-Meier curves of 3-year PFS between the responders and non-responders in the validation cohort. The two response groups were divided by the models constructed by (a) clinical data only; (b) key frame of DSA videos and segmentation results; and (c) key frame of DSA videos, segmentation results, and clinical data. PFS, progression-free survival; DSA, digital subtraction angiography