| Literature DB >> 35756653 |
Xiaoying Lou1,2, Niyun Zhou3, Lili Feng2,4, Zhenhui Li5, Yuqi Fang3,6, Xinjuan Fan1,2, Yihong Ling7, Hailing Liu1,2, Xuan Zou1,2, Jing Wang1,2, Junzhou Huang3, Jingping Yun7, Jianhua Yao3, Yan Huang1,2.
Abstract
Objective: This study aimed to develop an artificial intelligence model for predicting the pathological complete response (pCR) to neoadjuvant chemoradiotherapy (nCRT) of locally advanced rectal cancer (LARC) using digital pathological images. Background: nCRT followed by total mesorectal excision (TME) is a standard treatment strategy for patients with LARC. Predicting the PCR to nCRT of LARC remine difficulty.Entities:
Keywords: artificial intelligence; deep learning; neoadjuvant chemoradiotherapy; pathological complete response; rectal cancer
Year: 2022 PMID: 35756653 PMCID: PMC9214314 DOI: 10.3389/fonc.2022.807264
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1The flow diagram of patient enrollment. (A) Primary cohort, (B) External validation cohort.
Figure 2The proposed deep learning framework (DeepPCR) for pCR prediction. (A) WSIs with tumors annotated by expert pathologists. (B) All WSIs were cropped into small patches with a size of 299×299 pixels at a magnification of 20×. (C) An in-house deep learning-based color normalization method was applied to ensure the color consistency of the cropped patches. (D) Illustration of the proposed DeepPCR model for pCR candidate prediction. Three scales of phenotype feature representations (i.e., patchPR, clusPR, and wsiPR) were integrated to derive the final prediction.
Clinicopathological characteristics of patients in the training, testing, and external validation cohorts.
| Training set (n=666) | Testing set (n=117) | ExternalValidation set (n=102) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| PCR (%)(n=171) | Non-PCR (%)(n=495) |
| PCR (%)(n=30) | Non-PCR (%)(n=87) |
| PCR (%)(n=24) | Non-PCR (%)(n=78) |
| |
|
| 52.77 ± 12.02 | 54.71 ± 11.78 | 0.078 | 53.90 ± 11.71 | 55.38 ± 11.47 | 0.549 | 54.08 ± 11.01 | 57.17 ± 10.37 | 0.182 |
|
| 0.849 | 0.376 | 0.081 | ||||||
| | 55(32.2) | 154(31.1) | 8(26.7) | 32(36.8) | 12(50.0) | 22(28.2) | |||
| | 116(67.8) | 341(68.9) | 22(73.3) | 55(63.2) | 12(50.0) | 56(71.8) | |||
|
| |||||||||
| | 10(5.9) | 16(3.2) | 0.124 | 2(6.7) | 2(2.3) | 0.271 | 1(4.2) | 0(0.0) | 0.235 |
| | 113(66.1) | 323(65.3) | 0.926 | 20(66.7) | 55(63.2) | 0.827 | 9(37.5) | 24(30.8) | 0.612 |
| | 48(28.0) | 156(31.5) | 0.442 | 8(26.6) | 30(34.5) | 0.503 | 14(58.3) | 54(69.2) | 0.333 |
|
| |||||||||
| | 34(19.9) | 76(15.4) | 0.189 | 5(16.7) | 13(14.9) | 0.777 | 0(0.0) | 18(23.1) | 0.006 |
| | 86(50.3) | 249(50.3) | 1 | 13(43.3) | 40(46.0) | 0.834 | 17(70.8) | 43(55.1) | 0.236 |
| | 51(29.8) | 170(34.3) | 0.301 | 12(40.0) | 34(39.1) | 1 | 7(29.2) | 17(21.8) | 0.582 |
|
| |||||||||
| | 35(20.5) | 76(15.3) | 0.167 | 5(16.6) | 13(14.9) | 0.777 | 0(0.0) | 18(23.1) | 0.006 |
| | 136(79.5) | 419(84.7) | 0.124 | 25(83.4) | 74(85.1) | 0.777 | 24(100.0) | 60(76.9) | 0.006 |
|
| |||||||||
| | 22(12.9) | 55(11.1) | 0.579 | 3(10.0) | 16(18.4) | 0.394 | 1(4.2) | 0(0) | 0.235 |
| | 125(73.1) | 382(77.2) | 0.299 | 22(73.3) | 65(74.7) | 1 | 23(95.8) | 71(91.0) | 0.677 |
| | 24(14.0) | 58(11.7) | 0.421 | 5(16.7) | 6(6.9) | 0.147 | 0(0) | 7(9.0) | 0.194 |
|
| 102,728 | 18475 | 46599 | ||||||
Figure 3(A, B) AUC-ROC of the four comparative methods in the (A) primary and (B) external validation cohorts (top row). (C, D) AUC-PR of the four comparative methods in the (C) primary and (D) external validation cohorts (middle row). (E, F) DeLong test for the four comparative methods in the (E) primary and (F) external validation cohorts (bottom row). In this work, we used a probability threshold of 0.7 (that is, any patient with a pCR prediction probability greater than 0.7 was reported as a pCR candidate). No significant difference (ns): P > 0.05, *P < 0.05, ***P < 0.001.
Results of DeepPCR and the comparative models in the (a) primary and (b) external validation cohorts.
| (a) Model/Outcome | AUC-ROC | AUC-PR | Sen (%) | Spe (%) | PPV (%) | NPV (%) |
|---|---|---|---|---|---|---|
|
| 0.403 (0.274, 0.534) | 0.698 (0.591, 0.805) | 72.6 (64.1, 80.3) | 27.2 (18.8, 36.2) | 61.7 (60.0, 71.1) | 37.7 (30.8, 51.2) |
|
| 0.544 (0.432, 0.653) | 0.805 (0.717, 0.885) | 68.4 (59.8, 76.9) | 25.8 (17.0, 34.7) | 57.2 (45.2, 69.6) | 27.0 (15.4, 46.8) |
|
| 0.627 (0.516, 0.733) | 0.842 (0.762, 0.909) | 69.2 (60.7, 77.8) | 30.4 (20.7, 40.7) | 61.6 (50.5, 73.1) | 37.6 (18.0, 59.4) |
|
| 0.710 (0.595, 0.808) | 0.875 (0.795, 0.935) | 72.6 (64.1, 80.3) | 46.9 (32.6, 61.0) | 70.4 (61, 79.9) | 54.0 (35.8, 70.9) |
|
|
|
|
|
|
|
|
|
| 0.420 (0.293, 0.548) | 0.737 (0.623, 0.846) | 70.6 (61.8, 79.4) | 21. 7 (14.2, 30.0) | 57.4 (48.5, 67.4) | 17.6 (14.3, 20.4) |
|
| 0.527 (0.402, 0.657) | 0.810 (0.712, 0.895) | 73.5 (64.7, 81.4) | 22.6 (15.3, 31.4) | 57.9 (62.6, 72.4) | 17.8 (17.4, 18.1) |
|
| 0.599 (0.474, 0.726) | 0.832 (0.732, 0.919) | 69.6 (60.8, 78.4) | 27.2 (16.3, 38) | 62.3 (49.9, 74.5) | 31.7 (14.8, 54.1) |
|
| 0.723 (0.591, 0.844) | 0.887 (0.805, 0.949) | 72.5 (63.7, 81.4) | 62.7 (46.3, 77.3) | 75.8 (67.1, 84.7) | 53.6 (36.8, 68.8) |
The CI value is inside the parentheses. Sen, sensitivity; Spe, specificity; PPV, positive predictive value; NPV, negative predictive value. In this work, we used a probability threshold of 0.7 (that is, any patient with a pCR prediction probability greater than 0.7 was reported as a pCR candidate).
Univariate and multivariate logistic regression analyses.
| (a) Univariate logistic regression | Testing Set | External Validation Set | ||
|---|---|---|---|---|
| P value | Exp (B) (95% CI) | P value | Exp (B) (95% CI) | |
| Sex | 0.316 | 1.6 (0.638, 4.011) | 0.051 | 0.393 (0.153, 1.006) |
| Age | 0.051 | 2.679 (0.995, 7.212) | 0.042 | 2.768 (1.039, 7.376) |
| TNM stage | 0.822 | 1.138 (0.369, 3.513) | 0.998 | 0 (0, -) |
| CEA | 0.033 | 2.796 (1.087, 7.197) | 0.029 | 3.667 (1.145, 11.74) |
| CA-199 | 0.087 | 2.128 (0.896, 5.055) | 0.054 | 2.505 (0.985, 6.37) |
| CRP | 0.198 | 2.348 (0.639, 8.621) | – | |
| LDH | 0.999 | 5.80e8 (0, -) | 0.207 | 2.4 (0.617, 9.339) |
| Lymphocytes | 0.24 | 2.186 (0.593, 8.062) | 0.133 | 2.2 (0.788, 6.146) |
| Neutrophils | 0.414 | 1.524 (0.555, 4.186) | 0.097 | 2.508 (0.846, 7.436) |
| NLR | 0.142 | 3.155 (0.681, 14.623) | 0.04 | 3.045 (1.054, 8.804) |
| Patch-indi | 0.06 | 2.248 (0.967, 5.224) | 0.219 | 1.786 (0.709, 4.5) |
| Patch-comb | 0.053 | 2.548 (0.989, 6.564) | 0.023 | 3.143 (1.171, 8.437) |
| DeepPCR | 0.0001 | 6.125 (2.462, 15.239) | 0.0001 | 7 (2.575, 19.028) |
|
|
|
| ||
|
|
|
|
| |
| Sex | 0.143 | 2.45 (0.739, 8.124) | 0.011 | 0.122 (0.024, 0.621) |
| Age | 0.489 | 1.576 (0.434, 5.72) | 0.705 | 1.346 (0.289, 6.261) |
| TNM stage | 0.965 | 1.034 (0.233, 4.582) | 0.998 | 0 (0, -) |
| CEA | 0.101 | 2.718 (0.823, 8.973) | 0.189 | 3.211 (0.564, 18.284) |
| CA-199 | 0.124 | 2.413 (0.785, 7.415) | 0.059 | 4.137 (0.945, 18.108) |
| CRP | 0.104 | 4.607 (0.732, 29.003) | ||
| LDH | 0.999 | 2.6e8 (0, -) | 0.118 | 7.334 (0.604, 89.051) |
| Lymphocytes | 0.128 | 3.412 (0.704, 16.539) | 0.203 | 3.418 (0.514, 22.723) |
| Neutrophils | 0.979 | 0.981 (0.239, 4.023) | 0.874 | 0.846 (0.107, 6.699) |
| NLR | 0.138 | 4.242 (0.628, 28.678) | 0.05 | 8.854 (0.995, 78.749) |
| Patch-indi | 0.346 | 1.657 (0.58, 4.732) | 0.831 | 0.855 (0.204, 3.591) |
| Patch-comb | 0.8 | 0.819 (0.175, 3.842) | 0.642 | 1.453 (0.301, 7.023) |
| DeepPCR | 0.008 | 6.879 (1.646, 28.743) | 0.004 | 10.461 (2.138, 51.186) |
(a) Univariate logistic regression analysis of the testing set and external validation set. (b) Multivariate logistic regression analysis of the testing set and external validation set. The covariates were sex, age, TNM stage, CEA, CA19-9, CRP, LDH, lymphocytes, neutrophils, neutrophil-to-lymphocyte ratio (NLR), patch-based individual (patch-indi) model, patch-based combined (patch-comb) model, and DeepPCR model.
Figure 4Patch-level feature interpretation in the pCR group (A–E) and non-pCR group (F–J). Patches in the correctly predicted pCR group (A) and correctly predicted non-pCR group (F). PatchPRs were categorized into six phenotype clusters based on t-SNE and the Raster Fairy method, and each grid represented an individual patch (B, G). The importance distribution of the patches in the pCR group (C) and non-pCR group (H). Darker colors represent the patches that played a more important role in pCR or non-pCR prediction. Demonstration of patch importance and the number of patches in each cluster; the size of the bubble represents the number of patches in the corresponding cluster (D, I). Representative patches of cluster 1 (E) and cluster 2 (J) and the part of the WSI from which they were selected.