| Literature DB >> 35406422 |
Ching-Wei Wang1,2, Yu-Ching Lee2, Cheng-Chang Chang3,4, Yi-Jia Lin5,6, Yi-An Liou1, Po-Chao Hsu3,4, Chun-Chieh Chang1, Aung-Kyaw-Oo Sai1, Chih-Hung Wang7,8, Tai-Kuang Chao5,6.
Abstract
Ovarian cancer is a common malignant gynecological disease. Molecular target therapy, i.e., antiangiogenesis with bevacizumab, was found to be effective in some patients of epithelial ovarian cancer (EOC). Although careful patient selection is essential, there are currently no biomarkers available for routine therapeutic usage. To the authors' best knowledge, this is the first automated precision oncology framework to effectively identify and select EOC and peritoneal serous papillary carcinoma (PSPC) patients with positive therapeutic effect. From March 2013 to January 2021, we have a database, containing four kinds of immunohistochemical tissue samples, including AIM2, c3, C5 and NLRP3, from patients diagnosed with EOC and PSPC and treated with bevacizumab in a hospital-based retrospective study. We developed a hybrid deep learning framework and weakly supervised deep learning models for each potential biomarker, and the experimental results show that the proposed model in combination with AIM2 achieves high accuracy 0.92, recall 0.97, F-measure 0.93 and AUC 0.97 for the first experiment (66% training and 34%testing) and high accuracy 0.86 ± 0.07, precision 0.9 ± 0.07, recall 0.85 ± 0.06, F-measure 0.87 ± 0.06 and AUC 0.91 ± 0.05 for the second experiment using five-fold cross validation, respectively. Both Kaplan-Meier PFS analysis and Cox proportional hazards model analysis further confirmed that the proposed AIM2-DL model is able to distinguish patients gaining positive therapeutic effects with low cancer recurrence from patients with disease progression after treatment (p < 0.005).Entities:
Keywords: deep learning; ovarian cancer; precision oncology; weakly supervised learning
Year: 2022 PMID: 35406422 PMCID: PMC8996991 DOI: 10.3390/cancers14071651
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Baseline characteristics of data.
| Characteristics |
|
|---|---|
| Tissue Core | 720 |
| Patient age (mean, range) | (59.1, 23–79) |
| Diagnosis (%) | |
| Papillary serous carcinoma | 444 (61.6) |
| Peritoneal serous papillary carcinoma | 89 (12.3) |
| Clear cell carcinoma | 69 (9.6) |
| Unclassified carcinoma | 69 (9.6) |
| Endometrioid carcinoma | 39 (5.5) |
| MC | 10 (1.4) |
| FIGO stage (%) | |
| I | 69 (9.6) |
| II | 39 (5.4) |
| III | 454 (63) |
| IV | 158 (22) |
| Surgery (%) | |
| Optimal debulking | 306 (42.5) |
| CRS+HIPEC | 217 (30.1) |
| Suboptimal debulking | 197 (27.4) |
| Treatment effectiveness (%) | |
| Effective | 412 (57.2) |
| Invalid | 308 (42.8) |
Data distribution of the collected four datasets w.r.t. immune-related proteins for training and testing for the first experiment.
| Treatment Outcome | AIM2 | C3 | C5 | NLRP3 | |
|---|---|---|---|---|---|
| Training (66%) | Effective | 68 | 68 | 68 | 68 |
| Invalid | 50 | 50 | 50 | 50 | |
| Testing (34%) | Effective | 35 | 35 | 35 | 35 |
| Invalid | 27 | 27 | 27 | 27 | |
| Total | 180 | 180 | 180 | 180 | |
Figure 1Sample tumor-like tissue selection results with high magnification views at 100 m of the four kinds of IHC stained data by the proposed weakly supervised tissue selection model.
Figure 2(a) System workflow: (i) multiresolution pyramid data structure of WSIs; (ii) a TMA core detection model conducts fast localization of tissue cores in low-resolution level; (iii) a forward-mapping function is applied to fetch the high resolution core data to be processed by (iv) a robust tumorlike tissue selection model to locate tumorlike tissues of each core; (v) a backward-mapping function is applied to fetch the medium resolution tumorlike tissue data of each core to be processed by; (vi) a treatment effectiveness classification model to predict; (vii) treatment outcomes; (b) Weakly supervised learning with focusing sampling, boosting learning and boosted data augmentation.
Figure 3The proposed hybrid deep learning precision oncology framework contains three deep learning networks. (a) In the tissue core detection model, ResNet-101 backbone where RPN proposes a set of low quality candidate bounding boxes (B0) from an image to determine the occurrence of an object, and therefore the subsequent detectors are developed to be more selective for lower quality candidates. The modules (H) produced samples for training the z-th classifier and detector. Ref. [40] is used as the RPN (b) The proposed weakly supervised tumorlike tissue selection model composed of 16 convolution layers (each convolution layer is followed by a RELU layer), five pooling layers for downsampling, and an upsampling layer. The class map of the core is generated by a deconvolution layer and a SoftMax layer. (c) The proposed treatment effectiveness prediction model uses dimensional reduction and parallel structures of the inception modules and contains three different sizes of convolution and one maximum pooling. For the network output of the previous layer, the channel is aggregated after the convolution operation, and then the nonlinear fusion is performed.
Figure 4Illustration of the proposed model selection method with three examples: (a) loss values of models through iterations during training; (b) F-measure scores of individual models with different iterations on the training set; (c) F-measure scores of individual models with different iterations on the testing set. The blue lines indicate the measurement values, and the red and yellow lines represent the associated first and second derivatives, respectively. During training, the proposed model selection method computes the loss values and F-measure scores of trained models on the training set and further calculates associated first and second derivatives. If a stable loss is found in (a) where the first and second derivatives of the training loss converge for a continuous period, the starting point of the stable loss window is used as the starting point in model selection searching. Afterwards, if a stable F-measure is found in (b) where the associated first and second derivatives converge for a continuous period, an early stop mechanism is activated to stop training and end the search window of model selection. Then, a model is selected in the model selection search window by finding a model with the highest F-measure score on the training set. If there are multiple models with the highest score, the one with the largest training iteration time is selected. Green circles highlight model selection results using the F-measure on the training set in (b) and demonstrates that overall the selected models obtain relatively high F-measure scores on the testing set in (c).
First experiment:Quantitative evaluation in classification of therapeutic outcomes.
| Method in Combination with Potential Biomarker | Accuracy | Precision | Recall | F-Measure | AUC |
|---|---|---|---|---|---|
| Proposed Weakly Supervised DL Method—AIM2 |
| 0.89 |
|
|
|
| Coudray et al. [ | 0.90 |
| 0.86 | 0.91 |
|
| Proposed Weakly Supervised DL Method—C3 | 0.69 | 0.69 | 0.83 | 0.75 | 0.75 |
| Coudray et al. [ | 0.90 | 0.91 | 0.91 | 0.91 | 0.94 |
| Proposed Weakly Supervised DL Method—C5 | 0.63 | 0.67 | 0.69 | 0.68 | 0.65 |
| Coudray et al. [ | 0.69 | 0.68 | 0.86 | 0.76 | 0.78 |
| Proposed Weakly Supervised DL Method—NLRP3 | 0.52 | 0.56 | 0.71 | 0.63 | 0.50 |
| Coudray et al. [ | 0.71 | 0.76 | 0.71 | 0.74 | 0.73 |
Figure 5(a) Receiver operating characteristic (ROC) curves on the testing set for the models of the proposed method and the benchmark approach [44]; (b) Graphs of AUC on the testing set with respect to the iteration times in training among the DL models, showing that the Proposed-AIM2 model generally outperforms other models with the same training time.
Second experiment: 5-fold cross-validation.
| Method in Combination with Potential Biomarker | Accuracy | Precision | Recall | F-Measure | AUC |
|---|---|---|---|---|---|
| ProposedWeakly Supervised DL Method—AIM2 |
|
|
|
|
|
| Coudray et al. [ | 0.73 ± 0.17 | 0.76 ± 0.19 | 0.71 ± 0.39 | 0.68 ± 0.33 | 0.9 ± 0.07 |
| Proposed Weakly Supervised DL Method—C3 | 0.75 ± 0.1 | 0.77 ± 0.1 | 0.79 ± 0.1 | 0.78 ± 0.09 | 0.78 ± 0.12 |
| Coudray et al. [ | 0.73 ± 0.08 | 0.77 ± 0.06 | 0.74 ± 0.21 | 0.74 ± 0.1 | 0.84 ± 0.09 |
| Proposed Weakly Supervised DL Method—C5 | 0.65 ± 0.03 | 0.66 ± 0.03 | 0.8 ± 0.04 | 0.72 ± 0.02 | 0.66 ± 0.07 |
| Coudray et al. [ | 0.56 ± 0.13 | 0.69 ± 0.22 | 0.51 ± 0.3 | 0.54 ± 0.22 | 0.52 ± 0.23 |
| Proposed Weakly Supervised DL Method—NLRP3 | 0.56 ± 0.08 | 0.59 ± 0.05 | 0.77 ± 0.08 | 0.67 ± 0.06 | 0.55 ± 0.08 |
| Coudray et al. [ | 0.63 ± 0.18 | 0.68 ± 0.19 | 0.66 ± 0.32 | 0.63 ± 0.28 | 0.73 ± 0.24 |
Figure 6(a) Kaplan–Meier PFS and; (b) OS analysis for EOC and PSPC patients receiving bevacizumab therapy based on AI prediction outcomes (0: invalid; 1: effective) by the two best models, i.e., proposed method-AIM2 and Coudray et al. [44]-AIM2.
Multivariate analyses of DL model prediction and clinical factors associated with recurrence.
| Adjusted HR | ||
|---|---|---|
| Age | 1.03 (0.98–1.07) | 0.212 |
| BMI | 1.01 (0.92–1.11) | 0.848 |
| Number of bevacizumab used times | 0.97 (0.89–1.05) | 0.472 |
| FIGO | 4.23 (0.82–21.86) | 0.085 |
| Histology (others vs. serous) | 0.82 (0.26–2.62) | 0.737 |
| Surgery | ||
| CRS + HIPEC | 1.00 (reference) | reference |
| optimal | 0.92 (0.35–2.42) | 0.869 |
| suboptimal | 1.18 (0.42–3.30) | 0.753 |
| Theraphy | ||
| Concurrent therapy | 1.00 (reference) | reference |
| Second-line therapy | 1.71 (0.66–4.41) | 0.265 |
| Maintenance therapy | 0.23 (0.04–1.40) | 0.110 |
| DL model prediction | ||
| Proposed DL Method-AIM2 (effective v.s. invalid) | 0.18 (0.06–0.55) |
|
1 HR = Hazard ratio; 2 FIGO = International Federation of Gynecology and Obstetrics. 3 CRS+HIPEC = Cytoreductive surgery with hyperthermic intraperitoneal chemotherapy. * The proposed model prediction is useful as an indicator for patient selection with statistical significance (p < 0.01).