Literature DB >> 35406422

A Weakly Supervised Deep Learning Method for Guiding Ovarian Cancer Treatment and Identifying an Effective Biomarker.

Ching-Wei Wang^1,2, Yu-Ching Lee², Cheng-Chang Chang^3,4, Yi-Jia Lin^5,6, Yi-An Liou¹, Po-Chao Hsu^3,4, Chun-Chieh Chang¹, Aung-Kyaw-Oo Sai¹, Chih-Hung Wang^7,8, Tai-Kuang Chao^5,6.

Abstract

Ovarian cancer is a common malignant gynecological disease. Molecular target therapy, i.e., antiangiogenesis with bevacizumab, was found to be effective in some patients of epithelial ovarian cancer (EOC). Although careful patient selection is essential, there are currently no biomarkers available for routine therapeutic usage. To the authors' best knowledge, this is the first automated precision oncology framework to effectively identify and select EOC and peritoneal serous papillary carcinoma (PSPC) patients with positive therapeutic effect. From March 2013 to January 2021, we have a database, containing four kinds of immunohistochemical tissue samples, including AIM2, c3, C5 and NLRP3, from patients diagnosed with EOC and PSPC and treated with bevacizumab in a hospital-based retrospective study. We developed a hybrid deep learning framework and weakly supervised deep learning models for each potential biomarker, and the experimental results show that the proposed model in combination with AIM2 achieves high accuracy 0.92, recall 0.97, F-measure 0.93 and AUC 0.97 for the first experiment (66% training and 34%testing) and high accuracy 0.86 ± 0.07, precision 0.9 ± 0.07, recall 0.85 ± 0.06, F-measure 0.87 ± 0.06 and AUC 0.91 ± 0.05 for the second experiment using five-fold cross validation, respectively. Both Kaplan-Meier PFS analysis and Cox proportional hazards model analysis further confirmed that the proposed AIM2-DL model is able to distinguish patients gaining positive therapeutic effects with low cancer recurrence from patients with disease progression after treatment (p < 0.005).

Entities: Chemical

Keywords: deep learning; ovarian cancer; precision oncology; weakly supervised learning

Year: 2022 PMID： 35406422 PMCID： PMC8996991 DOI： 10.3390/cancers14071651

Source DB: PubMed Journal: Cancers (Basel) ISSN： 2072-6694 Impact factor: 6.639

1. Introduction

Globally, ovarian cancer is the most common cancer-related cause of death from gynecological tumors in women [1] but lacks methods recommended for screening and early diagnostics of this disease [2]. Around 90% of primary ovarian cancers are of epithelial origin. Epithelial ovarian cancer (EOC) is classified into serous, mucinous, endometrioid, clear cell, transitional cell, mixed epithelial, undifferentiated and unclassified subtype [3]. Peritoneal serous papillary carcinoma (PSPC), though managed according to EOC therapeutic principles, has been variably considered as an EOC counterpart [4]. The current standardized treatment for EOC is optimal cytoreductive surgery plus platinum-based chemotherapy. However, with the development of chemotherapy-resistant and refractory diseases, the sensitivity of chemotherapy has decreased [5]. There are many new drugs under development, and they are undergoing clinical trials aimed to evaluate their efficacy in the treatment of EOC, such as antiangiogenesis, inhibitors of growth factor signaling, poly-ADP-ribose polymerase inhibitors (PARP) inhibitors, or folate receptor inhibitors [2]. Tumor cells and tumor stroma both produce a variety of proangiogenic factors, such as vascular endothelial growth factor (VEGF), bFGF, interleukin 8 (IL-8), G-CSF and GM-CSF, which are designed to promote angiogenesis [6]. Angiogenesis promotes several pathophysiological conditions which are important for tumor cell growth, metastasis and also associated with chronic inflammation [7]. The key factor in the development of a tumor pathological vascular network is VEGF and its signal transduction pathway [8]. VEGF is also related to the formation of ascites in patients with EOC. Therefore, inhibiting pathological angiogenesis has become one of the new treatment options that has been widely tested in the treatment of EOC with promising therapeutic effects [2]. Bevacizumab is an anti-VEGF antibody, and its use in the first and second-line therapy of EOC is well established [9]. Considering the cost, potential side effects such as hypertension, proteinuria, bleeding, thromboembolic events, poor wound healing, and gastrointestinal perforation [10] and no confirmed and accessible biomarkers for routine clinical use to direct patient selection for bevacizumab treatment, the identification of new predictive method remains an urgent unmet medical need. Artificial intelligence (AI) has been demonstrating remarkable success in medical image analysis owing to the rapid progress of “deep learning (DL)” algorithms [11], which have shown increasing ability in solving complex and real-world problems in computer vision and image analysis. The possibility of digitizing gigapixel whole-slide images (WSIs) of tissues has led to AI and machine learning tools in digital pathology, which enable mining of subvisual morphometric phenotypes and may ultimately improve the patient therapeutic effect [12]. The combination of AI and the WSIs from tissue microarrays (TMAs) enables high throughput screening of a large number of patients. In this study, we have built a new DL-based precision oncology frameworkfrom immunostained TMA WSIs to accurately predict bevacizumab therapeutic effect in patients with EOC and PSPC. Importantly, the results show that therapeutic prediction is achievable without guidance or the use of prior knowledge of EOC or PSPC pathology in training AI models. Instead of directing the focus toward traditional pathological evaluation for tumor subclassification (e.g., papillary or clear cells formation, presence of mucin) or immunostaining features (e.g., percentage, intensity, score), the AI learning process is simply guided with the patient therapeutic effect data. In evaluation, the proposed model in combination with AIM2 achieves high accuracy 0.92, recall 0.97, F-measure 0.93 and AUC 0.97 for the first experiment (66% training and 34%testing) and high accuracy 0.86 ± 0.07, precision 0.9 ± 0.07, recall 0.85 ± 0.06, F-measure 0.87 ± 0.06 and AUC 0.91 ± 0.05 for the second experiment using five-fold cross validation, respectively. Both Kaplan–Meier PFS analysis and Cox proportional hazards model analysis further confirm that the proposed AIM2-DL model is able to distinguish patients gaining positive therapeutic effects with low cancer recurrence from patients with disease progression after treatment ().

2. Related Works

2.1. Selection of Antibodies

Rather than directly targeting cancer cells, bevacizumab targets the tumor microenvironment, the effects of VEGF-inhibition are likely tumor-type and microenvironment-specific including the modulation of cancer immunity [13]. Cancer cells affect their microenvironment by releasing extracellular signals, thereby inducing tumor angiogenesis, and improving immune tolerance, thereby avoiding being recognized by the immune system. VEGF signaling supports immune suppression, and targeting VEGF/VEGFR has been recognized as an approach to enhance antitumor immunity in cancer patients [13]. Chronic inflammation perpetuated by inflammasome activation may play a central role in immunosuppression, angiogenesis, tumor proliferation, and metastasis. Conversely, inflammasome signaling also contributes to tumor suppression, which indicates the diverse roles of inflammasomes in tumorigenesis [14]. Inflammasomes are activated upon cellular infection that triggers the maturation of proinflammatory cytokines to engage innate immune defenses [15]. Once innate immune system related NOD-like receptors (NLRs) or AIM2-like receptors (ALRs) are activated, inflammation via the recruitment of immune cells such as macrophages promotes the proteolytic cleavage and secretion of proinflammatory cytokines (IL-1 and IL-18) through the activation of caspase-1, leading to cell senescence, apoptosis and the prevention of cancer progression [16,17,18]. In a previous study, the high expression level of AIM2 and NLRP3 was significantly correlated with poor PFS and disease progression of EOC which demonstrated a key role of the dysregulated inflammasome in modulating the malignant transformation of endometriosis-associated ovarian cancer [19]. In ascites of EOC patients, local complement activation has been observed to induce high complement anaphylatoxins level [20]. The immune genes involved in the complement system have dual influences on the survival of patients. Immunohistochemical analysis showed that the expression of the C3a receptor (C3aR) and the C5a receptor (C5aR) is higher in ovarian clear cell carcinoma [21]. Complement-activated factors have been related, either directly or indirectly, to neovascularization in several diseases [22]. The antiangiogenic factor was upregulated in monocytes by complement activation [23]. In contrast, a role for complement in the activation of angiogenesis has been demonstrated in age-related macular degeneration [24]. As the resistance to anti-VEGF treatment is related to immunity, in this study we explore the utility of four antibodies, including AIM2, NLPR3, C3 and C5, to differentiate patients with good treatment responses from patients with disease progression on EOC and PSPC.

2.2. Deep Learning in Application to Gynecologic Oncology

With an increase in computing power and advances in imaging technologies, DL is being implemented for the diagnosis and classification of medical images. Wang et al. [25] proposed a DL-based noninvasive recurrence prediction model in high-grade serous ovarian cancer (HGSOC) that extracts prognostic biomarkers from preoperative computed tomography (CT) images. Sato et al. [26] successfully applied DL to the classification of images from colposcopy. Matsuo et al. [27] compared the performance of DL models in survival analysis for women with newly diagnosed cervical cancer with conventional Cox proportional hazard regression (CPH) models. Ke et al. [28] proposed a DL diagnostic system that can distinguish high grade squamous intraepithelial lesion (HSIL), squamous cell carcinoma, atypical squamous cells of undetermined significance (ASCUS) and low grade squamous intraepithelial lesion. Wu et al. [29] introduced automatic classification of ovarian cancer types from cytological images using deep convolutional neural networks. Ghoniem et al. [30] built a multimodal evolutionary DL model for ovarian cancer diagnosis. Hong et al. [31] built multiresolution deep learning models for predicting endometrial cancer subtypes and molecular features from histopathology images. These studies demonstrate that gynecologists are able to utilize DL in clinical practice, increasingly.

2.3. Weakly Supervised Learning

The development of decision support systems for medical applications with deployment in clinical practice has been hindered by the need for large manually annotated datasets. To overcome problems of limited amount of data supervision, recent studies have investigated weakly supervised learning technologies. Campanella et al. [32] built a weakly supervised multiple instance deep learning system that uses only the reported diagnoses as labels for training accurate classification models in pathology and avoids expensive and time-consuming pixelwise manual annotations. Li et al. [33] presented an ensemble learning scheme to derive a safe prediction by integrating multiple weakly supervised learners to deal with inaccurate supervision, such as label noise learning, where the given labels are not always ground-truth. Kim et al. [34] develoep a weakly-supervised DL algorithm that diagnoses breast cancer at ultrasound without image annotation. Liu et al. [35] evaluated a weakly supervised deep learning approach to breast magnetic resonance imaging (MRI) assessment and showed that it is feasible to assess breast MRI images without the need for pixel-by-pixel segmentation using the weakly supervised learning method to yield a high degree of specificity in lesion classification. Lu et al. [36] built a weakly supervised clustering-constrained-attention multiple-instance learning (CLAM) model for data-efficient WSI processing and learning that only requires slide-level labels. These studies demonstrate that weakly supervised learning assists development and deployment of decision support systems for medical applications.

3. Materials and Methods

3.1. Study Population and Experimental Setup

A total of 720 sample tissue cores of 12 TMAs with four immune-related proteins, including inflammasome absent in melanoma (AIM2), nucleotide-binding domain leucine-rich repeat and pyrin domain containing receptor 3 (NLRP3), the complement of C3 and C5, were constructed with clinical information collected from March 2013 to January 2021 from the tissue bank of the department of pathology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan,. Ethical approvals were obtained from the research ethics committee of the Tri-Service General Hospital (TSGHIRB No.1-107-05-171 and No.B202005070). The medical data were de-identified and used for a retrospective study without impacting patient care. Patients in this study were divided into the bevacizumab-resistant or the bevacizumab sensitive group. Patients with persistently high levels of CA-125 during bevacizumab therapy or who experienced tumor progression or recurrence (assessed by CT/PET imaging) within six months posttreatment were classified as the bevacizumab-resistant group. Patients with low levels of CA-125 and no tumor progression or recurrence (based on imaging) during or within six months of bevacizumab treatment were classified as the bevacizumab sensitive group. In data preparation, tissues from bevacizumab-treated EOC and PSPC patients were embedded in Paraffin wax. Two pathologists screened histological sections and selected areas of representative tumor cells, and one tissue core (2 mm in diameter) was then taken from each of the representative tumor samples and placed in a new recipient paraffin block for immunohistochemistry staining. The TMA sections were dewaxed in xylene, rehydrated in alcohol, and immersed in 3% hydrogen peroxide for 10 min to suppress the activity of endogenous peroxidase. Antigen retrieval was carried out by heating each section to 100 °C for 30 min in 0.01 M sodium citrate buffer (pH 6.0). The sections were rinsed three times (5 min each wash) in phosphate-buffered saline (PBS) and then incubated for one hour at room temperature with antibodies of AIM2 (1:500) (Abcam, cat#ab93015, Cambridge, UK), NLRP3 (1:300) (Millipore, cat#ABF23, Burlington, VT, USA), C3 (1:1000) (Abcam, cat#ab200999, Cambridge, UK) and C5 (1:300) (Abcam, cat#ab217027, UK) in PBS. The sections were washed three times (5 min each wash) in PBS, followed by incubation with horseradish peroxidase-labeled immunoglobulin (Dako, Carpinteria, CA, USA) for 1 h at room temperature. The sections were washed three times again, and the peroxidase activity was visualized using a solution of diaminobenzidine (DAB) at room temperature. Slides were stained with AIM2, NLRP3, C3 and C5 antibody. The WSIs were then acquired with a digital slide scanner (Leica AT Turbo) with a 20× objective lens. In total, four datasets were built with four immune-related proteins. With regard to the class distribution, well-balanced data sets were ensured, and 57.2% of tissue cores were associated with effective Bevacizumab treatment outcomes, whereas 42.8% of tissue cores were associated with invalid treatment outcomes. The characteristics of the data are shown in Table 1. In evaluation, two experiments were conducted. For the first experiment, each dataset was split into two separate subsets for training and testing; 66% for training and 34% for testing, which ensured models were never trained and tested on the same sample (see Table 2). For the second experiment, a five fold cross validation was performed. In evaluation, models were evaluated independently on the testing set. In training, the batch size was set as six cores per batch with the learning rate as 0.003, and AI models were built independently for each stains. In evaluation, all models were independently evaluated on the test set.

Table 1

Baseline characteristics of data.

Characteristics	N
Tissue Core	720
Patient age (mean, range)	(59.1, 23–79)
Diagnosis (%)
Papillary serous carcinoma	444 (61.6)
Peritoneal serous papillary carcinoma	89 (12.3)
Clear cell carcinoma	69 (9.6)
Unclassified carcinoma	69 (9.6)
Endometrioid carcinoma	39 (5.5)
MC	10 (1.4)
FIGO stage (%)
I	69 (9.6)
II	39 (5.4)
III	454 (63)
IV	158 (22)
Surgery (%)
Optimal debulking	306 (42.5)
CRS+HIPEC	217 (30.1)
Suboptimal debulking	197 (27.4)
Treatment effectiveness (%)
Effective	412 (57.2)
Invalid	308 (42.8)

Table 2

Data distribution of the collected four datasets w.r.t. immune-related proteins for training and testing for the first experiment.

	Treatment Outcome	AIM2	C3	C5	NLRP3
Training (66%)	Effective	68	68	68	68
	Invalid	50	50	50	50
Testing (34%)	Effective	35	35	35	35
	Invalid	27	27	27	27
Total		180	180	180	180

3.2. Weakly Supervised Learning Framework

One of the obstacles for developing a deep neural network for medical image analysis application are data insufficiency. Small data may lead to under training, reducing the performance of the network, but large data cost enormous efforts in data collection and labeling. One of the contributions of this study is to develop an effective and efficient precision oncology system for prediction of therapeutic effect on ovarian cancer patients while utilizing as few manual annotations as possible. In this study, only the patient-based labels w.r.t. therapeutic effect on ovarian cancer patients and annotations on eight tissue cores, which account for less than of the training image data, were utilized for building the system. We developed an efficient learning framework to produce AI models based on limited data with boosting learning, soft focusing sampling, boosted data augmentation and transfer learning. We built a single and robust weakly supervised model in selecting tumorlike tissues for all four kinds of immunohistochemical (IHC) stained data, which appear quite differently due to large variations on tissue morphology, staining magnitude, individual responses to antibodies and highlighting structures as shown in the high magnification views of Figure 1. The proposed weakly supervised tumorlike tissue selection model was built as follows.

Figure 1

Sample tumor-like tissue selection results with high magnification views at 100 m of the four kinds of IHC stained data by the proposed weakly supervised tissue selection model.

Boosting Learning, Focusing Sampling, Boosted Data Augmentation and Transfer Learning

Transfer learning has been successfully applied to eliminate the problem of data scarcity in different biomedical image analysis applications. In transfer learning, a network captures knowledge from one problem and applies it to a different problem that contains a relatively small number of data samples for training the network properly. For producing the backbone network, only five H&E stained WSIs from IEEE automatic cancer detection and classification in whole slide lung histopathology challenge [37] were used. We hypothesized that this pretrain network may have the knowledge to identify the tissue morphology of tumors. Hence, this pretrained model was used as the backbone architecture for transferring knowledge to locate ovarian tumor tissues, and the model was retrained with our IHC images of the eight annotated tissue cores, which accounted for less than of the training image data. However, to train a model with such a small dataset, a boosting learning approach was devised as follows. Given a training set S: where represents the instance data, and : represents the label, a learner and the number of base models to build U, the proposed boosting learning produces the final AI model by the following steps. First, create a new set : with instance weight where : represents a tile; . Each instance weight is initialized with an IoU based attention weighting function for further training. where ; . Then, iteratively for , build a base model . The sample weights are continuously modified and formulated by increasing the attention weights of false positives and false negatives of . Next, we devised a boosted data augmentation based on the sample attention weights and produced new data . Data augmentation was applied to enlarge the training set with additional synthetically modified databy manipulating the rotation per 5° and 5 times and increment of 90°, the mirror-flipped along the horizontal and vertical axes, the contrast adjusted (random contrast, range 0% ± 20%), the saturation adjusted(random saturation, range 0% ± 20%), and the brightness adjusted(random brightness, range 0% ± 12.5%). When the training data is partially labeled, many unlabeled tissues of interest are wrongly defined as background or contents of no interest. This severely confuses AI learners during supervised learning and deteriorates the performance output of AI models. To deal with this issue, we added an IoU-based focusing sampling mechanism for computing the gradients effectively. A number of unlabeled cells will now not be used as negative samples for training to confuse learning but arranged as ignored samples. This not only helps learning be more focused but also speeds up learning time. Moreover, we increased learning efforts on false positive and false negative predictions and further added variations of the FPs and FNs to assist AI learn better, deal with its weakness and produce improved AI models. An illustration of the proposed weakly supervised learning framework is given in Figure 2b.

Figure 2

(a) System workflow: (i) multiresolution pyramid data structure of WSIs; (ii) a TMA core detection model conducts fast localization of tissue cores in low-resolution level; (iii) a forward-mapping function is applied to fetch the high resolution core data to be processed by (iv) a robust tumorlike tissue selection model to locate tumorlike tissues of each core; (v) a backward-mapping function is applied to fetch the medium resolution tumorlike tissue data of each core to be processed by; (vi) a treatment effectiveness classification model to predict; (vii) treatment outcomes; (b) Weakly supervised learning with focusing sampling, boosting learning and boosted data augmentation.

For the learner , the segmentation network is an extended version of our previous efforts in fast screening of cervical cancer [38] and thyroid cancer diagnosis [39]. The modified model is improved from its previous edition after including focusing sampling, boosted data augmentation and SoftMax, which is added to improve the prediction score for each class.

3.3. A Hybrid Deep Learning Model

In this study, a hybrid DL precision oncology framework was built, consisting of three DL models and two mapping models to rapidly process a gigapixel WSI of a TMA in seconds. Figure 2a presents the system workflow. First, a TMA core detector located tissue cores on a single patch in low resolution level (Section 3.3.1). Second, a forward-mapping function was applied to fetch the high resolution core data to be processed by a robust weakly supervised tumo-like tissue selection model (Section 3.3.2). Third, a backward-mapping function was applied to fetch the medium resolution tumorlike data of each core to be processed by a treatment effectiveness classification model (Section 3.3.3). The network architectures of the proposed hybrid DL framework are shown in Figure 3.

Figure 3

The proposed hybrid deep learning precision oncology framework contains three deep learning networks. (a) In the tissue core detection model, ResNet-101 backbone where RPN proposes a set of low quality candidate bounding boxes (B0) from an image to determine the occurrence of an object, and therefore the subsequent detectors are developed to be more selective for lower quality candidates. The modules (H) produced samples for training the z-th classifier and detector. Ref. [40] is used as the RPN (b) The proposed weakly supervised tumorlike tissue selection model composed of 16 convolution layers (each convolution layer is followed by a RELU layer), five pooling layers for downsampling, and an upsampling layer. The class map of the core is generated by a deconvolution layer and a SoftMax layer. (c) The proposed treatment effectiveness prediction model uses dimensional reduction and parallel structures of the inception modules and contains three different sizes of convolution and one maximum pooling. For the network output of the previous layer, the channel is aggregated after the convolution operation, and then the nonlinear fusion is performed.

3.3.1. Tissue Core Detection Model

The tissue core detection model rapidly locates tissue cores in low resolution. In our preliminary study, we compared the performance of cascade region-based convolutional neural networks (Cascade R-CNN) [41] and Faster R-CNN [42] for object detection, and it was found that Faster R-CNN often fails detecting objects that are close to each other, while Cascade R-CNN performs well in detection of those objects. Therefore, the TMA cores detection model was built with Cascade R-CNN. We formulated the two dimensional input data into a multiresolution pyramid data structure with multiple levels M from high to low magnification . The detector model aims to rapidly locate TMA cores on the low-magnification level , generating a set of cores using Equation (3). where N denotes the number of detected cores, and represents the bounding box of a detected cores in level . Next, forward mapping was conducted to acquire each core data in the high magnification level for further tumor segmentation, generating a set of cores . where represents the bounding box of a detected cores in level 0 after forward mapping. The tissue core detection model is a multistage object detection DL architecture [41] that consists of one region proposal network (RPN) and three Faster R-CNN [42] detectors. RPN proposes a set of low quality candidate bounding boxes from an image to determine the occurrence of an object, and therefore the subsequent detectors were developed to be more selective for lower quality candidates. The detector at stage consists of a bounding box-regressor and a classifier . The bounding box-regressor aims to regress a candidate bounding box of an object proposal image into a referenced bounding box and is learned from a training set by minimizing the risk formulated as follows: The TMA cores detection model was developed based on a cascaded bounding box-regressor formulated as follows: where indicates the total number of stages. For multitask learning, each detector at stage is learned with the loss formulated as follows: where , is the trade-off coefficient, the label of under the intersection-over-union (IoU) threshold with and the Iversion bracket indicator function. In this study, three detectors with the IoU threshold and .

3.3.2. Weakly Supervised Tumorlike Tissue Selection Model

The tissue selection model aims to sample information of critical tissues for further analysis; information about how the weakly supervised tumorlike tissue selection model is built is described in Section 3.2. The tumorlike tissue selection model was performed on each core in the patch-based data structure, producing tumor selection results for each tile . where denotes the number of types of tissue to be identified, and 0, 1 and 2 represent the background, others and tumorlike tissue, respectively. A two dimensional pixel-based class map for each tile was produced as the index of the tissue type that has the maximum probability of the pixel using Equation (9). Next, the non-tumorlike tissues were suppressed using Equation (10), and tiles with the qualified number of tumorlike tissues were selected for further training or testing the treatment effectiveness prediction model described in the subsequent section using Equation (11).

3.3.3. Treatment Effectiveness Prediction Model

The treatment effectiveness prediction model of the proposed hybrid DL framework utilizes information of selected tumorlike tissues instead of nonmalignant cells such as stroma or background, and generates treatment effectiveness prediction (Figure 2(vii)). The treatment effectiveness prediction model was built based on Inception V3 [43], and modifications were made in the Inception Module 2 and 3. The network architecture is illustrated in Figure 3c where the blue highlighted convolution layers are the modified layers, for which the bias filter and scale filter are removed to avoid data distortion. Obtaining the selected tiles and tissues at level 0, backward mapping was performed to fetch the tumor features of each core from the middle-magnification level l, which is defined in Equation (12). where was set to 3 in the study. The treatment effective decision model applied onto the probability using the Equation (13). The task is to predict the treatment effectiveness outcomes for individual cores, where d is the number of the cores; is the prediction of treatment effectiveness on the d-th core using Equation (14). In this study, was set to 0.5.

3.3.4. Model Selection Method with Early Stop Mechanism

A model selection method was developed in this study, and an illustration of the proposed model selection approach is given in Figure 4. Through training iteration k, the proposed model selection method computed the loss values and F-measure scores of trained models on the training set and further calculated associated first and second derivatives, including and .

Figure 4

Illustration of the proposed model selection method with three examples: (a) loss values of models through iterations during training; (b) F-measure scores of individual models with different iterations on the training set; (c) F-measure scores of individual models with different iterations on the testing set. The blue lines indicate the measurement values, and the red and yellow lines represent the associated first and second derivatives, respectively. During training, the proposed model selection method computes the loss values and F-measure scores of trained models on the training set and further calculates associated first and second derivatives. If a stable loss is found in (a) where the first and second derivatives of the training loss converge for a continuous period, the starting point of the stable loss window is used as the starting point in model selection searching. Afterwards, if a stable F-measure is found in (b) where the associated first and second derivatives converge for a continuous period, an early stop mechanism is activated to stop training and end the search window of model selection. Then, a model is selected in the model selection search window by finding a model with the highest F-measure score on the training set. If there are multiple models with the highest score, the one with the largest training iteration time is selected. Green circles highlight model selection results using the F-measure on the training set in (b) and demonstrates that overall the selected models obtain relatively high F-measure scores on the testing set in (c).

Next, are continuously evaluated for every iterations to find models with stable loss where converges to for a continuous period, the starting time of the stable loss window is used as the starting point in the model selection search window as shown in Figure 4a,b. Afterwards, the model searches for a stable F-measure where converges to for a continuous period, an early stop mechanism is activated to stop training and set the end of the search window of model selection as ; otherwise, is set as the maximum number of training times given as an input parameter. Then, a model is selected in the model selection search window by finding a model with the highest F-measure score on the training set. If there are multiple models with the highest score, the one with the largest training iteration time is selected. ( are input parameters and empirically defined as 1000, 0.003 and 0.1, respectively).

4. Results

4.1. Quantitative Evaluation

For the first experiment (see Table 3), the Proposed-AIM2 model achieved the highest recall 0.97, AUC 0.97, F-measure 0.93 and accuracy 0.92, respectively, and Coudray-AIM2 obtained the highest precision 0.97 and AUC 0.97, respectively. It was also found that Coudray-AIM2 and Coudray-C3 obtained good performance with F-measure equal to 0.91. Figure 5a further compares the ROC curves on the testing set for all models, showing that the Proposed-AIM2 model consistently performed well, regardless of the value of the selected threshold. Moreover, Figure 5b compares the performance of AUC through iterations of AI models without model selection on the testing set, showing that the Proposed-AIM2 model generally outperformed other models with the same training time.

Table 3

First experiment:Quantitative evaluation in classification of therapeutic outcomes.

Method in Combination with Potential Biomarker	Accuracy	Precision	Recall	F-Measure	AUC
Proposed Weakly Supervised DL Method—AIM2	0.92	0.89	0.97	0.93	0.97
Coudray et al. [44]—AIM2	0.90	0.97	0.86	0.91	0.97
Proposed Weakly Supervised DL Method—C3	0.69	0.69	0.83	0.75	0.75
Coudray et al. [44]—C3	0.90	0.91	0.91	0.91	0.94
Proposed Weakly Supervised DL Method—C5	0.63	0.67	0.69	0.68	0.65
Coudray et al. [44]—C5	0.69	0.68	0.86	0.76	0.78
Proposed Weakly Supervised DL Method—NLRP3	0.52	0.56	0.71	0.63	0.50
Coudray et al. [44]—NLRP3	0.71	0.76	0.71	0.74	0.73

Figure 5

(a) Receiver operating characteristic (ROC) curves on the testing set for the models of the proposed method and the benchmark approach [44]; (b) Graphs of AUC on the testing set with respect to the iteration times in training among the DL models, showing that the Proposed-AIM2 model generally outperforms other models with the same training time.

For the second experiment using five-fold cross validation (see Table 4), the proposed-AIM2 model was demonstrated to achieve the highest accuracy 0.86 ± 0.07, precision 0.9 ± 0.07, recall 0.85 ± 0.06, F-measure 0.87 ± 0.06 and AUC 0.91 ± 0.05. Overall, the proposed method obtained better results using AIM2 than C3, C5 and NLRP3. The results of the two experiments indicate that AIM2 could be an effective biomarker for guiding ovarian cancer treatment.

Table 4

Second experiment: 5-fold cross-validation.

Method in Combination with Potential Biomarker	Accuracy	Precision	Recall	F-Measure	AUC
ProposedWeakly Supervised DL Method—AIM2	0.86 ± 0.07	0.9± 0.07	0.85 ± 0.06	0.87 ± 0.06	0.91 ± 0.05
Coudray et al. [44]—AIM2	0.73 ± 0.17	0.76 ± 0.19	0.71 ± 0.39	0.68 ± 0.33	0.9 ± 0.07
Proposed Weakly Supervised DL Method—C3	0.75 ± 0.1	0.77 ± 0.1	0.79 ± 0.1	0.78 ± 0.09	0.78 ± 0.12
Coudray et al. [44]—C3	0.73 ± 0.08	0.77 ± 0.06	0.74 ± 0.21	0.74 ± 0.1	0.84 ± 0.09
Proposed Weakly Supervised DL Method—C5	0.65 ± 0.03	0.66 ± 0.03	0.8 ± 0.04	0.72 ± 0.02	0.66 ± 0.07
Coudray et al. [44]—C5	0.56 ± 0.13	0.69 ± 0.22	0.51 ± 0.3	0.54 ± 0.22	0.52 ± 0.23
Proposed Weakly Supervised DL Method—NLRP3	0.56 ± 0.08	0.59 ± 0.05	0.77 ± 0.08	0.67 ± 0.06	0.55 ± 0.08
Coudray et al. [44]—NLRP3	0.63 ± 0.18	0.68 ± 0.19	0.66 ± 0.32	0.63 ± 0.28	0.73 ± 0.24

4.2. Statistical Analysis

Studies of how patients respond to treatment over time are critical to investigating how therapies influence disease progression during survivorship, and generally two closely related statistical analyses, i.e., Kaplan–Meier (K-M) analysis and Cox proportional hazards model analysis, are performed [45] where K-M is a univariate approach, while Cox analysis is a multivariable approach. In this study, the results of the K-M analysis and Cox proportional hazards model analysis are provided as follows.

4.2.1. Kaplan–Meier Progression Free Survival and Overall Survival Analysis

To further examine the performance of the proposed precision oncology models, we conducted Kaplan–Meier analysis in PFS and overall survival (OS) time using the log-rank test. The binary prediction outcomes (0: invalid, 1: effective) of each model were used to categorize patients into two groups, and then a Kaplan–Meier curve was generated to visualize the probability of a patient having disease progression as time increases. Figure 6a compares the Kaplan–Meier PFS curves of the top two models in Table 3, showing the proposed-AIM2 as highly effective () and Coudray-AIM2 as effective() to distinguish patients gaining positive therapeutic effects with low cancer recurrence from patients with disease progression after treatment with statistical significance. On the other hand, Figure 6b compares the Kaplan–Meier OS curves of the two top models, showing that both models are not effective to distinguish treatment effectiveness with respect to the overall survival time.

Figure 6

(a) Kaplan–Meier PFS and; (b) OS analysis for EOC and PSPC patients receiving bevacizumab therapy based on AI prediction outcomes (0: invalid; 1: effective) by the two best models, i.e., proposed method-AIM2 and Coudray et al. [44]-AIM2.

4.2.2. Cox Proportional Hazards Model Analysis

We further investigated DL model prediction and clinical factors, including age, BMI, number of treatments, cancer stage (FIGO), histology and surgery type, in association with disease progression using multivariate analysis. However, as shown in Table 5, only the proposed DL model prediction is useful as an indicator for patient selection with statistical significance (); patients who were predicted to be effective by the proposed-AIM2 DL model had a lower risk of recurrence than patients predicted to be invalid ().

Table 5

Multivariate analyses of DL model prediction and clinical factors associated with recurrence.

	Adjusted HR 1 (95% C.I.)	p Value
Age	1.03 (0.98–1.07)	0.212
BMI	1.01 (0.92–1.11)	0.848
Number of bevacizumab used times	0.97 (0.89–1.05)	0.472
FIGO 2 (III + IV vs. I + II)	4.23 (0.82–21.86)	0.085
Histology (others vs. serous)	0.82 (0.26–2.62)	0.737
Surgery
CRS + HIPEC 3	1.00 (reference)	reference
optimal	0.92 (0.35–2.42)	0.869
suboptimal	1.18 (0.42–3.30)	0.753
Theraphy
Concurrent therapy	1.00 (reference)	reference
Second-line therapy	1.71 (0.66–4.41)	0.265
Maintenance therapy	0.23 (0.04–1.40)	0.110
DL model prediction
Proposed DL Method-AIM2 (effective v.s. invalid)	0.18 (0.06–0.55)	0.003 *

1 HR = Hazard ratio; 2 FIGO = International Federation of Gynecology and Obstetrics. 3 CRS+HIPEC = Cytoreductive surgery with hyperthermic intraperitoneal chemotherapy. * The proposed model prediction is useful as an indicator for patient selection with statistical significance (p < 0.01).

5. Conclusions and Discussion

This study presents a new precision oncology approach with a hybrid DL framework to guide cancer treatment on patients with EOC and PSPC using immunostained TMA WSIs. Bevacizumab is the first antiangiogenic agent to have demonstrated benefit as first-line and maintenance therapy in EOC and was explored in two large prospective randomized trials. The Gynecologic Oncology Group 218 and ICON 7 phase III trials revealed significantly prolonged PFS for carboplatin/paclitaxel plus bevacizumab followed by bevacizumab maintenance versus carboplatin/paclitaxel alone [46]. Other antiangiogenics have been assessed in EOC, such as pazopanib, sorafenib, sunitinib, cediranib, VEGF trap (aflibercept), and AMG386. However, EOC response rates to these inhibitors were relatively low, ranging from 6–20% [47]. In addition, none have been adopted in routine clinical practical use due to toxicity and the cost of licensing [9,48]. Bevacizumab is the only approved antiangiogenic agent in routine clinical practice. Unlike most other targeted therapies, bevacizumab is used in the general patient populations instead of targeting patients preselected by a biomarker. Despite intense efforts, no predictive biomarker has been identified for personalized use of bevacizumab [49]. Given the high cost, the potential for toxicity, and finding that only a subset of patients will benefit from these drugs, patients who receive bevacizumab treatment should be carefully selected [50]. Although several potential predictive biomarkers such as circulating combination of Ang1 and Tie2 proteins, VEGFR-1 and neuropilin-1 in the plasma or cancer tissue as well as plasma cell-free DNA [51,52,53], and Ang-2, FGF, HGF activation of c-Met, deltalike ligand 4 (Dll4) -induced Notch signaling, hedgehog signaling or inhibition of Zeste homolog 2 (EZH2) [54,55,56,57], are found, these biomarkers have not consistently been predictive of response. In the past decade, advances in precision oncology have resulted in an increased demand for predictive methods that enable the selection of patients for treatment. In clinical practice, anatomical pathologists base their histological diagnosis on the visual recognition, semiquantification and integration of multiple morphological features of the analyzed sample. However, the traditional histology-based classification of EOC has no benefit for the prediction of therapeutic outcome. Immunohistochemistry (IHC) with antibodies specific for EOC antigens help identify those that are overexpressed or aberrantly expressed in tumor tissues and may be a potential biomarker for therapeutic prediction. In our previous study [58], the expression of AIM2, C3 and C5 was analyzed using immunohistochemical staining according to the percentage and intensity of the color reaction by visual observation to predict the efficacy of bevacizumab in epithelial ovarian cancer (EOC) patients, and the AIM2 immunostaining scores were significantly higher in the bevacizumab-resistant group than in the bevacizumab-sensitive group (p < 0.001), but there were no significant differences in C3 (p = 0.077) or C5 (p = 0.326) regarding bevacizumab. Based on the results of the previous study and this study, we propose a possible link between the AIM2 inflammasome and antiangiogenic therapy in EOC. In the microenvironment favorable for EOC, the inflammasome promotes transcription of certain NLRs. NLRs and AIM2 assemble into the inflammasome complex which via the caspase recruitment domain, recruit platelets and circulating leukocytes, all of which can secrete proangiogenic factors and promote angiogenesis [19]. In cancer, the complexity of genomic alterations that affect cell signaling and cellular interactions with their environment can influence the biological course of the disease and affect responses to therapeutic interventions. The assessment of such alterations requires simultaneous interrogation of multiple features with highly sensitive and precise approaches [59]. The development of new AI-based image analysis approaches in pathology and oncology is being led by computer engineers and data scientists, who are developing and applying AI tools for a variety of tasks such as helping improve diagnostic accuracy and identify novel biomarker approaches for precision oncology [12]. Over the past few years, interest in the use of machine learning-based approaches for drug discovery and development has increased [60]. The fact that most patients who receive cytotoxic agents or immune checkpoint inhibitors do not respond to treatment has led to increasing interest in combining AI with digital pathology to identify the patients who are most likely to derive therapeutic benefit [12,60]. The previous report has also developed an approach whereby the spatial arrangement of nuclei or tumor-infiltrating lymphocytes enables prediction of the responsiveness of patients with late-stage non-small-cell lung cancer to the antiprogrammed cell death 1 (PD-1) antibody nivolumab [12]. Machine learning can not only be used to analyze the tumor cells of epithelial origin, increasing interest exists in trying to identify prognostic patterns within the tumor environmental stroma cells [61]. To the authors’ best knowledge, this is the first precision oncology framework created to effectively predict the bevacizumab therapeutic effect of EOC or PSPC patients. The proposed-AIM2 model is demonstrated to achieve excellent performance on accuracy 0.92, recall 0.97, F-measure 0.93 and AUC 0.97. Furthermore, results of both Kaplan–Meier PFS analysis and Cox proportional hazards model analysis show that the proposed model can distinguish patients gaining positive therapeutic effects with low cancer recurrence from patients with disease progression after treatment (). This study indicates new potential precision oncology systems to predict antiangiogenic therapy benefits in EOC and PSPC patients for assisting personalized medicine are possible. Furthermore, the proposed method could also be applied to other types of cancers for angiogenesis inhibitors, such as metastatic breast cancer, non-small-cell lung cancer, glioblastoma, renal cell carcinoma, and cervical cancers. We hope that DL could play a role in immunohistochemical stains for other potential candidate proteins such as Ang1, Ang-2, Tie2, VEGFR-1, neuropilin-1, FGF and EZH2 to develop more effective predictive biomarkers for personalized antiangiogenic target therapies. In conclusion, we demonstrate that DL could be a very useful tool for assisting AIM2 immunostained slides in applying the appropriate and tailored targeted therapy, increasing the scope and performance of precision medicine, which aims to develop a treatment method tailored to the patients with EOC and PSPC. Therefore, AI approaches will become key in analyzing these large amounts of data, helping pathologists and oncologists in the process. These results highlight the emerging role of DL in precision medicine for therapeutic prediction and suggest an expanding utility for computational analysis of histology in the future practice of pathology.

58 in total

Review 1. Diverging inflammasome signals in tumorigenesis and potential targeting.

Authors: Rajendra Karki; Thirumala-Devi Kanneganti
Journal: Nat Rev Cancer Date: 2019-04 Impact factor: 60.716

2. Cascade R-CNN: High Quality Object Detection and Instance Segmentation.

Authors: Zhaowei Cai; Nuno Vasconcelos
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2021-04-01 Impact factor: 6.226

Review 3. Leukocytes, inflammation, and angiogenesis in cancer: fatal attractions.

Authors: Curzio Rüegg
Journal: J Leukoc Biol Date: 2006-07-18 Impact factor: 4.962

Review 4. Serous papillary peritoneal carcinoma: unknown primary tumour, ovarian cancer counterpart or a distinct entity? A systematic review.

Authors: George Pentheroudakis; Nicholas Pavlidis
Journal: Crit Rev Oncol Hematol Date: 2009-11-07 Impact factor: 6.312