Literature DB >> 34040391

Artificial Intelligence Based on Blood Biomarkers Including CTCs Predicts Outcomes in Epithelial Ovarian Cancer: A Prospective Study.

Jun Ma1, Jiani Yang1, Yue Jin1, Shanshan Cheng1, Shan Huang1, Nan Zhang1, Yu Wang1.   

Abstract

OBJECTIVE: We aimed to develop an ovarian cancer-specific predictive framework for clinical use platinum-sensitivity and prognosis using machine learning methods based on multiple biomarkers, including circulating tumor cells (CTCs). PATIENTS AND METHODS: We enrolled 156 epithelial ovarian cancer (EOC) patients, randomly assigned into the training and validation cohorts. Eight machine learning classifiers, including Random Forest (RF), Support Vector Machine, Gradient Boosting Machine, Conditional RF, Neural Network, Naive Bayes, Elastic Net, and Logistic Regression, were used to derive predictive information from 11 peripheral blood parameters, including CTCs. Through the advanced CanPatrol CTC-enrichment technique, we detect CTCs and classify them into subpopulations: epithelial, mesenchymal, and hybrids. Survival curves were generated by Kaplan-Meier method and calculated through the Log rank test.
RESULTS: Machine learning techniques, especially the Random Forest classifier, were superior to conventional regression-based analyses in predicting multiple clinical parameters related to EOC. The values for the receiver operating characteristic (ROC) curve for segregating EOC with advanced clinical stages and platinum-sensitivity were 0.796 (95% CI, 0.727-0.866) and 0.809 (95% CI, 0.742-0.876), respectively. Stepwise, we used the unsupervised clustering analysis to identify EOC subgroups with significantly worse overall survival (OS), especially in the advanced-stage group with the p-value of 0.0018 (HR, 2.716; 95% CI, 1.602-4.605) for progression-free survival (PFS) and 0.0037 (HR, 2.359; 95% CI, 1.752-6.390) for overall survival (OS).
CONCLUSION: Machine learning systems could provide risk stratification for EOC patients before initial intervention through blood variables, including circulating tumor cells. The predictive algorithms could facilitate personalized treatment options through promising pre-treatment stratification of EOC patients. TRIAL REGISTRATION: ChiCTR-DDD-16009601 Registered 25 October 2016.
© 2021 Ma et al.

Entities:  

Keywords:  artificial intelligence; blood biomarkers; circulating tumor cell; epithelial ovarian cancer

Year:  2021        PMID: 34040391      PMCID: PMC8140950          DOI: 10.2147/OTT.S307546

Source DB:  PubMed          Journal:  Onco Targets Ther        ISSN: 1178-6930            Impact factor:   4.147


Introduction

About 21,750 cases of ovarian cancer will be newly diagnosed in the United States in 2020.1 While in China, the incidence and mortality of epithelial ovarian cancer (EOC) have increased by 30% and 18%, respectively, evidenced by an average of 15,000 deaths yearly for the past 10 years.2 Due to the absence of clinical symptoms in the early stage of EOC and the lack of effective screening tests, approximately 70% of patients with EOC are diagnosed at advanced stages (stage III and IV).3 Primary disease is treated with original debulking surgery, followed by standard adjuvant chemotherapy, a combination of platinum and taxane-based treatment.4,5 However, 75% of patients at an advanced stage will eventually experience recurrence, resulting in poor survival.6 To improve EOC patients’ long-term outcomes, it is crucial to identify stratification indicators, which could accurately define characteristics and predict outcomes before initial intervention.7 Traditionally, clinical factors such as age and tumor grade have been used to assess prognosis, with limited predictive value.8,9 Emerging evidence indicated that circulating tumor cells (CTC) in ovarian cancer patients’ blood had great potential as a prognosis indicator for poor overall survival in various malignancies.10 Our research team have carried out a clinical trail (ChiCTR-DDD-16009601) and developed a prognosis nomogram model for 152 EOC patients, with the area under curve (AUC) of 0.8705.40 Moreover, several other studies have also reported the prognostic role of CTCs in ovarian cancer, but no consistent results have been obtained.11 In the realm of precision medicine, there is a desperate urgency to develop a promising risk stratification model in ovarian cancer-specific predictive framework. Recently, in order to support clinical decisions, machine learning is widely used by oncologists to generate prediction models with improved performance.12 The cutting-edge artificial intelligence technology could allow computers to “learn” potential patterns derived from previous databases.13 Several researches indicated that machine learning algorithms, such as decision trees and neural networks, played an essential role in risk stratification for carcinomas.14 Random Forest, an ensemble learning algorithm of machine learning with the basic unit of a decision tree, can independently train some relatively weak learning models by ensemble learning, integrating results, and realizing overall prediction.15 Therefore, we combined a large number of simple predictors into complex combinations of multiple biomarkers through various machine learning algorithms for prognosis model construction. In this study, we aimed to develop an ovarian cancer-specific predictive framework for clinical stages, platinum-sensitivity, and prognosis using machine learning methods based on multiple biomarkers, including circulating tumor cells and clinical variables of patients with EOC.

Materials and Methods

Patients Selection

Firstly, we enrolled a set of ovarian cancer patients (n=185) undergoing treatment between June 2017 to November 2019 in our institution based on the inclusion criteria: 1) with histologically confirmed EOC; 2) without co-existing or prior cancers within 5 years; 3) with available demographic information and clinical data. Then, patients were excluded if they: 1) underwent other treatments, such as radiotherapy, neoadjuvant therapy, or immunotherapy (n=5); 2) without consents for the usage of their medical information for research purpose (n=4); 3) with clinical evidence of sepsis, autoimmune diseases or hematological disorders (n=2); 4) lost to follow-up (n=10); 5) without detailed clinical, imaging, and treatment data (n=8). Finally, we assessed 156 patients in the analysis (Figure 1). Patients were split into training cohort (n = 106) and validation cohort (n = 50) for stepwise analysis.
Figure 1

The flowchart of the study. (A) We detected the circulating tumor cells (CTCs) through the advanced CanPatrolTM technique. After collecting 5 mL of peripheral blood samples, we used a nanofiltration system for CTCs isolation. Then, CTCs were detected by RNA-In Situ Hybridization (RNA-ISH). (B) We enrolled in 156 epithelial ovarian cancer (EOC) patients according to the inclusion and exclusion criteria. Patients were then randomly assigned to a training group (n=106) and a validated group (n=50) for machine learning model development.

The flowchart of the study. (A) We detected the circulating tumor cells (CTCs) through the advanced CanPatrolTM technique. After collecting 5 mL of peripheral blood samples, we used a nanofiltration system for CTCs isolation. Then, CTCs were detected by RNA-In Situ Hybridization (RNA-ISH). (B) We enrolled in 156 epithelial ovarian cancer (EOC) patients according to the inclusion and exclusion criteria. Patients were then randomly assigned to a training group (n=106) and a validated group (n=50) for machine learning model development. To achieve optimal tumor debulking, all patients’ operation was aimed at maximal tumor resection without visible residual tumor (R0). The operation was followed by standardized paclitaxel and platinum-based chemotherapy.16 In our study, follow-up visits were performed every 3 months through both clinical and radiological evaluation. The overall survival (OS) was measured from the date of operation to the last follow-up visit or death. The progression-free survival (PFS) was identified from the date of operation to the last follow-up visit or ovarian cancer progression, which was defined through radiographic and clinical evidence. Based on the Gynecologic Cancer Inter-Group (GCOG) consensus statement, platinum-resistance was defined when the progression-free interval since the last line of platinum treatment was less than 6 months.17 This research was approved by the Ethics Committee of Renji Hospital Affiliated to Shanghai Jiao Tong University School of Medicine. We conducted the research in accordance with the Declaration of Helsinki. All participants were informed about the purpose of the trial and signed consent forms for the usage of their information.

Data Collection and Management

Clinicopathologic characteristics, including age, body mass index (BMI), tumor size, histology type, and tumor grade were collected retrospectively from medical records. The clinical stage was evaluated based on the International Federation of Obstetrics and Gynecology Association (FIGO) staging system. Routine blood tests, including hemoglobin (HB), neutrophil, lymphocyte, and platelet, were conducted 1 day before surgery. Tumor biomarkers, including CA-125, CA-199, Human epididymis protein 4 (HE4), Carcinoembryonic antigen (CEA), and Alpha-fetoprotein (AFP) were collected for analysis. Other blood indexes, including C-reaction protein (CRP), albumin, fibrinogen, Lactate dehydrogenase (LDH), Alanine aminotransferase (ALT), Aspartate aminotransferase (AST), and total bile acid (TBA), were also collected 1 day before surgery.

Characterization of CTCs by CanPatrol System

Peripheral blood samples (5 mL) anticoagulated with Ethylene Diamine Tetra-acetic Acid (EDTA) were collected 1 day before surgery. The first 2 mL of blood was discarded to avoid potential skin cell contamination due to venipuncture. After sampling, the blood should be stored at 2 to 8°C and processed within 4 hours.18 To isolate and characterize CTCs, we used the advanced CanPatrolTM technique. The blood samples preserved in the cell preservation solution were centrifuged at 1850 rpm for 5 minutes to remove the supernatant.19 Then, we mixed the samples with 4% formaldehyde and phosphate buffer saline (PBS) solution for 8 minutes.20 Next, the samples were run through the vacuum filtration system, including a filtration tube containing the membrane with a pore size of 8μm diameters, a vacuum pump, and a manifold vacuum plate with valve settings at the pressure of 0.08 MPa.21 Secondly, we detected CTCs and classified them into three subpopulations, mesenchymal, mesenchymal/epithelial hybrid, and epithelial, by RNA-In Situ Hybridization (RNA-ISH). The samples were treated with protease K, while the cells were hybridized with fluorescent probes specific for the following target sequences: green probes for mesenchymal molecules (Vimentin and Twist) and red for epithelial cell adhesion molecules (CK8/18/19 and EpCAM). Finally, we stained nuclei with 40,6-diamidino-2-phenylindole (DAPI) and analyzed the cells through a fluorescent microscope21 (Figure 1). Based on these markers, we classified CTCs into three subgroups: epithelial CTCs (EpCAM and CK8/18/19 +/Vimentin and Twist -, ), hybrids CTCs (EpCAM and CK8/18/19 +/Vimentin and Twist +, ), and mesenchymal CTCs (EpCAM and CK8/18/19 -/Vimentin and Twist +, ). Moreover, M-CTC was defined as the percentage of the mesenchymal CTCs subgroup among all CTCs.

Supervised Machine Learning Classifiers

The dataset was repeatedly and randomly sampled until divided into training and validation cohorts with no significant difference (P < 0.20, Table 1). Differences in clinicopathologic characteristics between two cohorts for categorical and continuous variables were analyzed by Chi-square test and T-test, respectively. The model inputs included clinicopathologic characteristics (such as age and BMI) and blood biomarkers (such as CTCs, M-CTC, HB, neutrophil, lymphocyte, platelet, CA-125, CA-199, HE4, CEA, AFP, CRP, albumin, fibrinogen, LDH, ALT, AST, and TBA). The prognostic factors were determined using both univariate and multivariate analyses through Cox’s hazards regression model. Stepwise, we evaluated 7 types of machine learning models, including Support Vector Machine (SVM), Gradient Boosting Machine (GBM), Random Forest (RF), Naive Bayes (NB), Conditional Random Forest (CRF), Elastic Net (EN) and Neural Network (NN). All the classifiers were assessed by R package (“svmRadial” for SVM, “gbm” for GBM, “rf” for RF, “nb” for NB, “cforest” for CRF, “glmnet” for EN and “ nnet” for NN). The RF classifier comprises two machine learning techniques: random feature selection and bagging. Based on multiple variables, we used unsupervised RF clustering to evaluate similarity among patients.
Table 1

Characteristics Between Patients in the Training Cohort and the Validated Cohort

VariableTotal Patients (n = 156)Training Cohort (n = 106)Validated Cohort (n = 50)p-value
Age (years)57.89 ± 9.0157.78 ± 8.4658.10 ± 9.320.831
BMI (kg/m2)22.97 ± 0.8923.19 ± 1.3522.87 ± 0.950.134
Tumor size (cm)6.58 ± 3.986.73 ± 3.486.45 ± 4.210.662
Pathological grade, n (%)0.494
 G1-243 (27.56%)31 (19.87%)12 (7.69%)-
 G3113 (72.44%)75 (48.08%)38 (24.36%)-
Clinical stage, n (%)0.396
 I–II52 (33.33%)33 (21.15%)19 (12.18%)-
 III–IV104 (66.67%)73 (46.79%)31 (19.87%)-
Histological type, n (%)0.906
 Serous98 (62.8%)67 (42.9%)31 (19.9%)-
 Mucinous25 (16.0%)16 (10.3%)9 (5.8%)-
 Endometrioid14 (9.0%)9 (5.8%)5 (3.2%)-
 Others19 (12.2%)14 (9.0%)5 (3.2%)-
CTCs, n (%)8.70 ± 3.858.73 ± 4.588.65 ± 3.490.913
M-CTC0.26 ± 0.180.26 ± 0.140.25 ± 0.200.719
Neutrophil (10^9/L)5.13 ± 1.825.28 ± 1.695.10 ± 1.930.554
Lymphocyte (10^9/L)1.32 ± 0.791.31 ± 0.841.40 ± 0.660.506
Platelet (10^9/L)342.16 ± 90.41351.73 ± 77.38339.94 ± 92.650.406
Albumin (g/L)41.97 ± 9.3540.94 ± 8.3742.28 ± 9.830.379
CA-125 (U/mL)996.57 ± 392.041003.24 ± 412.43994.39 ± 379.560.898
CA-199 (U/mL)130.29 ± 52.30123.73 ± 59.04135.28 ± 47.570.228
AFP (ng/mL)5.82 ± 3.946.04 ± 3.325.36 ± 4.250.278
CEA (ng/mL)3.29 ± 2.633.09 ± 2.743.32 ± 2.480.615
HE4 (pmol/L)527.39 ± 73.01535.39 ± 70.38524.28 ± 80.390.381
CRP (mg/L)8.39 ± 2.108.20 ± 1.938.75 ± 2.130.110
HB (g/L)118.27 ± 18.39120.38 ± 20.31116.38 ± 16.470.226
Fibrinogen (g/L)4.27 ± 1.294.11 ± 0.834.32 ± 1.070.182
LDH (U/L)187.17 ± 19.83190.38 ± 20.19185.87 ± 15.480.165
ALT (U/L)28.38 ± 3.9329.18 ± 5.2027.91 ± 4.790.147
AST (U/L)30.37 ± 2.0429.49 ± 4.4730.74 ± 3.930.093
TBA (μmol/L)9.82 ± 2.0910.18 ± 2.749.76 ± 3.380.409

Abbreviations: BMI, body mass index; CTCs, circulating tumor cells; M-CTC, mesenchymal–CTC percentage; HB, hemoglobin; HE4, Human epididymis protein 4, CEA, Carcinoembryonic antigen, AFP, Alpha-fetoprotein; CRP, C-reaction protein; LDH, Lactate dehydrogenase; ALT, Alanine aminotransferase; AST, Aspartate aminotransferase; TBA, total bile acid.

Characteristics Between Patients in the Training Cohort and the Validated Cohort Abbreviations: BMI, body mass index; CTCs, circulating tumor cells; M-CTC, mesenchymal–CTC percentage; HB, hemoglobin; HE4, Human epididymis protein 4, CEA, Carcinoembryonic antigen, AFP, Alpha-fetoprotein; CRP, C-reaction protein; LDH, Lactate dehydrogenase; ALT, Alanine aminotransferase; AST, Aspartate aminotransferase; TBA, total bile acid. A receiver operating characteristic (ROC) curve analysis was used for assessment to identify the prognostic value of each machine learning classifier according to the area under the curve (AUC) and the Youden index. Kaplan–Meier survival curves were then generated, and prognostic differences were evaluated through a Log rank test. All statistical analyses were conducted by R software Version 4.0.2 (GUI 1.72 Catalina build) and graphed by Graph Prism Version 7.0a (GraphPad Software, San Diego, CA, USA). P < 0.05 was defined as statistically significant.

Results

Patient Characteristics

Baseline demographic and clinical characteristics of the training cohort (n = 106) and the validated cohort (n = 50) were assessed in Table 1. The patients with high pathological grades (G3) and those at an advanced clinical stage (FIGO stage III–IV) accounted for 113 (72.44%) and 104 (66.67%), respectively. There were 98 patients (62.8%) diagnosed with serous ovarian cancer. The mean value of CTC count and percentage were 8.70 ± 3.85 and 0.26 ± 0.18, respectively. All characteristics included were similar among the two groups with no significant differences (P < 0.05, Table 1). The median time of follow-up of patients was 33 months (range, 26–38 months). According to the Youden index of the ROC curve, we divided patients into two CTC groups by setting the cut-off value at 5 counts (Figure 2A). Table 2 showed the baseline features of EOC patients grouped by CTC count. We found significant correlation of CTC count with clinical FIGO stage (P = 0.007), tumor size (P = 0.016), and CA-125 (P = 0.037) (Table 2). Patients in the high-CTC group (≥ 5 counts) had a higher FIGO stage. The mean CA-125 (± SD) values and tumor size in high-CTCs patients were 1013.01 ± 385.24 and 6.22 ± 1.09, significantly higher than 897.92 ± 293.59 and 5.72 ± 1.43 in low-CTCs patients. No significant differences among the two CTC groups were found for age, BMI index, tumor size, pathological grade, histological type, neutrophil, lymphocyte, platelet, albumin, CA-199, AFP, CEA, and HE4 (P ≥ 0.05).
Figure 2

Differentiation of epithelial ovarian cancer (EOC) prognosis based on multiple preoperative blood biomarkers. (A) receiver operating characteristic (ROC) curves derived from logistic regression for single blood biomarkers. (B) the ROC curves derived from 8 supervised machine learning methods. The progression-free survival (PFS) analysis among (C) all patients; patients stratified by (D) circulating tumor cell (CTC) counts and (E) mesenchymal–CTC (M-CTC) percentage. The overall survival (OS) analysis among (F) all patients; patients stratified by (G) CTCs counts and (H) M-CTC percentage.

Table 2

Correlation Between Preoperative Circulating Tumor Cell (CTC) Count and Clinicopathological Features of Epithelial Ovarian Cancer Patients

VariableCTC Count < 5(n = 81)CTC Count ≥ 5(n = 75)p-value
Age (years)57.92 ± 7.3158.24 ± 8.020.795
BMI (kg/m2)22.97 ± 0.9823.06 ± 1.250.616
Tumor size (cm)5.72 ± 1.436.22 ± 1.090.016
Pathological grade, n (%)0.188
 G1-226 (16.67%)17 (10.90%)-
 G355 (35.26%)58 (37.18%)-
Clinical stage, n (%)0.007
 I–II35 (22.44%)17 (20.90%)
 III–IV46 (29.49%)58 (37.18%)
Histological type, n (%)0.849
 Serous53 (33.97%)45 (28.85%)-
 Mucinous12 (7.69%)13 (8.33%)-
 Endometrioid6 (3.85%)8 (5.13%)-
 Others10 (6.41%)9 (5.77%)-
Neutrophil (10^9/L)5.18 ± 2.925.27 ± 2.410.298
Lymphocyte (10^9/L)1.37 ± 0.791.53 ± 1.020.273
Platelet (10^9/L)371.28 ± 86.02359.43 ± 79.200.373
Albumin (g/L)40.82 ± 8.0241.39 ± 9.380.683
CA-125 (U/mL)897.92 ± 293.591013.01 ± 385.240.037
CA-199 (U/mL)129.40 ± 49.31136.38 ± 40.480.338
AFP (ng/mL)5.83 ± 3.356.02 ± 4.720.771
CEA (ng/mL)3.28 ± 2.183.62 ± 1.970.310
HE4 (pmol/l)539.48 ± 74.20529.40 ± 80.380.417
CRP (mg/L)7.89 ± 1.838.43 ± 2.180.095
HB (g/L)120.26 ± 20.18117.39 ± 17.720.348
Fibrinogen (g/L)4.10 ± 1.534.31 ± 1.070.326
LDH (U/L)184.27 ± 16.38189.54 ± 20.300.075
ALT (U/L)28.04 ± 3.1829.18 ± 4.260.059
AST (U/L)30.72 ± 1.9830.29 ± 2.130.193
TBA (μmol/L)9.27 ± 2.199.94 ± 3.720.169

Abbreviations: BMI, body mass index; CTC, circulating tumor cell; HB, hemoglobin; HE4, Human epididymis protein 4, CEA, Carcinoembryonic antigen, AFP, Alpha-fetoprotein; CRP, C-reaction protein; LDH, Lactate dehydrogenase; ALT, Alanine aminotransferase; AST, Aspartate aminotransferase; TBA, total bile acid.

Correlation Between Preoperative Circulating Tumor Cell (CTC) Count and Clinicopathological Features of Epithelial Ovarian Cancer Patients Abbreviations: BMI, body mass index; CTC, circulating tumor cell; HB, hemoglobin; HE4, Human epididymis protein 4, CEA, Carcinoembryonic antigen, AFP, Alpha-fetoprotein; CRP, C-reaction protein; LDH, Lactate dehydrogenase; ALT, Alanine aminotransferase; AST, Aspartate aminotransferase; TBA, total bile acid. Differentiation of epithelial ovarian cancer (EOC) prognosis based on multiple preoperative blood biomarkers. (A) receiver operating characteristic (ROC) curves derived from logistic regression for single blood biomarkers. (B) the ROC curves derived from 8 supervised machine learning methods. The progression-free survival (PFS) analysis among (C) all patients; patients stratified by (D) circulating tumor cell (CTC) counts and (E) mesenchymal–CTC (M-CTC) percentage. The overall survival (OS) analysis among (F) all patients; patients stratified by (G) CTCs counts and (H) M-CTC percentage.

CTCs and M-CTC as Prognosis Biomarkers

The normal logistic regression model based on univariable analysis of clinicopathologic parameters showed that age (HR, 1.28; 95% CI, 1.09–1.47; P = 0.033), tumor size (HR, 1.32; 95% CI, 1.10–1.79; P = 0.042), pathological grade (HR, 1.47; 95% CI, 1.23–1.64; P = 0.038), FIGO stage (HR, 2.11; 95% CI, 1.28–3.73; P = 0.009), CTCs counts (HR, 2.03; 95% CI, 1.64–4.04; P = 0.002), M-CTC percentage (HR, 1.74; 95% CI, 1.54–2.57; P = 0.005), albumin (HR, 0.84; 95% CI, 0.54–0.93; P = 0.016), CA-125 (HR, 1.43; 95% CI, 1.04–1.74; P = 0.029), CRP (HR, 1.47; 95% CI, 1.04–2.92; P = 0.037) and fibrinogen (HR, 1.58; 95% CI, 1.18–2.10; P = 0.041) were significant prognostic factors for survival (Table 3). Then, these selected indicators were included into the multivariable regression model, which demonstrated that pathological grade (HR, 1.38; 95% CI, 1.23–1.94; P = 0.042), FIGO stage (HR, 1.94; 95% CI, 1.26–3.73; P = 0.015), CTC count (HR, 1.95; 95% CI, 1.55–3.96; P = 0.007), M-CTC percentage (HR, 1.84; 95% CI, 1.48–2.64; P = 0.009), CA-125 (HR, 1.34; 95% CI, 1.03–1.84; P = 0.038) and CRP (HR, 1.36; 95% CI, 1.29–2.80; P = 0.041) were independent factors for EOC prognosis (Table 3).
Table 3

Univariate and Multivariate Regression Analyses with Clinicopathologic Parameters for Epithelial Ovarian Cancer (EOC) Patient’s Prognosis

VariablesUnivariate AnalysisMultivariate Analysis
HR (95% CI)P-valueHR (95% CI)P-value
Age1.28(1.09–1.47)0.0331.19(1.04–1.49)0.112
BMI1.10(0.85–2.19)0.341--
Tumor size1.32(1.10–1.79)0.0421.27(0.95–1.86)0.113
Pathological grade
 G1-2 vs G31.47(1.23–1.64)0.0381.38(1.23–1.94)0.042
Clinical stage
 I–II vs III–IV2.11(1.28–3.73)0.0091.94(1.26–3.73)0.015
Histological type
 Serous vs others1.19(0.90–2.20)0.235--
CTC count
 <5 vs ≥52.03(1.64–4.04)0.0021.95(1.55–3.96)0.007
M-CTC percentage
 <0.3 vs ≥0.31.74(1.54–2.57)0.0051.84(1.48–2.64)0.009
Neutrophil1.22(0.84–1.92)0.328--
Lymphocyte0.94(0.54–2.48)0.281--
Platelet1.43(0.89–1.74)0.136--
Albumin0.84(0.54–0.93)0.0160.89(0.64–1.02)0.083
CA-1251.43(1.04–1.74)0.0291.34(1.03–1.84)0.038
CA-1990.98(0.85–1.35)0.348--
AFP1.19(0.85–1.43)0.193--
CEA1.25(0.94–1.86)0.379--
HE41.34(0.84–1.63)0.283--
CRP1.47(1.04–2.92)0.0371.36(1.29–2.80)0.041
HB1.28(0.89–1.73)0.326--
Fibrinogen1.58(1.18–2.10)0.0411.39(0.99–2.39)0.126
LDH1.03(0.93–2.38)0.275--
ALT0.94(0.75–2.02)0.362--
AST0.99(0.85–1.95)0.286--
TBA1.25(0.85–2.05)0.321--

Abbreviations: HR, hazard ratio; 95% CI, 95% confidence interval; BMI, body mass index; CTC, circulating tumor cell; M-CTC, mesenchymal–CTC; HB, hemoglobin; HE4, Human epididymis protein 4, CEA, Carcinoembryonic antigen, AFP, Alpha-fetoprotein; CRP, C-reaction protein; LDH, Lactate dehydrogenase; ALT, Alanine aminotransferase; AST, Aspartate aminotransferase; TBA, total bile acid.

Univariate and Multivariate Regression Analyses with Clinicopathologic Parameters for Epithelial Ovarian Cancer (EOC) Patient’s Prognosis Abbreviations: HR, hazard ratio; 95% CI, 95% confidence interval; BMI, body mass index; CTC, circulating tumor cell; M-CTC, mesenchymal–CTC; HB, hemoglobin; HE4, Human epididymis protein 4, CEA, Carcinoembryonic antigen, AFP, Alpha-fetoprotein; CRP, C-reaction protein; LDH, Lactate dehydrogenase; ALT, Alanine aminotransferase; AST, Aspartate aminotransferase; TBA, total bile acid. In Figure 2A, we compared the univariable logistic regression analysis using each peripheral blood biomarker (dash line). The univariable regression analysis indicated that the CTCs counts (area under the ROC curve (AUC) = 0.841, p-value < 0.001) and M-CTC percentage (AUC = 0.859, p-value < 0.001) have better predictive value than other biomarkers, including CA-125 (AUC = 0.809, p-value = 0.003) (Figure 2A). The Youden Index evaluated that the cut-off value was 5 for CTCs and 0.3 for M-CTC, considered as thresholds for a positive test. Stepwise, the survival curves were graphed in Figure 2 for all EOC patients. The PFS survival curves were significantly different when stratified by CTC count (P = 0.0169, Figure 2D) and M-CTC percentage (P = 0.0098, Figure 2E). The OS survival curves also differed significantly when stratified by CTC counts (P = 0.0136, Figure 2G) and M-CTC percentage (P = 0.0033, Figure 2H). The training dataset was then used to predict EOC using machine learning methods in Figure 2B. The values for AUC and the highest accuracy of the prediction were 0.892 and 85.9% for multiple logistic analysis. For other machine learning models, the AUC were 0.961 for RF, 0.948 for GBM, 0.933 for CRF, 0.930 for NN, 0.926 for NB, 0.899 for SVM, and 0.869 for EN, respectively. The results reveal that supervised machine learning classifies, especially the RF analysis (AUC 0.961, 95% CI 0.928–0.994), could predict more accurately than the conventional logistic regression analysis, which had an AUC of 0.892 (95% CI 0.869–0.941). So, the RF algorithm was used in the subsequent analysis to replace the conditional logistic regression model.

Prediction of FIGO Clinical Stages of EOC Patients with the RF Classifier

Through the RF classifier based on circulating biomarkers, we tended to predict clinical stages of EOC preoperatively. Using the RF model to predict the FIGO stage, we found that the AUC value of the ROC curve was 0.796 (95% CI, 0.727–0.866) and 0.743 (95% CI, 0.688–0.798), based on biomarkers with and without CTCs, respectively (Figure 3A). The results also indicated that CTC count, CRP, and M-CTC percentage are essential parameters for predicting the clinical stage of EOC, rather than traditional tumor markers such as CA-125, HE4, and CA-199, according to the variable importance measured by mean decrease in the Gini index (Figure 3B). As shown in Figure 3C, as the clinical stage progressed, CTC count, CRP, M-CTC, CA-125, HE4, and neutrophil also increased, whereas others, including albumin and lymphocyte decreased.
Figure 3

Prediction of clinical stages of epithelial ovarian cancer (EOC) with Random Forest (RF) classifier. (A) Receiver operating characteristic (ROC) curve for RF prediction of clinical stage based on circulating biomarkers with/without CTCs. (B) Variable importance for RF prediction of clinical stages measured by mean decrease in Gini index. (C) The box plot show distribution of the top eight important blood markers for RF prediction of clinical stages.

Prediction of clinical stages of epithelial ovarian cancer (EOC) with Random Forest (RF) classifier. (A) Receiver operating characteristic (ROC) curve for RF prediction of clinical stage based on circulating biomarkers with/without CTCs. (B) Variable importance for RF prediction of clinical stages measured by mean decrease in Gini index. (C) The box plot show distribution of the top eight important blood markers for RF prediction of clinical stages.

Prediction of Platinum-Resistance of EOC with the RF Classifier

Based on the biomarkers, we then attempted to predict platinum-resistance preoperatively. Using the RF model to predict platinum-resistance, we found that the AUC value of the ROC curve was 0.809 (95% CI, 0.742–0.876) and 0.759 (95% CI, 0.705–0.813), based on biomarkers with and without CTCs, respectively (Figure 4A). The relative variable importance for segregating platinum-resistant patients from others was calculated by a predictive RF classifier (Figure 4B). We identified the top eight factors, including M-CTC percentage, fibrinogen, carbohydrate antigen-125 (CA-125), CTCs count, albumin, lymphocyte, C-reactive protein (CRP), and neutrophils as predictors for distinguishing patients with platinum-resistance through the RF algorithm. Box plots that present the distribution of each selected variable between platinum-resistant and platinum-sensitive patients were shown in Figure 4C. Platinum-resistant patients tended to with higher M-CTC, fibrinogen, CA-125, CTCs, CRP, and neutrophils, but lower albumin and lymphocyte.
Figure 4

Prediction of platinum-resistance of epithelial ovarian cancer (EOC) with the Random Forest (RF) classifier. (A) The receiver operating characteristic (ROC) curve for RF prediction of platinum-resistance based on circulating biomarkers with/without CTCs. (B) Variable importance for RF prediction of platinum-resistance measured by mean decrease in the Gini index. (C) The box plot shows the distribution of the top eight important blood markers for RF prediction of platinum-resistance.

Prediction of platinum-resistance of epithelial ovarian cancer (EOC) with the Random Forest (RF) classifier. (A) The receiver operating characteristic (ROC) curve for RF prediction of platinum-resistance based on circulating biomarkers with/without CTCs. (B) Variable importance for RF prediction of platinum-resistance measured by mean decrease in the Gini index. (C) The box plot shows the distribution of the top eight important blood markers for RF prediction of platinum-resistance.

Unsupervised Clustering Analysis for EOC Prognosis with the RF Classifier

In addition, we performed unsupervised clustering analysis with the RF algorithm to classify patients into two clusters, based on preoperative blood markers for EOC prognosis. For the progression-free survival (PFS) rate, the two clusters showed significant differences among all the patients (Figure 5A, P = 0.0007). Taking clinical stage into separation, patients had the log-rank p-value of 0.1608 (Figure 5B, HR, 2.465; 95% CI, 0.540–11.260) for the early-stage and 0.0018 (Figure 5C, HR, 2.716; 95% CI, 1.602–4.605) for the advanced-stage. Moreover, we found a statistically significant difference between two clusters on the OS (overall survival) rate in all the patients (Figure 5D, P = 0.0021) and those at an advanced stage (Figure 5E, P = 0.0037). In contrast, the early-stage patients had no significant difference (Figure 5F, P = 0.0869). The multiple blood markers, including M-CTC, CTC count, CRP, Fibrinogen, CA-125, albumin, lymphocyte, and neutrophils were significantly different among advanced-stage cases in the two clusters (Figure 5G).
Figure 5

Unsupervised machine learning clustering associated with EOC prognosis. EOC patients were clustered into two groups by the unsupervised clustering analysis with RF classifier. Kaplan–Meier curves indicating progression-free survival (PFS) of each cluster in (A) all EOC patients, (B) early clinical stage group, and (C) advanced clinical stage group. Kaplan–Meier curves indicating overall survival (OS) of each cluster in (D) all EOC patients, (E) early clinical stage group, and (F) advanced clinical stage group. (G) Box plots showed the distribution of the top eight peripheral blood biomarkers between two clusters.

Unsupervised machine learning clustering associated with EOC prognosis. EOC patients were clustered into two groups by the unsupervised clustering analysis with RF classifier. Kaplan–Meier curves indicating progression-free survival (PFS) of each cluster in (A) all EOC patients, (B) early clinical stage group, and (C) advanced clinical stage group. Kaplan–Meier curves indicating overall survival (OS) of each cluster in (D) all EOC patients, (E) early clinical stage group, and (F) advanced clinical stage group. (G) Box plots showed the distribution of the top eight peripheral blood biomarkers between two clusters.

Discussion

In the present study, we developed and validated a prognosis model for EOC based on blood biomarkers, including CTCs. To the best of our knowledge, this is the very first study to combine the advanced CTC CanPatrol technique together with the machine learning techniques for risk stratification among ovarian cancer patients. Our results showed that CTC count, M-CTC percentage, together with other blood biomarkers, could provide significantly great prediction values for clinical stages, platinum-resistance, and survival by machine learning approaches, especially the RF Classifier. The machine learning model could facilitate the selection of treatment strategies in precision medicine. A previous study from Enshaei et al constructed a risk stratification model, based on clinical variables including age, clinical stage, histopathology grade, and CA-125 level. They demonstrated that the neural network (NN) algorithm was capable of predicting OC survival with high accuracy (93%) and an AUC of 0.74, outperforming the traditional logistic regression.22 In our study, the multivariable regression model showed that besides normal factors, including pathological grade (HR 1.38; P = 0.042), FIGO stage (HR 1.94; P = 0.015), CA-125 (HR 1.34; P = 0.038) and CRP (HR 1.36; P = 0.041), CTC count (HR 1.95; P = 0.007) and M-CTC percentage (HR 1.84; P = 0.009) were also independent factors for EOC prognosis. Stepwise, we further revealed the association of pre-operation biomarkers with important EOC features, which may facilitate the risk stratification of patients through supervised machine learning models. Machine learning techniques have been widely accepted in various cancer studies for both diagnostic and prognostic assessment.22,23 This cutting-edge approach was able to illustrate embedded patterns within data and discover the underlying mechanism between biomarkers and cancer progression.24 However, the machine learning algorithm that may provide the promising pre-operation predictive potential for blood biomarkers, including CTCs is poorly understood in the EOC domain. We conducted the comparison among various supervised algorithms and identified the RF classifier as the best approach with a good predictive performance (AUC 0.961, 95% CI 0.928–0.994), which is consistent with the results of a recent study.25 The RF classifier consists of decision trees based on the bagging and random feature selection technique. By considering interactions among variables, the RF classifier could stratify samples and avoid overfitting.26 Ovarian cancer has various heterogeneous features, including clinical stages and histological types with different grades. So, we investigated and found that unsupervised RF clustering analysis was able to segregate EOC clusters, which were associated with clinical stages and survival. We found that the RF classifier could predict several clinical characteristics based on pre-operation blood biomarkers with a promising AUC of 0.796 (95% CI, 0.727–0.866) for the clinical stage and 0.809 (95% CI, 0.742–0.876) for platinum-resistance, which was not very significant. However, a recent research from Kawakami et al25 also developed an ovarian cancer-specific predictive framework for clinical stage using machine learning methods based on multiple biomarkers, though without CTCs. They indicated that the AUC for predicting clinical stages with RF model was 0.760, which is even lower than our findings. The relatively low significance was partly due to limited sample size of 156 patients, thus future studies of large database are of great urgency to develop promising models. Moreover, the subgroup unsupervised machine learning approach revealed that two clusters in advanced-stage EOC were significantly associated with PFS (P = 0.002) and OS (P = 0.004). In previous studies, blood biomarkers including indicators of systemic inflammatory response had prognostic relevance in patients with EOC. A recent meta-analysis involving 2919 patients showed that elevated neutrophil-to-lymphocyte ratio is significantly associated with disease progression and EOC patients’ survival.27 Inflammatory indicators may promote tumor progression by producing cytokines (including VEGF, interleukin, and tumor necrosis factor-α, etc.), which play a vital role in the tumor microenvironment.28 In addition, coagulation factors could also stimulate cancer proliferation and angiogenesis by interaction with VEGF and fibroblast growth factor-2 (FGF-2).29 Studies reported that elevated levels of pre-operation plasma fibrinogen, CRP, and albumin were useful in predicting unfavorable EOC prognosis,30,31 which is consistent with our results. Apart from the inflammatory and coagulation-related biomarkers, we revealed that CTC count was also an independent prognosis factor for ovarian cancer prognosis with the AUC value of 0.841 (95% CI, 0. 802–0.880). Among the “liquid biopsy” alternatives for the prognosis of solid carcinomas, CTCs have shown great potential in prostate cancer, breast cancer, and hepatocellular cancer.32–34 However, whether CTCs characteristics were associated with prognosis still remains controversial in EOC.35 Poveda et al36 concluded that elevated CTCs detected through the CellSearch system were an independent risk factor for ovarian cancer prognosis, which supported our findings. In this study, we used the updated CanPatrol CTC-enrichment technique with high sensitivity, which uses the filter-based separation method to reduce CTC loss caused by centrifugation.37 Moreover, recent researches indicated that CTCs could disseminate to distant sites by epithelial-mesenchymal transition (EMT) that could help them change phenotype and penetrate blood vessels.38 Therefore, we classified CTCs into three subtypes: epithelial, epithelial/mesenchymal hybrids, and mesenchymal through the advanced CanPatrol CTC-enrichment technique. We demonstrated that M-CTC percentage had great prediction value for ovarian cancer prognosis, with the AUC value of 0.859 (95% CI, 0.818–0.903). Consistent with our findings, a previous study also indicated the prognosis value of both M-CTC percentage (AUC 0.74; 95% CI 0.64–0.84) and CTCs (AUC 0.75; 95% CI 0.66–0.84) in hepatocellular carcinoma.18 In ovarian cancer, researchers indicated that tumor cells underwent EMT process showed cancer stem cell (CSC) features and could drive tumor growth in vivo,39 which might partly explain the significant association between high M-CTC percentage and poor prognosis. However, there were some limitations of this study. Firstly, this prospective study involved a relatively small sample size of 156 patients within a single institution, which might cause selection bias and limited accuracy in our results. To solve this problem, future carrying out multi-center studies with larger sample sizes and more input variables is important. Secondly, detection efficiency might be biased since the CanPatrol system is a filtration-based system, allowing small CTCs to cross the barrier easily. Thus, other CTC collection techniques might also be used to improve detection efficiency in future studies. Finally, in this research, we aimed at developing a pre-operation machine learning model based on multiple blood biomarkers, so as to facilitate personalized treatment options before primary therapeutic approach, in the realm precision medicine. However, in order to realize dynamic tumor monitor, future studies are still needed to construct prediction models, especially based on biomarkers collected periodically, including pre-chemotherapy. In conclusion, we developed a serum-based CTCs model through machine learning techniques for the prognosis of ovarian cancer that could address the mentioned concerns and demonstrate the clinical significance of this diagnostic technique. Through the newly developed machine learning model, we may facilitate a personalized treatment before the primary therapeutic approach in nearly future.
  40 in total

1.  Evaluating the prognostic significance of preoperative thrombocytosis in epithelial ovarian cancer.

Authors:  S K Allensworth; C L Langstraat; J R Martin; M A Lemens; M E McGree; A L Weaver; S C Dowdy; K C Podratz; J N Bakkum-Gamez
Journal:  Gynecol Oncol       Date:  2013-06-05       Impact factor: 5.482

Review 2.  Machine Learning in Medicine.

Authors:  Rahul C Deo
Journal:  Circulation       Date:  2015-11-17       Impact factor: 29.690

3.  Cancer statistics in China, 2015.

Authors:  Wanqing Chen; Rongshou Zheng; Peter D Baade; Siwei Zhang; Hongmei Zeng; Freddie Bray; Ahmedin Jemal; Xue Qin Yu; Jie He
Journal:  CA Cancer J Clin       Date:  2016-01-25       Impact factor: 508.702

4.  Cytoreductive surgery and intraperitoneal chemo-hyperthermia for chemo-resistant and recurrent advanced epithelial ovarian cancer: prospective study of 81 patients.

Authors:  Eddy Cotte; Olivier Glehen; Faheez Mohamed; Franck Lamy; Claire Falandry; François Golfier; Francois Noel Gilly
Journal:  World J Surg       Date:  2007-09       Impact factor: 3.352

5.  Natural language processing with machine learning to predict outcomes after ovarian cancer surgery.

Authors:  Emma L Barber; Ravi Garg; Christianne Persenaire; Melissa Simon
Journal:  Gynecol Oncol       Date:  2020-10-14       Impact factor: 5.482

6.  Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models.

Authors:  Chang Ming; Valeria Viassolo; Nicole Probst-Hensch; Pierre O Chappuis; Ivo D Dinov; Maria C Katapodi
Journal:  Breast Cancer Res       Date:  2019-06-20       Impact factor: 6.466

Review 7.  The Role of Epithelial-to-Mesenchymal Plasticity in Ovarian Cancer Progression and Therapy Resistance.

Authors:  Nele Loret; Hannelore Denys; Philippe Tummers; Geert Berx
Journal:  Cancers (Basel)       Date:  2019-06-17       Impact factor: 6.639

8.  Development and validation for prognostic nomogram of epithelial ovarian cancer recurrence based on circulating tumor cells and epithelial-mesenchymal transition.

Authors:  Jiani Yang; Jun Ma; Yue Jin; Shanshan Cheng; Shan Huang; Nan Zhang; Yu Wang
Journal:  Sci Rep       Date:  2021-03-22       Impact factor: 4.379

Review 9.  New insights into the mechanisms of epithelial-mesenchymal transition and implications for cancer.

Authors:  Anushka Dongre; Robert A Weinberg
Journal:  Nat Rev Mol Cell Biol       Date:  2019-02       Impact factor: 94.444

Review 10.  Machine learning applications in cancer prognosis and prediction.

Authors:  Konstantina Kourou; Themis P Exarchos; Konstantinos P Exarchos; Michalis V Karamouzis; Dimitrios I Fotiadis
Journal:  Comput Struct Biotechnol J       Date:  2014-11-15       Impact factor: 7.271

View more
  1 in total

1.  Artificial intelligence-based preoperative prediction system for diagnosis and prognosis in epithelial ovarian cancer: A multicenter study.

Authors:  Meixuan Wu; Yaqian Zhao; Xuhui Dong; Yue Jin; Shanshan Cheng; Nan Zhang; Shilin Xu; Sijia Gu; Yongsong Wu; Jiani Yang; Liangqing Yao; Yu Wang
Journal:  Front Oncol       Date:  2022-09-21       Impact factor: 5.738

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.