| Literature DB >> 31893575 |
Wei Lu1,2, Dongliang Fu1,2, Xiangxing Kong1,2, Zhiheng Huang1,2, Maxwell Hwang1,2, Yingshuang Zhu1,2, Liubo Chen1,2, Kai Jiang1,2, Xinlin Li1,2, Yihua Wu3, Jun Li1,2, Ying Yuan2,4, Kefeng Ding1,2.
Abstract
Early identification of metastatic or recurrent colorectal cancer (CRC) patients who will be sensitive to FOLFOX (5-FU, leucovorin and oxaliplatin) therapy is very important. We performed microarray meta-analysis to identify differentially expressed genes (DEGs) between FOLFOX responders and nonresponders in metastatic or recurrent CRC patients, and found that the expression levels of WASHC4, HELZ, ERN1, RPS6KB1, and APPBP2 were downregulated, while the expression levels of IRF7, EML3, LYPLA2, DRAP1, RNH1, PKP3, TSPAN17, LSS, MLKL, PPP1R7, GCDH, C19ORF24, and CCDC124 were upregulated in FOLFOX responders compared with nonresponders. Subsequent functional annotation showed that DEGs were significantly enriched in autophagy, ErbB signaling pathway, mitophagy, endocytosis, FoxO signaling pathway, apoptosis, and antifolate resistance pathways. Based on those candidate genes, several machine learning algorithms were applied to the training set, then performances of models were assessed via the cross validation method. Candidate models with the best tuning parameters were applied to the test set and the final model showed satisfactory performance. In addition, we also reported that MLKL and CCDC124 gene expression were independent prognostic factors for metastatic CRC patients undergoing FOLFOX therapy.Entities:
Keywords: FOLFOX; colorectal cancer; machine learning algorithm; microarray meta-analysis
Mesh:
Substances:
Year: 2020 PMID: 31893575 PMCID: PMC7013065 DOI: 10.1002/cam4.2786
Source DB: PubMed Journal: Cancer Med ISSN: 2045-7634 Impact factor: 4.452
Figure 1A, Flow diagram of datasets screening and selection. B, Flow diagram of identifying differentially expressed genes and building models via machine learning algorithms
Characteristics of the included datasets
| Series accession | Platforms | Year of submission | Specimens source | Number of specimens | Male/Female | Tissue type | Regimen | Response evaluation | Response rate |
|---|---|---|---|---|---|---|---|---|---|
|
| Affymetrix Human Genome U133 Plus 2.0 Array | 2010 | Department of Surgical Oncology, University of Tokyo | 29 | NA | Primary colorectal cancers | mFOLFOX6 | The best observed response at the end of the first‐line treatment | 31.03% |
|
| Affymetrix Human Genome U133 Plus 2.0 Array | 2011 | Teikyo University Hospital and Gifu University Hospital | 83 | 54/29 | 56 primary colorectal cancers, 23 metastatic lesions to the liver, 3 metastatic lesions to the peritoneum, and 1 metastatic lesion to the lung | mFOLFOX6 | Assessment by computed tomography after four cycles of mFOLFOX6 therapy | 50.60% |
|
| Affymetrix Human Genome U133 Plus 2.0 Array | 2015 | REGP, COSIVAL and BIOCOLON cohorts | 32 | 19/13 | Primary colorectal cancers | FOLFOX | The best observed response of first‐line treatment | 60.60% |
mFOLFOX6: modified FOLFOX6
Figure 2A, Bar plot of enriched KEGG terms based on differentially expressed genes. Darker colors indicated smaller P values. B, Network of enriched KEGG terms was colored by cluster ID. C, Dot plot of GO enrichment of differentially expressed genes in cellular component ontology. D, Dot plot of GO enrichment of differentially expressed genes in biological process ontology
Figure 3A Heat map of 18 differentially expressed genes used in prediction models. Blue color indicated gene downregulation and yellow color indicated gene upregulation. NR indicates nonresponders and R indicates responders. Dataset 1, dataset 2, and dataset 3 are http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE19860, http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28702, and http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE72970, respectively. B Correlation coefficient plot of 18 differentially expressed genes in http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28702. Blue color indicated positive correlation and red color indicated negative correlation
Figure 4Cross validation results using KNN, SVM, GBM, tree, random forest, and neural network algorithms. Model performances were determined according to prediction (A) accuracy, (B) sensitivity, (C) specificity, and (D) Youden index, each dot represented onefold of cross validation result. Results were compared by Kruskal‐Wallis ANOVA analysis, and multiple comparisons are performed by Dunn's tests. Significances were labeled between the random forest algorithm and other algorithms. *P < .05, **P < .01, ***P < .005
Model prediction results in the test set
| SVM | Random forest | Neural network | |
|---|---|---|---|
| Sensitivity | 0.900 | 0.850 | 0.800 |
| 95% CI | (0.669‐0.982) | (0.611‐0.960) | (0.557‐0.934) |
| Specificity | 0.692 | 0.692 | 0.538 |
| 95% CI | (0.389‐0.896) | (0.389‐0.896) | (0.261‐0.796) |
| PLR | 2.925 | 2.762 | 1.733 |
| 95% CI | (1.278‐6.697) | (1.197‐6.373) | (0.926‐3.244) |
| NLR | 0.144 | 0.217 | 0.371 |
| 95% CI | (0.036‐0.575) | (0.071‐0.658) | (0.136‐1.015) |
Abbreviations: PLR, positive likelihood ratio; NLR, negative likelihood ratio; SVM, support vector machine.