| Literature DB >> 34386038 |
Changlong Dong1,2,3, Nini Rao1,2,3, Wenju Du1,2,3, Fenglin Gao1,2,3, Xiaoqin Lv1,2,3, Guangbin Wang1,2,3, Junpeng Zhang1,2,3.
Abstract
PURPOSE: In this work, an algorithm named mRBioM was developed for the identification of potential mRNA biomarkers (PmBs) from complete transcriptomic RNA profiles of gastric adenocarcinoma (GA).Entities:
Keywords: biomarkers; complete transcriptomic profiles; generalization ability; prognosis; sample classification
Year: 2021 PMID: 34386038 PMCID: PMC8354214 DOI: 10.3389/fgene.2021.679612
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Statistics of clinical information of included 279 GA patients.
|
|
|
| |
| Gender | Male | 171 | 61.3 |
| Female | 108 | 38.7 | |
| Age | <40 | 2 | 0.7 |
| 40–60 | 86 | 30.8 | |
| 60–80 | 174 | 62.4 | |
| =80 | 17 | 6.1 | |
| Oncology classification | Adenocarcinoma, intestinal type | 45 | 16.1 |
| Adenocarcinoma, NOS | 119 | 42.7 | |
| Adenocarcinoma, diffuse type | 55 | 19.7 | |
| Papillary adenocarcinoma, NOS | 5 | 1.8 | |
| Tubular adenocarcinoma | 55 | 19.7 | |
| Pathological staging | Stage I | 33 | 11.8 |
| Stage II | 94 | 33.7 | |
| Stage III | 118 | 42.3 | |
| others | 34 | 12.2 | |
FIGURE 1Data organization and utilization. (A) Five subsets from the TCGA-STAD dataset. (B) TCGA-COAD, TCGA-LUAD, and TCGA-LIHC. C, cancer sample; N, adjacent normal sample; CF, cancer-related factor; CFth, threshold of CF; ML, machine learning.
FIGURE 2DE RNAs and the screening results of PmBs. (A) Volcano plot of DE RNAs (circles: DE mRNA, squares: DE miRNA, triangles: DE lncRNA); red dots represent up-regulated DE RNAs, and blue dots represent down-regulated DE RNAs. (B) The total information amount plot of DE mRNAs; the abscissa represents symbols of mRNA (part of the symbols is displayed), and the ordinate is the total information amount of each DE mRNA. (C) Heatmap of the PmBs of adjacent normal group vs. GA group. DE, differentially expressed; PmBs, potential mRNA biomarkers; N, adjacent normal sample; C, cancer sample; TIA, total amount of information.
The identified PmBs and their total information amount.
|
|
|
|
|
|
|
|
|
| 1 | MET | 1.485 | GA ( | 29 | PMEPA1 | 1.290 | PCa ( |
| 2 | KLF4 | 1.808 | GA ( | 30 | DNMT1 | 1.277 | BRCA ( |
| 3 | LPCAT1 | 1.885 | GA ( | 31 | MFHAS1 | 1.264 | CRC ( |
| 4 | SOX4 | 1.356 | GA ( | 32 | IRAK1 | 1.264 | BRCA ( |
| 5 | KPNA2 | 1.506 | GA ( | 33 | TIMP1 | 1.418 | PCa ( |
| 6 | GPX3 | 1.393 | GA ( | 34 | RCC2 | 1.255 | BRCA ( |
| 7 | TYMP | 1.374 | GA ( | 35 | SLC12A7 | 1.254 | AC ( |
| 8 | FKBP10 | 1.430 | GA ( | 36 | IFI6 | 1.239 | MM ( |
| 9 | CDC25B | 1.347 | GA ( | 37 | BGN | 1.231 | CRC ( |
| 10 | SOX9 | 1.299 | GA ( | 38 | GTPBP4 | 1.229 | LUC ( |
| 11 | GPRC5A | 1.260 | GA ( | 39 | RUNX1 | 1.261 | CRC ( |
| 12 | CITED2 | 1.250 | GA ( | 40 | MXI1 | 1.214 | LUC ( |
| 13 | FHL1 | 1.214 | GA ( | 41 | TMEM63A | 1.751 | Not reported |
| 14 | DKC1 | 1.996 | CRC ( | 42 | PDCD11 | 1.598 | Not reported |
| 15 | PLOD3 | 1.464 | LUC ( | 43 | METTL7A | 1.467 | Not reported |
| 16 | KAT2B | 1.543 | BRCA ( | 44 | ATP5PF | 1.852 | Not reported |
| 17 | PARP14 | 1.433 | MM ( | 45 | UBL3 | 1.433 | Not reported |
| 18 | VAV2 | 1.352 | BRCA ( | 46 | HELZ2 | 1.405 | Not reported |
| 19 | MTHFD2 | 1.421 | RCC ( | 47 | SLC25A4 | 1.321 | Not reported |
| 20 | RAP1A | 1.372 | LUC ( | 48 | ARFGEF3 | 1.328 | Not reported |
| 21 | LMNB2 | 1.437 | HCC ( | 49 | NCAPD2 | 1.298 | Not reported |
| 22 | PER1 | 1.339 | LUC ( | 50 | ENTPD6 | 1.604 | Not reported |
| 23 | GSN | 1.248 | CRC ( | 51 | CAD | 1.253 | Not reported |
| 24 | CHD7 | 1.310 | EC ( | 52 | THEM6 | 1.333 | Not reported |
| 25 | SLC1A5 | 1.321 | CRC ( | 53 | MKI67 | 1.247 | Not reported |
| 26 | PLXNA3 | 1.306 | BRCA ( | 54 | PINK1 | 1.232 | Not reported |
| 27 | BOP1 | 1.292 | CRC ( | 55 | SH3KBP1 | 1.232 | Not reported |
| 28 | MFSD12 | 1.292 | MM ( |
FIGURE 3GO and KEGG enrichment analysis of PmBs. (A) GO enrichment analysis. (B) KEGG enrichment analysis. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; GA, gastric adenocarcinoma.
Performance of cancer-related factor.
|
| |||
|
|
|
|
|
| True | 240 | 0 | 240 |
| False | 27 | 12 | 39 |
| Total | 267 | 12 | 279 |
FIGURE 4ROC curve analysis for the four classifiers. (A) CF. (B) RF-based, SVM-based and NB-based classifiers. ROC, receiver operating characteristic; CF, cancer factor; RF, random forest; SVM, support vector machine; NB, naive Bayes.
Results of fivefold cross-validation of three classifiers with machine learning.
|
|
|
|
|
|
|
| |
| Accuracy | RF | 90.910 | 92.308 | 100 | 88.889 | 100 | 94.4211 |
| SVM | 90.909 | 100 | 100 | 100 | 100 | 98.1818 | |
| NB | 90.909 | 100 | 100 | 88.889 | 100 | 95.9596 | |
| Sensitive | RF | 88.889 | 100 | 100 | 83.333 | 100 | 94.4444 |
| SVM | 88.889 | 100 | 100 | 100 | 100 | 97.7778 | |
| NB | 88.889 | 100 | 100 | 83.333 | 100 | 94.4444 | |
| Specificity | RF | 100 | 85.714 | 100 | 100 | 100 | 97.1429 |
| SVM | 100 | 100 | 100 | 100 | 100 | 100 | |
| NB | 100 | 100 | 100 | 100 | 100 | 100 |
FIGURE 5Survival analysis based on four prognostic PmBs. (A) Kaplan–Meier curves analysis for overall survival of patients between the high- and low-risk groups; the upper panel represents the Kaplan-Meier curve for the high and low risk groups, the lower panel shows the cumulative number of deaths. (B) ROC analysis for prognostic risk model with the 4 PmBs.
Generalization ability verification results.
|
|
|
|
| |
| Disease type | Colon adenocarcinoma | Lung adenocarcinoma | Liver hepatocellular carcinoma | |
| PmBs number | 289 | 200 | 300 | |
| CF | ACC | 0.9869 | 0.9709 | 0.9384 |
| SP | 1 | 1 | 1 | |
| SE | 0.9866 | 0.9688 | 0.9366 | |
| CFth | 0.9768 | 1.001 | 0.9384 | |
| CFC | 0.8578–1.1323–1.3027 | 0.8595–1.2112–1.4593 | 0.7024–0.9974–1.6711 | |
| CFN | 0.7603–0.8752–0.9444 | 0.8159–0.8429–0.9478 | 0.6199–0.6904–0.8104 | |
| RF | ACC | 0.9716 | 0.9846 | 0.9826 |
| SP | 0.9667 | 0.9778 | 0.975 | |
| SE | 0.9833 | 0.975 | 0.9867 | |
| NB | ACC | 0.975 | 0.9833 | 0.9735 |
| SP | 0.95 | 1 | 0.9568 | |
| SE | 1 | 0.9652 | 0.9833 | |
| SVM | ACC | 0.9833 | 0.975 | 0.9913 |
| SP | 0.9667 | 1 | 1 | |
| SE | 1 | 0.9485 | 0.9833 | |