| Literature DB >> 31059106 |
Zhengyi Tang1, Ganguan Wei2, Longcheng Zhang2, Zhiwen Xu1.
Abstract
The aim of the present study was to identify novel microRNA (miRNA) or long noncoding RNA (lncRNA) signatures of laryngeal cancer recurrence and to investigate the regulatory mechanisms associated with this malignancy. Datasets of recurrent and nonrecurrent laryngeal cancer samples were downloaded from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus database (GSE27020 and GSE25727) to examine differentially expressed miRNAs (DE‑miRs), lncRNAs (DE‑lncRs) and mRNAs (DEGs). miRNA‑mRNA and lncRNA‑miRNA networks were constructed by investigating the associations among these RNAs in various databases. Subsequently, the interactions identified were combined into a competing endogenous RNA (ceRNA) regulatory network. Feature genes in the miRNA‑mRNA network were identified via topological analysis and a recursive feature elimination algorithm. A support vector machine (SVM) classifier was established using the betweenness centrality values in the miRNA‑mRNA network, consisting of 32 optimal feature‑coding genes. The classification effect was tested using two validation datasets. Furthermore, coding genes in the ceRNA network were examined via pathway enrichment analyses. In total, 21 DE‑lncRs, 507 DEGs and 55 DE‑miRs were selected. The SVM classifier exhibited an accuracy of 94.05% (79/84) for sample classification prediction in the TCGA dataset, and 92.66 and 91.07% in the two validation datasets. The ceRNA regulatory network comprised 203 nodes, corresponding to mRNAs, miRNAs and lncRNAs, and 346 lines, corresponding to the interactions among RNAs. In particular, the interactions with the highest scores were HLA complex group 4 (HCG4)‑miR‑33b, HOX transcript antisense RNA (HOTAIR)‑miR‑1‑MAGE family member A2 (MAGEA2), EMX2 opposite strand/antisense RNA (EMX2OS)‑miR‑124‑calcitonin related polypeptide α (CALCA) and EMX2OS‑miR‑124‑γ‑aminobutyric acid type A receptor γ2 subunit (GABRG2). Gene enrichment analysis of the genes in the ceRNA network identified that 11 pathway terms and 16 molecular function terms were significantly enriched. The SVM classifier based on 32 feature coding genes exhibited high accuracy in the classification of laryngeal cancer samples. miR‑1, miR‑33b, miR‑124, HOTAIR, HCG4 and EMX2OS may be novel biomarkers of recurrent laryngeal cancer, and HCG4‑miR‑33b, HOTAIR‑miR‑1‑MAGEA2 and EMX2OS‑miR‑124‑CALCA/GABRG2 may be associated with the molecular mechanisms regulating recurrent laryngeal cancer.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31059106 PMCID: PMC6522811 DOI: 10.3892/mmr.2019.10143
Source DB: PubMed Journal: Mol Med Rep ISSN: 1791-2997 Impact factor: 2.952
Figure 1.Heat map of DE-lncRNAs, DEGs and DE-miRNAs in recurrent and nonrecurrent laryngeal cancer samples. Heat map corresponding to (A) DE-lncRNAs, (B) DEGs and (C) DE-miRNAs. Yellow represents recurrent samples and blue represents nonrecurrent samples. Red indicates upregulated genes and green indicates downregulated genes. DE, differentially expressed; lncRNAs, long noncoding RNAs; DEGs, differentially expressed coding genes; miRNAs, microRNAs.
Figure 2.KM survival curves for five differentially expressed lncRNAs associated with laryngeal cancer recurrence. Blue curves represent samples with downregulated lncRNAs, and red curves represent samples with upregulated lncRNAs. KM curves corresponding to (A) HCG4, (B) EMX2OS, (C) FAM138F, (D) HOTAIR and (E) TTTY15. KM, Kaplan-Meier; HCG4, HLA complex group 4; EMX2OS, EMX2 opposite strand/antisense RNA; FAM138F, family with sequence similarity 138 member F; HOTAIR, HOX transcript antisense RNA; TTTY15, testis-specific transcript, Y-linked 15; lncRNA, long noncoding RNA.
Figure 3.Regulatory network of differentially expressed miR-lncRNA. Diamonds represent miRs and squares represent lncRNAs. Green represents downregulated genes and pink represents upregulated genes. HCG4, HLA complex group 4; HOTAIR, HOX transcript antisense RNA; TTTY15, testis-specific transcript, Y-linked 15; lncRNA, long noncoding RNA; miR, microRNA.
Figure 4.miRNA-mRNA regulatory network corresponding to genes differentially expressed between recurrent and nonrecurrent samples. Circles represent mRNAs and diamonds represent miRNAs. Green represents downregulated genes, pink represents upregulated genes, and white circles represent genes associated with at least five miRNA-interacting genes.
Figure 5.Heat map of 86 feature coding genes identified by the betweenness centrality values in the microRNA-mRNA network. Yellow represents recurrent samples and blue represents nonrecurrent samples. Red indicates upregulated genes and green indicates downregulated genes.
Basic information of 32 optimal feature-coding genes, among the genes significantly differentially expressed between recurrent and nonrecurrent cancer samples.
| Gene symbol | Betweenness centrality | Number of interactions | P-value | Log2 fold-change |
|---|---|---|---|---|
| SHISA6 | 0.240 | 4 | 0.028 | 0.544 |
| MAGEA2 | 0.215 | 5 | 0.047 | −0.394 |
| TRDN | 0.156 | 6 | 0.002 | 0.601 |
| RGS7 | 0.149 | 4 | 0.001 | 1.331 |
| CPNE4 | 0.142 | 4 | 0.020 | 0.657 |
| CALCA | 0.111 | 3 | 0.004 | 1.214 |
| GABRG2 | 0.101 | 2 | 0.035 | 0.599 |
| TECRL | 0.083 | 3 | 0.043 | 0.846 |
| MYLK2 | 0.073 | 3 | 0.005 | 0.543 |
| MYO3A | 0.058 | 3 | 0.048 | 0.365 |
| ASB10 | 0.057 | 3 | 0.001 | 1.261 |
| WT1 | 0.039 | 4 | 0.015 | 0.712 |
| CALCB | 0.038 | 2 | 0.006 | 0.986 |
| OPCML | 0.038 | 2 | 0.011 | 0.693 |
| KY | 0.038 | 2 | 0.011 | 0.502 |
| CYP3A43 | 0.038 | 2 | 0.027 | −2.312 |
| EPHA6 | 0.038 | 2 | 0.001 | 1.216 |
| SLC6A20 | 0.038 | 2 | 0.001 | 0.583 |
| NCAM2 | 0.036 | 2 | 0.026 | 0.451 |
| FLRT3 | 0.033 | 2 | 0.022 | 0.312 |
| TMC5 | 0.033 | 2 | 0.034 | −0.334 |
| AADACL3 | 0.033 | 2 | 0.011 | 0.983 |
| ANKRD1 | 0.026 | 6 | 0.005 | 0.520 |
| CLDN22 | 0.020 | 2 | 0.005 | −1.662 |
| FGF5 | 0.018 | 2 | 0.015 | 0.593 |
| SCN5A | 0.016 | 2 | 0.009 | 0.475 |
| PTPRR | 0.016 | 2 | 0.005 | 0.561 |
| LDHC | 0.015 | 2 | 0.000 | −1.366 |
| OLFM3 | 0.012 | 2 | 0.036 | 1.283 |
| HRASLS5 | 0.012 | 2 | 0.042 | −0.560 |
| SPINK6 | 0.010 | 2 | 0.011 | −0.533 |
| SPESP1 | 0.008 | 2 | 0.004 | 0.499 |
Figure 6.Classification effect of the support vector machine classifier. Classification effect was investigated on the two validation datasets, (A) GSE27020 and (B) GSE25727. Yellow represents recurrent samples and blue represents nonrecurrent samples. Red indicates upregulated genes and green indicates downregulated genes.
Figure 7.Scatter plot of samples classification using the SVM classifier. (A) Analysis of The Cancer Genome Atlas dataset. (B) GSE27020 dataset. (C) GSE25727 dataset. Blue represents nonrecurrent samples and red represents recurrent samples. SVM, support vector machine.
Classifying parameters of the support vector machine classifier for the three datasets analyzed.
| Dataset | Number of samples | Accuracy | Sensitivity | Specificity | Positive predictive value | Negative predictive value | Area under receiver operating characteristic curve |
|---|---|---|---|---|---|---|---|
| TCGA | 84 | 0.9410 | 0.947 | 0.938 | 0.818 | 0.984 | 0.986 |
| GSE27020 | 109 | 0.9266 | 0.882 | 0.947 | 0.882 | 0.947 | 0.946 |
| GSE25727 | 56 | 0.9107 | 0.941 | 0.897 | 0.8 | 0.972 | 0.921 |
TCGA, The Cancer Genome Atlas.
Figure 8.AUC of the support vector machine classifier on sample classification. (A) Sample grouping of The Cancer Genome Atlas dataset. (B) Sample grouping of the GSE27020 dataset. (C) Sample grouping of the GSE25727 dataset. AUC, area under the receiver operating characteristic curve.
Figure 9.Competing endogenous RNA regulatory network of all DE-miRNAs, DE-lncRNAs and DEGs. Circles represent mRNAs, diamonds represent miRNAs and squares represent lncRNAs. Green represents downregulated genes, pink represents upregulated genes and white circles represent genes associated with at least five miRNA-interacting genes. miRNA, microRNA; DE, differentially expressed; lncRNA, long noncoding RNA.
Significantly enriched pathways in the competing endogenous RNA regulatory network.
| KEGG pathway ID | KEGG pathway name | Gene count | P-value | Differentially expressed genes |
|---|---|---|---|---|
| hsa04530 | Tight junction | 5 | 0.001 | MYH1, CLDN6, MYH4, CTNNA2, CLDN10 |
| hsa04970 | Salivary secretion | 4 | 0.001 | HTN3, ATP1B4, ATP1A2, STATH |
| hsa04080 | Neuroactive ligand-receptor interaction | 6 | 0.004 | GABRA2, GABRG2, PTH2R, OPRK1, CHRNA1, GRIK2 |
| hsa04918 | Thyroid hormone synthesis | 3 | 0.007 | 6528, ATP1B4, ATP1A2 |
| hsa04964 | Proximal tubule bicarbonate reclamation | 2 | 0.008 | ATP1B4, ATP1A2 |
| hsa04960 | Aldosterone-regulated sodium reabsorption | 2 | 0.020 | ATP1B4, ATP1A2 |
| hsa05033 | Nicotine addiction | 2 | 0.020 | GABRA2, GABRG2 |
| hsa04973 | Carbohydrate digestion and absorption | 2 | 0.025 | ATP1B4, ATP1A2 |
| hsa04670 | Leukocyte transendothelial migration | 3 | 0.026 | CLDN6, CTNNA2, CLDN10 |
| hsa04978 | Mineral absorption | 2 | 0.032 | ATP1B4, ATP1A2 |
| hsa04514 | Cell adhesion molecules (CAMs) | 3 | 0.041 | CLDN6, NCAM2, CLDN10 |
KEGG, Kyoto Encyclopedia of Genes and Genomes.
Significantly enriched molecular functions in the competing endogenous RNA regulatory network.
| GO term ID | GO term name | Gene count | P-value | Genes |
|---|---|---|---|---|
| GO:0006936 | Muscle contraction | 11 | 4.17×10−7 | TRDN, ACTC1, MYH1, MYL1, MYLK2, MYH4, SMPX, TTN, CHRNA1, SCN5A, CASQ2 |
| GO:0003012 | Muscle system process | 11 | 9.88×10−7 | TRDN, ACTC1, MYH1, MYL1, MYLK2, MYH4, SMPX, TTN, CHRNA1, SCN5A, CASQ2 |
| GO:0008015 | Blood circulation | 10 | 1.89×10−5 | CALCA, CALCB, ACTC1, MYL1, MYLK2, CARTPT, ATP1A2, TAC4, SCN5A, EPO |
| GO:0003013 | Circulatory system process | 10 | 1.89×10−5 | CALCA, CALCB, ACTC1, MYL1, MYLK2, CARTPT, ATP1A2, TAC4, SCN5A, EPO |
| GO:0007267 | Cell-cell signaling | 16 | 9.03×10−5 | FGF5, GABRG2, GABRA2, GRIK2, OPRK1, MYLK2, ATP1A2, CXCL6, IL22, CTNNA2, CALCA, PNOC, SLC1A6, CARTPT, SLC30A8, CHRNA1 |
| GO:0006811 | Ion transport | 18 | 1.32×10−4 | GABRG2, SLC5A5, GABRA2, GRIK2, ATP1B4, KCNA6, ATP1A2, TCN1, BEST3, SLC1A6, SLC5A8, LTF, ANO5, ANO4, CHRNA1, SLC30A8, SCN5A, ADD2 |
| GO:0007268 | Synaptic transmission | 11 | 1.44×10−4 | GABRG2, GABRA2, PNOC, GRIK2, OPRK1, SLC1A6, MYLK2, CARTPT, ATP1A2, CHRNA1, CTNNA2 |
| GO:0030182 | Neuron differentiation | 13 | 2.09×10−4 | NCAM2, SLITRK4, SOX1, OPCML, FOXA2, PKHD1, EMX2, MDGA2, PTPRR, POU4F1, OLFM3, NEFL, CTNNA2 |
| GO:0019226 | Transmission of nerve impulse | 11 | 5.20×10−4 | GABRG2, GABRA2, PNOC, GRIK2, OPRK1, SLC1A6, MYLK2, CARTPT, ATP1A2, CHRNA1, CTNNA2 |
| GO:0055082 | Cellular chemical homeostasis | 10 | 3.60×10−3 | CALCA, CALCB, XIRP1, GRIK2, SLC1A6, LTF, CARTPT, ATP1A2, SLC30A8, CHRNA1 |
| GO:0019725 | Cellular homeostasis | 10 | 1.29×10−2 | CALCA, CALCB, XIRP1, GRIK2, SLC1A6, LTF, CARTPT, ATP1A2, SLC30A8, CHRNA1 |
| GO:0048878 | Chemical homeostasis | 10 | 2.24×10−2 | CALCA, CALCB, XIRP1, GRIK2, SLC1A6, LTF, CARTPT, ATP1A2, SLC30A8, CHRNA1 |
| GO:0007155 | Cell adhesion | 12 | 2.54×10−2 | IBSP, FLRT3, CALCA, NCAM2, AMBN, OPCML, FAT3, PKHD1, CLDN6, CLDN22, CLDN10, CTNNA2 |
| GO:0022610 | Biological adhesion | 12 | 2.56×10−2 | IBSP, FLRT3, CALCA, NCAM2, AMBN, OPCML, FAT3, PKHD1, CLDN6, CLDN22, CLDN10, CTNNA2 |
| GO:0050877 | Neurological system process | 17 | 3.28×10−2 | GABRG2, GABRA2, MYO3A, GRIK2, OPRK1, MYLK2, ATP1A2, TAS1R1, CTNNA2, CALCA, NCAM2, PNOC, SLC1A6, CARTPT, POU4F1, CHRNA1, NEFL |
| GO:0042592 | Homeostatic process | 12 | 3.94×10−2 | CALCA, CALCB, XIRP1, GRIK2, PKHD1, SLC1A6, LTF, CARTPT, ATP1A2, SLC30A8, CHRNA1, EPO |
GO, Gene Ontology.
Figure 10.Competing endogenous RNA regulatory network of DE-miRNAs, DE-lncRNAs, and 32 optimal feature coding genes. Circles represent mRNAs, diamonds represent miRNAs and squares represent lncRNAs. Green represents downregulated genes, pink represents upregulated genes and white circles represent genes associated with at least five miRNA-interacting genes. DE, differentially expressed; miRNA, microRNA; lncRNA, long noncoding RNA.
Figure 11.KM survival curves of the differentially expressed miRs in the competing endogenous RNA regulatory network constructed using 32 optimal feature coding genes. Blue curves represent samples with downregulated miRs and red curves represent samples with upregulated miRs. (A) hsa-miR-1. (B) hsa-miR-33b. (C) hsa-miR-124. (D) hsa-miR-133a. (E) hsa-miR-184. (F) hsa-miR-208a. KM, Kaplan-Meier; miR, microRNA.