| Literature DB >> 31638253 |
Cheng-Yun Li1, Wen-Wen Zhang1, Ji-Lian Xiang2, Xing-Hua Wang3, Jun-Ling Wang1, Jin Li1.
Abstract
Esophageal squamous cell carcinoma (ESCC) is a prevalent aggressive malignant tumor with poor prognosis. Investigations into the molecular changes that occur as a result of the disease, as well as identification of novel biomarkers for its diagnosis and prognosis, are urgently required. Long non‑coding RNAs (lncRNAs) have been reported to play a critical role in tumor progression. The present study performed data mining analyses for ESCC via an integrated study of accumulated datasets and identification of the differentially expressed lncRNAs from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases. The identified intersection of differentially expressed genes (lncRNAs, miRNAs and mRNAs) in ESCC tissues between the GEO and TCGA datasets was investigated. Based on these intersected lncRNAs, the present study constructed a competitive endogenous RNA (ceRNA) network of lncRNAs in ESCC. A total of 81 intersection lncRNAs were identified; 67 of these were included in the ceRNA network. Functional analyses revealed that these 67 key lncRNAs primarily dominated cellular biological processes. The present study then analyzed the associations between the expression levels of these 67 key lncRNAs and the clinicopathological characteristics of the ESCC patients, as well as their survival time using TCGA. The results revealed that 31 of these lncRNAs were associated with tumor grade, tumor‑node‑metastasis (TNM) stage and lymphatic metastasis status (P<0.05). In addition, 15 key lncRNAs were demonstrated to be associated with survival time (P<0.05). Finally, 5 key lncRNAs were selected for validation of their expression levels in 30 patients newly diagnosed with ESCC via reverse transcription‑quantitative PCR (RT‑qPCR). The results suggested that the fold changes in the trends of up‑ and downregulation between GEO, TCGA and RT‑qPCR were consistent. In addition, it was also demonstrated that a select few of these 5 key lncRNAs were significantly associated with TNM stage and lymph node metastasis (P<0.05). The results of the clinically relevant analysis and the aforementioned bioinformatics were similar, hence proving that the bioinformatics analysis used in the present study is credible. Overall, the results from the present study may provide further insight into the functional characteristics of lncRNAs in ESCC through bioinformatics integrative analysis of the GEO and TCGA datasets, and reveal potential diagnostic and prognostic biomarkers for ESCC.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31638253 PMCID: PMC6859451 DOI: 10.3892/or.2019.7377
Source DB: PubMed Journal: Oncol Rep ISSN: 1021-335X Impact factor: 3.906
Details of ESCC studies and RNA sequencing microarray datasets from the GEO database.
| GSE | Publication | RNA sequencing styles | Sample size for each group |
|---|---|---|---|
| GSE23400 | Clinical cancer research | lncRNA | Tumor, 104 |
| mRNA | Adjacent normal tissues, 104 | ||
| GSE26886 | BMC Cancer | lncRNA | Tumor, 35 |
| mRNA | Adjacent normal tissues, 34 | ||
| GSE45670 | Annals of oncology | lncRNA | Tumor, 28 |
| mRNA | Adjacent normal tissues, 10 | ||
| GSE97049 | Unrecorded | miRNA | Tumor, 7 |
| Adjacent normal tissues, 7 | |||
| GSE6188 | Cancer research | miRNA | Tumor, 153 |
| Adjacent normal tissues, 104 | |||
| GSE55856 | Gut | miRNA | Tumor, 108 |
| Adjacent normal tissues, 108 |
ESCC, esophageal squamous cell carcinoma; GEO, Gene Expression Omnibus.
Clinical information and samples size for TCGA ESCC datasets.
| Variables | Total cases, n=312 (%) | Alive, n=138 (%) | Deceased, n=174 (%) |
|---|---|---|---|
| Sex | |||
| Male | 211 (67.63) | 83 (60.14) | 128 (73.56) |
| Female | 101 (32.37) | 55 (39.86) | 46 (26.44) |
| Race | |||
| White | 162 (51.92) | 5 (39.86) | 107 (61.49) |
| Asia | 127 (40.71) | 68 (49.28) | 59 (33.91) |
| Black | 23 (7.37) | 15 (10.87) | 8 (4.60) |
| Age, years | |||
| ≤50 | 86 (27.56) | 53 (38.41) | 33 (18.97) |
| >50 | 226 (72.44) | 85 (61.59) | 141 (81.03) |
| Tumor grade | |||
| GI | 49 (15.71) | 31 (22.46) | 18 (10.34) |
| GII | 146 (46.79) | 48 (34.78) | 98 (56.32) |
| GIII–IV | 117 (37.50) | 59 (42.75) | 58 (33.33) |
| TNM stage | |||
| I/II | 99 (31.73) | 30 (21.74) | 69 (39.66) |
| III/IV | 213 (68.27) | 108 (78.26) | 105 (60.34) |
| Lymph node status | |||
| No metastasis | 35 (11.22) | 32 (23.19) | 3 (1.72) |
| Metastasis | 277 (88.78) | 106 (76.81) | 171 (98.28) |
TCGA, The Cancer Genome Atlas; ESCC, esophageal squamous cell carcinoma.
Figure 1.Flowchart for integrated bioinformatics analysis of ESCC publicly available RNA sequencing datasets from the GEO and TCGA databases. ESCC, esophageal squamous cell carcinoma; GEO, Gene Expression Omnibus; TCGA, The Cancer Genome Atlas.
Figure 2.Flowchart for lncRNA, miRNA and mRNA ceRNA network construction. lncRNA, long non-coding RNA.
Demographic and clinical characteristics of 30 patients with ESCC.
| Variables | Total cases, n=30 (%) |
|---|---|
| Age, years | 59±9.72 |
| (mean ± standard deviation) | |
| Sex | |
| Male | 20 (66.67) |
| Female | 10 (33.33) |
| TNM stage | |
| I/II | 8 (26.67) |
| III/IV | 22 (73.33) |
| Lymph node status | |
| No metastasis | 9 (30.00) |
| Metastasis | 21 (70.00) |
ESCC, esophageal squamous cell carcinoma.
Figure 3.(A) Volcano plots of differentially expressed genes in GEO datasets; (B) Venn diagram demonstrates the differentially expressed intersection genes (lncRNAs, miRNAs and mRNAs) of GEO datasets. GEO, Gene Expression Omnibus; lncRNA, long non-coding RNA.
Details information of differentially expressed genes in GEO database.
| GSE | Differentially expressed genes | Upregulated | Downregulated | Intersection differentially expressed genes | |
|---|---|---|---|---|---|
| GSE23400 | lncRNA | 792 | 378 | 414 | 108 |
| GSE26886 | lncRNA | 1340 | 599 | 741 | |
| GSE45670 | lncRNA | 914 | 508 | 406 | |
| GSE23400 | mRNA | 4518 | 2219 | 2299 | 1234 |
| GSE26886 | mRNA | 7407 | 3933 | 3474 | |
| GSE45670 | mRNA | 5422 | 2599 | 2823 | |
| GSE97049 | miRNA | 178 | 68 | 110 | 45 |
| GSE6188 | miRNA | 198 | 48 | 150 | |
| GSE55856 | miRNA | 219 | 109 | 110 | |
GEO, Gene Expression Omnibus.
Figure 4.Venn diagram demonstrating the intersections of genes between GEO and TCGA data. (A) The intersection of lncRNAs; (B) The intersection of miRNAs; (C) The intersection of mRNAs. GEO, Gene Expression Omnibus; TCGA, The Cancer Genome Atlas; lncRNA, long non-coding RNA.
ESCC-related intersection co-differentially expressed lncRNAs, miRNAs and mRNAs in GEO and TCGA.
| Genes | Co-differentially expressed lncRNAs, miRNAs and mRNAs |
|---|---|
| lncRNAs | ALOX12P2, AOC4P, BTN2A3P, C20orf166-AS1, C21orf62-AS1, CMAHP, CYP2D7, CYP4Z2P, DDX12P, DIRC3, DUSP5P1, ENST00000570167.1, ENST00000584492.5, FAM66A, FAM86HP, FAM86JP, FAR2P1, FIRRE, FOXD2-AS1, GUCY1B2, HAND2-AS1, HAVCR1P1, HCP5, HNRNPA3P1, IPW, LINC00176, LINC00341, LINC00346, LINC00472, LINC00476, LINC00663, LINC00689, LINC00887, LINC00889, LINC00950, LINC00982, LINC01001, LINC01105, LINC01588, LOC100128164, LOC100499484-C9ORF174, LOC101928316, LOC148696, LOC202181, LOC283856, LOC399815, LOC728743, MBL1P, MEG3, MIR4435-2HG, MIR503HG, MIR600HG, NONHSAT068116.2, NONHSAT075748.2, NONHSAT179718.1, NONHSAT198787.1, PART1, PCAT18, PSMG3-AS1, PTGES2-AS1, PVT1, PWAR5, PWARSN, RAMP2-AS1, RPLP0P2, SBF1P1, SLC26A4-AS1, SLC8A1-AS1, SMIM10L2A, SMIM10L2B, SNHG4, TCAM1P, TP73-AS1, UCA1, UG0898H09, XR_253656.2, XR_946740.1, ZFAS1, ZFP91-CNTF, ZNF300P1, ZNF542P |
| miRNAs | let-7c-5p, let-7g-3p, miR-101-3p, miR-101-5p, miR-106b-5p, miR-125a-5p, miR-130b-3p, miR-133a-3p, miR-135b-5p, miR-141-3p, miR-143-3p, miR-145-5p, miR-15b-3p, miR-15b-5p, miR-16-5p, miR-182-5p, miR-183-5p, miR-185-5p, miR-18a-5p, miR-195-5p, miR-200a-3p, miR-200b-3p, miR-200c-3p, miR-200c-5p, miR-205-5p, miR-20b-5p, miR-21-3p, miR-224-5p, miR-28-5p, miR-31-5p, miR-320a, miR-32-5p, miR-328-3p, miR-330-5p, miR-33a-5p, miR-425-5p, miR-484, miR-497-5p, miR-93-5p |
| mRNAs | ABCA8, ABCC8, ACACB, ACADL, ACADSB, ACTG2, ACVR2A, ADAMTSL1, ADCY2, ADCY5, ADCY6, ADGRD1, ADH1B, ADHFE1, AFF3, AGPS, ALAD, ALDH6A1, ALDH7A1, ANGPTL1, ANK2, AOX1, APLP1, AQP4, AR, ARHGDIG, ARHGEF6, ARRB1, ASPA, ASXL3, ATP1A2, ATP4A, ATP4B, AZI2, B3GAT1, B4GALNT2, BID, BIRC5, BMP3, BMP8B, BMPER, BMS1, C16orf89, C2orf40, C6, C7, CA4, CAB39L, CACNA2D2, CADM2, CADM3, CALM1, CASQ2, CCBE1, CCKAR, CCKBR, CD1E, CD44, CDC6, CDH19, CDH2, CDK6, CELF4, CFLAR, CGNL1, CHGA, CHGB, CHMP2B, CHRDL1, CHST11, CKB, CKM, CKMT2, CLCNKA, CLDN1, CLDN16, CNKSR2, CNN1, CNTFR, CNTN2, CNTN3, COL2A1, COL4A3, CPA2, CPEB1, CPEB3, CPLX2, CTNND2, CTSC, CUX2, CYBRD1, CYFIP2, CYP2U1, CYP4B1, DES, DHX36, DIRAS1, DLG2, DLG3, DNER, DPP10, DPP6, DPT, E2F2, E2F3, EDA, EDNRB, EFNA5, EIF4EBP2, ELOVL6, EME1, ENAM, ENPP5, EPHA5, EPHB1, ERBB4, ESPL1, ESRRB, ESRRG, ETNPPL, EXO1, FAM107A, FAR1, FAXDC2, FGA, FGF2, FGFR1, FGG, FNDC5, FRMD1, FXYD1, FZD4, GAB1, GAB2, GALNT2, GALNT6, GATA5, GC, GFRA1, GHRL, GIF, GKN1, GKN2, GNAQ, GPD1L, GPER1, GPM6A, GPR155, GREM2, GRIA1, GRIA3, GRIA4, GRIK3, GRIK5, GRIN2A, GSTM5, H2AFJ, H2AFX, HCFC2, HDC, HIPK2, HMGA2, HMGCS2, HMP19, HOXA10, HPN, HPSE2, HS6ST3, HSPB6, HSPB7, ICOS, ID4, IGF1R, IGF2BP1, IKBKE, IL1RAP, IL6ST, INPP5A, IQSEC3, IRS1, ITGA8, ITGB8, ITPR2, KAT2B, KCNB1, KCNE2, KCNJ10, KCNJ11, KCNJ16, KCNK2, KCNMA1, KCNMB2, KCTD8, KIAA0408, KIAA2022, KIF5A, KLF15, KSR1, LAMC2, LAMTOR3, LDB3, LIFR, LIPF, LMOD1, LONRF2, MAGI1, MAGI3, MAMDC2, MAOA, MAP3K13, MAP4K4, MAPK4, MAPT, MARVELD3, MASP1, MFAP5, MFSD4A, MME, MMP14, MOCS1, MT1M, MYH11, MYLK, MYO18B, MYOC, MYOCD, MYRIP, NBEA, NCAM1, NEGR1, NRXN1, NTN4, OAS2, OGN, OMD, P2RX2, PANX1, PCDH9, PCSK2, PDCD4, PDCD6IP, PDE1A, PDE2A, PDE7B, PDZRN4, PEBP4, PGA3, PGA4, PGA5, PGM5, PGR, PI16, PKHD1L1, PLAU, PLCXD3, PLN, PLP1, PML, PPP1R12B, PPP1R1A, PPP1R9A, PPP2R3A, PRICKLE2, PRIMA1, PRKAA2, PRKACB, PRKAR2B, PRKCB, PRSS1, PSAPL1, PSMB2, PSME4, PTGER3, PTGIS, PTGS1, PTPN2, PTPRN, RAB11A, RAB11FIP2, RAB2A, RAD51, RAG1, RANBP3L, RAP1A, RAPGEF2, RBL1, RBPMS2, RELN, RGN, RIC3, RIMS4, RNF125, RORC, RPRM, RPS6KA2, RPS6KA6, RSPO2, RYR2, S1PR1, SCARA5, SCG3, SCIN, SCN7A, SCUBE2, SEMA3E, SERPINA5, SESN3, SFRP1, SGCA, SH3GL2, SH3GLB1, SIGLEC11, SIX4, SLC1A2, SLC26A7, SLC2A1, SLC2A4, SLC5A7, SLC9A4, SLIT2, SLK, SORCS1, SORT1, SOX10, SOX4, SST, STMN1, STMN2, STUM, SYNPO2, SYT4, TACR1, TCEAL2, TCF3, TFDP2, TGFBR2, THRB, TMEM132C, TNFAIP3, TNFRSF10B, TNXB, TP53INP2, TRA2B, TRIM50, VAMP2, VIP, VIPR2, WASF3, WDR17, WISP2, XKR4, YWHAZ, ZBTB16, ZFP36, ZFP36L2, ZNF385B, ZNF471 |
ESCC, esophageal squamous cell carcinoma; GEO, Gene Expression Omnibus; TCGA, The Cancer Genome Atlas; lncRNA, long non-coding RNA.
Figure 5.The lncRNA-miRNA-mRNA ceRNA network. Red, upregulated genes; green, downregulated genes; triangles, lncRNAs; squares, miRNAs; balls, mRNAs. lncRNA, long non-coding RNA.
Figure 6.Top 20 enrichment of GO terms for mRNAs in the ceRNA network (the bar plot shows the enrichment scores of the significant enrichment GO terms). GO, Gene Ontology.
Figure 7.Top 20 enrichment of pathways for mRNAs in ceRNA network (the bar plot shows the enrichment scores of the significant enrichment pathways).
Associations between lncRNA signature and ESCC patients' clinicopathological characteristics.
| Comparisons | Upregulated | Downregulated |
|---|---|---|
| Sex (male vs. female) | C20orf166-AS1, UG0898H09, PART1 | |
| Race (Caucasian vs. Asian) | ALOX12P2, DUSP5P1 | PART1, LINC00341, LINC00982, DIRC3 |
| Tumor grade | PTGES2-AS1, LINC01588, BTN2A3P, | PSMG3-AS1, CMAHP, LINC00472, LINC01105, |
| (GIII–IV vs. GI–II) | LOC728743, RPLPOP2, LINC00346 | SBF1P1, DIRC3, SMIM10L2A, ZNF300P1, TP73-AS1 |
| TNM stage (T3 + T4 vs. T1 + T2) | RPLP0P2, DDX12P, PVT1, HCP5 | LINC00472, LOC148696, SMIM10L2A, ZNF300P1, SMIM10L2B, HAND2-AS1 |
| Lymphatic metastasis (yes vs. no) | PTGES2-AS1, FOXD2-AS1, TCAM1P, LINC00176 | DIRC3, LINC00982, LOC148696, SBF1P1 |
ESCC, esophageal squamous cell carcinoma; lncRNA, long non-coding RNA.
Figure 8.Kaplan-Meier survival curves for 15 lncRNAs associated with ESCC patients' overall survival time [horizontal axis, overall survival time (days); vertical axis, survival function]. ESCC, esophageal squamous cell carcinoma; lcRNA, long non-coding RNA.
Randomly selected lncRNAs with absolute FC>2, P<0.05.
| Name (lncRNAs) | Gene ID | Regulation | TCGA (mean FC) | GEO (mean FC) |
|---|---|---|---|---|
| LINC00982 | 440556 | Down | −17.539 | −3.991 |
| TP73-AS1 | 57212 | Down | −3.845 | −5.579 |
| SMIM10L2A | 399668 | Down | −7.850 | −4.852 |
| PVT1 | 5820 | Up | 7.193 | 9.997 |
| FOXD2-AS1 | 84793 | Up | 6.470 | 14.211 |
FC, fold change; TCGA, The Cancer Genome Atlas; GEO, Gene Expression Omnibus; lncRNA, long non-coding RNA.
Figure 9.Box plot showing the median and quartiles of specific lncRNAs in donor samples. lncRNA, long non-coding RNA.
Figure 10.Box plot showing the association of the fold change in LINC00982, TP73-AS1, SMIM10L2A, PVT1 and FOXD2-AS1 expression with clinicopathological characteristics in 30 ESCC patients. ESCC, esophageal squamous cell carcinoma; y, years. *P<0.05.