Yan Liu1, Teng Hua1, Shuqi Chi1, Hongbo Wang1. 1. Department of Obstetrics and Gynecology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China.
Abstract
Endometrial cancer (EC) is one of the most common gynecological cancer types worldwide. However, to the best of our knowledge, its underlying mechanisms remain unknown. The current study downloaded three mRNA and microRNA (miRNA) datasets of EC and normal tissue samples, GSE17025, GSE63678 and GSE35794, from the Gene Expression Omnibus to identify differentially expressed genes (DEGs) and miRNAs (DEMs) in EC tumor tissues. The DEGs and DEMs were then validated using data from The Cancer Genome Atlas and subjected to gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis. STRING and Cytoscape were used to construct a protein-protein interaction network and the prognostic effects of the hub genes were analyzed. Finally, miRecords was used to predict DEM targets and an miRNA-gene network was constructed. A total of 160 DEGs were identified, of which 51 genes were highly expressed and 100 DEGs were discovered from the PPI network. Three overlapping genes between the DEGs and the DEM targets, BIRC5, CENPF and HJURP, were associated with significantly worse overall survival of patients with EC. A number of DEGs were enriched in cell cycle, human T-lymphotropic virus infection and cancer-associated pathways. A total of 20 DEMs and 29 miRNA gene pairs were identified. In conclusion, the identified DEGs, DEMs and pathways in EC may provide new insights into understanding the underlying molecular mechanisms that facilitate EC tumorigenesis and progression.
Endometrial cancer (EC) is one of the most common gynecological cancer types worldwide. However, to the best of our knowledge, its underlying mechanisms remain unknown. The current study downloaded three mRNA and microRNA (miRNA) datasets of EC and normal tissue samples, GSE17025, GSE63678 and GSE35794, from the Gene Expression Omnibus to identify differentially expressed genes (DEGs) and miRNAs (DEMs) in EC tumor tissues. The DEGs and DEMs were then validated using data from The Cancer Genome Atlas and subjected to gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis. STRING and Cytoscape were used to construct a protein-protein interaction network and the prognostic effects of the hub genes were analyzed. Finally, miRecords was used to predict DEM targets and an miRNA-gene network was constructed. A total of 160 DEGs were identified, of which 51 genes were highly expressed and 100 DEGs were discovered from the PPI network. Three overlapping genes between the DEGs and the DEM targets, BIRC5, CENPF and HJURP, were associated with significantly worse overall survival of patients with EC. A number of DEGs were enriched in cell cycle, humanT-lymphotropic virus infection and cancer-associated pathways. A total of 20 DEMs and 29 miRNA gene pairs were identified. In conclusion, the identified DEGs, DEMs and pathways in EC may provide new insights into understanding the underlying molecular mechanisms that facilitate EC tumorigenesis and progression.
Entities:
Keywords:
Kyoto Encyclopedia of Genes and Genomes enrichment analysis; bioinformatics analyses; differentially expressed genes; endometrial cancer; gene ontology
Endometrial carcinoma (EC) is one of the most common gynecological cancer types, with increasing global incidence in recent years (1). A total of 60,050 cases of EC and 10,470 EC-associated cases of mortality were reported in the USA in 2016 (1), which was markedly higher than the 2012 statistics of 47,130 cases and 8,010 mortalities (2). Although numerous studies have been conducted to investigate the mechanisms of endometrial tumorigenesis and development, to the best of our knowledge, the exact etiology remains unknown. Understanding the potential molecular mechanisms underlying EC initiation and progression is of great clinical significance. Previously, microarray technologies and bioinformatics have widely been used for the differential expression analysis of cancer and healthy cells to identify novel diagnostic and therapeutic biomarkers (3).MicroRNAs (miRNAs) are small, noncoding RNAs that regulate the expression of critical genes involved in cancer progression and treatment (4). They bind to the 3′-untranslated region (3′-UTR) of target mRNAs (5), resulting in either degradation or inhibition of the expression and function of protein-coding mRNAs. miRNAs regulate several functions in cancer cells, including proliferation, apoptosis, metastasis, immune evasion and differentiation (6). In addition, several miRNAs serve critical roles in EC pathogenesis (7,8) and are associated with clinicopathological features and survival (9). However, the specific mechanisms associated with miRNA-mediated regulation in EC require further investigation.The current study evaluated the potential molecular mechanisms and biomarkers of EC using a bioinformatics approach. Microarray expression data were downloaded from the Gene Expression Omnibus (GEO) database and The Cancer Genome Atlas (TCGA). Differentially expressed genes (DEGs) and miRNAs (DEMs) in the EC samples compared with normal samples were identified using the GEO2R program and R software. The DEGs were subjected to functional and pathway enrichment analysis, followed by protein-protein interaction (PPI) network and survival analysis. A putative miRNA-mRNA network relevant to EC pathogenesis was then constructed.
Materials and methods
Microarray expression data
The two gene expression datasets, GSE17025 (10) and GSE63678 (11), the miRNA expression dataset, GSE35794, and the DNA methylation profile, GSE40032, were downloaded from the GEO database (www.ncbi.nlm.nih.gov/geo). The GSE17025 dataset included data of 91 EC tissue samples, of which 79 were endometrioid and 12 were papillary serous, and 12 were atrophic endometrium samples from postmenopausal women. The tissue samples were analyzed on the GPL570 Platform Affymetrix Human Genome U133 Plus 2.0 (Affymetrix; Thermo Fisher Scientific, Inc., Waltham, MA, USA) (10). The GSE63678 dataset included data from seven EC tissues and five normal endometrium samples, and was analyzed on the GPL571 Platform Affymetrix Human Genome U133A 2.0 Array (Affymetrix; Thermo Fisher Scientific, Inc.) (11). The GSE35794 dataset included data from 18 EC samples and four normal samples, and was analyzed on the GPL10850 Agilent-021827 Human miRNA Microarray V3 (Agilent Technologies, Palo Alto, CA, USA). The GSE40032 dataset included data of 64 EC tissue samples and 23 normal endometrium samples, which was detected using the Illumina HumanMethylation27 BeadChip (HumanMethylation27_270596_v.1.2) on GPL8490 (Illumina, Inc., San Diego, CA, USA).The RNA-seq ht seq-count data of mRNA, miRNA-seq and clinical data (project ID. TCGA-UCEC) of patients diagnosed with uterine corpus endometrial carcinoma were downloaded from TCGA (www.cancergenome.nih.gov) using the shengxin.ren download tool (http://www.shengxin.ren). Data of 552 EC samples and 23 normal endometrium samples were included.
Identification of DEGs and DEMs
DEGs, DEMs and differentially methylated genes (DMGs) in the GSE17025, GSE35794 and GSE40032 datasets were identified using the GEO2R program of the GEO (www.ncbi.nlm.nih.gov/geo/geo2r/). The screening threshold of DEGs and DEMs was adjusted to P<0.05 and |log2 fold-change (FC)|>1. DMGs were identified with the thresholds of P<0.05 and |t|>2, where t is the ratio of the difference of the estimated value of a parameter from its hypothesized value to its standard error. For the dataset GSE63678, the original CEL files of the Affymetrix platform were background corrected, normalized and log2 transformed using the Robust Multi-array Average (RMA) (12) method and the affy package in R software (version 3.4.0; www.r-project.org). The Limma package (version 3.34.9) (13) was subsequently used for the calculation of aberrantly expressed mRNAs and the Benjamini-Hochberg (BH) method (14) was used to identify DEGs with the threshold criterion of P<0.05 and absolute log2FC >1. The mRNA expression data of TCGA were calculated using Bioconductor package edgeR (version 3.20.9) (15) and were analyzed using the same strategy as used for the Affymetrix data analysis. The miRNA expression data of TCGA were analyzed using a Student's t-test in GraphPad Prism (version 6; GraphPad Software, Inc., La Jolla, CA, USA).
Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis
GO and KEGG pathway enrichment analysis were performed to determine the biological significance of DEGs, using the Database for Annotation, Visualization and Integrated Discovery (DAVID; version 6.8; http://david.ncifcrf.gov/). The BH and Bonferroni methods were used for GO and pathway enrichment analysis.
PPI network and modular analysis
PPI networks are mathematical representations of physically interacting proteins (16). The STRING database (version 10.5; www.string-db.org) was used to establish the PPI network of DEGs and Cytoscape version 3.5 (17) was used to visualize the results. A confidence score ≥0.7 was set as the cut-off criterion. Molecular Complex Detection (MCODE) was used to filter modules of the PPI network with a node score cut-off value of 0.2, degree cut-off value of 2, k-core of 2 and maximum depth of 100 (18).
Prediction of miRNA targets
The miRecords database (19) was used to predict the target genes of the DEMs. miRecords is a comprehensive database created using 11 established miRNA target prediction programs: MirTarget2, miTarget, MicroInspector, RNA22, PITA, miRanda, DIANA-microT, NBmiRTar, RNAhybrid, PicTar and TargetScan. Genes that were predicted by at least four programs were selected as the candidate targets of miRNAs.
Construction of the miRNA-target gene regulatory network
Overlaps between DEGs and DEM targets were selected and the association between overlapping genes and DEMs was validated using Pearson's correlation analysis in starBase (version 2.0; http://starbase.sysu.edu.cn/). The miRNA-gene regulatory network was constructed based on the overlapping genes and their upstream miRNAs, which were then visualized by Cytoscape software.
DEG survival analysis
OncoLnc (www.oncolnc.org) is a tool used for studying survival correlations by comparing clinical data with expression profiles of mRNAs, miRNAs and long non-coding RNAs (lncRNAs) (20). The overall survival (OS) rate of patients with EC relative to different DEGs was calculated using Kaplan-Meier analysis in OncoLnc. The associations between gene expression and clinical characteristics were analyzed using one-way ANOVA and a Bonferroni's multiple comparisons test in GraphPad Prism software.
Results
Identifying DEGs and DEMs
Gene expression profiles of EC and normal endometrium tissue datasets GSE17025 and GSE63678 were downloaded from GEO and normalized using the RMA method. The Limma package was used to analyze and compare the transcriptional data between EC samples and normal samples. Using P<0.05 and absolute log2FC >1 as the cut-off criteria, 214 aberrantly expressed mRNAs were identified in EC (Fig. 1). A total of 205 identical DEGs were filtered from the two datasets, consisting of 131 upregulated and 74 downregulated genes that were similarly aberrantly expressed in the two datasets.
Figure 1.
Identification of DEGs from the two datasets. The overlapping area corresponds to the commonly identified DEGs. P<0.05 and |log(fold-change)|>1 indicated a statistically significant DEG. DEG, differentially expressed gene.
The TCGA RNA-seq data from 552 EC samples and 23 normal samples were normalized and corrected using the quantile normalization method and volcano plot analysis was performed using R software. A total of 7,562 aberrantly expressed mRNAs were obtained, of which 2,871 and 4,681 were upregulated and downregulated, respectively (Fig. 2). Finally, 160 aberrantly expressed genes, including 111 upregulated and 49 downregulated genes, were identified in EC samples from both the GEO and TCGA databases. The top ten DEGs identified between EC and normal tissue data from TCGA are presented in Table I.
Figure 2.
Volcano plot of detectable genome-wide mRNA profiles in 552 endometrial cancer tissue samples and 23 normal tissue samples. Red and green plots represent aberrantly expressed mRNAs with P<0.05 and |log(FC)|>1. Red plots indicate upregulated genes, green plots indicate downregulated genes and black plots indicate normally expressed mRNAs. The x-axis is the fold-change value between the expression of circulating mRNAs in normal tissues and endometrial cancer tumors. The y-axis is the -log10 of the FDR value for each mRNA, representing the strength of the association. FDR, false discovery rate; FC, fold change.
Table I.
Top 10 differentially expressed genes in endometrial cancer compared with normal tissue according to data from The Cancer Genome Atlas.
For the dataset GSE40032, a total of 2,151 hypermethylated genes and 1,173 hypomethylated genes were identified using the cut-off criteria P<0.05 and |t|>2. Subsequently, hypomethylation-high expression genes were obtained by overlapping hypomethylated and upregulated DEGs, and hypermethylation-low expression genes were obtained by overlapping hypermethylated and downregulated DEGs. A total of 12 hypomethylation-high expression genes (ESPL1, KIF14, KRT8, TYMS, SFN, TRIP13, S100A11, TK1, ASPM, CDCA3, CDCP1 and FUT2) and 15 hypermethylation-low expression genes (NAALAD2, RUNX1T1, TRPC4, TSPYL5, GPM6A, TCEAL2, ENPEP, ZFP2, PEG3, EFS, ST8SIA1, MAGEH1, CDO1, GSPT2 and FGF2) were obtained.
GO and KEGG pathway enrichment analysis of DEGs in EC
GO and KEGG enrichment analysis of the DEGs were conducted using DAVID. The DEGs were most highly enriched in biological processes associated with cell division, mitotic nuclear division and cell proliferation (Table II). According to KEGG pathway enrichment analysis, the DEGs were predominantly associated with cell cycle, human T-lymphotropic virus (HTLV-I) infection and pathways in cancer (Table III).
Table II.
GO enrichment analysis of differentially expressed genes in endometrial cancer.
Term
Description
Count
P-value
FDR
GO:0051301
Cell division
33
9.46131×10−24
1.53×10−20
GO:0007067
Mitotic nuclear division
23
3.07185×10−16
5.32907×10−13
GO:0005829
Cytosol
69
1.14723×10−13
1.46927×10−10
GO:0030496
Midbody
16
2.8856×10−13
3.69671×10−10
GO:0007062
Sister chromatid cohesion
15
2.92563×10−13
4.71811×10−10
GO:0005634
Nucleus
87
1.79532×10−11
2.30007×10−8
GO:0000070
Mitotic sister chromatid segregation
9
2.71519×10−11
4.379×10−8
GO:0000775
Chromosome, centromeric region
11
4.53211×10−11
5.80631×10−8
GO:0000777
Condensed chromosome kinetochore
12
1.89383×10−10
2.42628×10−7
GO:0000776
Kinetochore
11
1.65624×10−9
2.12189×10−6
GO:0000086
G2/M transition of mitotic cell cycle
13
2.80607×10−9
4.52556×10−6
GO:0005654
Nucleoplasm
54
3.75741×10−9
4.81381×10−6
GO:0005515
Protein binding
113
9.52955×10−9
1.2931×10−5
GO:0008283
Cell proliferation
18
2.22925×10−8
3.59528×10−5
GO:0000922
Spindle pole
11
3.06811×10−8
3.9307×10−5
GO:0000083
Regulation of transcription involved in G1/S transition of mitotic cell cycle
7
3.62732×10−8
5.85005×10−5
GO:0005737
Cytoplasm
77
6.06599×10−8
7.77145×10−5
GO:0005876
Spindle microtubule
8
8.27121×10−8
0.000105967
GO:0005819
Spindle
11
8.34624×10−8
0.000106928
GO:0007059
Chromosome segregation
9
1.35128×10−7
0.00021793
GO:0015630
Microtubule cytoskeleton
11
2.69353×10−7
0.000345081
GO:0000082
G1/S transition of mitotic cell cycle
10
2.70795×10−7
0.00043673
GO:0008017
Microtubule binding
13
3.46922×10−7
0.000470751
GO:0031145
Anaphase-promoting complex-dependent catabolic process
Signaling pathway enrichment analysis of differentially expressed genes in endometrial cancer.
ID
Term
Count
P-value
FDR
hsa04110
Cell cycle
17
7.65415×10−14
9.04832×10−11
hsa04115
p53 signaling pathway
7
6.26158×10−5
0.074039093
hsa04914
Progesterone-mediated oocyte maturation
7
0.000268953
0.317664885
hsa05166
HTLV-I infection
9
0.004891749
5.635320021
hsa04114
Oocyte meiosis
6
0.005389396
6.192004384
hsa01200
Carbon metabolism
6
0.006271196
7.171038253
hsa05161
Hepatitis B
6
0.017244392
18.59650509
hsa03460
Fanconi anemia pathway
4
0.017517104
18.86329792
hsa05200
Pathways in cancer
10
0.020108452
21.35875474
hsa01130
Biosynthesis of antibiotics
7
0.022606828
23.69795919
hsa00010
Glycolysis/Gluconeogenesis
4
0.032322528
32.20200981
hsa01230
Biosynthesis of amino acids
4
0.041557594
39.47199877
HTLV-1, human T-lymphotropic virus; FDR, false discovery rate.
PPI network construction and modular analysis reveal critical candidate genes and pathways
STRING and Cytoscape software were used to screen 100 of the 160 DEGs into a PPI network complex, which contained 3,140 edges and 100 nodes (Fig. 3A). The remaining 60 DEGs did not fit into the PPI network. Of the 100 nodes, 57 hub genes were identified with a cut-off degree value of >30 and the top 10 genes with the most significant nodes were CDK1, CCNB1, CCNB2, TOP2A, CCNA2, CDC20, MAD2L1, BUB1B, NCAPG and CDCA8. According to the degree of importance, a significant module was selected from the PPI network complex for further analysis using MCODE. A total of 51 DEGs, including 51 nodes and 2,392 edges, were then selected as hub genes from the module (Fig. 3B).
Figure 3.
DEG PPI network and modular analysis. (A) Using the STRING online database, a total of 100 DEGs were filtered into the PPI network. The highlighted circle area is the most significant module. (B) The module consists of 51 nodes and 2,393 edges. DEG, differentially expressed gene; PPI, protein-protein interaction.
Integrated network analysis of miRNA-mRNA interaction
A total of 35 DEMs were filtered from the GSE35794 dataset, of which 20, consisting of 14 upregulated and 6 downregulated miRNAs, were validated in TCGA data. As presented in Table IV, the most significantly upregulated miRNA was hsa-miR-200b, while the most significantly downregulated miRNA was hsa-miR-503. Subsequently, the predicted targets of DEMs were obtained on the basis of the miRecords database. Since an inverse association was observed between miRNA expression and that of its target mRNA, DEMs with target genes identified as DEGs were selected for network analysis. A total of 29 pairs of DEMs and DEGs with an inverse association of expression met this criterion, including 14 DEMs and 14 overlapping genes (Fig. 4, Table V). Hsa-miR-203, hsa-miR-429, hsa-miR-200a, hsa-miR-200c and hsa-miR-141 exhibited the highest degrees (degree ≥3) in the network (Table VI).
Table IV.
Top five differentially expressed miRNAs in endometrial cancer compared with normal tissue.
A, Upregulated
miRNA
P-value
logFC
hsa-miR-200b
0.000101
7.633409
hsa-miR-205
0.001261
7.413916
hsa-miR-200a
0.000101
7.382017
hsa-miR-141
0.000143
7.254374
hsa-miR-200c
0.000143
7.108838
B, Downregulated
miRNA
P-value
logFC
hsa-miR-503
0.027533
−3.923641
hsa-miR-876-3p
0.047710
−3.048536
hsa-miR-144
0.043335
−2.710278
has-miR-133a
0.000100
−2.596223
has-miR-154
0.000100
−2.588022
miRNA or miR, microRNA; FC, fold-change.
Figure 4.
The miRNA-gene regulatory network in endometrial cancer. An ellipse represents a gene and a rhombus represents an miRNA. Red nodes represent upregulated genes and miRNAs, and green nodes represent downregulated genes and miRNAs in endometrial cancer. miRNA, micro-RNA.
Table V.
Correlation between differentially expressed miRNAs and target genes.
miRNA
Expression
Target gene
Expression
r
P-value
hsa-miR-96
Up
MITF
Down
−0.66790
3.77×10−22
hsa-miR-449a
Up
SNCA
Down
−0.26599
0.00064878
hsa-miR-429
Up
PDS5B
Down
−0.48758
5.39×10−11
hsa-miR-429
Up
MITF
Down
−0.55316
2.76×10−14
hsa-miR-203
Up
SPARC
Down
−0.39811
1.70×10−7
hsa-miR-203
Up
PDS5B
Down
−0.43891
5.75×10−9
hsa-miR-203
Up
FGF2
Down
−0.42727
1.58×10−8
hsa-miR-200c
Up
PDS5B
Down
−0.52345
1.05×10−12
hsa-miR-200c
Up
MITF
Down
−0.61052
8.09×10−18
hsa-miR-200c
Up
GPM6A
Down
−0.61152
6.92×10−18
hsa-miR-200b
Up
PDS5B
Down
−0.46714
4.19×10−10
hsa-miR-200b
Up
GPM6A
Down
−0.64805
1.51×10−20
hsa-miR-200a
Up
STAT5B
Down
−0.61687
2.96×10−18
hsa-miR-200a
Up
SPAG9
Down
−0.45047
2.02×10−9
hsa-miR-200a
Up
PDS5B
Down
−0.46840
3.71×10−10
hsa-miR-200a
Up
C1orf21
Down
−0.43953
5.44×10−9
hsa-miR-182
Up
MITF
Down
−0.69192
2.90×10−24
hsa-miR-182
Up
FOXN3
Down
−0.60518
1.85×10−17
hsa-miR-141
Up
STAT5B
Down
−0.66087
1.44×10−21
hsa-miR-141
Up
SPAG9
Down
−0.47761
1.49×10−10
hsa-miR-141
Up
PDS5B
Down
−0.51944
1.66×10−12
hsa-miR-141
Up
FOXN3
Down
−0.58257
5.20×10−16
hsa-miR-141
Up
C1orf21
Down
−0.42973
1.28×10−8
hsa-miR-135b
Up
FOXN3
Down
−0.64504
2.59×10−20
hsa-miR-429
Up
GPM6A
Down
−0.66256
1.05×10−21
hsa-miR-136
Down
BIRC5
Up
−0.16483
3.67×10−2
hsa-miR-133a
Down
CENPF
Up
−0.39548
2.08×10−7
hsa-miR-144
Down
BNC2
Up
−0.19293
0.0142069
has-miR-154
Down
HJURP
Up
−0.16526
0.0361743
miRNA or miR, microRNA.
Table VI.
Node-degree analysis of miRNA-mRNA interactions.
Node
Degree
hsa-miR-141
5
hsa-miR-200a
4
hsa-miR-200c
3
hsa-miR-203
3
hsa-miR-429
3
hsa-miR-200b
2
hsa-miR-182
2
hsa-miR-96
1
hsa-miR-449a
1
hsa-miR-144
1
hsa-miR-135b
1
hsa-miR-136
1
hsa-miR-133a
1
miRNA or miR, microRNA.
Survival analysis
The prognostic effects of the 51 hub genes in the PPI network were evaluated in OncoLnc. The OS of patients with EC was analyzed depending on low and high expression of each hub gene. TOP2A, CDCA8, AURKA, TTK, ASPM, CENPA, DLGAP5, RRM2, TPX2, KIF2C, UBE2C, CDC45, HMMR, FOXM1, KIF4A, TRIP13, SPAG5, MCM4, MKI67 and ESPL1 were significantly associated with worse OS (data not shown). The high mRNA expression levels of BIRC5, CENPF and HJURP were associated with worse OS of patients with EC (Fig. 5). In addition, BIRC5, CENPF and HJURP were identified as target genes of the DEMs (Table V). Furthermore, the BIRC5 expression level was significantly associated with tumor grade (P<0.01), while CENPF and HJURP expression levels were significantly associated with high tumor grade and recurrence (P<0.05; Fig. 6).
Figure 5.
Kaplan-Meier curves for patients with endometrial cancer. The prognostic values of (A) BIRC5, (B) TOP2A, (C) CENPF and (D) HJURP were obtained by Kaplan-Meier analysis. These data were all from The Cancer Genome Atlas.
Figure 6.
Associations between the expression levels of BIRC5, CENPF and HJURP, and clinical characteristics, including tumor grade and recurrence. (A and B) The association of BIRC5 expression level with tumor grade and recurrence. (C and D) The association of CENPF expression level with tumor grade and recurrence. (E and F) The association of HJURP expression level with tumor grade and recurrence. Data are presented as mean ± the standard error of the mean. *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. ns, not significant.
Discussion
The incidence of EC and EC-associated mortality rate have been increasing in recent years despite improvements in surgical and chemo-therapies (1). Therefore, it is important to elucidate the potential mechanisms of EC tumorigenesis and development, and identify the key pathogenic factors to improve prognosis and clinical outcome.The current study integrated two microarray expression profiles from GEO with TCGA data and identified 160 DEGs between the normal and tumor samples, including 111 upregulated and 49 downregulated genes. As per the GO and KEGG enrichment analysis, most of the DEGs were predicted to be associated with cell cycle, HTLV-I infection and pathways in cancer. Following construction of the PPI network, 51 hub genes were identified. Similarly, 20 DEMs were identified from the GEO and TCGA databases. After integrating the target genes of these DEMs with the DEGs, 14 overlapping genes were identified, of which three hub genes (BIRC5, CENPF, HJURP) were associated with poor prognosis and aggressive grade of patients with EC.The results of KEGG pathway analysis are noteworthy as several studies have previously demonstrated the involvement of the cell cycle in the development of EC (21,22). HTLV-1 has been identified to cause specific T cell leukemias and lymphoma (23). HTLV-1 infection is also associated with other diseases, including neuroinflammatory disease (24), dermatitis (25) and uveitis (26). In some populations, the development of aggressive cervical carcinomas is associated with high HTLV-1 seroprevalence (27). In addition, certain cancer types have been associated with HTLV-1-hematologic malignancies (28), including adenocarcinoma of the thyroid or stomach and squamous cell carcinoma of the larynx, lip or lung. Notably, one previous study revealed the occurrence of endometrial adenocarcinoma in a rabbit inoculated with HTLV-1 (29). These findings are consistent with the current study, indicating an important role of the HTLV-1 infection pathway in EC.miRNAs are a group of endogenous non-coding RNA molecules that can repress gene expression by targeting the 3′-UTR of mRNAs. Recent studies have reported that miRNA dysregulation may serve important roles in cancer development (30,31). In the current study, 20 DEMs were identified in EC compared with normal tissues, including hsa-miR-203, hsa-miR-429, has-miR-200a, hsa-miR-200c and hsa-miR-141. Several studies have suggested that hsa-miR-203 not only functions as an oncogene, but also as a tumor suppressor. It is downregulated in several tumors, including non-small-cell lung cancer, gastric mucosa-associated lymphoid tissue lymphoma and myeloma, and can inhibit G protein signaling 17, as well as the oncogene, B-cell-specific Moloney murine leukemia virus insertion site-1 (32–34). As an oncogene, hsa-miR-203 is overexpressed in ovarian cancer tissues where it promotes glycolysis (35). One study has reported frequent hypermethylation of miR-203 in EC (36), however the expression of miR-203 was upregulated in the current study, consistent with the findings of Benati et al (37). miRNAs are regulated by multiple mechanisms including epigenetic, transcriptional, post-transcriptional and degradation regulation (38). Although it is reported that miR-203 hypermethylation is associated with EC, to the best of our knowledge, no studies have investigated the association between miR-203 hypermethylation and its expression level. The pathways of miR-203 upregulation in EC may be due to other mechanisms, which requires further investigation.Hsa-miR-429 has been revealed to act as a tumor suppressor in renal cell carcinoma, gastric cancer and glioblastoma, by inhibiting cell proliferation, invasion and metastasis (39–41). However, hsa-miR-429 was upregulated in the current study, implying that it may function as an oncogene in EC. Hsa-miR-141 downregulates transmembrane-4-L-six-family-1 to inhibit pancreatic cancer cell invasion and migration and is widely considered as a potential candidate for the post-transcriptional regulation of phospholipase A2 receptor 1 expression in mammary cancer cells (42,43). One study has demonstrated that hsa-miR-141 upregulation is important for EC growth (44). Based on the aforementioned findings, the current study hypothesizes that hsa-miR-203, hsa-miR-141 and hsa-miR-429 serve important roles in EC via different pathways.Survival analysis of the overlapping DEGs and the target genes of the DEMs revealed that BIRC5, CENPF and HJURP were associated with poor prognosis of patients with EC. BIRC5 encodes survivin, which can regulate p21 expression in HeLa cells (45) and may be regulated by certain miRNAs (45,46). Chuwa et al (47) reported that a high expression level of BIRC5 is associated with poor prognosis of EC, while Li et al (48) demonstrated that low expression levels of CENPF are associated with better overall survival of patients with bladder cancer. HJURP encodes holiday junction recognition protein, a centromeric histone chaperone involved in de novo histone H3 variant CenH3 recruitment and may regulate proliferation and apoptosis in bladder cancer cells by dysregulating the cell cycle and reactive oxygen species metabolism via the peroxisome proliferator-activated receptor γ-sirtuin 1 feedback loop (49). Hu et al (50) identified that the overexpression of HJURP predicts a poor prognosis of hepatocellular carcinoma.In conclusion, the current study identified 160 DEGs and 20 DEMs in EC, and 14 DEGs were identified as target genes of the DEMs. Network analysis indicated a co-regulatory association between hsa-miR-203, hsa-miR-429 and hsa-miR-141, as well as the corresponding target mRNAs. These findings may improve understanding of the pathogenesis and the potential molecular mechanisms involved in EC, and assist with the identification of novel diagnostic and therapeutic biomarkers. However, the current study has limitations. The regulation of DEGs is complicated and the current study has only investigated the regulators of DEGs at the post-transcriptional level (miRNA) and the epigenetic level (DNA methylation). Additional studies should be performed to identify the putative regulators of DEGs. For example, future studies may construct a transcription factor-mRNA network to identify regulators at the transcriptional level.