Literature DB >> 33344075

Identification of significant genes signatures and prognostic biomarkers in cervical squamous carcinoma via bioinformatic data.

Yunan He1, Shunjie Hu2, Jiaojiao Zhong3, Anran Cheng4, Nianchun Shan5.   

Abstract

BACKGROUND: Cervical squamous cancer (CESC) is an intractable gynecological malignancy because of its high mortality rate and difficulty in early diagnosis. Several biomarkers have been found to predict the prognose of CESC using bioinformatics methods, but they still lack clinical effectiveness. Most of the existing bioinformatic studies only focus on the changes of oncogenes but neglect the differences on the protein level and molecular biology validation are rarely conducted.
METHODS: Gene set data from the NCBI-GEO database were used in this study to compare the differences of gene and protein levels between normal and cancer tissues through significant pathway selection and core gene signature analysis to screen potential clinical biomarkers of CESC. Subsequently, the molecular and protein levels of clinical samples were verified by quantitative transcription PCR, western blot and immunohistochemistry.
RESULTS: Three differentially expressed genes (RFC4, MCM2, TOP2A) were found to have a significant survival (P < 0.05) and highly expressed in CESC tissues. Molecular biological verification using quantitative reverse transcribed PCR, western blotting and immunohistochemistry assays exhibited significant differences in the expression of RFC4 between CESC and para-cancerous tissues (P < 0.05).
CONCLUSION: This study identified three potential biomarkers (RFC4, MCM2, TOP2A) of CESC which may be useful to clarify the underlying mechanisms of CESC and predict the prognosis of CESC patients. ©2020 He et al.

Entities:  

Keywords:  Bioinformatics; Prognostic biomarker; RFC4; Cervical squamous carcinoma

Year:  2020        PMID: 33344075      PMCID: PMC7718800          DOI: 10.7717/peerj.10386

Source DB:  PubMed          Journal:  PeerJ        ISSN: 2167-8359            Impact factor:   2.984


Introduction

Cervical cancer now ranks fourth in the most prevalent cancers and it is the most common gynecological cancer in developing countries (Vu et al., 2018). Despite the increase in the incidence of cervical adenocarcinoma, cervical squamous carcinoma (CESC) is still the most common type of cervical cancer (Wang et al., 2004; Galic et al., 2012). Currently, a large number of gene mutations have been proved to be related to the pathogenesis of cervical cancer, which can be used as biomarkers for early detection, like DNA mutations occurring on the oncogenes tumor protein 53 (TP53) (Crook et al., 1992), phosphatase and tensin homolog (PTEN) (Yang et al., 2015). However, due to the difficulties of early detection and diagnosis, the survival rate of CESC patients still stays weak. Studies also showed that some biological markers can explain the pathogenesis of CESC and predict the consequences of this disease (Mao et al., 2019). Therefore, more reliable biological markers should be explored to comprehensively understanding the pathogenesis of CESC and guide treatment and prognosis. With the developed bioinformatics and statistical analyses, the potential marker genes can be detected effectively, which shows great strength in the field of discovery and prediction of tumor markers, and plays a guiding role in the treatment and prognosis of the disease (Banwait & Bastola, 2015). Some biomarkers have been found in the field of cervical cancer, such as MicoRNA-425-5p and MicoRNA-489, which have been proposed for prognostic prediction (Sun et al., 2017; Juan et al., 2018). However, the presented biomarkers for clinical application are far from enough, and in the previous bioinformatics studies, most studies only focus on the changes of oncogenes, which increases the possibility of clinical inefficacy. On the basis of learning the expression of differential genes between cancer tissues and normal tissues, this study analyzed and compared the difference in protein level between cancer tissue and normal tissue, which provides stronger evidence for the validity of biomarkers found in our bioinformatic research.

Materials & Methods

Information of the microarray data

NCBI-GEO (Gene Expression Omnibus) is known as a free public database of microarray cohort. The gene profiles of GSE27678, GSE39001 and GSE7803 were obtained in this study. The three datasets were on the account of GPL570 platform, GPL201 platform and GPL96 platform, including 14 normal cervical tissues and 28 CESC tissues, 12 normal cervical tissues and 43 CESC tissues, 10 normal cervical tissues and 21 CESC tissues, respectively.

Identification of differentially expressed genes

The differentially expressed genes (DEGs) were analyzed by GEO2R to obtain the number of up-down-regulated genes (Barrett et al., 2013). The genes with —log Fold Change— ≥2 and P <  0. 05 were screened as differentially expressed genes.

Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses

Gene Ontology (GO) is an international standardized classification system of gene function, which provides a dynamic updating database to describe the attributes of genes and gene products in organisms (Ashburner et al., 2000). The main biological functions of differentially expressed genes could be determined by GO functional significance enrichment analysis. The GO items with q <  0. 05 were considered to be significantly enriched in DEGs. The Kyoto Encyclopedia of Genes and Genomes (KEGG) database is a bioinformatics resource for linking genomes to life and the environment (Kanehisa et al., 2017). Based on the KEGG database, the enriched pathway analysis of DEGs was carried out to find out the important pathway.

PPI & module analysis

Cytoscape 3.8.0 is a software that was used for visualization and analyzation of complex network (Shannon et al., 2003). Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) is an application that could conduct protein interaction group research, genome research and proteome research (Doncheva et al., 2019). By mapping the information of DEGs to the STRING, we evaluated the protein-protein interaction (PPI) information of DEGs. Interactions experimentally validated with combined score >0.4 and were selected. Subsequently, we used another tool embedded in the Cytoscape named Molecular Complex Detection (MCODE) to clustering constructed functional module of PPI network (Bader & Hogue, 2003). The MCODE scores were set to be greater than 10 and nodes number more than 6. Functional and pathway enrichment for DEGs in the modules were also conducted, P < 0.05 was considered to have significant difference.

Survival analysis of significant genes in CESC and RNA expression of core genes

Kaplan–Meier (K-M) is a widely used method for estimating the survival rate of cancer patients and “Survival” package was applied in the R studio software (Rich et al., 2010). As for the compare of the magnitude of the difference in survival between the 2 groups, a Cox univariate hazard ratio (HR) was calculated. The clinical significance of each genes was also evaluated by performing the survival analysis of single gene in survival-related gene sets. A log-rank test was used to calculate the statistical significance of the survival difference between these 2 groups mentioned above, and the P value set as 0.05 was considered to be significant. Gene Expression Profiling Interactive Analysis (GEPIA) is visualization tool for gene research (Tang et al., 2017). In this study, GEPIA was applied to analyze RNA expression of selected genes on the basis of thousands of simples from the TCGA database.

Specimen collection

The tissues or cells of CESC patients were collected from Xiangya Hospital of Central South University in order to verify the high expression of RFC4 in tumor tissues for molecular and protein levels. This study was proved by Medical Ethics Committee of Xiangya Hospital (No. 201912542). CESC Patients and the kin have signed a consent form, agreeing to use cervical tissue for scientific research.

Molecular biological verification of differences in gene expression

CESC tissues and para-cancerous tissues (para-CT) were selected from CESC patients to conduct the molecular validation of RFC4. The expression levels of RFC in CESC patients with different pathological stages were also compared. The pathological stage of I and II are regarded as early stage which including 4 I B1 patients, 7 I B2 patients, 3 I B3 patients, 3 II A1 patients and 1 II A2 patient. Stage III are divided into advanced stage and 17 patients in III C1 stage were included. Total RNA was extracted from CESC tissues and para-CT using Trizol Reagent (RNAiso Plus, TaKaRa, 9109) according to the manufacturer’s protocols, and reverse transcribed into cDNA using a PrimeScript™ RT reagent Kit with gDNA Eraser (TaKaRa, RR047A-1). Gene expression levels were assessed by quantitative reverse transcribed PCR (qRT-PCR) with TB Green™ Premix (Tli RNaseH Plus, TaKaRa, RR820A) and specific primers: RFC4 forward: 5′-GGCAGCTTTAAGACGTACCATGG-3′; RFC4 reverse: 5′-TCTGACAGAGGCTTGAAGCGGA-3′. The β-actin expression was used as the normalization control. Relative mRNA levels are analyzed using 2−ΔΔCt method.

Verification of differences in protein expression

We adopted the cancerous tissues and para-CT of CESC patients to analyze the differences in protein expression by Western Blotting (WB) technology. The samples for WB analysis was separated using SDS-PAGE and transferred onto a PVDF membrane (Roche) which was blocked with 5% nonfat milk in Tris-buffered saline and incubated overnight at 4 °C with target antibodies against the following proteins: Anti-RFC4 antibody (ab156780, Abcam) and Anti-β-Actin antibody (ab115777, Abcam). After three times washing with PBST (10 min for each time), the membrane was incubated with species-appropriate HRP-conjugated secondary antibodies, the fluorescent signals were detected using SageCapture™ imaging system (SAGECREATION company). Immunohistochemistry (IHC) assays were also performed to detected protein levels in CESC tissues and para-CT. The tissues were performed into 5- µm-thick tissue sections with formalin fixed and paraffin embedded. Subsequently, there sections were deparaffinized and rehydrated with xylene and graded ethanol respectively, followed by heated in antigen retrieval solution (EDTA, PH 9.0) and endogenous peroxidase inactivation with 3% H2O2. After blocking, the samples were incubated overnight at 4 °C with anti-RFC4 antibody (1:100, ab156780, Abcam). Then the slides were treated with the HRP-conjugated secondary antibody and stained with 3, 3′-diaminobenzidine until brown granules appeared in the membrane, cytoplasm, or nucleus. Finally, the sections were counterstained with hematoxylin at room temperature.

Results

Screening for DEGs

Ninety-two cancer tissues and 36 normal tissues were selected from the three datasets in total, with the help of GEO2R tools, 211, 134 and 260 DEGs were extracted from GSE39001, GSE7803 and GSE27678. And Venn diagram was made by the Venn diagram software to investigate the commonly DEGs in all the three datasets. The results showed that there were 25 commonly DEGs in total and 18 of them were down-regulated while 7 were up-regulated (Fig. 1 and Table 1).
Figure 1

Identification of 25 common DEGs in the three datasets (GSE39001, GSE7803 and GSE27678) through Venn diagrams software.

Different color meant different datasets. (A) Seven DEGs were up-regulated in the three datasets (logFC > 2). (B). Eighteen DEGs were down-regulated in three datasets (logFC > −2).

Table 1

25 common DEGs identified from the three datasets.

ExpressionGenes Name
Up-regulatedM2M DTL CDKN2A TOP2A NUSAP1 RFC4 PLOD2
Down-regulatedEMP1 IGF1 ALOX12 EDN3 PTGDS KRT1 FOSB GREB1 ESR1 PAMR1 CXCL12 HPGD AR MAL CRNN CRISP3 CFD NDN

Identification of 25 common DEGs in the three datasets (GSE39001, GSE7803 and GSE27678) through Venn diagrams software.

Different color meant different datasets. (A) Seven DEGs were up-regulated in the three datasets (logFC > 2). (B). Eighteen DEGs were down-regulated in three datasets (logFC > −2).

GO and KEGG results shows significant signaling pathways of DEGs.

(A) The results of GO analysis for pathways associated with molecular function (MF). (B) The results of GO analysis for pathways associated with cellular component (CC). (C) The results of GO analysis for pathways associated with biological process (BP). (D) The results of KEGG analysis.

Significant pathways identified in CESC

We investigated upregulated and downregulated DEGs to identify the most significantly enriched pathways in each group by GO and KEGG pathway analysis (Fig. 2 and Table 2). With GO analyzing, the results indicated that (1) for biology processes (BP) , the most significantly enriched pathways of the DEGs were epidermis development, positive regulation of cell proliferation, peptide cross-linking, regulation of cell proliferation, positive regulation of cellular process, epidermal cell differentiation, skin development, keratinocyte differentiation, positive regulation of nuclear division, positive regulation of mitotic nuclear division; (2) for molecular function (MF), they were chemokine activity, chemokine receptor binding, calcium ion binding, collagen binding, CXCR chemokine receptor binding, growth factor activity, intergrin binding, cytokine activity, peptidase activity, acting on L-amino acid peptides, CCR chemokine receptor binding; (3) for cell component (CC), DEGs were significantly enriched in spindle, intercalated disc, intermediate filament, mitotic spindle, nuclear chromosome part, spindle midzone, condensed chromosome kinetochore, platelet alpha granule lumen, spindle microtubule and kinesin complex.
Figure 2

GO and KEGG results shows significant signaling pathways of DEGs.

(A) The results of GO analysis for pathways associated with molecular function (MF). (B) The results of GO analysis for pathways associated with cellular component (CC). (C) The results of GO analysis for pathways associated with biological process (BP). (D) The results of KEGG analysis.

Table 2

GO analysis of different expressed genes in CESC.

ExpressionCategoryTermCount%p-ValueFDR
Up-regulatedGOTERM_MF_DIRECTGO:0005524∼ATP binding1021.032.84E−40.270953
GOTERM_MF_DIRECTGO:0003678∼DNA helicase activity24.210.01233311.169161
GOTERM_MF_DIRECTGO:0003688∼DNA replication origin binding24.210.01844516.278756
GOTERM_MF_DIRECTGO:0003682∼chromatin binding48.410.02152518.752384
GOTERM_CC_DIRECTGO:0030496∼midbody612.622.68E−72.65E−4
GOTERM_CC_DIRECTGO:0005737∼cytoplasm1429.441.50E−40.147574
GOTERM_CC_DIRECTGO:0005876∼spindle microtubule36.310.0014861.457975
GOTERM_CC_DIRECTGO:0005654∼nucleoplasm816.820.0055545.351061
GOTERM_CC_DIRECTGO:0000784∼nuclear chromosome, telomeric region36.310.0630919.946422
GOTERM_CC_DIRECTGO:0072687∼meiotic spindle24.210.01428213.241928
GOTERM_CC_DIRECTGO:0042555∼MCM complex24.210.01428213.241928
GOTERM_CC_DIRECTGO:0005680∼anaphase-promoting complex24.210.03533929.901598
GOTERM_CC_DIRECTGO:0005634∼nucleus918.930.04947039.406713
GOTERM_CC_DIRECTGO:0072686∼mitotic spindle24.210.05426242.356169
GOTERM_CC_DIRECTGO:0005819∼spindle24.210.06274547.263108
GOTERM_CC_DIRECTGO:0000776∼kinetochore24.210.09268261.726231
GOTERM_BP_DIRECTGO:0000910∼cytokinesis36.319.78E−41.154575
GOTERM_BP_DIRECTGO:0044772∼mitotic cell cycle phase transition24.210.0077708.844187
GOTERM_BP_DIRECTGO:0051988∼regulation of attachment of spindle microtubules to kinetochore24.210.01163312.969466
GOTERM_BP_DIRECTGO:0031145∼anaphase-promoting complex-dependent catabolic process24.210.01355914.961799
GOTERM_BP_DIRECTGO:0006268∼DNA unwinding involved in DNA replication24.210.01548216.908679
GOTERM_BP_DIRECTGO:0007095∼mitotic G2 DNA damage checkpoint24.210.01931620.670192
GOTERM_BP_DIRECTGO:0007076∼mitotic chromosome condensation24.210.02122822.486821
GOTERM_BP_DIRECTGO:0000070∼mitotic sister chromatid segregation24.210.03451134.093744
GOTERM_BP_DIRECTGO:0001578∼microtubule bundle formation24.210.03827437.079667
GOTERM_BP_DIRECTGO:0000281∼mitotic cytokinesis24.210.04015138.521683
GOTERM_BP_DIRECTGO:0006270∼DNA replication initiation24.210.04015138.521683
Down-regulatedGOTERM_MF_DIRECTGO:0005198∼structural molecule activity96.885.12E−64.557253
GOTERM_MF_DIRECTGO:0004252∼serine-type endopeptidase activity75.351.02E−40.004512
GOTERM_MF_DIRECTGO:0008201∼heparin binding64.591.73E−40.187501
GOTERM_MF_DIRECTGO:0005496∼steroid binding32.290.0036393.880030
GOTERM_MF_DIRECTGO:0004962∼endothelin receptor activity21.530.01529311.655613
GOTERM_MF_DIRECTGO:0003707∼steroid hormone receptor activity32.290.02293931.330301
GOTERM_CC_DIRECTGO:0005615∼extracellular space2519.127.92E−117.97E−8
GOTERM_CC_DIRECTGO:0070062∼extracellular exosome3929.821.45E−104.43E−9
GOTERM_CC_DIRECTGO:0005576∼extracellular region129.182.40E−60.002416
GOTERM_CC_DIRECTGO:0001533∼cornified envelope53.824.01E−50.040335
GOTERM_CC_DIRECTGO:0005578∼proteinaceous extracellular matrix75.352.35E−40.236029
GOTERM_CC_DIRECTGO:0045095∼keratin filament43.060.0021882.179044
GOTERM_CC_DIRECTGO:0042567∼insulin-like growth factor ternary complex21.530.02333321.135448
GOTERM_CC_DIRECTGO:0031012∼extracellular matrix43.060.02398221.661062
GOTERM_CC_DIRECTGO:0001527∼microfibril21.530.03479729.965502
GOTERM_CC_DIRECTGO:0042581∼specific granule21.530.06287847.957744
GOTERM_CC_DIRECTGO:0016323∼basolateral plasma membrane32.290.08757360.215342
GOTERM_BP_DIRECTGO:0018149∼peptide cross-linking53.826.99E−40.098265
GOTERM_BP_DIRECTGO:0030216∼keratinocyte differentiation53.823.21E−40.451079
GOTERM_BP_DIRECTGO:0007565∼female pregnancy32.290.0016352.274249
GOTERM_BP_DIRECTGO:0008284∼positive regulation of cell proliferation75.350.0024163.344514
GOTERM_BP_DIRECTGO:0045840∼positive regulation of mitotic nuclear division32.290.0044346.057636
GOTERM_BP_DIRECTGO:0048146∼positive regulation of fibroblast proliferation32.290.01535119.550011
GOTERM_BP_DIRECTGO:0006955∼immune response53.820.01676421.157726
GOTERM_BP_DIRECTGO:0001558∼regulation of cell growth32.290.01803022.573586
GOTERM_BP_DIRECTGO:0001755∼neural crest cell migration32.290.018963823.602523
GOTERM_BP_DIRECTGO:0014826∼vein smooth muscle contraction21.530.02213827.006079
GOTERM_BP_DIRECTGO:0001775∼cell activation21.530.03302537.638886
GOTERM_BP_DIRECTGO:0014068∼positive regulation of phosphatidylinositol 3-kinase signaling32.290.03403038.544070
GOTERM_BP_DIRECTGO:0007267∼cell–cell signaling32.290.04283545.968556
GOTERM_BP_DIRECTGO:0021952∼central nervous system projection neuron axonogenesis21.530.04379346.724165
GOTERM_BP_DIRECTGO:0030198∼extracellular matrix organization32.290.04820750.079668
GOTERM_BP_DIRECTGO:0048484∼enteric nervous system development21.530.04913250.758144
GOTERM_BP_DIRECTGO:0043568∼positive regulation of insulin-like growth factor receptor signaling pathway21.530.04913250.758144
GOTERM_BP_DIRECTGO:0005978∼glycogen biosynthetic process21.530.07539266.786495
GOTERM_BP_DIRECTGO:0006885∼regulation of pH21.530.07539266.786495
GOTERM_BP_DIRECTGO:0010596∼negative regulation of endothelial cell migration21.530.07539266.786595
GOTERM_BP_DIRECTGO:0031290∼retinal ganglion cell axon guidance21.530.08055769.302524
GOTERM_BP_DIRECTGO:0010906∼regulation of glucose metabolic process21.530.09080373.777717
GOTERM_BP_DIRECTGO:0048662∼negative regulation of smooth muscle cell proliferation21.530.09588375.764591
GOTERM_BP_DIRECTGO:0048675∼axon extension21.530.09588375.764591
The results of KEGG analysis demonstrated that the most significant signaling pathways of DEGs were cell cycle, pathways in cancer, ECM-receptor interaction, arrhythmogenic right ventricular cardiomyopathy (ARVC), melanoma, PI3K-Akt signaling pathway, focal adhesion, vascular smooth muscle contraction, DNA replication and oocyte meiosis (Table 3).
Table 3

KEGG analysis of DEGs in CESC.

Pathway IDNameCountp-ValueGenes
04110Cell cycle137.76E−6PCNA, CDKN2A, BUB1B, CDC7, TTK, SMC1B, CDC20, CCNB1, PTTG1, CDK1, MCM4, MCM5, MCM2
05200Pathways in cancer292.77E−5LAMA2, CKS1B, FGF7, EDNRA, EDNRB, RUNX1T1, PDGFRB, PDGFRA, JUP, CDKN2A, MMP1, ITGA2, PTCH1, FN1, IGF2, MITF, FOS, IGF1, WNT16, GNG11, ESR1, AR, CXCL12, GSTA4, CKS2, BIRC5, FGFR2, GSTM5, FGF10
04512ECM-receptor interaction91.36E−4TNXB, VWF, LAMA2, ITGA2, ITGA8, SPP1, FN1, HMMR, ITGA9
05412Arrhythmogenic right ventricular cardiomyopathy (ARVC)82.93E−4GJA1, LAMA2, JUP, ITGA2, ITGA8, DSG2, DSC2, ITGA9
05218Melanoma82.93E−4PDGFRB, PDGFRA, FGF7, CDKN2A, PDGFD, MITF, IGF1, FGF10
04151PI3K-Akt signaling pathway203.18E−4PDGFRB, PDGFRA, TNXB, VWF, LAMA2, ITGA2, IGF2, FN1, IGF1, GNG11, AREG, EREG, GYS2, FGF7, PDGFD, SPP1, ITGA8, FGFR2, FGF10, ITGA9
04510Focal adhesion139.45E−4PDGFRB, PDGFRA, TNXB, VWF, LAMA2, ITGA2, FN1, IGF1, MYLK, PDGFD, SPP1, ITGA8, ITGA9
04270Vascular smooth muscle contraction100.001189ACTA2, GUCY1A2, PPP1R14A, EDNRA, EDN3, MYH11, MRVI1, AVPR1A, ACTG2, MYLK
03030DNA replication50.001491PCNA, RFC4, MCM4, MCM5, MCM2
04114Oocyte meiosis90.002920CDC20, AR, CCNB1, PTTG1, CDK1, PGR, IGF1, SMC1B,AURKA

Systematic analysis of core genes by PPI network

PPI network investigated the systematic interaction between the DEGs we got above. Twenty-five DEGs in total were mapped to the DEGs PPI network with 99 nodes and 270 edges. Seven up-regulated DEGs and 18 down-regulated DEGs were included in the PPI network. And then Cytotype MCODE was applied for further analysis of the DEGs in PPI network, and we got a result of 15 particular nodes being identified which were all up-regulated DEGs (Fig. 3).
Figure 3

Common DEGs PPI network constructed by STRING online database and Module analysis.

(A) Nodes meant proteins; the edges meant the interaction of proteins. (B) Module analysis via Cytoscape MCODE tool (degree cutoff = 2, node score cutoff = 0.2, k-core = 2, and max. Depth = 100).

Common DEGs PPI network constructed by STRING online database and Module analysis.

(A) Nodes meant proteins; the edges meant the interaction of proteins. (B) Module analysis via Cytoscape MCODE tool (degree cutoff = 2, node score cutoff = 0.2, k-core = 2, and max. Depth = 100).

Analysis of core gene signature in CESC using K-M plotter and GEPIA

To investigate the survival data of the 15 genes we identified, K-M plotter indicated that three (TOP2A, RFC4, MCM2) of them had a significant survival rate while other 12 genes had not (P > 0.05) (Fig. 4 and Table 4). The expression of TOP2A, RFC4, MCM2 in normal tissue and CESC tissue was detected by GEPIA. The results showed that the expression of these three genes in CESC tissue was higher than that in normal tissue (P < 0.05) (Fig. 5).
Figure 4

The prognostic information of the 15 core genes.

Three (A, B and C) of 15 genes had a significant better survival rate (P < 0.05) and twelve genes (D–O) had not significant difference in OS (P > 0.05).

Table 4

The information of prognostic analysis of 15 core DEGs.

CategoryGenes
Genes with significant (better) survival (P < 0.05)TOP2A RFC4 MCM2
Genes without significant survival (P > 0.05)UBE2C PRC1 NUSAP1 NEK2 MCM5 KIF20A HMMR FANCI ECT2 DTL AURKA ASPM
Figure 5

Expression level of three significantly expressed genes in CESC tissues and normal tissues.

(A) The expression level of TOP2A in CESC tissues and normal tissues. (B) The expression level of MCM2 in CESC tissues and normal tissues. (C) The expression level of RFC4 in CESC tissues and normal tissues. Red color means tumor tissues and grey means normal tissues.

The prognostic information of the 15 core genes.

Three (A, B and C) of 15 genes had a significant better survival rate (P < 0.05) and twelve genes (D–O) had not significant difference in OS (P > 0.05).

RFC4 is validated to be overexpressed in CESC

By analyzing the data from the NCBI-GEO dataspace for mRNA expression in CESC patients, RFC4 gene was identified as an overexpressed gene in CESC patients. We collected 35 pairs of CESC patients for qPCR, the tissues of 6 pairs CESC patients were used for WB, 9 pairs CESC tissues and 4 normal cervical tissues for IHC. In order to validate our finding, total RNA was extracted from 35 paired CESC tissues and para-CT tissues, and qRT-PCR was conducted to measure the expression level of RFC4 gene. The result showed that the expression level of RFC4 on CESC tissues was significantly high compared with para-CT (P = 0.0197) (Fig. 6). And the expression of RFC4 in early stage CESC was significantly higher than that in advanced CESC (P = 0.0314) (Fig. 7). The same result was invested from WB. The results of WB analysis indicated that the RFC4 was overexpressed in CESC tissues compared to para-CT tissues (Fig. 8). A higher level of RFC4 expression on CESC tissues was observed from the result of IHC, and RFC4 protein was mainly concentrated in the nucleus (Fig. 9).
Figure 6

The expression of RFC4 on CESC was significantly different compared with para-cancerous tissues from the result of qRT-PCR.

Figure 7

Expression levels of RFC4 in different pathological stages of CESC.

The expression of RFC4 in early stage CESC was significantly higher than that in advanced CESC (P = 0.0314).

Figure 8

WB analysis of RFC4 protein.

C: CESC tissues, P: para-cancerous tissues. (A) Six pairs CESC tissues WB analysis indicated that except that the results of case 4 are not obvious, the others are consistent with the expected results of high expression of RFC4 in tumor tissues. (B) The grayscale analysis of multiple WB bands shows that the WB tests are reliable.

Figure 9

IHC test of CESC.

IHC declared that, in general, the RFC4 protein is highly expressed in tumor tissue sections, and is mainly concentrated in the nucleus, while normal cervical tissue and para-cancerous tissues are underexpressed.

Expression level of three significantly expressed genes in CESC tissues and normal tissues.

(A) The expression level of TOP2A in CESC tissues and normal tissues. (B) The expression level of MCM2 in CESC tissues and normal tissues. (C) The expression level of RFC4 in CESC tissues and normal tissues. Red color means tumor tissues and grey means normal tissues.

Expression levels of RFC4 in different pathological stages of CESC.

The expression of RFC4 in early stage CESC was significantly higher than that in advanced CESC (P = 0.0314).

WB analysis of RFC4 protein.

C: CESC tissues, P: para-cancerous tissues. (A) Six pairs CESC tissues WB analysis indicated that except that the results of case 4 are not obvious, the others are consistent with the expected results of high expression of RFC4 in tumor tissues. (B) The grayscale analysis of multiple WB bands shows that the WB tests are reliable.

IHC test of CESC.

IHC declared that, in general, the RFC4 protein is highly expressed in tumor tissue sections, and is mainly concentrated in the nucleus, while normal cervical tissue and para-cancerous tissues are underexpressed.

Discussion

In order to identify more effective prognostic biomarkers in CESC, we used different bioinformatics methods to analyze three data sets based on NCBI-GEO database, including 92 CESC tissues and 36 normal tissues. A total of 25 DEGs were selected by GEO2R and Venn software, including seven up-regulated genes and 18 down-regulated genes. Then GO and KEGG pathway analysis were conducted, and the results of GO and KEGG indicated that the selected DEGs were significantly enriched in various cell pathways. Research reported that genes from these pathways could be associated with the pathogenesis and progression of cervical cancer. Nucleolar and spindle associated protein 1(NUSAP1) was a gene from spindle associated pathway, and it was reported to promote the metastasis of cervical cancer by activating Wnt/β-catenin signaling (Li et al., 2019). And studies showed that CXCL12/CXCR4 pathways was associated with HPV infection as a co-factor, which means a high risk to the incidence of cervical cancer (Meuris et al., 2016). Genes involved epidermis development were also associated with the high-risk HPV infection (Zhang et al., 2018; Chatterjee et al., 2019). After that PPI network was constructed using STRING software and MCODE analysis was conducted, and 15 particular DEGs were identified. Furthermore, by K-M plotter analysis we found three DEGs from the 15 which had a significantly better survival. The results of GEPIA showed that the expression levels of the three selected genes in CESC tissues were higher than that in normal tissues. To further validation, we performed RFC4 relevant molecule biological experiments and the results showed that compared with normal tissues, RFC4 was highly expressed in CESC tissues. Being short for Replicant Factor C, RFC is a structure specific DNA- binding protein acting as a primer recognition factor for DNA polymerase (Zhou & Hingorani, 2012), which includes five subunits (RFC1-5). Among all five subunits of RFC complex, RFC4 has been reported to play an important role in DNA damage checkpoint and DNA replication pathways (Ellison & Stillman, 2003). In 2009, Arai M et al. reported that RFC4 was closely related to the prognosis of liver cancer (Arai et al., 2009). Besides liver cancer, RFC4 has been reported to be associated with several types of cancer, including prostate cancer, colon cancer non-small cell lung cancer and leukemia (LaTulippe et al., 2002; Jung, Choi & Kim, 2009; Erdogan et al., 2009; Barfeld et al., 2014). Research illustrated that up-regulated RFC4 expression found in neck squamous cell carcinoma and it was 3.4-fold higher than that in normal tissues (Slebos et al., 2006). Studies from Garnett et al. (2012) showed that RFC4 can be regulated by mutated RB1 in several types of cancers, suggesting that RFC4 could be a potential biomarker associated with the occurrence and prognosis of various cancers. Moreover, RFC4 was reported as an independent predictor of overall survival in breast cancer (Fatima et al., 2017; Niu et al., 2017). In this study we observed RFC4 as a potential independent prognostic biomarker in CESC, and our results suggested that CESC patients with higher expression level of RFC4 may have a better overall survival. A possible reason might be that RFC4 was highly expressed throughout the cell circle process of proliferating cells, and tumor proliferation in situ will become slow with the development of the disease (Szymanska et al., 2018; Chaplain & Sleeman, 1993), which means a decrease in the expression of RFC4. Therefore, highly expressed RFC4 may suggest early stage CESC, which indicates better overall survival. Several studies have proved that these three genes were associated with numerous types of cancer, but studies of RFC4 in CESC were rarely seen, and very few researches conducted molecule biology validation. Therefore, our study shows that RFC4 is a potential biomarker for the predicting the prognosis of CESC and provides a direction for further study of CESC. What should be noted is that there are some limitations in this study. Clinical samples from one hospital may have either region or race difference. The expression level of RFC4 in different stages of CESC and clinical investigations should be conducted in our future study to validate our results further.

Conclusions

In conclusion, by using bioinformatics analysis we identified three genes (TOP2A, RFC4, MCM2) based on three microarray datasets. These three genes were suggested to have a significant effect on the prognosis of CESC, which could be key factors in the occurrence and progression of CESC. A high level expressed RFC4 was validated to exist in CESC tissues using clinical samples. Although further investigation and experiments needs to be conducted, the findings in our study could act as clinical biomarkers which would help us better understand the pathological process and predict the prognostic of CESC. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  34 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

Review 2.  Contribution of bioinformatics prediction in microRNA-based cancer therapeutics.

Authors:  Jasjit K Banwait; Dhundy R Bastola
Journal:  Adv Drug Deliv Rev       Date:  2014-11-06       Impact factor: 15.470

3.  Computational Modelling of Cancer Development and Growth: Modelling at Multiple Scales and Multiscale Modelling.

Authors:  Zuzanna Szymańska; Maciej Cytowski; Elaine Mitchell; Cicely K Macnamara; Mark A J Chaplain
Journal:  Bull Math Biol       Date:  2017-06-20       Impact factor: 1.758

4.  Identification of a 26-lncRNAs Risk Model for Predicting Overall Survival of Cervical Squamous Cell Carcinoma Based on Integrated Bioinformatics Analysis.

Authors:  Yu Mao; Zhanzhao Fu; Lixin Dong; Yue Zheng; Jing Dong; Xin Li
Journal:  DNA Cell Biol       Date:  2019-01-30       Impact factor: 3.311

5.  Cytoscape StringApp: Network Analysis and Visualization of Proteomics Data.

Authors:  Nadezhda T Doncheva; John H Morris; Jan Gorodkin; Lars J Jensen
Journal:  J Proteome Res       Date:  2018-12-05       Impact factor: 4.466

6.  Meta-analysis of oncogenic protein kinase Ciota signaling in lung adenocarcinoma.

Authors:  Eda Erdogan; Eric W Klee; E Aubrey Thompson; Alan P Fields
Journal:  Clin Cancer Res       Date:  2009-02-17       Impact factor: 12.531

7.  Expression profiles of SV40-immortalization-associated genes upregulated in various human cancers.

Authors:  Hyun Min Jung; Seong-Jun Choi; Jin Kyeoung Kim
Journal:  J Cell Biochem       Date:  2009-03-01       Impact factor: 4.429

8.  Cervical adenocarcinoma and squamous cell carcinoma incidence trends among white women and black women in the United States for 1976-2000.

Authors:  Sophia S Wang; Mark E Sherman; Allan Hildesheim; James V Lacey; Susan Devesa
Journal:  Cancer       Date:  2004-03-01       Impact factor: 6.860

Review 9.  Meta-analysis of prostate cancer gene expression data identifies a novel discriminatory signature enriched for glycosylating enzymes.

Authors:  Stefan J Barfeld; Philip East; Verena Zuber; Ian G Mills
Journal:  BMC Med Genomics       Date:  2014-12-31       Impact factor: 3.063

10.  GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses.

Authors:  Zefang Tang; Chenwei Li; Boxi Kang; Ge Gao; Cheng Li; Zemin Zhang
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

View more
  3 in total

1.  RFC4 promotes the progression and growth of Oral Tongue squamous cell carcinoma in vivo and vitro.

Authors:  Jian Zhang; Linlin Wang; Xiao Xie
Journal:  J Clin Lab Anal       Date:  2021-03-30       Impact factor: 2.352

2.  Multi-Omics Analysis Identified TMED2 as a Shared Potential Biomarker in Six Subtypes of Human Cancer.

Authors:  Nuzhat Sial; Saba Saeed; Mukhtiar Ahmad; Yasir Hameed; Abdul Rehman; Mustansar Abbas; Rizwan Asif; Hamad Ahmed; Muhammad Safdar Hussain; Jalil Ur Rehman; Muhammad Atif; Muhammad Rashid Khan
Journal:  Int J Gen Med       Date:  2021-10-21

3.  The upregulated expression of RFC4 and GMPS mediated by DNA copy number alteration is associated with the early diagnosis and immune escape of ESCC based on a bioinformatic analysis.

Authors:  Jing Wang; Fei-Fei Luo; Tie-Jun Huang; Yan Mei; Li-Xia Peng; Chao-Nan Qian; Bi-Jun Huang
Journal:  Aging (Albany NY)       Date:  2021-09-14       Impact factor: 5.682

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.