Literature DB >> 33402836

Bioinformatics Analysis of Key Genes and Pathways of Cervical Cancer.

Huan Chen1, Xi Wang1, Huanhuan Jia2, Yin Tao3, Hong Zhou3, Mingyuan Wang4, Xin Wang1, Xiaoling Fang1.   

Abstract

BACKGROUND AND
OBJECTIVE: Globally, cervical cancer (CC) is the fourth most common cancer affecting women. Although effective screening reduces its incidence, it remains one of the most serious cancers threatening the health of women. Therefore, the purpose of this study is to find new genes that can be used as potential biomarkers for the prognosis of CC. METHODS AND
RESULTS: After downloading three datasets such as GSE6791, GSE63678, and GSE63514 from the Gene Expression Omnibus (GEO), we combined the expression matrixes and analyzed them to obtain the differential expressed genes (DEGs). Next, using the STRING website, we performed the protein interaction network analysis. Subsequently, hub genes were screened using the R and Cytoscape software. Then, the expression difference and survival analyses of the hub genes were confirmed using GIPIA. Here, we established that the KNTC1 gene was correlated to the overall survival prognosis of CC. Besides, the expression of the KNTC1 gene in the GSE63514 dataset was significantly different from that of the normal cervix, cervical pre-cancerous lesions, and CC. Consequently, immunohistochemistry analysis showed that the results have a definite diagnostic value.
CONCLUSION: The KNTC1 gene could be linked with the pathophysiology of CC and maybe one of the early diagnostic markers for the diagnosis of cervical pre-cancerous lesions.
© 2020 Chen et al.

Entities:  

Keywords:  KNTC1; bioinformatics analysis; cervical cancer; differentially expressed gene; pathways

Year:  2020        PMID: 33402836      PMCID: PMC7778384          DOI: 10.2147/OTT.S281533

Source DB:  PubMed          Journal:  Onco Targets Ther        ISSN: 1178-6930            Impact factor:   4.147


Introduction

Worldwide,1 cervical cancer (CC) is the fourth most common cancer affecting women. It has a morbidity of 371,000 cases and a mortality rate of 190,000 cases annually.2 Though its incidence has dropped year after year in developed countries, it is escalating in developing countries. Studies have shown that approximately 90% of CC-related deaths have occurred in developing countries. On the other hand,3 cervical intraepithelial neoplasia (CIN) is a pre-cancerous potential transformation of squamous cells on the surface of the cervix. Following an abnormal hyperplasia degree, it is categorized into three different types such as CIN I, II, and. The CIN I type is also called a low-grade squamous intraepithelial lesion (LSIL), and CIN II and CIN III are the high-grade squamous intraepithelial lesion (HISL).4 Typically, it takes 8 years on average for CIN to develop into CC. According to literature,5,6 squamous cell carcinoma accounts for about 80 to 90% of the total CC cases. First, pre-cancerous lesions develop from LSIL to HISL, and subsequently to squamous cell carcinoma. However,7,8 the CC pathogenesis is unclear and could be caused by recurrent genetic alterations, microenvironment, lifestyle, smoking, etc.9–12 Different studies have revealed that persistent HPV infection is the main trigger of CC. However, only a small proportion of HPV infection patients will eventually develop CC.13 Etiological studies have shown that the CC occurrence is a complicated process that involves multiple factors, and its molecular mechanism has not yet been fully elucidated. Therefore, there is a need to find an indicator that predicts the pre-cancerous lesion which leads to cancer. Currently,14 the profile data of gene expression is available, and its data analysis using bioinformatics has become a promising area of new research. Studies have shown that15 bioscience research increasingly depends on bioinformatics to efficiently organize and analyze biological data and experimental results, which16 provides a theoretical basis for a profound understanding of molecular events occurring in human biology and disease. Besides,17,18 it has been extensively used to study molecular changes in tumor progression to better comprehend the development of human carcinogenesis development and to discover novel predictive cancer markers. GEO () is a publicly available database that supports a large number of microarray expression data sets. Here, we downloaded and analyzed three already established microarray datasets such as19 GSE6791,20 GSE63678,21 GSE63514 from the GEO database. Subsequently, we performed an analysis of DEGs that are present in normal cervical and CC tissues using the R statistical software. Concurrently, we verified differential gene expression using different databases like22 the online analysis tool GIPIA, based on GTEx, and TCGA databases, which contained 306 CC samples and 13 normal cervical tissues. Next, the DEGs protein interactions were analyzed on the STRING online website. Consequently, our study established that the KNTC1 expression levels were substantially related to the OS of CC patients and it was quite different in normal cervix, cervical pre-cancerous lesions, and CC. Lastly, we used immunohistochemistry to verify this finding and we concluded that the KNTC1 gene is a prospective key gene for CC pre-cancerous lesions and prognosis.

Materials and Methods

Microarray Data

We downloaded19–21 three CC GEO datasets, including GSE6791, GSE63678, and GSE63514, from the Gene Expression Omnibus (). Table 1 illustrates the details of GEO cervical cancer data.
Table 1

Dataset Information

GEOPlatformSampleNormalTumor
GSE63678GPL571Cervix55
GSE6791GPL570Cervix820
GSE63514*GPL570Cervix2428

Notes: *The datasets also contain 14 CIN I, 22 CIN II, 40 CIN III samples.

Dataset Information Notes: *The datasets also contain 14 CIN I, 22 CIN II, 40 CIN III samples.

Data Normalization and Exploration

We first downloaded a series of matrix files, and the platform was converted using the Perl programming and scripting language software. Next, the ID corresponding to the probe name was converted into an international standard name for genes (symbols) and then merged into a CSV file.23 Subsequently, we eliminated the batch effect using the Surrogate Variable Analysis (SVA) package of Bioconductor ().24 Consequently, we selected the DEGs using the limma package. Lastly, we selected DEGs, using a cutoff criteria of adj.P.Val <0.05 and |log2 (fold change) | >2.

Protein–Protein Interaction (PPI) Network

We analyzed the PPI pairs of the screened DEGs using25 the online database STRING version 11 (. org/). Next, the pairs with a combining score >0.4 were used for the PPI network construction. Then, we removed the partial and isolated connected nodes, constructed a complex network of DEGs, downloaded the excel of string-interactions, and saved them in a CSV file format. Lastly, the DEGs expression and String-interactions files were imported into Cytoscape software for further processing, whereby we constructed a network and analyzed the interactive relationship of the candidate DEGs encoding proteins in CC.

DEGs Basic Expression in Normal and Cancer Tissues

Here,22 the online website GEPIA was selected as a validation set for Screening the DEGs. GEPIA is a newly developed interactive web server that has a standard processing pipeline and was used to analyze the sequenced RNA expression data of 9736 tumors and 8587 normal samples from the TCGA and GTEx projects, that were initially based on the UCSC Xena Project ().

Biomarker Discovery for Early Diagnosis

The potential application for early diagnosis was evaluated using GSE63514 datasets to perform microarrays prediction analysis by comparing normal to LISL, LISL to HISL, and HISL to CC samples.

Tissue Collection

From August 2020 to September 2020, normal cervical, CIN I–III, and cervical squamous cell carcinoma tissue samples were acquired from Zhuzhou Hospital which is allied to Xiangya Medical College. All specimens were subjected to immunohistochemical evaluation and confirmed by two independent pathologists. This study was approved by the research and clinical trial ethics committee of Zhuzhou Hospital, and all eligible participants provided written informed consent. All clinical procedures were performed per the ethical standards of the Declaration of Helsinki guidelines and relevant policies in China.

Immunohistochemistry

To validate Kinetochore Associated protein 1 (KTNC1), immunohistochemical staining was performed on the specimen sections of normal, LSIL, HSIL, and CC samples. The following steps were followed: section dewaxing, antigen repair, KTNC1 antibody (Abcam, Cambridge, UK) incubation, and secondary antibody incubation.26

Results

Identification of DEGs in CC

Our study involved 39 healthy individuals and 51 patients with CC and their three mRNA expression profiles were merged into one dataset. Using a cutoff criteria of adj.P.Val <0.05 and |log FC| >2, we extracted a total of 161 differentially expressed probesets (63 downregulated and 98 upregulated) from the expression profile datasets. The volcano map illustrates the differential genes (Figure 1A) whereas the heatmap shows differentially expressed probe sets between normal versus cancer tissues (Figure 1B).
Figure 1

(A) Volcano plot of three datasets (GSE63914, GSE63678, GSE6791), orange nodes represent DEGs with logFC >2 and p-value of <0.05. Blue nodes represent DEGs with logFC <–2 and p-value of <0.05. (B) A heat map of all DEGs of three datasets, Each column represents a sample and each row represents one gene. The gradual color ranging from blue to orange represents the gene expression changing from downregulation to upregulation. The letter N represents normal tissue, and the letter T represents cancer tissue. (C) GO enrichment analysis of DEGs.

(A) Volcano plot of three datasets (GSE63914, GSE63678, GSE6791), orange nodes represent DEGs with logFC >2 and p-value of <0.05. Blue nodes represent DEGs with logFC <–2 and p-value of <0.05. (B) A heat map of all DEGs of three datasets, Each column represents a sample and each row represents one gene. The gradual color ranging from blue to orange represents the gene expression changing from downregulation to upregulation. The letter N represents normal tissue, and the letter T represents cancer tissue. (C) GO enrichment analysis of DEGs.

Pathway Analysis

Here, enrichment analyses were performed for Gene Ontology gene sets (GO) using the ClusterProfiler27 and the org.Hs.eg.db packages.28 As illustrated in Figure 1C, it was observed that the 161 CC intrinsic genes were mostly involved in the cell cycle through processes such as DNA replication and recombination, RNA metabolism, purine, and pyrimidine metabolism. Besides, by utilizing the ClusterProfiler package,27 a similar analysis was performed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. However, only one pathway was found to be involved in DNA replication.

PPI Network Construction

At a combined score >0.4, Figure 2A shows a total of 136 DEGs (92 upregulated and 44 downregulated genes) from the 161 commonly altered DEGs that were filtered into the DEGs PPI network and they contained 136 nodes and 1363 edges. Subsequently, the node degree29 was estimated using the cytohubba app in Cytoscape software. Next, as illustrated in Figure 2B, the most significant 20 node degree genes were selected as the hub nodes since they could play a vital role in the progression of CC.
Figure 2

(A) PPI of DEGs, orange nodes for upregulation genes, blue nodes for downregulation genes. (B) The most significant 20 node degree genes calculated by cytohubba APP of cytoscape. (C) The top 20 genes selected by using the String-interactions file to calculated the counts between DEGs. (D) The Venn diagram of the top 20 degree genes and the top 20 counts genes.

(A) PPI of DEGs, orange nodes for upregulation genes, blue nodes for downregulation genes. (B) The most significant 20 node degree genes calculated by cytohubba APP of cytoscape. (C) The top 20 genes selected by using the String-interactions file to calculated the counts between DEGs. (D) The Venn diagram of the top 20 degree genes and the top 20 counts genes.

Key Gene Signatures Identification in CC

To calculate the counts between DEGs, we used the String-interactions file, and as illustrated in Figure 2C, we selected the top 20 genes. Subsequently, we overlapped the 20 genes with hub nodes, and as demonstrated in Figure 2D, we selected the following key genes; MELK, CDK1, BUB1B, NCAPG, KIF11, PBK, TOP2A, TTK, DLGAP5, ASPM, DTL, CCNB1, CDC6, RAD51AP1, KNTC1, KIF15, and NDC80. Also, as shown in Figure 3A and B, univariate analysis (P<0.05) showed that the expression levels of KNTC1 and TOP2A were significantly related to the OS of the CC patients. The high expression of KNTC1 and TOP2A could lead to a high rate of survival. Notably, in CC tissues, the expression of these two genes was significantly higher than that of normal tissues (P<0.01; Figure 3C and D). However, its expression was not correlated to the CC grade. Taken together, the high level of these two genes could represent the significant prognostic factor that predicts the survival of CC patients.
Figure 3

(A and B) Expression boxplots of KNTC1/TOP2A genes using GEPIA website (P<0.05). (C and D) Survival analysis. Expression level of KNTC1/TOP2A was significantly related to the overall survival of patients with cervical squamous cancer (p < 0.05).

(A and B) Expression boxplots of KNTC1/TOP2A genes using GEPIA website (P<0.05). (C and D) Survival analysis. Expression level of KNTC1/TOP2A was significantly related to the overall survival of patients with cervical squamous cancer (p < 0.05).

Validation of Hub Genes

To identify biomarkers for early discovery, we adopted a four-stage approach. Here, we first extracted the KNTC1 expression profiles from GSE6791 datasets. These datasets contained 24 normal cervix,14 CIN I, 22 CIN II, 40 CIN III, and 28 CC tissues. Next, we compared the expression profiles of normal, LSIL (CIN I), HSIL (CIN II, III), and CC samples using the Wilcox test. As illustrated in Figure 4A, we established that the KNTC1 expression is significantly different in normal cervix, cervical pre-cancerous lesions, and in CC tissues.
Figure 4

(A) The expression of KNTC1 of normal cervix, cervical precancerous lesions and CC in Dataset GSE6791.a represents normal cervix, b represents LSIL (CIN I), c represents HSIL (CIN II), d represents cervical squamous cell carcinoma. (B–F) Immunohistochemical staining of KNTC1 gene in normal cervical, LSIL (CIN I), HSIL (CIN II, III) and cervical squamous cell carcinoma specimens.

(A) The expression of KNTC1 of normal cervix, cervical precancerous lesions and CC in Dataset GSE6791.a represents normal cervix, b represents LSIL (CIN I), c represents HSIL (CIN II), d represents cervical squamous cell carcinoma. (B–F) Immunohistochemical staining of KNTC1 gene in normal cervical, LSIL (CIN I), HSIL (CIN II, III) and cervical squamous cell carcinoma specimens. Besides, we performed immunochemical staining for KNTC1 in the normal cervix, LSIL, HSIL, and CC tissue, which could show the following; weak staining of the normal cervical tissue, HISL was significantly enhanced than LSIL, and the cervical squamous cell carcinoma was also significantly enhanced than normal cervical tissues (Figure 4B–F).

Discussion

Studies have shown that hub genes play critical roles during the different cancer progression stages. Although several studies on the development of CC are available, there is a need to establish more research to identify driver genes and candidate drugs for the treatment of CC. In a nutshell, our study integrated three different gene profile datasets, utilized bioinformatics techniques to analyze these datasets, and identified a total of 161 frequently changed DEGs (98 upregulated and 63 downregulated). The pathway enrichment analysis showed that cell cycle and DNA replication pathways were over-represented among the upregulated genes. Besides, the PPI network was constructed and it included 136 nodes and 1363 edges. Consequently, 16 key genes were determined and chosen for subsequent survival analysis. Here, we observed that the KNTC1 gene was related to the prognosis of CC patients. Also, small molecules that could provide new insights into the therapeutic studies of CC were identified. Studies have shown that the Kinetochore-related protein 1 (KNTC1) gene is disseminated in the cytoplasm, nucleus, chromosomal centromere, centromere, cytoskeleton, and in the spindle fiber. Its distribution is different with the cell cycle, and it is evenly distributed throughout the cytoplasm and nucleus during its interphase stage. Besides, a small part of the entire cell is concentrated on the centromere from the early stage of cell division to the late stage of the middle and early stages.30 Previous studies have also shown that by the middle of cell division, it was identified at the centromere, along the spindle fiber, majorly at the pole. However, from the late stage of splitting to the final stage, the centromere or spindle fiber could no longer be detected, but it still existed at the pole of the spindle.31 The KNTC1 gene acts on the G2-M phase of the cell cycle and several studies have established that it is involved in the cell cycle process, hence participating in the occurrence of disease. Moreover, it has a significant association with the progression of liver fibrosis.32 Notably, the level of this gene was lower in mycoplasma pneumonia, than in the healthy control group. Also, KNTC1 has been investigated in various human malignancies and has been associated with tumor histopathological grade.33 A study explored the expression levels of KNTC1 mRNA in three esophageal squamous cell carcinoma (ESCC) cell lines and established that all the tested cell lines showed that it is positively expressed. However, knocking out this gene can effectively inhibit cell viability and increase apoptosis. From the online ESCC dataset, a gene set enrichment analysis showed that overexpression of the KNTC1 gene is related to increased mitotic spindles and hypoxic pathways, and with decreased DNA repair and mismatch repair pathways. KNTC1 could play an important role in regulating the viability and apoptosis of ESCC cells and may perhaps become a novel therapeutic strategy for ESCC.34 Previous studies have shown that in patients with oral squamous cell carcinoma, the transcriptional activity of exact cell cycle genes, like the KNTC1, changes with the size of the tumor, which could reflect the evolution and adaptation of tumor molecules to the microenvironment. Also, compared with the healthy control group, the KNTC1 expression in neuroblastoma samples increased statistically.35 Although the increase in gene expression level is minimal, it can also play a significant role in promoting neuroblastoma.36 Furthermore, the KNTC1 frameshift mutation could cause signal damage/repair regulation and promote gastric cancer (GC) and colorectal cancer (CRC) with microsatellite instability (MSI) development.37–39 Several bioinformatics studies have also established that the KNTC1 gene could be one of the vital genes associated with cancer development, including Hepatocellular carcinoma (HCC), Pancreatic cancer (PC), and nasopharyngeal carcinoma (NPC).40 In mitotic checkpoints, the KNTC1 gene is a key component that ensures proper chromosome separation during cell division. These processes are both critical, highly ordered biological processes that depend on numerous evolutionarily conserved protein complexes.41 A study showed that many proteins that regulate the mitotic process are overexpressed in human malignancies than in their normal counterparts, and some are conveyed to be oncogenes.42 To date, the different KNTC1 gene functions include: recruiting cytoplasmic dynein to centromeres; participating in the chromosomes mitochondrial movement during mitosis; maintaining functional metaphase checkpoints;40 regulating exit from mitosis;43 protein-containing complex assembly; involved in cell division; separation of sister chromatids; resolution of sister chromatid cohesion; EML4 and NUDC in mitotic spindle formation;44 Amplification of signal from unattached kinetochores through a MAD2inhibitory signal; and mitotic Prometaphase. Clinically, this study has great significance for a better understanding of the development and prognosis of CC. However, it is imperative to perform further clinical research to ascertain whether these genes can be used as biomarkers for the diagnosis and prognosis of CC.

Conclusion

Taken together, our study has established that the KNTC1 gene is closely related to the poor prognosis of CC. Its expression is also different in normal cervical, pre-cancerous lesions, and cancer tissues. Therefore, it could be an important biomarker for early diagnosis and prediction of cervical pre-cancerous lesions.
  42 in total

1.  Downregulation of CCR5 inhibits the proliferation and invasion of cervical cancer cells and is regulated by microRNA-107.

Authors:  Li-Fan Che; Su-Fang Shao; Li-Xin Wang
Journal:  Exp Ther Med       Date:  2015-12-04       Impact factor: 2.447

2.  Identification of core genes and outcomes in hepatocellular carcinoma by bioinformatics analysis.

Authors:  Shu Shen; Junjie Kong; Yiwen Qiu; Xianwei Yang; Wentao Wang; Lvnan Yan
Journal:  J Cell Biochem       Date:  2018-12-07       Impact factor: 4.429

Review 3.  Papillomaviruses and cancer: from basic studies to clinical application.

Authors:  Harald zur Hausen
Journal:  Nat Rev Cancer       Date:  2002-05       Impact factor: 60.716

4.  Calcium-activated potassium channels as potential early markers of human cervical cancer.

Authors:  Ana Ramírez; Eunice Vera; Armando Gamboa-Domínguez; Paul Lambert; Patricio Gariglio; Javier Camacho
Journal:  Oncol Lett       Date:  2018-03-07       Impact factor: 2.967

5.  Array-based gene expression, CGH and tissue data defines a 12q24 gain in neuroblastic tumors with prognostic implication.

Authors:  Maija Wolf; Miikka Korja; Ritva Karhu; Henrik Edgren; Sami Kilpinen; Kalle Ojala; Spyro Mousses; Anne Kallioniemi; Hannu Haapasalo
Journal:  BMC Cancer       Date:  2010-05-05       Impact factor: 4.430

6.  The ZW10 and Rough Deal checkpoint proteins function together in a large, evolutionarily conserved complex targeted to the kinetochore.

Authors:  F Scaërou; D A Starr; F Piano; O Papoulas; R E Karess; M L Goldberg
Journal:  J Cell Sci       Date:  2001-09       Impact factor: 5.285

7.  Association of traffic-related hazardous air pollutants and cervical dysplasia in an urban multiethnic population: a cross-sectional study.

Authors:  Michael E Scheurer; Heather E Danysh; Michele Follen; Philip J Lupo
Journal:  Environ Health       Date:  2014-06-13       Impact factor: 5.984

8.  Knockdown of NEAT1 induces tolerogenic phenotype in dendritic cells by inhibiting activation of NLRP3 inflammasome.

Authors:  Maomao Zhang; Yang Zheng; Yong Sun; Shuang Li; Liangqi Chen; Xiangyuan Jin; Xinyu Hou; Xianglan Liu; Qi Chen; Jing Li; Mingyang Liu; Xianghui Zheng; Yongxiang Zhang; Jian Wu; Bo Yu
Journal:  Theranostics       Date:  2019-05-24       Impact factor: 11.556

9.  UniProt: a worldwide hub of protein knowledge.

Authors: 
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

10.  Systematic prediction of key genes for ovarian cancer by co-expression network analysis.

Authors:  Mingyuan Wang; Jinjin Wang; Jinglan Liu; Lili Zhu; Heng Ma; Jiang Zou; Wei Wu; Kangkai Wang
Journal:  J Cell Mol Med       Date:  2020-04-21       Impact factor: 5.310

View more
  6 in total

1.  Kinetochore-associated protein 1 promotes the invasion and tumorigenicity of cervical cancer cells via matrix metalloproteinase-2 and matrix metalloproteinase-9.

Authors:  Caimei Wang; Yiyuan Wang; Congrong Liu; Xiaoyu Meng; Zhongxia Hang
Journal:  Bioengineered       Date:  2022-04       Impact factor: 6.832

2.  Bioinformatics Screening of Potential Biomarkers from mRNA Expression Profiles to Discover Drug Targets and Agents for Cervical Cancer.

Authors:  Md Selim Reza; Md Harun-Or-Roshid; Md Ariful Islam; Md Alim Hossen; Md Tofazzal Hossain; Shengzhong Feng; Wenhui Xi; Md Nurul Haque Mollah; Yanjie Wei
Journal:  Int J Mol Sci       Date:  2022-04-02       Impact factor: 5.923

3.  An integrated analysis of prognostic and immune infiltrates for hub genes as potential survival indicators in patients with lung adenocarcinoma.

Authors:  Zhiyun Xu; Shi Wang; Zhijian Ren; Xiang Gao; Lin Xu; Shuai Zhang; Binhui Ren
Journal:  World J Surg Oncol       Date:  2022-03-30       Impact factor: 2.754

4.  Metadata analysis to explore hub of the hub-genes highlighting their functions, pathways and regulators for cervical cancer diagnosis and therapies.

Authors:  Md Selim Reza; Md Alim Hossen; Md Harun-Or-Roshid; Mst Ayesha Siddika; Md Hadiul Kabir; Md Nurul Haque Mollah
Journal:  Discov Oncol       Date:  2022-08-22

5.  Bioinformatics analysis of differentially expressed genes and pathways in the development of cervical cancer.

Authors:  Baojie Wu; Shuyi Xi
Journal:  BMC Cancer       Date:  2021-06-26       Impact factor: 4.430

6.  Five candidate biomarkers associated with the diagnosis and prognosis of cervical cancer.

Authors:  Hong-Yan Han; Jiang-Tao Mou; Wen-Ping Jiang; Xiu-Ming Zhai; Kun Deng
Journal:  Biosci Rep       Date:  2021-03-26       Impact factor: 3.840

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.