Literature DB >> 31281439

Identification of key candidate genes for pancreatic cancer by bioinformatics analysis.

Kui Lv1, Jianying Yang1, Junfeng Sun1, Jianguo Guan1.   

Abstract

Although pancreatic cancer has the highest mortality rate among all neoplasms worldwide, its exact mechanism remains poorly understood. In the present study, three Gene Expression Omnibus (GEO) datasets were integrated to elucidate the potential genes and pathways that contribute to the development of pancreatic cancer. Initially, a total of 226 differentially expressed genes (DEGs) were identified in the three GEO datasets, containing 179 upregulated and 47 downregulated DEGs. Furthermore, function and pathway enrichment analyses were performed to explore the function and pathway of these genes, and the results indicated that the DEGs participated in extracellular matrix (ECM) processes. In addition, a protein-protein interaction network was constructed and 163 genes of the 229 DEGs were filtered into the network, resulting in a network complex of 163 nodes and 438 edges. Finally, 24 hub genes were identified in the network, and the top 2 most significant modules were selected for function and pathway analysis. The hub genes were involved in several processes, including activation of matrix, degradation of ECM and ECM organization. Taken collectively, the data demonstrated potential key genes and pathways in pancreatic cancer, which may provide novel insights to the mechanism of pancreatic cancer. In addition, these hub genes and pathways may be considered as targets for the treatment of pancreatic cancer.

Entities:  

Keywords:  bioinformatics analysis; differentially expressed gene; extracellular matrix; pancreatic cancer

Year:  2019        PMID: 31281439      PMCID: PMC6580103          DOI: 10.3892/etm.2019.7619

Source DB:  PubMed          Journal:  Exp Ther Med        ISSN: 1792-0981            Impact factor:   2.447


Introduction

Pancreatic cancer is one of the most prevalent and lethal malignancies worldwide (1). Although substantial progress has been made in adjuvant and neo-adjuvant chemotherapies during previous decades, pancreatectomy remains the most effective treatment, notably for early stage pancreatic cancer cases. Despite this, a previous study demonstrated that only 20% of patients present with localized and non-metastatic disease, and are therefore suitable for initial resection (2). Due to its specific tumor biology, pancreatic cancer is characterized with early recurrence and metastasis and resistance to chemotherapy and radiotherapy. The 5-year overall survival rate is <5% (3). Therefore, an improved understanding of the underlying mechanism of pancreatic cancer is required for the development of effective therapy and the improvement of patient survival. Previously, the development of high throughout sequencing has resulted in the production of numerous gene expression profiles of neoplasms that are freely available via the Gene Expression Omnibus (GEO) database (4). Based on these data, the different aspects of the mechanism of pancreatic neoplasm development and the resistance to chemotherapy may be investigated. However, only a small part of these data has been used, and the majority of them have only been deposited. Using a bioinformatic analysis, these data may be re-analysed and used to provide valuable information for subsequent investigation. During the re-analysis process, differentially expressed genes (DEGs) are initially identified, and subsequently the functions and pathways of the genes involved are investigated. Several studies performed in pancreatic cancer have been performed previously (5,6). Although the majority of these studies only focused on the identification of the most significant genes, the tumor and normal tissues were not paired in those analyses. Therefore, in the present study, three GEO datasets were selected, which contained paired tumor tissues and corresponding normal tissues, and the microarray data was analysed. The analysis led to the identification of the DEGs, and Gene Ontology (GO) and pathway enrichment analysis were subsequently performed to explore the biological functions and pathways of these genes. Furthermore, a protein-protein interaction (PPI) network was constructed and a module analysis was performed to explore the hub genes in pancreatic cancer. The present study may provide novel insights into the understanding of the mechanism of pancreatic cancer formation and its corresponding hub genes, and the pathways involved may serve as potential targets for the treatment of this cancer type.

Materials and methods

Data source

The microarray data for the investigation of pancreatic cancer were downloaded from the GEO datasets (www.ncbi.nlm.nih.gov/geo) as follows: GSE15471, GSE16515 and GSE28735. The gene expression profiles of GSE15471 and GSE16515 were obtained using the GPL570 platform, and GSE28735 was obtained using the GPL6244 platform. GSE15471 and GSE28735 included 36 and 45 pairs of tumor and corresponding normal tissues, respectively. GSE16515 consisted of 36 tumor and 16 normal tissues. A total of 16 pairs of tissues were selected for subsequent analysis.

DEG identification

DEGs between tumor and normal tissue samples were identified using the online analysis tool GEO2R (www.ncbi.nlm.nih.gov/geo/geo2r), and the results were saved as a .txt format. The cut-off criteria for DEG identification were defined as adjusted P<0.05 and |log fold change (FC)|>1. Subsequently, the DEGs of the three GEO datasets were processed to generate a Venn diagram using an online resource (bioinformatics.psb.ugent.be/webtools/Venn).

Gene Ontology (GO) and pathway enrichment analysis

To explore the primary functions and pathways of the DEGs involved, the Database for Annotation, Visualization and Integrated Discovery (DAVID; version 6.8; david.ncifcrf.gov) and the Panther (www.pantherdb.org) databases were employed to perform GO analysis. The Kyoto Encyclopedia of Genes and Genomes (7), and the Reactome pathway enrichment analysis were used for the pathway enrichment analysis (8–10). P<0.05 and a DEG count ≥2 were set as criteria.

PPI network construction and module analysis

The Search Tool for the Retrieval of Interacting Genes (STRING; string-db.org) was used to build a PPI network, and subsequently the network was visualized using Cytoscape v3.6.1 software (11). The cut-off criterion for the PPI network was a combined score >0.4, and the hub genes were defined by a node degree >10. The modules of the PPI network were calculated using the Molecular Complex Detection (MCODE) plug-in of Cytoscape with default parameters. Subsequently, the hub genes in high-scored modules (degree cut-off=2, node score cut-off=0.2) were selected for additional GO and pathway enrichment analyses.

Results

Identification of DEGs in pancreatic cancer

Gene expression profiles from three pancreatic cancer GEO datasets were analysed. Based on the cut-off criteria, 1,666, 1,501 and 412 DEGs were extracted from the GSE15471, GSE16515 and GSE28735 datasets, respectively. Subsequently, 226 DEGs were identified (Fig. 1) by integrated bioinformatics analysis, including 179 upregulated and 47 downregulated DEGs (Table I).
Figure 1.

DEGs in pancreatic cancer microarray datasets. Venn diagram represents DEGs identified using three pancreatic cancer GEO datasets (GSE15471, GSE16515 and GSE28735). DEGs were identified using GEO2R with P<0.05 and |logFC|>1. DEGs, differentially expressed genes. GEO, gene expression omnibus.

Table I.

Identification of DEGs. A total of 226 DEGs were identified, including 179 upregulated genes and 47 downregulated genes.

DEGsGene names
UpregulatedABHD17C, ACSL5, ADAM28, ADAM9, ADAMTS12, ADGRF1, AEBP1, AGR2, AHNAK2, AK4, ANKRD22, ANLN, ANO1, ANTXR1, ANXA10, ANXA3, ANXA8, APOL1, AREG, ARNTL2, ASAP2, ASPM, BGN, CAPG, CCL18, CCL20, CD109, CDH11, CDH3, CEACAM1, CEACAM5, CEACAM6, CEMIP, CLDN18, COL10A1, COL11A1, COL12A1, COL1A1, COL1A2, COL3A1, COL5A2, COL8A1, COMP, CORIN, CP, CST1, CST2, CTHRC1, CTSE, CXCL5, DDX60, DGKH, DHRS9, DKK1, DPCR1, DPYSL3, ECT2, EDIL3, EDNRA, EFNA5, EFNB2, ENO2, EPHA4, EPYC, ERO1A, ESM1, ETV1, FAP, FBXO32, FERMT1, FGD6, FN1, FOXQ1, FXYD3, GABRP, GALNT5, GCNT3, GJB2, GPRC5A, GPX2, GPX8, GREM1, HEPH, HK2, IFI27, IFI44L, IGF2BP3, IGFBP5, IL1R2, INHBA, INPP4B, ITGA2, ITGA3, ITGB4, KCNN4, KRT19, KRT7, KYNU, LAMA3, LAMB3, LAMC2, LCN2, LEF1, LOXL2, LRRN1, MALL, MATN3, MBOAT2, MELK, MET, MICAL2, MLPH, MMP1, MMP11, MMP12, MMP14, MMP7, MMP9, MTMR11, MXRA5, MYOF, NMU, NOX4, NPR3, NQO1, NRP2, NT5E, NTM, OAS1, OAS2, OLR1, OSBPL3, PCDH7, PGM2L1, PKM, PLA2R1, PLAC8, PLAT, PLAU, PLPP4, PLS1, POSTN, RAI14, RHBDL2, RUNX2, S100A16, S100P, SCEL, SCNN1A, SDR16C5, SERPINB3, SERPINB5, SLC22A3, SLC2A1, SLC44A4, SLC6A14, SLC6A6, SLPI, SRPX2, ST6GALNAC1, STYK1, SULF1, SULF2, SULT1C2, SYTL2, TCN1, TFF1, TGFBI, THBS2, TMC5, TMEM45B, TMPRSS4, TNFAIP6, TOP2A, TRIM29, TSPAN1, TSPAN8, VCAN, VSIG1
DownregulatedABAT, ACADL, ADAMTS6, ALB, ANPEP, AOX1, BACE1, BNIP3, BTG2, C5, CHRM3, CTNND2, DPP10, EGF, EPB41L4B, EPHX2, ERO1B, F11, F8, FAM129A, FAM150B, FGL1, GATM, GNMT, GP2, GSTA1, HOMER2, IAPP, KIAA1324, LIFR, MCOLN3, MT1G, NR5A2, NUCB2, PAIP2B, PDK4, PNLIPRP1, RBPJL, RGN, SERPINI2, SLC16A10, SLC1A2, SLC39A5, SLC43A1, SLC4A4, TMED6, TRHDE

DEGs, differentially expressed genes.

GO analysis of DEGs in pancreatic cancer

Candidate DEGs function enrichment analysis was conducted with DAVID and Panther software. The functions of the DEGs were classified into three groups as follows: ‘Cellular Component’; ‘Molecular Function’; and ‘Biological Process’ (Fig. 2). DEGs were primarily involved in binding and catalytic activity with regard to the ‘Molecular Function’ group, cellular and metabolic processes with regard to the ‘Biological Process’ group, and cell membrane with regard to the ‘Cellular Component’ group. Furthermore, the top 30 GO terms of these DEGs (Fig. 3), and the top 10 GO terms of the upregulated and downregulated DEGs were classified (Table II). The upregulated DEGs that were primarily enriched were associated with extracellular matrix (ECM) organisation, cell adhesion, collagen catabolic process and ECM disassembly in the ‘Biological Process’ group, whereas with regard to the ‘Cellular Component’ group ECM, extracellular space, extracellular region and extracellular exosome formation were the primary processes identified. The downregulated DEGs that were primarily enriched were associated with reactive oxygen species, metabolic processes, proteolysis and cellular response to starvation in the ‘Biological Process’ group. The enriched downregulated DEGs were also associated with extracellular exosome formation, extracellular space, integral component of plasma membrane and extracellular region with regard to the ‘Cellular Component’ group. These results indicated that the majority of DEGs were significantly enriched in processes, including extracellular exosome formation, ECM, ECM organisation, extracellular space and extracellular region.
Figure 2.

GO analysis of the DEGs in pancreatic cancer. GO analysis classified the DEGs into three groups. Green, molecular function; red, cellular component, and blue, biological process. GO, gene ontology; DEG, differentially expressed genes.

Figure 3.

Significantly enriched GO terms in DEGs. Significantly enriched GO terms in DEGs identified in pancreatic cancer. Green, molecular function; red, cellular component, and blue, biological process. DEGs, differentially expressed genes; GEO, gene expression omnibus.

Table II.

Enrichment analysis of DEGs in pancreatic cancer.

DEGsTermDescriptionCategoryP-value
UpregulatedGO:0030198Extracellular matrix organizationBP8.62×10−16
GO:0031012Extracellular matrixCC1.09×10−14
GO:0007155Cell adhesionBP2.15×10−14
GO:0030574Collagen catabolic processBP6.69×10−14
GO:0005615Extracellular spaceCC3.16×10−13
GO:0005576Extracellular regionCC3.37×10−11
GO:0005578Proteinaceous extracellular matrixCC1.11×10−10
GO:0035987Endodermal cell differentiationBP1.85×10−10
GO:0070062Extracellular exosomeCC1.37×10−9
GO:0022617Extracellular matrix disassemblyBP4.93×10−9
DownregulatedGO:0070062Extracellular exosomeCC5.19×10−6
GO:0072593Reactive oxygen species metabolic processBP9.67×10−5
GO:0006508ProteolysisBP1.62×10−3
GO:0005615Extracellular spaceCC5.84×10−3
GO:0015171Amino acid transmembrane transporterMF6.37×10−3
GO:0009267Cellular response to starvationBP6.44×10−3
GO:0005887Integral component of plasma membraneCC8.01×10−3
GO:0031093Platelet α granule lumenCC8.50×10−3
GO:0016323Basolateral plasma membraneCC1.05×10−2
GO:0005576Extracellular regionCC1.77×10−2

DEGs, differentially expressed genes; CC, cellular component; MF, molecular function; BP, biological process.

Pathway enrichment analysis of DEGs in pancreatic cancer

The DEG signaling pathway enrichment was performed with DAVID. The 30 significantly enriched pathway terms are demonstrated in Fig. 4. The data indicated that these enriched genes were involved in matrix formation, activation of matrix metalloproteinase enzymes, ECM organisation, degradation of the ECM, ECM proteoglycan formation, non-integrin membrane-ECM interactions and ECM-receptor interaction. The data suggested that the DEGs primarily participated in the regulation of the ECM.
Figure 4.

Significantly enriched pathway terms in DEGs. Significant enriched pathway terms in DEGs identified in pancreatic cancer. Orange, Reactome analysis and blue, KEGG analysis. DEGs, differentially expressed genes; KEGG, kyoto encyclopedia of genes and genomes.

The PPI network of the DEGs was constructed according to the STRING database and subsequently visualized and analysed using Cytoscape. The results indicated that 163 genes, including 133 upregulated and 30 downregulated genes, of the 229 DEGs were filtered into the PPI network, and the PPI network complex contained 163 nodes and 438 edges (Fig. 5). Among the 163 nodes, 24 hub genes were identified with the following filter: Node degree >10. The top 10 most significant hub genes were albumin, epidermal growth factor, matrix metalloproteinase (MMP) 9, collagen type I α 2 chain (COL1A2), fibronectin 1, collagen type I α 1 chain (COL1A1), collagen type III α 1 chain (COL3A1), tissue inhibitors of metalloproteinases 1 (TIMP1), integrin subunit α 2 and MMP2. Subsequently, MCODE was used to determine the role of the hub genes, and the top 2 significant modules were selected for subsequent analysis. Module 1 consisted of 10 genes, whereas module 2 consisted of 7 genes. GO analysis indicated that module 1 was primarily associated with ECM disassembly and ECM formation, whereas module 2 was primarily associated with ECM organisation and ECM formation (Table III). Furthermore, pathway enrichment analysis demonstrated that module 1 was mainly enriched in the activation of matrix and the degradation of the ECM, whereas module 2 was mainly enriched in ECM proteoglycan formation, ECM-receptor interaction and ECM organisation (Table IV). These data indicated that the hub genes served key roles in the regulation of the ECM.
Figure 5.

DEGs protein-protein interaction network complex. A total 229 DEGs were screened into the DEGs PPI network complex, containing 163 nodes (genes), which included 133 upregulated and 30 downregulated genes. Red, upregulated genes, and blue, downregulated genes. DEGs, differentially expressed genes; PPI, protein-protein interaction.

Table III.

Gene Ontology analysis of the top 2 modules genes.

ModulesTermDescriptionCategoryP-value
Module 1GO:0004252Serine-type endopeptidase activityMF7.34×10−6
GO:0022617Extracellular matrix disassemblyBP3.13×10−5
GO:0005576Extracellular regionCC5.57×10−5
GO:0070062Extracellular exosomeCC8.66×10−5
GO:0009986Cell surfaceCC8.69×10−5
GO:0006508ProteolysisBP2.14×10−4
GO:0005615Extracellular spaceCC3.18×10−4
GO:0031093Platelet α granule lumenCC3.31×10−4
GO:0031012Extracellular matrixCC5.06×10−4
GO:0030574Collagen catabolic processBP7.34×10−6
Module 2GO:0001501Skeletal system developmentBP2.22×10−10
GO:0030198Extracellular matrix organizationBP1.22×10−9
GO:0031012Extracellular matrixCC6.47×10−9
GO:0005615Extracellular spaceCC1.61×10−7
GO:0005578Proteinaceous extracellular matrixCC6.70×10−7
GO:0005201Extracellular matrix structuralMF1.19×10−6
GO:0007155Cell adhesionBP7.91×10−6
GO:0005576Extracellular regionCC2.97×10−5
GO:0005518Collagen bindingMF1.85×10−4
GO:0030574Collagen catabolic processBP2.12×10−4

CC, cellular component; MF, molecular function; BP, biological process.

Table IV.

Pathway enrichment analysis of the top 2 module genes.

ModulesPathwayNameP-value
Module 1R-HSA-1592389Activation of matrix3.63×10−6
R-HSA-1442490Collagen degradation1.71×10−3
R-HSA-1474228Degradation of the extracellular2.52×10−3
R-HSA-114608Platelet degranulation6.86×10−3
R-HSA-75205Dissolution of fibrin clot1.28×10−2
hsa04510Focal adhesion2.10×10−2
hsa04810Regulation of actin cytoskeleton2.30×10−2
R-HSA-210991Basigin interactions2.45×10−2
R-HSA-3000157Laminin interactions2.94×10−2
R-HSA-2022090Assembly of collagen fibrils and other multimeric structures4.47×10−2
Module 2R-HSA-3000178ECM proteoglycans2.14×10−8
R-HSA-216083Integrin cell surface interactions8.11×10−6
hsa04512ECM-receptor interaction1.89×10−5
R-HSA-1474244Extracellular matrix organization2.54×10−5
R-HSA-3000170Syndecan interactions8.48×10−5
R-HSA-3000171Non-integrin membrane-ECM1.88×10−4
R-HSA-2022090Assembly of collagen fibrils and other multimeric structures2.49×10−4
hsa04510Focal adhesion2.50×10−4
R-HSA-1442490Collagen degradation4.83×10−4
hsa04151PI3K-Akt signaling pathway1.14×10−3

CC, cellular component; MF, molecular function; BP, biological process; ECM, extracellular matrix; hsa, Homo sapiens.

Discussion

Pancreatic cancer is usually asymptomatic and is diagnosed in the advanced stages of progression due to lack of specific and sensitive detection markers that may be used during the early stages of the disease (12). The cancer cells are able to readily invade blood vessels and lymph nodes and metastasize to distant organs (13). Furthermore, pancreatic cancer is often resistant to conventional treatment, and it is characterised as one of the most lethal neoplasms (14). Previously, numerous studies have been performed to explore the underlying mechanism of pancreatic cancer progression and its treatment resistance. However, the disease prognosis has not changed significantly (15). The majority of these studies have focused on a single gene to investigate its potential role in pancreatic cancer, although cancer is a complex disease, and is not determined by only one or a few genes. Therefore, the present study integrated three pancreatic cancer microarray datasets from different studies, using bioinformatics analysis. Consequently, 226 DEGs were identified, which included 176 upregulated and 47 downregulated genes. The function and pathway enrichment analysis were conducted, and the results indicated that these genes primarily participated in the ECM process. Furthermore, a PPI network was constructed in order to determine the role of the hub genes, and MCODE was used to determine the interactions of these genes with several pathways. The results identified 24 hub genes among the 226 DEGs that were associated with ECM regulation. The ECM is a 3-dimensional non-cellular structure, which serves important roles in all tissues and biological process and is primarily responsible for the maintenance of tissue homeostasis and the regulation of development (16). As a complex network, the ECM consists of extracellular macromolecules, including glycoproteins, collagen and enzymes, which provide structural support to the surrounding cells and segregate cells from one another. This sequesters a variety of cellular growth factors and regulates intercellular communication (17,18). Although the composition of the ECM varies between tissues and species, its common functions comprise cell differentiation, cell adhesion and cell to cell communication (19). Stem cells are able to actively detect the elasticity and rigidity of the surrounding ECM, and subsequently adjust gene expression, which determines the differentiation process (20). Cells bind to the ECM via intermediate actin filaments, which is regulated by specific cell adhesion molecules, including integrin, fibronectin and laminin (21). The ECM interacts biochemically with the surrounding cells by serving as a ligand to transmit signals, which mediate cell adhesion, differentiation, apoptosis, survival, proliferation and migration (22). Although the ECM is responsible for regulating normal tissue development and homeostasis, its dysregulation also contributes to neoplastic progression. The development of cancer is significantly affected by the microenvironment and the ECM is considered the major component of tumor associated microenvironment (23). Although ECM remodelling is strictly regulated and mediated by the activities of specific enzymes, the dysregulation of these activities may account for the progression and development of certain disease conditions. This may lead to changes in the amount and composition of the ECM, which may significantly alter the biochemical properties of the ECM and promote the oncogenic effects of various pathways, and deregulate cell behaviours during malignant transformation (24). Several studies have suggested that the ECM is important for the maintenance of the polarity and architecture of tissues, and that deregulated ECM promotes epithelial-mesenchymal transition, eventually facilitating tumor invasion (25). Furthermore, the abnormal ECM function may promote tumor angiogenesis and lymphangiogenesis (26) and tumor-associated inflammation (27). In summary, abnormal ECM may promote tumor progression through a number of mechanisms. The present study revealed that the specific DEGs identified primarily participated in ECM formation, as demonstrated by analysis of specific hub genes, including MMP2, MMP9, TIMP1, COL1A2, COL1A1 and COL3A1. MMPs are the primary enzymes that degrade the ECM and are synthesized by tumor cells (MMP7) or tumor stromal cells (MMP2 and MMP9). A previous study has demonstrated that MMPs serves important roles in cancer progression by increasing tumor cell migration, invasion, metastasis and angiogenesis (28). In pancreatic cancer, MMPs were correlated with prognosis, survival time, local invasion and distant metastasis (29). TIMPs are the endogenous inhibitors of MMPs, which bind to active and alternative sites of the activated MMP enzymes, prevent angiostatin and endostatin production, and serve a role in the promotion of tumor angiogenesis (30). The balance between MMPs and TIMPs may determine the ECM dynamics and affect tumor progression (31). COL1A1, COL1A2 and COL3A1 are the genes that encode collagen proteins, and the target genes of transforming growth factor-β (TGF-β)/SMAD3. Collagens mediate the tumor metastatic process via the interactions with MMPs and may also serve significant roles in the immune response to cancer (32). β-TGF/SMAD3 may promote the expression of ECM-associated proteins (33). The aforementioned hub genes participated in ECM formation and may serve as potential targets for the treatment of pancreatic cancer.
  2 in total

1.  Gene Expression Characteristics of Tumor and Adjacent Non-Tumor Tissues of Pancreatic Ductal Adenocarcinoma (PDAC) In-Silico.

Authors:  Emine Güven
Journal:  Iran J Biotechnol       Date:  2022-01-01       Impact factor: 1.266

2.  Hypoxia-Related Gene FUT11 Promotes Pancreatic Cancer Progression by Maintaining the Stability of PDK1.

Authors:  Wenpeng Cao; Zhirui Zeng; Runsang Pan; Hao Wu; Xiangyan Zhang; Hui Chen; Yingjie Nie; Zijiang Yu; Shan Lei
Journal:  Front Oncol       Date:  2021-06-17       Impact factor: 6.244

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.