Literature DB >> 27347177

Identification of disease-associated pathways in pancreatic cancer by integrating genome-wide association study and gene expression data.

Jin Long1, Zhe Liu1, Xingda Wu1, Yuanhong Xu1, Chunlin Ge1.   

Abstract

In order to additionally understand the pathogenesis of pancreatic cancer (PC), the present study conducted pathway analysis based on genome-wide association study (GWAS) and gene expression data to predict genes that are associated with PC. GWAS data (accession no., pha002874.1) were downloaded from National Center for Biotechnology Information (NCBI) database of Genotypes and Phenotypes, which included data concerning 1,896 patients with PC and 1,939 control individuals. Gene expression data [accession no., GSE23952; human pancreatic carcinoma Panc-1 transforming growth factor-β (TGF-β) treatment assay] were downloaded from NCBI Gene Expression Omnibus. Gene set enrichment analysis was used to identify significant pathways in the GWAS or gene expression profiles. Meta-analysis was performed based on pathway analysis of the two data sources. In total, 58 and 280 pathways were identified to be significant in the GWAS and gene expression data, respectively, with 7 pathways significant in both the data profiles. Hsa 04350 TGF-β signaling pathway had the smallest meta P-value. Other significant pathways in the two data sources were negative regulation of DNA-dependent transcription, the nucleolus, negative regulation of RNA metabolic process, the cellular defense response, exocytosis and galactosyltransferase activity. By constructing the gene-pathway network, 5 pathways were closely associated, apart from exocytosis and galactosyltransferase activity pathways. Among the 7 pathways, 11 key genes (2.9% out of a total of 380 genes) from the GWAS data and 43 genes (10.5% out of a total of 409 genes) from the gene expression data were differentially expressed. Only Abelson murine leukemia viral oncogene homolog 1 from the nucleolus pathway was significantly expressed in by both data sources. Overall, the results of the present analysis provide possible factors for the occurrence of PC, and the identification of the pathways and genes associated with PC provides valuable data for investigating the pathogenesis of PC in future studies.

Entities:  

Keywords:  ABL1; GWAS; TGF-β signaling pathway; gene expression profile; meta-analysis; pancreatic cancer

Year:  2016        PMID: 27347177      PMCID: PMC4906788          DOI: 10.3892/ol.2016.4637

Source DB:  PubMed          Journal:  Oncol Lett        ISSN: 1792-1074            Impact factor:   2.967


Introduction

Pancreatic cancer (PC; OMIM, 260350) is a highly lethal disease; it is one of the deadliest cancers worldwide, with a mortality rate of 99% and 5-year relative survival rate of <5% (1), and almost all patients with PC develop metastasis. The etiology of PC remains elusive; smoking is the best known risk factor (2). Advances in molecular biology have greatly improved understanding the pathogenesis of PC. The development of PC requires the transformation of normal pancreatic cells to precursor pancreatic intraepithelial neoplasia, which is associated with gene mutations, continuous alterations in nuclei, loss of polarity and alterations to the architecture of cells (3). In addition, chromosome abnormalities are involved in the pathophysiology and development of PC, which usually presents as a loss or gain of alleles in various chromosomes (4). It has been reported that the development and progression of PC is caused by activation of oncogenes and inactivation of tumor suppressor genes, as well as deregulation of numerous signaling pathways, including epidermal growth factor receptor, protein kinase B and nuclear factor kappa B pathways (5). In addition, Hedgehog signaling pathway, an essential pathway during embryonic pancreatic development, is involved in several types of cancer and may be an important mediator in human PC (6). Previous studies indicate that PC has a complex genomic landscape with frequent copy number alterations and point mutations (7). It has been demonstrated that common mutated genes in PC include Kirsten rat sarcoma viral oncogene homolog (K-ras; 74–100%), p16INK4a (≤98%), p53 (43–76%), deleted in pancreatic cancer, locus 4 (~50%), human epidermal growth factor (HER)-2/neu (~65%) and Fragile Histidine Triad (~70%) (8–12); K-ras and HER-2/neu are proto-oncogenes, while all the other genes are tumor suppressor genes (7). Through comprehensive genetic analysis of 24 samples of PC, Jones et al (13) demonstrated that PC contained an average of 63 genetic alterations, the majority of which were point mutations, and these alterations defined a core set of 12 cellular signaling pathways, which were genetically altered in 67–100% of PC tumors. Additionally, Biankin et al (14) defined 16 significant mutated genes, reaffirmed known mutations [K-RAS, tumor protein p53, cyclin-dependent kinase inhibitor 2A, SMAD4, myeloid/lymphoid or mixed-lineage leukemia 3, transforming growth factor, beta receptor II, AT-rich interaction domain (ARID) 1A and splicing factor 3b subunit 1] and uncovered novel mutated genes, including genes involved in chromatin modification (enhancer of polycomb homolog 1 and ARID2), DNA damage repair (ATM serine/threonine kinase) and other mechanisms in axon guidance (zinc finger imprinted 2, mitogen-activated protein kinase kinase 4, sodium leak channel, non-selective, solute carrier family 16 member 4 and MAGEA6). In a humanized genetically modified mouse model of pancreatic ductal adenocarcinoma, which accounts for >90% of PC, Rosenfeldt et al (15) revealed that loss of autophagy did not block tumor progression, but actually accelerated tumor onset. Genome-wide association study (GWAS) aims to detect variants at genomic loci associated with complex traits in a population and, in particular, detect associations between common single-nucleotide polymorphisms (SNPs) and common diseases (16). Gene expression is another source of gene data for investigating complex genetic disease, which describes the type and abundance of gene expression in specific cells or tissues under certain conditions (17). Gene Expression Omnibus Series (GSE) dataset GSE 23952 [human pancreatic carcinoma Panc-1 transforming growth factor-β (TGF-β) treatment assay] was used by the present study, which has also been used by previous studies. Kato et al (18) analyzed two datasets (GSE 17708 and 23952) to identify genes encoding secreted proteins on GenePattern. Xu and Liu (19) used several datasets to study the aberrant expression of cytoplasmic polyadenylation element binding protein 4. Additionally, Gröger (20) developed a comprehensive meta-analysis combining 24 epithelial mesenchymal transition (EMT) datasets, including GSE 23952, to investigate the effectors of EMT. However, none of these studies focussed on pathway analysis in PC. The present study combined GWAS and gene expression data to identify important pathways for the pathogenesis of PC. Gene set enrichment analysis (GSEA) was used to identify over-represented pathways in GWAS or gene expression profiles. Meta-analysis was performed to select significant pathways in GWAS and gene expression data.

Materials and methods

GWAS and gene expression profile

GWAS data (accession no., pha002874.1) were downloaded from National Center for Biotechnology Information (NCBI) database of Genotypes and Phenotypes (www.ncbi.nlm.nih.gov/projects/SNP/gViewer/gView.cgi?aid=2874). The data was obtained by genotyping with the Illumina Hap 500 Infinium genotyping assay (Illumina, Inc., San Diego, CA, USA) on 1,896 PC patients and 1,939 control individuals drawn from 12 prospective cohorts plus one hospital-based case-control study (21). A total of 522,293 SNPs were used in this analysis. For gene expression analysis, the present study utilized the microarray data set submitted by Maupin et al (22). The data were downloaded from NCBI Gene Expression Omnibus (www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS4106; accession no., GSE23952; human pancreatic carcinoma Panc-1 TGF-β treatment assay). In the study by Maupin et al, the human pancreatic adenocarcinoma Panc-1 cell line was treated with TGF-β to induce EMT, and the study was repeated three times. Samples were assayed using Affymetrix Human Genome U133 Plus 2.0 Array (Affymetrix, Santa Clara, CA, USA), and probes with the largest differential expression value were selected. A total of 54,623 probe-sets were obtained following normalization. Differentially expressed genes (DEGs) derived from probes were used for further analysis.

GWAS data analysis: Mapping SNPs to genes

SNPs in the GWAS data were mapped to corresponding genes. The SNPs were annotated based on hg19 (hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/). Genes were identified according to their sitting priority [exon region > intron region > 5′ untranslated region (UTR) > 3′ UTR]. If there were no genes identified in any genetic locus, genes closest to one side of the SNP were included. If more than one SNP was mapped to a gene, genes with the smallest P-value were selected in from the GWAS data.

GSEA pathway analysis

Pathway analysis was performed using GSEA on GWAS and gene expression profile data. GSEA statistically tests whether members of a predefined gene set are randomly distributed throughout a ranked list of genes or whether the members of the gene set cluster toward the top of the list provided by the Broad Institute (www.broadinstitute.org/gsea/index.jsp) (23–25). Pathway analysis of gene sets was performed through Gene Ontology (GO) pathways (c5.all.v4.0.symbols.gmt) from the Molecular Signatures Database (www.broadinstitute.org/msigdb) (23) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (www.genome.jp/kegg/) (26).

Meta-analysis

Pathways with a significant difference in GWAS and gene expression data were selected to perform meta-analysis. Meta P-values were obtained by Fisher's combined probability test (27). The combined P-value was calculated by adding −2ln (P-value) of the two tests for a pathway. Subsequently, a χ2-test distribution was performed, which was used to determine the meta P-value (28). P<0.05 was considered to indicate a statistically significant difference. All statistical tests were performed using Perl language version 5.24.0 (www.perl.org/).

Results

Meta-analysis of over-represented pathways in PC

The 522,293 SNPs loci were contained in GWAS data and were mapped to 18,910 genes. In total, 58 over-represented pathways were selected by GSEA pathway analysis. For gene expression data, 54,623 probes were obtained and the probes were mapped to 31,620 genes. Subsequently, 230 pathways were identified by GSEA (P<0.05). A meta-analysis of over-represented pathways was performed to identify statistically significant pathways in the combined GWAS and gene expression PC data. A total of 7 over-represented pathways from the GWAS and gene expression data were identified (Fig. 1; Table I), of which 6 pathways were GO signal pathways, and 1 pathway (KEGG ID, hsa 04350) was from the KEGG database (Fig. 2). Additionally, a TGF-β signaling pathway (hsa 04350) had the smallest meta P-value. In this signaling pathway, transcription factor Dp-1 (TFDP1), activin A receptor type (ACVR) 2A and v-myc avian myelocytomatosis viral oncogene homolog (MYC) were differentially expressed in GWAS data, while noggin (NOG), inhibitor of DNA binding 1, HLH protein (ID1), left-right determination factor 1 (LEFTY1) and ACVR1 were differentially expressed in gene expression data. The other significant pathways in the two data sources were as follows: Negative regulation of DNA-dependent transcription (GO: 0045892); the nucleolus (GO: 0005730); negative regulation of RNA metabolic process (GO: 0051253); the cellular defense response (GO: 0006968); exocytosis (GO: 0006887); and galactosyltransferase activity (GO: 0008378).
Figure 1.

Gene-pathway network. Red arrows indicate signaling pathways; purple circles indicate genes differentially expressed in GWAS data; yellow circles indicate genes differentially expressed in gene expression profile; blue circles indicate normal genes in GWAS or gene expression data. GWAS, genome-wide association study.

Table I.

A total of 7 pathways were determined by meta-analysis using combined genome-wide association study and gene expression data.

PathwayTermMeta P-value
Hsa 04350TGF-β signaling pathway0.00037
GO: 0005730Nucleolus0.00045
GO: 0006968Cellular defense response0.00145
GO: 0008378Galactosyltransferase activity0.00182
GO: 0006887Exocytosis0.00219
GO: 0051253Negative regulation of RNA metabolic process0.00331
GO: 0045892Negative regulation of DNA-dependent transcription0.00333

TGF-β, transforming growth factor-β.

Figure 2.

TGF-β signaling pathway. Pink boxes indicate genes differentially expressed in genome-wide association study data; yellow boxes indicate genes differentially expressed in gene expression data; green boxes indicate normal genes in TGF-β signaling pathways. TGF-β, transforming growth factor-β.

According to the constructed gene-pathway network, exocytosis (GO: 0006887) and galactosyltransferase activity (GO: 0008378) had no connection with the other pathways, while the other 5 pathways were closely associated with each other (Fig. 1). As shown in the network, the TGF-β signaling pathway (hsa 04350) and nucleolus pathway (GO: 0005730) were closely connected, with numerous genes overlapping each other, as well as negative regulation of RNA metabolic process pathway (GO: 0051253) and negative regulation of DNA-dependent transcription pathway (GO: 0045892). Additionally, 4 pathways, including negative regulation of RNA metabolic process (GO: 0051253), negative regulation of DNA-dependent transcription (GO: 0045892), the nucleolus (GO: 0005730) and TGF-β signaling pathway (hsa 04350), were connected via recombination signal binding protein for immunoglobulin kappa J region (RBPJ) and MDM2 proto-oncogene (MDM2), while cellular defense response (GO: 0006968), negative regulation of RNA metabolic process (GO: 0051253) and negative regulation of DNA-dependent transcription (GO: 0045892) were associated via SMAD2, SMAD3, SMAD4, bone morphogenetic protein (BMP) 2 and follistatin.

Identification of key genes for PC pathways

To identify specific genes within the 7 pathways identified by the present study, individual genes with P<0.05 were selected from GWAS and gene expression data. Among the 7 pathways, 11 key genes (2.9% out of a total of 380 genes) from GWAS data were differentially expressed (Table II), including neuroligin 1, regulating synaptic membrane exocytosis 1, protein phosphatase 2 regulatory subunit 2C, BMP6, zinc finger protein 238, Kruppel-like factor 4, MYB Binding Protein (P160) 1a, ABL proto-oncogene 1, non-receptor tyrosine kinase (ABL1), ribosomal protein L11, topoisomerase (DNA) I. For the gene expression data, ~43 genes (10.5% out of a total of 409 genes) were identified as being significantly expressed (Table III). Among these significant genes, only ABL1 from the nucleolus pathway (GO: 0005730) was significantly expressed in both GWAS [P=0.002085; P-value < min (0.5*N), where N = gene number] and gene expression data [value=1.7271; value
Table II.

DEGs from genome-wide association study data of each pathway identified by meta-analysis.

PathwayDEGsP-value
GO: 00083780.005
GO: 0051253ZNF238, KLF40.031
Hsa 04350PPP2R2C, BMP6, TFDP1, ACVR2A, PPP2R2A, BMP7, MYC0.033
GO: 0045892ZNF238, KLF40.034
GO: 00069680.037
GO: 0006887NLGN1, RIMS10.039
GO: 0005730MYBBP1A, ABL1, RPL11, TOP10.040

DEGs, differentially expressed genes; -, no genes differentially expressed in the pathway.

Table III.

DEGs from gene expression profiles of each pathway identified by meta-analysis.

PathwayDEGsP-value
Hsa 04350NOG, ID1, LEFTY2, INHBB, LEFTY1, SMAD7, COMP, RBL2, ID2, SMURF2, ACVR10.001
GO: 0005730CD3EAP, NOL4, IMP4, DDX11, ZNF259, EMG1, RRP9, ABL1, LOC81691, DKC1, RPP40, UTP20, NOC4L, NOP20.001
GO: 0006968KLRC3, KLRC4, FOSL1, LGALS3BP, MICB0.004
GO: 0006887RAB26, SYT1, RABEPK, YKT6, SCIN0.006
GO: 0045892SNAI2, RYBP, GLIS3, IRF2, HEXIM2, KLF10, CIR1, BCOR, ZNF177, ZNF593, ARID4A0.011
GO: 0051253SNAI2, RYBP, GLIS3, IRF2, HEXIM2, KLF10, CIR1, BCOR, ZNF177, ZNF593, ARID4A0.012
GO: 0008378B3GALT4, B4GALT5, B3GALNT10.038

DEGs, differentially expressed genes.

Discussion

According to meta-analysis based on GSEA pathway analysis, 7 pathways were identified by the present study to be significant in GWAS and gene expression profiles. These pathways were associated with the TGF-β signaling pathway (hsa 04350), negative regulation of DNA-dependent transcription (GO: 0045892), the nucleolus (GO: 0005730), negative regulation of RNA metabolic process (GO: 0051253), the cellular defense response (GO: 0006968), exocytosis (GO: 0006887) and galactosyltransferase activity (GO: 0008378). The TGF-β signaling pathway had the smallest meta P-value. By constructing the gene-pathway network, 5 pathways were identified as closely connected, apart from exocytosis and galactosyltransferase activity pathways. Among the 7 pathways, 11 key genes (2.9% out of a total of 380 genes) from GWAS data and 43 genes (10.5% out of a total of 409 genes) from gene expression data were differentially expressed. Only ABL1 from the nucleolus pathway was significantly expressed in the both data sources. In total, 3 genes (TFDP1, ACVR2A and MYC) from GWAS data and 4 genes (NOG, ID1, LEFTY1 and ACVR1) from gene expression profile were differentially expressed in the TGF-β signaling pathway. TGF-β family members include TGF-betas, activins and BMPs, which are structurally associated with secreted cytokines (29). In addition, the TGF-β family regulates numerous cellular processes, including cell proliferation, recognition, differentiation and apoptosis (30). MYC, an over-expressed proto-oncogene in the TGF-β signaling pathway, encodes a DNA-binding factor that activates or represses transcription (31). Via this mechanism, MYC regulates the expression of numerous target genes, which control key cellular functions, such as cell growth and cell cycle progression (32). Therefore, deregulated MYC expression, resulting from various types of genetic alterations, results in constitutive MYC activity in a variety of cancers and promotes oncogenesis (33). If PC cells are stably transfected with a dominant-negative mutant of MYC (c-Myc), their proliferation is markedly inhibited (34). Grippo et al (35) studied myc-associated acinar-to-ductal metaplasia in Ela-c-myc transgenic rats, and demonstrated that c-myc was associated with human pancreatic neoplasms, which was sufficient to induce acinar hyperplasia. Additionally, Köenig et al (36) revealed a novel mechanism regulating cell growth in PC: Serum promotes the occurrence of PC through the induction of proliferative NFAT/c-Myc axis by impaired c-Myc expression and reduces tumor growth upon nuclear factor of activated T-cells depletion in vitro and in vivo. TFDP1 is the first member of the E2F transcription factor family that regulates the expression of various cellular promoters, particularly those involved in the cell cycle (37). Abba et al (38) identified that TFDP1 exhibits the highest frequency of amplification affecting primary breast cancer samples. In addition, meta-analysis reveals a strong association between a high expression of TFDP1 or NOG and the decreased overall survival in patients with breast cancer (39). Furthermore, overexpression of TFDP1 may contribute to the progression of certain hepatocellular carcinomas by promoting the growth of tumor cells (40). Additionally, ID1, a gene associated with cell growth, senescence, differentiation and angiogenesis, participates in numerous tumor processes (41,42). Other genes, including ACVR2A, LEFTY1 and ACVR1, are mostly associated with pituitary tumors (43), left-right axis malformations (44) and fibrodysplasia ossificans progressive (45), respectively. These genes, including TFDP1, NOG, ID1 ACVR2A, LEFTY1 and ACVR1, were identified as being associated with PC in the present study, which is rarely reported in PC pathogenesis. Thus, further study is required verify these genes in a PC context. According to the constructed gene-signal pathway network, 5 pathways were demonstrated to be closely connected in the present study, apart from exocytosis and galactosyltransferase activity pathway. RBPJ and MDM2 were bridges that connected 4 of these pathways. Masui et al (46) demonstrated that pancreas specific transcription factor, 1a (PTFLA), a basic helix-loop-helix transcription factor required for pancreatic development, interacts with RBPJ within a stable trimeric DNA-binding complex, PTF1, during early PC development in mice. Introduction of a PTFLA mutant, which is unable to bind RBPJ, truncated pancreatic development at an immature stage and acini or islets were not formed. MDM2 is an E3 ubiquitin ligase that targets the tumor suppressor p53 protein for proteasomal degradation. p53 induces cell cycle arrest or apoptosis in response to cellular stress (47). The MDM2 oncoprotein promotes cell survival and cell cycle progression by inhibiting the p53 tumor suppressor protein (48). ABL1 (OMIM entry, *189980) was first identified as an oncogene from the ABL family of nonreceptor tyrosine kinases, and it transduces diverse extracellular signals to protein networks that control proliferation, survival, migration and invasion of cells (49). ABL1 encodes a cytoplasmic and nuclear protein tyrosine kinase, which is involved in cell differentiation, division and adhesion, and the stress response (50). Alterations of ABL1 by chromosomal rearrangement or viral transduction leads to malignant transformation (51). The overexpression of microRNA (miR)-203 leads to a poor survival of cancer, due to its oncogenic function; however, miR-203 exhibits tumor-suppressor qualities in PC by inhibiting the expression of ABL1 and BCR-ABL1, resulting in an inhibition of cell proliferation (52). Overall, through the combined pathway analysis of GWAS and gene expression, 7 pathways were demonstrated to be significant by meta-analysis performed by the present study. Among all the significantly expressed genes, only one gene, ABL1, was differentially expressed in the both data sources. The present study identified MYC as the most probable gene associated with the TGF-β signaling pathway in PC. In conclusion, the results of the present analysis provide possible factors for the occurrence of PC, and the identification of pathways and genes provides valuable data for investigating the pathogenesis of PC. However, bioinformatics analysis generally lacks experimental support, so additional study is required to verify the results of the present study.
  49 in total

Review 1.  Mechanisms of TGF-beta signaling from cell membrane to the nucleus.

Authors:  Yigong Shi; Joan Massagué
Journal:  Cell       Date:  2003-06-13       Impact factor: 41.582

Review 2.  Pancreatic cancer: pathogenesis, prevention and treatment.

Authors:  Fazlul H Sarkar; Sanjeev Banerjee; Yiwei Li
Journal:  Toxicol Appl Pharmacol       Date:  2006-11-11       Impact factor: 4.219

3.  The FHIT gene is expressed in pancreatic ductular cells and is altered in pancreatic cancers.

Authors:  C Sorio; A Baron; S Orlandini; G Zamboni; P Pederzoli; K Huebner; A Scarpa
Journal:  Cancer Res       Date:  1999-03-15       Impact factor: 12.701

4.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

Review 5.  MYC on the path to cancer.

Authors:  Chi V Dang
Journal:  Cell       Date:  2012-03-30       Impact factor: 41.582

6.  Mutational analysis of activin/transforming growth factor-beta type I and type II receptor kinases in human pituitary tumors.

Authors:  F H D'Abronzo; B Swearingen; A Klibanski; J M Alexander
Journal:  J Clin Endocrinol Metab       Date:  1999-05       Impact factor: 5.958

7.  NFAT-induced histone acetylation relay switch promotes c-Myc-dependent growth in pancreatic cancer cells.

Authors:  Alexander Köenig; Thomas Linhart; Katrin Schlengemann; Kristina Reutlinger; Jessica Wegele; Guido Adler; Garima Singh; Leonie Hofmann; Steffen Kunsch; Thomas Büch; Eva Schäfer; Thomas M Gress; Martin E Fernandez-Zapico; Volker Ellenrieder
Journal:  Gastroenterology       Date:  2009-11-06       Impact factor: 22.682

8.  Chromosome abnormalities in pancreatic adenocarcinoma.

Authors:  C A Griffin; R H Hruban; P P Long; L A Morsberger; F Douna-Issa; C J Yeo
Journal:  Genes Chromosomes Cancer       Date:  1994-02       Impact factor: 5.006

Review 9.  Role of ABL family kinases in cancer: from leukaemia to solid tumours.

Authors:  Emileigh K Greuber; Pameeka Smith-Pearson; Jun Wang; Ann Marie Pendergast
Journal:  Nat Rev Cancer       Date:  2013-07-11       Impact factor: 60.716

10.  Identifying consensus disease pathways in Parkinson's disease using an integrative systems biology approach.

Authors:  Yvonne J K Edwards; Gary W Beecham; William K Scott; Sawsan Khuri; Guney Bademci; Demet Tekin; Eden R Martin; Zhijie Jiang; Deborah C Mash; Jarlath ffrench-Mullen; Margaret A Pericak-Vance; Nicholas Tsinoremas; Jeffery M Vance
Journal:  PLoS One       Date:  2011-02-22       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.