Literature DB >> 28232943

Gastric Cancer Associated Genes Identified by an Integrative Analysis of Gene Expression Data.

Bing Jiang1, Shuwen Li2, Zhi Jiang3, Ping Shao1.   

Abstract

Gastric cancer is one of the most severe complex diseases with high morbidity and mortality in the world. The molecular mechanisms and risk factors for this disease are still not clear since the cancer heterogeneity caused by different genetic and environmental factors. With more and more expression data accumulated nowadays, we can perform integrative analysis for these data to understand the complexity of gastric cancer and to identify consensus players for the heterogeneous cancer. In the present work, we screened the published gene expression data and analyzed them with integrative tool, combined with pathway and gene ontology enrichment investigation. We identified several consensus differentially expressed genes and these genes were further confirmed with literature mining; at last, two genes, that is, immunoglobulin J chain and C-X-C motif chemokine ligand 17, were screened as novel gastric cancer associated genes. Experimental validation is proposed to further confirm this finding.

Entities:  

Mesh:

Year:  2017        PMID: 28232943      PMCID: PMC5292384          DOI: 10.1155/2017/7259097

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Gastric cancer (GC) is one of the most severe cancers in the world with high incidence and low survival rate. According to the global cancer statistics report in 2012, GC has been the fifth most common cancer in the world, which causes more than seven hundred thousand deaths each year [1]. Usually, the number of GC patients in men is twice more than that in women and Eastern Asia, especially Korea, Japan, and China, has the highest incidence rate. Although relevant reports revealed that the age-standardized incidence rate of gastric cancer is decreasing in Japan and Korea in last few years [2, 3], the number of new cases is still increasing due to the aging of the population. The pathogenesis of gastric cancer is very complex and remains unclear. Recent basic studies mainly focus on three main factors: environmental factors, Helicobacter pylori (H. pylori) infection, and gene expression dysregulation [4, 5]. Previous studies have demonstrated the unhealthy lifestyle, such as excessive diet, can raise the risk of gastric cancer [5-7]. Processed meat intakes will increase the risk of gastric non-cardia cancer in H. pylori antibody-positive individuals while fresh fruits and vegetables consumption will protect individuals against GC. Also, in molecular level, several host genetic factors might play a key role in GC, such as IL-1β, IL-10, TFF2, and CDH1 [8-10]. With relevant studies deepening, the size of research data is becoming larger and larger. Hundreds of gene expression profiles and diagnostic targets are uploaded into various gene expression databases. These data can be further integrated to the understanding of the complexity of the diseases, such as the cancer heterogeneity, high level consensus [11-13], biomarker discovery [14, 15], and the key players in the cancer genesis and progress [16]. In this study, we used meta-analysis approach for analysis of multiple transcriptomic datasets. We hope to integrate different gene expression data collected from GC patients and normal controls to figure out robust candidates in genes, pathways, and functions, setting the foundation for personalized treatment of gastric cancer. The method we used here was named INMEX (integrative meta-analysis of expression data) program [17]. Data procession and screening were performed in order to make sure all the datasets we uploaded into the program were in a consistent format. Due to the existence of outliers and variations in microarray data, a combining rank orders algorithm based on RankProd package [18] was used here to carry out the meta-analysis.

2. Materials and Methods

The pipeline of this whole analysis in the present study is shown in Figure 1. We first extract the microarray gene expression data from the GEO database, then we integrate analyzed the expression data with a meta-analysis tool INMEX, and then we further screen and validate the meta-analysis results with literature analysis and bioinformatics functional analysis.
Figure 1

The pipeline of the whole analysis in this study.

2.1. Dataset Collection and Data Screening

We used keywords “gastric cancer,” with two filters: (a) organism: Homo sapiens and (b) type: expression profiling by array, in searching for the gene expression profiles in Gene Expression Omnibus (GEO) database. We explored the searching results by setting four inclusion criteria: (1) datasets published after 2010; (2) case-control studies; (3) sample numbers more than 20; (4) high similarity in sample background information (i.e., sources, patients' race and location, disease status, and platforms). Datasets meeting these criteria were selected for further analysis.

2.2. Meta-Analysis for Selected Datasets

Based on the expression data we collected from each qualified microarray study, a global meta-analysis for identifying differentially expressed (DE) genes in gastric genes was conducted in this study. Here, we selected a web-based tool named INMEX (integrative meta-analysis of expression data, http://www.networkanalyst.ca/) for meta-analysis. We firstly upload the normalized gene expression datasets into INMEX. Then we processed and annotated the datasets to adjust the data format and class labels into the consistent style. After the integrity check, we selected combining rank orders method, which is based on the RankProd package, to carry out the meta-analysis. The number of permutation tests in this method was 20 times.

2.3. Functional Enrichment Analysis of DE Genes

Functional enrichment analysis of these DE genes was further performed by INMEX program in two approaches: Geno Ontology and pathway analysis. In GO annotation, we set a p value threshold of 0.05 to identify the significantly enriched items. In pathway analysis, KEGG pathway database was used here for pathway enrichment analysis. A p value threshold of 0.05 was also set for identification of significantly enriched pathways.

3. Results

3.1. Characteristics of Datasets Included in This Meta-Analysis

The datasets selection strategy and the screening results are presented in Figure 2. Through GEO datasets searching, a total of 1722 studies were retrieved. 1618 irrelevant studies were excluded, among which 1605 studies were not expression profiling by microarray technologies and 13 studies were animal studies. The remaining 104 studies were included for full-text review. Studies without case-control matches were then excluded. Due to the platform limitation, we further excluded those studies whose microarray platforms are not available in INMEX program. After several rounds of screening, a final list of 3 microarray datasets [19, 20] was selected for meta-analysis.
Figure 2

Datasets selection strategy and results.

These 3 datasets (GSE79973, GSE19826, and GSE49051) contain totally 25 cases and 25 controls. The number of cases and controls of each dataset is well matched. All the datasets were collected from Chinese hospitals and sample sources are consistent. The detailed information of these 3 datasets is listed in Table 1.
Table 1

Datasets selected in this meta-analysis.

Accession/IDPlatformGCControlMaterialsYearRaceRegion
GSE79973GPL570 n = 10 n = 10Gastric tissues2016ChineseHangzhou
GSE19826GPL570 n = 12 n = 12Gastric tissue2010ChineseShanghai
GSE49051GPL10332 n = 3 n = 3Gastric tissue2013ChineseShanghai

3.2. Results of Meta-Analysis

This study is performed based on combining rank orders. DE genes with p value < 0.05 were selected. Totally 1153 DE genes were got through this meta-analysis. The detailed DE gene information was listed in Table S1 (see Supplementary Material available online at https://doi.org/10.1155/2017/7259097). All of these DE genes are those identified to be differentially expressed in these three datasets rather than in individual samples. Among the 1153 DE genes, 787 genes were downregulated and 366 genes were upregulated. The top 10 most significantly upregulated genes and top 10 most downregulated genes were listed in Tables 2 and 3. Genes with the smallest combined rank product (RP) in upregulated DE gene list and downregulated DE gene list are COL6A3 (combinedRP = 59.02) and PGC (combinedRP = 22.38), respectively.
Table 2

Top 10 most significantly downregulated DE genes in gastric cancer.

EntrezIDGene full nameGene symbolCombinedRPAveLogFC
5225ProgastricsinPGC22.38−14454.48
57016Aldo-keto reductase family 1 member B10AKR1B1022.86−6705.56
9992Potassium voltage-gated channel subfamily E regulatory subunit 2KCNE236.08−4314.33
284340C-X-C motif chemokine ligand 17CXCL1749.18−3880.57
135656Diffuse panbronchiolitis critical region 1DPCR155.94−1892.04
51208Claudin 18CLDN1857.35−3435.26
3512Immunoglobulin J polypeptide, linker protein for immunoglobulin alpha, and mu polypeptidesIGJ62.81−25978.05
1510Cathepsin ECTSE64.51−4631.28
340547V-set and immunoglobulin domain containing 1VSIG169.66−1752.54
4499Metallothionein 1MMT1M100.58−6787.14
Table 3

Top 10 most significantly upregulated DE genes in gastric cancer.

EntrezIDGene full nameGene symbolCombinedRPAveLogFC
1293Collagen type VI alpha 3 chainCOL6A359.023600.96
1278Collagen type I alpha 2 chain COL1A262.063576.21
10562Olfactomedin 4OLFM4150.673542.76
7058Thrombospondin 2THBS2163.6624.03
115908Diffuse panbronchiolitis critical region 1 CTHRC1174.611204.56
4680Collagen triple helix repeat containing 1CEACAM6203.782542.02
3624Inhibin beta A subunitINHBA219.12368.69
1290Collagen type V alpha 2 chain COL5A2230.721064.68
54829AsporinASPN255.15237.05
1366Claudin 7CLDN7288.09356.71

3.3. Functional Enrichment Analysis Results

Functional enrichment analysis was carried out for further study of these DE genes. Gene Oncology (GO) analysis and KEGG pathway analysis were the two approaches we conducted here. In GO analysis, we did the analysis at three levels: biological process (BP), cellular component (CC), and molecular function (MF). The top 10 most significantly enriched terms (adj. p value < 0.05) were selected, respectively. The histograms of these terms were shown in Figure 3. Most of the DE genes are well mapped onto gastric cancer associated process of biological factors. In KEGG pathway analysis, we also selected top 10 most significantly enriched pathways, as shown in Figure 4. All of the selected items were taken into literature validation for further investigation.
Figure 3

Gene Oncology (GO) annotation for the DE genes in gastric cancer. Here the GO annotation was used at three levels: biological process, cellular component, and molecular function. (a), (b), and (c) represent the top 10 most significantly enriched GO terms for these DE genes, respectively. All the adjusted statistical significance value (p value) of the terms was negative 10-based transformed.

Figure 4

The top 10 most significantly enriched pathways in KEGG pathway analysis for the DE genes in gastric cancer. The adjusted statistical significance value (p value) was negative 10-based log transformed.

4. Discussion

In this study, we have used publicly available microarray datasets to identify genes that are differentially expressed in tumor tissues from people with GC comparing to people without GC. The aim of our study is to derive additional information from the combining datasets that are unlikely to be established from individual studies in isolation through combining the data from three separate gene expression datasets in a meta-analysis. Generally, we found this is to be the case. Through PubMed literature mining, we found 8 of 10 of downregulated genes and all the upregulated genes have been reported to be associated with gastric cancer by biological and clinical experiment validation. For example, downregulated gene with smallest combinedRP in this study is Progastricsin (PGC). Many researchers have found it plays a key role in gastric cancer and the PGC polymorphism could serve as one of the diagnosis biomarkers for GC [21-23]. Also, in a recent research, Li et al. found, in mitogen-activated protein kinase activator with WD40 repeats (MAWD) and MAWD-binding protein (MAWBP) downregulated GC cells, the expression level of PGC was lower than that in control samples [24]. In upregulated genes, collagen VI α3 (COL6A3) is the gene with smallest combinedRP. Relevant research has found the expression level of COL6A3 was significantly higher in GC patients [25, 26], which also could serve as a diagnosis biomarker for GC. Other DE genes, such as COL1A2 [26], OLFM4 [27], THBS2 [28], CEACAM6 [29], CTSE [30], AKR1B10 [31], and KCNE2 [32], also have been reported to be differentially expressed in GC patients comparing to controls. Interestingly, in the top 10 downregulated DE genes, 2 genes (IGJ and CXCL17) have not been reported to have a direct association with GC. For IGJ, Tvarijonaviciute et al. have observed that, in obese dogs, the amount of IGJ proteins was decreased [33]. Relevant research has revealed that obesity will increase the risk of gastric cancer [34]. For CXCL17, it is reported that overexpression of CXCL17 has a strong connection with colon cancer and hepatocellular carcinoma [35, 36]. The existence of gene interaction reveals the association between GC and these two cancers [37, 38]. Because there are still no specific experiments on these two genes and GC, further biological and clinical research are needed. To further investigate the functional mechanisms of these DE genes, we performed GO analysis and KEGG pathway analysis. We finally get 102 significantly enriched terms (p value < 0.05) in biological process level, 157 in cellular component level, and 31 in molecular function level. As shown above, the top 10 significantly enriched terms were all reported to be associated with GC. For example, in extracellular matrix, there exists extracellular matrix protein 1 (ECM1). ECM1 plays a key role in lymphangiogenesis [39], which could be an inducement of cancer invasion and metastasis. Aberrant expression of ECM1 was found in GC samples in a recent study [40]. Also, in translational elongation process, relevant genes, such as translation elongation factor EEF1B2, were upregulated in the poor prognosis samples [41]. All the top 10 terms in BP, CC, and MF have been reported to have an association with GC. In KEGG pathway analysis, the most significantly enriched pathway is Ribosome. Genes such as RPL11, RPL23, RPS6, and MRPS21 were enriched on this pathway. Ribosomal protein family (PRL/RPS) has been demonstrated to have a strong connection with GC. For example, a recent study revealed that GLTSCR2 regulates the MDM2-TP53 pathway through RPL11, playing a key role in GC progression [42]. A previous study has observed that reducing the phosphorylation of RPS6 could have an influence on the sensitivity to MEK inhibition in gastric cancer cells [43]. Another important pathway in GC is glycolysis/gluconeogenesis pathway. Reports revealed that microRNA-133b could silence PKM-splicer PTBP1, leading the inhibition of growth of human gastric cancer cells [44]. Hu and Chen also found that SIRT3 can strengthen glycolysis in SIRT3-expressing GC cells. Other pathways, like ECM-receptor interaction and metabolism of xenobiotics by cytochrome P450, have been validated to be associated with GC through bioinformatics approaches based protein-protein interaction networks analysis [45].

5. Conclusions

To summarize, our research provides novel angels in pathogenesis of gastric cancer. We identified consistently DE genes in gastric cancer through INMEX meta-analysis tools. Top 10 of upregulated and downregulated genes could potentially serve as diagnosis biomarker. GO annotation and KEGG pathway analysis demonstrated those candidates have a strong relationship with gastric cancer. Moreover, we identified 2 novel GC associated genes, IGJ and CXCL17, which have never been reported to be associated with GC before. Further experimental validation should be conducted in order to understand the mechanism of these two genes on gastric cancer. Table S1: Information for differentially expressed genes.
  45 in total

1.  Associations of THBS2 and THBS4 polymorphisms to gastric cancer in a Southeast Chinese population.

Authors:  Xiandong Lin; Don Hu; Gang Chen; Yi Shi; Hejun Zhang; Xiaojiang Wang; Xiaoyun Guo; Lu Lu; Dennis Black; Xiong-Wei Zheng; Xingguang Luo
Journal:  Cancer Genet       Date:  2016-04-13

2.  Identifying novel prostate cancer associated pathways based on integrative microarray data analysis.

Authors:  Ying Wang; Jiajia Chen; Qinghui Li; Haiyun Wang; Ganqiang Liu; Qing Jing; Bairong Shen
Journal:  Comput Biol Chem       Date:  2011-04-27       Impact factor: 2.877

3.  Gastric cancer in a Caucasian population: role of pepsinogen C genetic variants.

Authors:  Ana L Pinto-Correia; Hugo Sousa; Maria Fragoso; Luís Moreira-Dias; Carlos Lopes; Rui Medeiros; Mário Dinis-Ribeiro
Journal:  World J Gastroenterol       Date:  2006-08-21       Impact factor: 5.742

4.  Upregulated INHBA expression is associated with poor survival in gastric cancer.

Authors:  Quan Wang; Yu-Gang Wen; Da-Peng Li; Jun Xia; Chong-Zhi Zhou; Dong-Wang Yan; Hua-Mei Tang; Zhi-Hai Peng
Journal:  Med Oncol       Date:  2010-12-04       Impact factor: 3.064

5.  Mitogen-activated protein kinase activator with WD40 repeats (MAWD) and MAWD-binding protein induce cell differentiation in gastric cancer.

Authors:  Dongmei Li; Jun Zhang; Yu Xi; Lei Zhang; Wenmei Li; Jiantao Cui; Rui Xing; Yuanmin Pan; Zemin Pan; Feng Li; Youyong Lu
Journal:  BMC Cancer       Date:  2015-09-15       Impact factor: 4.430

6.  Identifying novel glioma associated pathways based on systems biology level meta-analysis.

Authors:  Yangfan Hu; Jinquan Li; Wenying Yan; Jiajia Chen; Yin Li; Guang Hu; Bairong Shen
Journal:  BMC Syst Biol       Date:  2013-12-17

7.  Identification of key genes associated with gastric cancer based on DNA microarray data.

Authors:  Hui Sun
Journal:  Oncol Lett       Date:  2015-11-17       Impact factor: 2.967

8.  PSMB8 and PBK as potential gastric cancer subtype-specific biomarkers associated with prognosis.

Authors:  Chae Hwa Kwon; Hye Ji Park; Yu Ri Choi; Ahrong Kim; Hye Won Kim; Jin Hwa Choi; Chung Su Hwang; So Jung Lee; Chang In Choi; Tae Yong Jeon; Dae Hwan Kim; Gwang Ha Kim; Do Youn Park
Journal:  Oncotarget       Date:  2016-04-19

9.  Integrative analysis reveals disease-associated genes and biomarkers for prostate cancer progression.

Authors:  Yin Li; Wanwipa Vongsangnak; Luonan Chen; Bairong Shen
Journal:  BMC Med Genomics       Date:  2014-05-08       Impact factor: 3.063

10.  MiR-133b inhibits growth of human gastric cancer cells by silencing pyruvate kinase muscle-splicer polypyrimidine tract-binding protein 1.

Authors:  Taro Sugiyama; Kohei Taniguchi; Nobuhisa Matsuhashi; Toshihiro Tajirika; Manabu Futamura; Tomoaki Takai; Yukihiro Akao; Kazuhiro Yoshida
Journal:  Cancer Sci       Date:  2016-12-19       Impact factor: 6.716

View more
  7 in total

1.  A tumour-resident Lgr5+ stem-cell-like pool drives the establishment and progression of advanced gastric cancers.

Authors:  A Fatehullah; Y Terakado; S Sagiraju; T L Tan; T Sheng; S H Tan; K Murakami; Y Swathi; N Ang; R Rajarethinam; T Ming; P Tan; B Lee; N Barker
Journal:  Nat Cell Biol       Date:  2021-12-02       Impact factor: 28.824

2.  Gastric Cancer Pre-Stage Detection and Early Diagnosis of Gastritis Using Serum Protein Signatures.

Authors:  Shahid Aziz; Faisal Rasheed; Rabaab Zahra; Simone König
Journal:  Molecules       Date:  2022-04-30       Impact factor: 4.927

3.  Analysis of Microarray-Identified Genes and MicroRNAs Associated with Idiopathic Pulmonary Fibrosis.

Authors:  Lichao Fan; Xiaoting Yu; Ziling Huang; Shaoqiang Zheng; Yongxin Zhou; Hanjing Lv; Yu Zeng; Jin-Fu Xu; Xuyou Zhu; Xianghua Yi
Journal:  Mediators Inflamm       Date:  2017-05-14       Impact factor: 4.711

4.  Analysis of KRAS gene mutation associated with Helicobacter pylori infection in patients with gastric cancer.

Authors:  Raheleh Jabini; Seyed Ahmad Eghbali; Hossein Ayatollahi; Maryam Sheikhi; Mohammadreza Farzanehfar
Journal:  Iran J Basic Med Sci       Date:  2019-05       Impact factor: 2.699

Review 5.  Artificial intelligence in gastric cancer: a translational narrative review.

Authors:  Chaoran Yu; Ernest Johann Helwig
Journal:  Ann Transl Med       Date:  2021-02

6.  Overexpression of UDP-Glucose 4-Epimerase Is Associated with Differentiation Grade of Gastric Cancer.

Authors:  Maria de Fátima Deodato de Souza; Antônio Felix da Silva Filho; Amanda Pinheiro de Barros Albuquerque; Michael Williams Leal Quirino; Mário Sérgio de Souza Albuquerque; Marina Ferraz Cordeiro; Mário Rino Martins; Ivan da Rocha Pitta; Antônio Roberto Lucena-Araujo; Maira Galdino da Rocha Pitta; Moacyr Jesus Barreto de Melo Rêgo
Journal:  Dis Markers       Date:  2019-11-20       Impact factor: 3.434

7.  Combined bioinformatics technology to explore pivot genes and related clinical prognosis in the development of gastric cancer.

Authors:  Jiasheng Xu; Xinlu Wang; Qiwen Ke; Kaili Liao; Yanhua Wan; Kaihua Zhang; Guanyu Zhang; Xiaozhong Wang
Journal:  Sci Rep       Date:  2021-07-29       Impact factor: 4.379

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.