Literature DB >> 24093889

Screening of key genes in gastric cancer with DNA microarray analysis.

Yong Jin1, Wei Da.   

Abstract

BACKGROUND: The aim of this study was to identify key genes and novel potential therapeutic targets related to gastric cancer (GC) by comparing cancer tissue samples and healthy control samples using DNA microarray analysis.
METHODS: Microarray data set GSE19804 was downloaded from Gene Expression Omnibus. Preprocessing and differential analysis were conducted with of R statistical software packages, and a number of differentially expressed genes (DEGs) were obtained. Cluster analysis was also done with gene expression values. Functional enrichment analysis was performed for all the DEGs with DAVID tools. The significantly up- and downregulated genes were selected out and their interactors were retrieved with STRING and HitPredict, followed by construction of networks. For all the genes in the two networks, GeneCodis was chosen for gene function annotation.
RESULTS: A total of 638 DEGs were identified, and we found that SPP1 and FABP4 were the markedly up- and downregulated genes, respectively. Cell cycle and regulation of proliferation were the most significantly overrepresented functional terms in up- and downregulated genes. In addition, extracellular matrix-receptor interaction was found to be significant in the SPP1-included interaction network.
CONCLUSIONS: A range of DEGs were obtained for GC. These genes not only provided insights into the pathogenesis of GC but also could develop into biomarkers for diagnosis or treatment.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24093889      PMCID: PMC3852022          DOI: 10.1186/2047-783X-18-37

Source DB:  PubMed          Journal:  Eur J Med Res        ISSN: 0949-2321            Impact factor:   2.175


Background

Gastric cancer (GC) is one of the most prevalent cancers in the world. Recognized risk factors for GC include infection with Helicobacter pylori, dietary factors, smoking and other factors [1]. Molecular genetics and molecular biology studies have shown that the pathogenesis of GC is a progressive process involving multiple steps and factors. The activation, overexpression or amplification of oncogenes and the deletion or mutation of tumor suppressor genes play important roles in the development of GC [2]. Molecularly targeted therapy holds promise and thus has become a focus in the field of cancer treatment in recent years [3]. Biomarkers can be used clinically to predict the effectiveness and toxicity of anticancer drugs and thus help to achieve individualized treatment [4]. Ryu et al. found seven overexpressed proteins and seven underexpressed proteins in GC by using a proteomics approach [5]. Jang et al. also tried to identify biomarker candidates by analyzing proteome profiles [6]. Yasui et al. performed serial analysis of gene expression to search for new biomarkers [7]. Accordingly, quite a few potential biomarkers have been reported, such as regenerating gene family member 4 [8], olfactomedin [9], resistin and visfatin [10]. However, current knowledge is not sufficient to conquer the disease clinically. Microarray technology is a powerful tool with which to discover the comprehensive changes in the incidence and development of cancer [11]. Therefore, in this study, gene expression profiles of GC tissue samples and healthy controls were compared to identify differentially expressed genes (DEGs). By combining functional enrichment analysis and interaction network analysis in our study, we sought not only to provide insights into the pathogenesis of GC but also to discover potential biomarkers for the diagnosis and treatment of GC.

Methods

Microarray data

Microarray data set GSE2685 [12] was downloaded from Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/) [GEO:GSE2685], including 22 GC samples and 8 healthy controls. The GLP80 [Hu6800] Affymetrix Human Full Length HuGeneFL Array (Affymetrix, Santa Clara, CA, USA) and the annotation information of probes were used to detect the gene expression.

Differential expression analysis

Raw data were converted into recognizable format, and missing values were imputed [13]. After data normalization [14], the multtest package [15] of R software was chosen to perform statistical analysis to identify the DEGs by comparing them with healthy tissues, and multiple testing correction was done using the Benjamini-Hochberg method [16]. A false discovery rate (FDR) less than 0.05 and an absolute log fold change (|logFC|) greater than 1 were set as the significant cutoffs.

Cluster analysis

Cluster analysis [17] was conducted on the basis of the gene expression values in each sample to verify the difference in gene expression between GC tissue samples and healthy controls.

Functional enrichment analysis for all differentially expressed genes

Functional enrichment analysis is able to reveal biological functions based upon DEGs [18]. Therefore, in the present study, we chose to use the web-based DAVID database (Database for Annotation Visualization and Integrated Discovery) for functional annotation bioinformatics microarray analysis [19] to determine the functional enrichment and the Gene Ontology (GO) annotation, with P < 0.05 were selected as the significant functions.

Construction of interaction network

Proteins usually interact with each other to display certain functions [20]. Therefore, interactors of the most significant DEGs were predicted, including the upregulated DEGs and downregulated DEGs using STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) [21] and HitPredict software [22], then the interaction networks of the significantly upregulated DEGs and downregulated DEGs, respectively, with their interactors were established. STRING connects major databases and predicts interactions based upon experiments, text mining and sequence homology. HitPredict collects interactions from databases such as IntAct (EMBL-European Bioinformatics Institute, Cambridge, UK) [23], BioGRID (Biological General Repository for Interaction Datasets) and HPRD (Human Protein Reference Database) [24], as well as from those predicted by algorithms [22]. The interaction network from HitPredict, which we obtained from experiments and the likelihood score greater than 1, were considered high-confidence interactions [25]. Interaction networks from STRING were obtained with a high degree of confidence.

Functional enrichment analysis for all genes in the network

To explore the biological functions of all genes in the network we obtained previously, we chose GeneCodis software [26] for functional enrichment analysis. P < 0.05 was applied as the cutoff value for significance. GeneCodis (Gene Annotations Co-occurrence Discovery) is a web-based tool used for gene functional analysis [27-29]. It integrates different information resources (GO, KEGG (Kyoto Encyclopedia of Genes and Genomes) and Swiss-Prot gene accession databases) to seek the annotation of genes and arrange their biological functions according to their significance.

Results

Differentially expressed genes

Normalized gene expression data are shown in Figure 1a. Good normalization performance was achieved. A total of 638 DEGs were screened out in GC samples compared with healthy controls, including 225 upregulated DEGs and 413 downregulated DEGs.
Figure 1

Boxplot for normalized gene expression data and cluster analysis results. (a) Boxplot of gene expression data. The medians are almost at the same level, indicating high normalization performance. (b) Cluster analysis results for gene expression data. The expression values clustered in the purple/magenta-shaded areas indicate overexpression, and the green-shaded areas indicate underexpression.

Boxplot for normalized gene expression data and cluster analysis results. (a) Boxplot of gene expression data. The medians are almost at the same level, indicating high normalization performance. (b) Cluster analysis results for gene expression data. The expression values clustered in the purple/magenta-shaded areas indicate overexpression, and the green-shaded areas indicate underexpression.

Cluster analysis results

Cluster analysis was performed with gene expression values, and the results are shown in Figure 1b. The gene expression of GC samples are distinguished from the healthy controls, indicating that obvious differences existed between the two groups.

Functional enrichment analysis results for differentially expressed genes

The functional enrichment analysis was conducted for upregulated and downregulated DEGs, respectively. The results showed that 15 and 13 terms, respectively, were significantly enriched (Table 1). Cell-cycle process (FDR = 1.50E-05), cell cycle (FDR = 3.70E-05), cell adhesion (FDR = 0.00146), cell motion (FDR = 0.001626) and regulation of apoptosis (FDR = 0.00271) were significantly enriched among upregulated genes. Regulation of cell proliferation (FDR = 3.72E-04), immune response (FDR = 0.001061657) and cellular ion homeostasis (FDR = 0.010226535) were significantly enriched for downregulated genes. For the cell-cycle process, 30 upregulated DEGs were included, such as NIMA-related kinase 2 (NEK2), cohesin subunit (RAD21) and thrombospondin 1 (THBS1). For regulation of cell proliferation, 48 downregulated DEGs, such as paired box 3 (PAX3), were contained.
Table 1

Functional enrichment analysis of the upregulated and downregulated differentially expressed genes

Gene accession numberCountFDR
Upregulated DEGs
[GO:0022402] Cell-cycle process
30
1.50E-05
[GO:0007049] Cell cycle
35
3.70E-05
[GO:0022403] Cell-cycle phase
24
1.43E-04
[GO:0000278] Mitotic cell cycle
22
3.82E-04
[GO:0007155] Cell adhesion
30
0.00146
[GO:0022610] Biological adhesion
30
0.001503
[GO:0006928] Cell motion
24
0.001626
[GO:0042981] Regulation of apoptosis
32
0.00271
[GO:0043067] Regulation of programmed cell death
32
0.003334
[GO:0010941] Regulation of cell death
32
0.0036
[GO:0006259] DNA metabolic process
24
0.004784
[GO:0009611] Response to wounding
24
0.010324
[GO:0001501] Skeletal system development
18
0.013141
[GO:0051301] Cell division
17
0.0199
[GO:0051726] Regulation of cell cycle
18
0.021567
Downregulated DEGs
[GO:0042127] Regulation of cell proliferation
48
3.72E-04
[GO:0008284] Positive regulation of cell proliferation
32
4.67E-04
[GO:0006873] Cellular ion homeostasis
30
5.59E-04
[GO:0006955] Immune response
43
0.001061657
[GO:0055080] Cation homeostasis
25
0.001479293
[GO:0019226] Transmission of nerve impulse
27
0.005126019
[GO:0019725] Cellular homeostasis
32
0.005850539
[GO:0007610] Behavior
32
0.006669845
[GO:0007586] Digestion
13
0.009844162
[GO:0006875] Cellular metal ion homeostasis
19
0.010226535
[GO:0055065] Metal ion homeostasis
19
0.019086885
[GO:0030003] Cellular cation homeostasis
21
0.031550799
[GO:0007268] Synaptic transmission230.033256699

aDEG, differentially expressed gene; FDR, false discovery rate.

Functional enrichment analysis of the upregulated and downregulated differentially expressed genes aDEG, differentially expressed gene; FDR, false discovery rate.

Interaction networks

The most upregulated gene, SPP1, and the most downregulated gene, FABP4, were selected from among the DEGs. Their expression values in each sample are shown in Figure 2. Interactors of the two genes were retrieved from STRING and HitPredict, then the interaction networks were constructed (Figure 3). In total, 55 and 13 genes were included in the networks of SPP1 and FABP4, respectively. The SPP1 network contained integrin α11 (ITGA11), integrin β5 (ITGB5), ITGA10, ITGB3 and other genes.
Figure 2

Gene expression levels of (a) and (b) in each sample. (a)FABP4 is downregulated in gastric cancer (GC) tissue. (b)SPP1 is upregulated in GC tissue.

Figure 3

Interaction networks including or . (a) The network that involved FABP4 based on HitPredict database, with the green lines indicating high-confidence, small-scale binary; the blue lines indicating high-confidence, small-scale–derived; the black lines indicating high-confidence, high-throughput; and the dashed black lines indicating spurious small-scale or high-throughput. (b) The network that involved SPP1 based on the STRING database.

Gene expression levels of (a) and (b) in each sample. (a)FABP4 is downregulated in gastric cancer (GC) tissue. (b)SPP1 is upregulated in GC tissue. Interaction networks including or . (a) The network that involved FABP4 based on HitPredict database, with the green lines indicating high-confidence, small-scale binary; the blue lines indicating high-confidence, small-scale–derived; the black lines indicating high-confidence, high-throughput; and the dashed black lines indicating spurious small-scale or high-throughput. (b) The network that involved SPP1 based on the STRING database.

Functional enrichment analysis results for genes in the networks

GeneCodis was chosen to analyze the function of all genes in the two networks. Only eight functional annotations were revealed in the network that included SPP1 (Table 2), and the most significant one was extracellular matrix (ECM)-receptor interaction (FDR = 1.01E-31). SPP1 was the most overexpressed gene in the whole pathway and might play a key role in the pathogenesis of GC.
Table 2

Overrepresented functional annotation terms in the network including

Gene accession numberCountFDR
[KEGG:hsa04512]: ECM-receptor interaction
25
1.01E-31
[KEGG:hsa04510]: Focal adhesion
26
1.76E-23
[KEGG:hsa05410]: Hypertrophic cardiomyopathy (HCM)
20
1.11E-21
[KEGG:hsa05414]: Dilated cardiomyopathy
20
5.79E-21
[KEGG:hsa05412]: Arrhythmogenic right ventricular cardiomyopathy (ARVC)
19
8.03E-21
[KEGG:hsa04810]: Regulation of actin cytoskeleton
20
1.11E-13
[KEGG:hsa04640]: Hematopoietic cell lineage
8
0.003148
[KEGG:hsa05200]: Pathways in cancer130.003491

ECM, extracellular matrix; FDR, false discovery rate.

Overrepresented functional annotation terms in the network including ECM, extracellular matrix; FDR, false discovery rate.

Discussion

Microarray data of GC samples and healthy controls were compared to identify the DEGs in present study. A total of 638 DEGs were obtained in GC samples. Cell-cycle process, cell adhesion, cell motion and regulation of apoptosis were significantly overrepresented in the upregulated genes according to the functional enrichment analysis, whereas regulation of cell proliferation, immune response and cellular ion homeostasis were enriched in the downregulated genes. Proliferation, cell cycle, immune response and apoptosis are closely associated with cancer. Many factors, such as oncogenes and tumor suppressors, have been found to be involved in the regulation of cell cycle, and abnormalities in relevant genes contribute to the incidence of cancer [30]. The immune system is a critical defense, and its dysfunction results in cancer. People have put in considerable effort to disclose the mechanisms of immune escape [31,32]. The functional enrichment analysis results in this study confirmed the reliability of our findings, and many of them have been implicated in various cancers. In addition, some key genes were screened as the DEGs and were involved in significant functions of the DEGs. In the cell-cycle process, for example, NEK2 encoded a serine/threonine protein kinase that was involved in mitotic regulation. It was associated with chromosome instability [33] and incidence of cancers [34]. RAD21 was involved in the repair of DNA double-strand breaks, and its deregulation was previously reported in endometrial cancer and oral squamous cell carcinoma [35,36]. Atienza et al. also indicated that suppression of RAD21 gene expression can decrease growth of breast cancer cells [37]. THBS1 is a glycoprotein that mediates cell-to-cell and cell-to-matrix interactions and plays a role in tumorigenesis. Lin et al. reported that polymorphism of THBS1 rs1478604 A > G in the 5′-untranslated region is associated with lymph node metastasis of GC [38]. Although it regulates cell proliferation, PAX3 was found to trigger neoplastic development by maintaining cells in a deregulated, undifferentiated and proliferative state, and it has become a target for cancer immunotherapy [39]. Thus, our findings might provide directions for future research. SPP1 was the most significantly upregulated gene, and FABP4 was the most significantly downregulated gene; therefore, network analysis was conducted for the two genes to mine more information. ECM-receptor interaction was significantly enriched in the network including SPP1. In fact, ECM is a macromolecular network comprising collagen, noncollagenous glycoprotein, glycosaminoglycan, proteoglycan, elastin and others. ECM was found to influence cell survival, death, proliferation and differentiation as well as cancer metastasis [40]. In addition, several subunits of integrin were included in the SPP1 network, such as ITGA11, ITGB5, ITGA10, ITGB3 and others. Integrins played important roles in cell adhesion and signal transduction. The integrin family regulated a range of cellular functions, which were crucial to the initiation, progression and metastasis of solid tumors [41]. ITGB3 was identified as a key regulator in reactive oxygen species–induced migration and invasion of colorectal cancer cells [42]. ITGB1 presented certain prognostic value for patients with GC [43]. ITGB8 silencing could reduce the potential metastasis of lung cancer cells [44]. Moreover, the ITGA2 gene C807T polymorphism was associated with the risk of GC [45]. Therefore, we thought these genes were also worthy of further research to uncover their potential effects in the diagnosis, prognosis and treatment of GC.

Conclusions

Overall, a range of DEGs were obtained through comparing gene expression profiles of GC samples with healthy controls. These genes might play important roles in the pathogenesis of GC according to the functional enrichment analysis, especially SPP1, which was closely associated with ECM-receptor interaction. Of course, more research is needed to confirm their potential function in clinical applications.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

YJ Conceived and designed the study and Analyzed the data; WD Wrote the paper. All authors read and approved the final manuscript.
  41 in total

1.  [The molecular mechanism of survivin expression in activated human peripheral lymphocytes].

Authors:  Yan Dong; Zhu-zhong Mei; Jun-jie Qian; Yi Song; Bao-lei Tian; Bin Liu; Zhi-xian Sun
Journal:  Xi Bao Yu Fen Zi Mian Yi Xue Za Zhi       Date:  2008-01

2.  Association between ITGA2 C807T polymorphism and gastric cancer risk.

Authors:  Jie Chen; Nan-Nan Liu; Jia-Qi Li; Li Yang; Ying Zeng; Xiao-Mei Zhao; Lin-Lin Xu; Xuan Luo; Bin Wang; Xue-Rong Wang
Journal:  World J Gastroenterol       Date:  2011-06-21       Impact factor: 5.742

3.  Proteomics identification of ITGB3 as a key regulator in reactive oxygen species-induced migration and invasion of colorectal cancer cells.

Authors:  Yunlong Lei; Kai Huang; Cong Gao; Quek Choon Lau; Hua Pan; Ke Xie; Jingyi Li; Rui Liu; Tao Zhang; Na Xie; Huey Shan Nai; Hong Wu; Qiang Dong; Xia Zhao; Edouard C Nice; Canhua Huang; Yuquan Wei
Journal:  Mol Cell Proteomics       Date:  2011-05-27       Impact factor: 5.911

4.  Gastric cancer: epidemiology and risk factors.

Authors:  Guenter J Krejs
Journal:  Dig Dis       Date:  2010-11-18       Impact factor: 2.404

5.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

6.  Serum olfactomedin 4 (GW112, hGC-1) in combination with Reg IV is a highly sensitive biomarker for gastric cancer patients.

Authors:  Naohide Oue; Kazuhiro Sentani; Tsuyoshi Noguchi; Shinya Ohara; Naoya Sakamoto; Tetsutaro Hayashi; Katsuhiro Anami; Junichi Motoshita; Masanori Ito; Shinji Tanaka; Kazuhiro Yoshida; Wataru Yasui
Journal:  Int J Cancer       Date:  2009-11-15       Impact factor: 7.396

7.  A map of the interactome network of the metazoan C. elegans.

Authors:  Siming Li; Christopher M Armstrong; Nicolas Bertin; Hui Ge; Stuart Milstein; Mike Boxem; Pierre-Olivier Vidalain; Jing-Dong J Han; Alban Chesneau; Tong Hao; Debra S Goldberg; Ning Li; Monica Martinez; Jean-François Rual; Philippe Lamesch; Lai Xu; Muneesh Tewari; Sharyl L Wong; Lan V Zhang; Gabriel F Berriz; Laurent Jacotot; Philippe Vaglio; Jérôme Reboul; Tomoko Hirozane-Kishikawa; Qianru Li; Harrison W Gabel; Ahmed Elewa; Bridget Baumgartner; Debra J Rose; Haiyuan Yu; Stephanie Bosak; Reynaldo Sequerra; Andrew Fraser; Susan E Mango; William M Saxton; Susan Strome; Sander Van Den Heuvel; Fabio Piano; Jean Vandenhaute; Claude Sardet; Mark Gerstein; Lynn Doucette-Stamm; Kristin C Gunsalus; J Wade Harper; Michael E Cusick; Frederick P Roth; David E Hill; Marc Vidal
Journal:  Science       Date:  2004-01-02       Impact factor: 47.728

8.  HitPredict: a database of quality assessed protein-protein interactions in nine species.

Authors:  Ashwini Patil; Kenta Nakai; Haruki Nakamura
Journal:  Nucleic Acids Res       Date:  2010-10-14       Impact factor: 16.971

9.  Evaluating different methods of microarray data normalization.

Authors:  André Fujita; João Ricardo Sato; Leonardo de Oliveira Rodrigues; Carlos Eduardo Ferreira; Mari Cleide Sogayar
Journal:  BMC Bioinformatics       Date:  2006-10-23       Impact factor: 3.169

10.  GeneCodis: interpreting gene lists through enrichment analysis and integration of diverse biological information.

Authors:  Ruben Nogales-Cadenas; Pedro Carmona-Saez; Miguel Vazquez; Cesar Vicente; Xiaoyuan Yang; Francisco Tirado; Jose María Carazo; Alberto Pascual-Montano
Journal:  Nucleic Acids Res       Date:  2009-05-22       Impact factor: 16.971

View more
  10 in total

Review 1.  Microarray analysis in gastric cancer: a review.

Authors:  Giovanna D'Angelo; Teresa Di Rienzo; Veronica Ojetti
Journal:  World J Gastroenterol       Date:  2014-09-14       Impact factor: 5.742

2.  Retraction note: Screening of key genes in gastric cancer with DNA microarray analysis.

Authors: 
Journal:  Eur J Med Res       Date:  2015-03-26       Impact factor: 2.175

3.  Cellular Signaling Pathways in Insulin Resistance-Systems Biology Analyses of Microarray Dataset Reveals New Drug Target Gene Signatures of Type 2 Diabetes Mellitus.

Authors:  Syed Aun Muhammad; Waseem Raza; Thanh Nguyen; Baogang Bai; Xiaogang Wu; Jake Chen
Journal:  Front Physiol       Date:  2017-01-25       Impact factor: 4.566

4.  Genome wide meta-analysis of cDNA datasets reveals new target gene signatures of colorectal cancer based on systems biology approach.

Authors:  Umair Ilyas; Shaiq Uz Zaman; Reem Altaf; Humaira Nadeem; Syed Aun Muhammad
Journal:  J Biol Res (Thessalon)       Date:  2020-06-08       Impact factor: 1.889

5.  Study of Gene Expression Profiles of Breast Cancers in Indian Women.

Authors:  Shreshtha Malvia; Sarangadhara Appala Raju Bagadi; Dibyabhaba Pradhan; Chintamani Chintamani; Amar Bhatnagar; Deepshikha Arora; Ramesh Sarin; Sunita Saxena
Journal:  Sci Rep       Date:  2019-07-10       Impact factor: 4.379

6.  Genome-scale meta-analysis of breast cancer datasets identifies promising targets for drug development.

Authors:  Reem Altaf; Humaira Nadeem; Mustafeez Mujtaba Babar; Umair Ilyas; Syed Aun Muhammad
Journal:  J Biol Res (Thessalon)       Date:  2021-02-16       Impact factor: 1.889

7.  Quantitative Real-Time Analysis of Differentially Expressed Genes in Peripheral Blood Samples of Hypertension Patients.

Authors:  Fawad Ali; Arifullah Khan; Syed Aun Muhammad; Syed Shams Ul Hassan
Journal:  Genes (Basel)       Date:  2022-01-21       Impact factor: 4.096

8.  Screening for candidate genes related to breast cancer with cDNA microarray analysis.

Authors:  Yu-Juan Xiang; Qin-Ye Fu; Zhong-Bing Ma; De-Zong Gao; Qiang Zhang; Yu-Yang Li; Liang Li; Lu Liu; Chun-Miao Ye; Zhi-Gang Yu; Ming-Ming Guo
Journal:  Chronic Dis Transl Med       Date:  2015-03-05

9.  Identification of molecular biomarkers for the diagnosis of gastric cancer and lymph-node metastasis.

Authors:  Sharvesh Raj Seeruttun; Wing Yan Cheung; Wei Wang; Cheng Fang; Zhi-Min Liu; Jin-Qing Li; Ting Wu; Jun Wang; Chun Liang; Zhi-Wei Zhou
Journal:  Gastroenterol Rep (Oxf)       Date:  2018-08-13

10.  qPCR Analysis Reveals Association of Differential Expression of SRR, NFKB1, and PDE4B Genes With Type 2 Diabetes Mellitus.

Authors:  Waseem Raza; Jinlei Guo; Muhammad Imran Qadir; Baogang Bai; Syed Aun Muhammad
Journal:  Front Endocrinol (Lausanne)       Date:  2022-01-03       Impact factor: 5.555

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.