Hao Li1, Beiqin Yu2, Jianfang Li2, Liping Su2, Min Yan3, Jun Zhang1, Chen Li3, Zhenggang Zhu3, Bingya Liu2. 1. Shanghai Key Laboratory of Gastric Neoplasms, Shanghai Institute of Digestive Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People's Republic of China; Department of Oncology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People's Republic of China. 2. Shanghai Key Laboratory of Gastric Neoplasms, Shanghai Institute of Digestive Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People's Republic of China. 3. Shanghai Key Laboratory of Gastric Neoplasms, Shanghai Institute of Digestive Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People's Republic of China; Department of Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People's Republic of China.
Abstract
To explore the patterns of gene expression in gastric cancer, a total of 26 paired gastric cancer and noncancerous tissues from patients were enrolled for gene expression microarray analyses. Limma methods were applied to analyze the data, and genes were considered to be significantly differentially expressed if the False Discovery Rate (FDR) value was < 0.01, P-value was <0.01 and the fold change (FC) was >2. Subsequently, Gene Ontology (GO) categories were used to analyze the main functions of the differentially expressed genes. According to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, we found pathways significantly associated with the differential genes. Gene-Act network and co-expression network were built respectively based on the relationships among the genes, proteins and compounds in the database. 2371 mRNAs and 350 lncRNAs considered as significantly differentially expressed genes were selected for the further analysis. The GO categories, pathway analyses and the Gene-Act network showed a consistent result that up-regulated genes were responsible for tumorigenesis, migration, angiogenesis and microenvironment formation, while down-regulated genes were involved in metabolism. These results of this study provide some novel findings on coding RNAs, lncRNAs, pathways and the co-expression network in gastric cancer which will be useful to guide further investigation and target therapy for this disease.
To explore the patterns of gene expression in gastric cancer, a total of 26 paired gastric cancer and noncancerous tissues from patients were enrolled for gene expression microarray analyses. Limma methods were applied to analyze the data, and genes were considered to be significantly differentially expressed if the False Discovery Rate (FDR) value was < 0.01, P-value was <0.01 and the fold change (FC) was >2. Subsequently, Gene Ontology (GO) categories were used to analyze the main functions of the differentially expressed genes. According to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, we found pathways significantly associated with the differential genes. Gene-Act network and co-expression network were built respectively based on the relationships among the genes, proteins and compounds in the database. 2371 mRNAs and 350 lncRNAs considered as significantly differentially expressed genes were selected for the further analysis. The GO categories, pathway analyses and the Gene-Act network showed a consistent result that up-regulated genes were responsible for tumorigenesis, migration, angiogenesis and microenvironment formation, while down-regulated genes were involved in metabolism. These results of this study provide some novel findings on coding RNAs, lncRNAs, pathways and the co-expression network in gastric cancer which will be useful to guide further investigation and target therapy for this disease.
Gastric cancer (GC) is one of the most common cancers worldwide, and its incidence is particularly high in Eastern Asia, especially in China. Approximately 952,000 new cases of stomach cancer were diagnosed worldwide in 2012, and half of them occurred in Eastern Asia (mainly in China) [1]. In China, the majority of patients with GC are diagnosed at a late stage with poor prognosis. Therefore, elucidating the molecular mechanisms underlying GC progression is essential to identifying key biomarkers and developing effective targeted therapies.Over the last decade, gene expression microarrays have become a common tool for examining gene transcript levels in cancer research. Microarray data is used for a wide variety of analyses, such as unsupervised clustering, classification, differential expression analysis, and expression mapping of quantitative trait loci [2]. It not only helps to identify key dysfunctional genes in cancer but provides genome-wide information on gene expression at one time as well[3,4]. In this study, we performed a genome-wide survey of the expression of lncRNAs and mRNAs from paired samples of primary gastric cancer tissues and noncancerous tissues, to profile the differentially expressed lncRNAs and coding RNAs. Study of these data will provide valuable information on the mechanism of carcinogenesis and allow discovery of key genes that may act as future targets of anti-cancer therapy.
Methods and Materials
Ethical statement
Written informed consent was obtained from all participants. The study was approved by the Human Research Ethics Committee of Ruijin Hospital, Shanghai Jiao Tong University, School of Medicine.
Tissue samples
Tissues were taken from primary gastric carcinomas from untreated patients who underwent D2 radical gastrectomy in Shanghai Ruijin Hospital. For each cancer tissue, a paired noncancerous tissue sample was collected from the adjacent region at the same time. The size of each sample was around 0.1cm3. All the samples were placed in RNALater within 15 minutes after excision and stored in liquid nitrogen until RNA extraction. In this study, 32 paired tissues were collected for the microarray and 26 paired samples were enrolled for the next-step analysis of GO, pathway and network after quality control using 3D Principal component analysis (3D-PCA) and Cluster analysis.
Microarray experiments
Agilent SurePrint G3 Human GE 8x60K Microarray (Design ID: 028004) was employed in this study. Total RNA was isolated and amplified using a Low Input Quick Amp Labeling Kit, One-Color (Cat#5190–2305, Agilent technologies, US). Then, the labeled cRNAs were purified by a RNeasy mini kit (Cat#74106, QIAGEN, Germany).Based on the manufacturer’s instructions, each slide was hybridized with 600ng Cy3-labeled cRNA using a Gene Expression Hybridization Kit (Cat#5188–5242, Agilent technologies, US) and washed by the Gene Expression Wash Buffer Kit (Cat#5188–5327, Agilent technologies, US).An Agilent Microarray Scanner (Cat#G2565CA, Agilent technologies, US) and Feature Extraction software 10.7 (Agilent technologies, US) were applied to scan each slide with the same settings shown as follow, Dye channel: Green, Scan resolution = 3μm, 20bit. The raw data were normalized by the Quantile algorithm, Gene Spring Software 11.0 (Agilent technologies, US) (detailed in S5 Table).
Limma
Linear models and empirical Bayes methods were applied to analyze the data in this study. The resulting P-values were adjusted using the BH FDR algorithm. There were three standards for us to consider that a gene was significantly differentially expressed, the FDR value was <0.01, P-value was <0.01 and the fold change was >2. (detailed in S5 Table)
GO category
We performed Gene Ontology (GO) analyses to analyze the functions of the differentially expressed genes in our microarray according to the key functional classification of The National Center for Biotechnology Information (NCBI). Generally, Fisher’s exact test and the χ
2 test were applied to classify the GO category, and the false discovery rate (FDR, ) was calculated to correct the P-value (N
refers to the number of Fisher’s test P-values less than the χ
2 test P-values). The enrichment Re was given by: Re = (n
/n)/(N
/N) in the significant categories (N
is the number of differential genes within the particular category, n is the total number of genes within the same category, n
is the number of differential genes in the entire microarray, and N is the total number of genes in the microarray.)(detailed in S5 Table).
Pathway analyses
Pathway annotations of the differential exressed genes were obtained from KEGG (http://www.genome.jp/kegg/). Pathway categories with a FDR <0.01 were marked. The enrichment of significant pathways was given by: enrichment = /, which helped us to locate more significant pathways in our study (n
is the number of differential genes within the particular pathway, n
is the total number of genes within the same pathway, N
is the number of differential genes which have at least one pathway annotation, and N
is the number of genes which have at least one pathway annotation in the entire microarray.) (detailed in S5 Table).
Gene-Act network
According to the KEGG database, one gene may be involved in several pathways or interact with several other genes. All the gene—gene interactions were pooled together to build the Gene-Act network based on the differential pathways, which helped us to reveal the signaling pathways and key regulatory genes in GC.
Co-expression network
Gene co-expression Network was built according to the normalized signal intensity of specific expression genes. Degree centrality is defined as the number of links one node has to another, which determines the relative importance of genes. What’s more, k-cores were applied as a method of simplifying the graph topology analyses. Core regulatory factors (genes) which have the highest degrees connect most adjacent genes and build the structure of the network (detailed in S5 Table).
Real-time quantitative PCR
Total RNA was extracted from tissues using the Trizol reagent (Invitrogen) according to the manufacturer’s instructions. The quantitative real-time polymerase chain reaction (PCR) was performed by using SYBR-green PCR Master Mix in a Fast Real-time PCR 7500 System (Applied Biosystems). The primers of the 10 genes were showed in S4 Table. PCR reactions were performed at 50°C for 2 min, followed by 40 cycles of 95°C for 15 s and 60°C for 1 min. ΔCt was calculated by subtracting the Ct of β-actin RNA (control) from the Ct of the RNA of sample, respectively. ΔΔCt was then calculated by subtracting the ΔCt of the control from the ΔCt of the sample. Fold change was calculated by the equation 2-ΔΔCt.
Statistical analysis
SPSS software 19 and Microsoft Excel 2010 was used to analyze the data. Expression levels between cancer tissues and adjacent noncancerous tissues were analyzed by paired-sample t-tests. P-values below 0.05 were regarded as statistically significant.
Results
Microarray analyses
In total, 42,405 human genes were profiled in our study by using an Agilent G3 Human GE 8x60K microarray. We have submitted our dataset in the repository of “Gene Expression Omnibus” and the accession number was “GSE65801” (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE65801). We used linear models and empirical Bayes methods to analyze the data (see Methods). There were 2371 mRNAs and 350 lncRNAs considered as the differentially expressed genes by limma for the next-step analysis (Fig 1A).
Fig 1
Differentially expressed genes in a gene expression microarray of 26 pairs of gastric cancer and noncancerous tissues.
A) Volcano plot showing the differential genes (red dots) in the expression microarray (P-value <0.01, FDR <0.01). B) Clustering heatmap showing the differential lncRNAs. Each column represents one sample and each row represents one differential lncRNA. C) Clustering heatmap showing the differential mRNAs. Each column represents one sample and each row represents one differential mRNA.
Differentially expressed genes in a gene expression microarray of 26 pairs of gastric cancer and noncancerous tissues.
A) Volcano plot showing the differential genes (red dots) in the expression microarray (P-value <0.01, FDR <0.01). B) Clustering heatmap showing the differential lncRNAs. Each column represents one sample and each row represents one differential lncRNA. C) Clustering heatmap showing the differential mRNAs. Each column represents one sample and each row represents one differential mRNA.Among all 2371 differential mRNAs, there are 1142 mRNAs down-regulated and 1229 mRNAs up-regulated in our observation on alterations of gene expression between gastric cancer and control tissues (Fig 1C). Most of the differential mRNAs have been proven to be correlated with carcinogenesis and metastasis in most types of cancer (Table 1). The genes such as GKN2, PGC, MUC6, CHIA, PSCA and FBP2 were among the top 20 down-regulated genes, while KLK8, SFRP4, INHBA, CLDN1, CST1, FAP, SPP1, OLFM4, and KRT17 were among the top 20 up-regulated genes (Table 1). However, some genes such as HOXC9, FNDC1, STRA6, KCNE2, PGA3 and KCNJ16 haven’t been reported in gastric cancer and their roles remain unknown (Table 1).
Table 1
top 40 differential expressed mRNAs in gastric cancer
Log2FC<0: down-regulated (left panel), Log2FC>0: up-regulated (right panel)In addition, we found 193 down-regulated lncRNAs and 156 up-regulated lncRNAs among a total of 350 differential lncRNAs based on the profiling (Fig 1B). Most of the lncRNAs have not been given an official names and their functions remain unknown. However, some have been reported playing critical roles in cancer, such as H19, GUCY1B2, MEG3 and AKR7L (Table 2).
Table 2
top 60 differential expressed lncRNAs in gastric cancer
Log2FC<0: down-regulated (left panel), Log2FC>0: up-regulated (right panel)In our previous report [36], the fold change (FC) of H19 in 74 gastric cancer versus paired noncancerous tissues was 6.015, with a P-value of 0.017. This result was consistent with the data of H19 (Absolute FC = 6.06) in this microarray analyses. Furthermore, over-expression of H19 contributes to the proliferation, migration, invasion and metastasis of gastric cancer.
Gene Ontology categories
All the differentially expressed genes were classified into different functional categories according to the Gene Ontology (GO) project for biological processes. Based on our microarray data, GO analyses indicated that 208 GO terms were enriched (P<0.01, FDR<0.01) (S1 Table). The primary GO categories for 170 up-regulated GO terms were focused on cell adhesion, angiogenesis, multicellular organism development, axon guidance, skeletal system development, collagen fibril organization, positive regulation of angiogenesis, wounding and negative regulation of cell proliferation (Fig 2A). The main GO categories for down-regulated genes were digestion, xenobiotic metabolic process, transmembrane transport, ion transport, small molecule metabolic process, negative regulation of growth, glutathione metabolic process, cellular response to cadmium ion and metabolic process (Fig 2B).
Fig 2
GO categories based on differential genes in the expression microarray.
A) The significant GO categories for up-regulated genes. B) The significant GO categories for down-regulated genes. P-value <0.01 and FDR <0.01 were used as a threshold to enroll significant GO categories.
GO categories based on differential genes in the expression microarray.
A) The significant GO categories for up-regulated genes. B) The significant GO categories for down-regulated genes. P-value <0.01 and FDR <0.01 were used as a threshold to enroll significant GO categories.According to the differential genes and functions, we built a GO Tree to explore the interactions among all the differential GO categories. The diversity in these categories when comparing cancerous and control tissues suggested that gastric cancer may be associated with significantly up-regulated cell migration, cell proliferation, angiogenesis, cell—cell adhesion and cell surface receptor signaling pathways, while cell metabolism processes and ion transmembrane transport are down-regulated (Fig 3).
Fig 3
Interaction of GO categories (GO Tree) based on analyses of differential GO categories.
Red dots represent up-regulated GO categories and green dots represent down-regulated GO categories.
Interaction of GO categories (GO Tree) based on analyses of differential GO categories.
Red dots represent up-regulated GO categories and green dots represent down-regulated GO categories.Pathway analyses were used to identify the significant pathways associated with the differentially expressed genes according to KEGG. There were 32 up-regulated pathways and 31 down-regulated pathways based on our data (Fig 4). Furthermore, the pathway profiling was consistent with the results for the GO categories in cancer-related biological functions. Our data showed some differential genes highly up-regulated which suggested their involved pathways were activiated. For example, SFRP4, WNT11, FZD2, MYC were highly expressed in cancer tissues which represent the Wnt pathway was activiated and BCL2A1, ICM1, TNFSF14 in NF-κB pathway were highly expressed as well. Most of the cancer-related signaling pathways such as JAK/STAT, Wnt, NF-κB, PI3K, mTOR, Hedgehog and Notch pathways were activated in gastric cancer compared with noncancerous tissues based on our data (S2 Table). The up-regulated pathways which were focused on cell adhesion, transcriptional dysregulation, carcinogenesis and differentiation were correlated with tumorogenesis and metastasis (Fig 4A). However, the down-regulated pathways were generally responsible for metabolism (Fig 4B).
Fig 4
Pathway analysis of differentially expressed genes according to the KEGG database.
A) Top-ranking up-regulated pathways identified by KEGG. B) Top-ranking down-regulated pathways identified by KEGG. Differential pathways are listed according to P-value <0.01 and FDR <0.01.C) Pathway-Act network showing the interaction of differential pathways. The red dots represent up-regulated pathways and the green dots represent down-regulated pathways.
Pathway analysis of differentially expressed genes according to the KEGG database.
A) Top-ranking up-regulated pathways identified by KEGG. B) Top-ranking down-regulated pathways identified by KEGG. Differential pathways are listed according to P-value <0.01 and FDR <0.01.C) Pathway-Act network showing the interaction of differential pathways. The red dots represent up-regulated pathways and the green dots represent down-regulated pathways.Based on GO categories and pathway analysis, one gene may be involved in several pathways or interact with several other genes. We pooled the differential genes and built a network of the interactions of differentially expressed genes. A high degree protein regulates or is regulated by many other proteins, which implies an important role in the Gene-Act network (S3 Table). The glutathione S-transferase (GST) family, cytochrome P450 (CYP) family, UDP glucuronosyltransferase 2 (UGT2) family, Epidermal Growth Factor Receptor (EGFR) family and cAMP-dependent protein kinase catalytic beta (PRKACB) were at the core of the gene—gene interaction network. They may play key roles in the network because they possessed the strongest degree (degree >25) centralities (gene-gene interactions) (Fig 5). It has been reported that GST, EGFR and PRKACB are responsible for signal transduction pathways involved in tumor growth and differentiation in different type of cancers [42,43].
Fig 5
Gene-Act network of differential genes according to pathways in the database.
Red dots represent up-regulated genes and green dots represent down-regulated genes. The arrows indicate the connection and regulatory relationship between two genes. Genes that have more connections with other genes have a higher degree score.
Gene-Act network of differential genes according to pathways in the database.
Red dots represent up-regulated genes and green dots represent down-regulated genes. The arrows indicate the connection and regulatory relationship between two genes. Genes that have more connections with other genes have a higher degree score.
Gene co-expression network
We produced a gene co-expression network based on the differentially expressed genes, proteins and protein complex in cancer tissues and noncancerous (control) tissues, respectively. Compared with the control, the connections between genes in cancer tissues were less, which suggested that most of the physiological gene—gene interactions and linkages in normal tissues had been broken or lost in the cancer tissues (Fig 6A and 6B). The genes with high degree and k-core which means they possessed most of the interactions with other geneswere known as key genes in the interaction network (Fig 6B) including TRO, GPR124, TIMP2, EMCN, SLIT3, HTRA1, SPARC, LAMA4 and MEOX2 (Table 3). They were responsible for cell signaling, adhesion, angiogenesis, migration, growth and metastasis.
Fig 6
Co-expression network of genes differentially expressed between normal and cancer tissues.
A) Co-expressed genes and their network in noncancerous tissue. B) Co-expressed genes and their network in gastric cancer. The greater the value of K-score, the stronger the differentially expressed genes are co-expressed. The scale of the K-score is from 1 to 21 in normal tissue but from 1 to 5 in cancer tissue.
Table 3
top 20 differential genes with highest Degree and K-Core in co-expression network
Gene Name
Description
Degree
K-core
HEG1
HEG homolog 1 (zebrafish)
15
5
COL14A1
collagen, type XIV, alpha 1
14
5
TRO[44]
trophinin
14
5
GPR124[45]
G protein-coupled receptor 124
12
5
IGFBP4
insulin-like growth factor binding protein 4
11
5
ECSCR
endothelial cell-specific chemotaxis regulator
10
3
EMCN[47]
endomucin
10
3
SERPINF1
serpin peptidase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 1
Co-expression network of genes differentially expressed between normal and cancer tissues.
A) Co-expressed genes and their network in noncancerous tissue. B) Co-expressed genes and their network in gastric cancer. The greater the value of K-score, the stronger the differentially expressed genes are co-expressed. The scale of the K-score is from 1 to 21 in normal tissue but from 1 to 5 in cancer tissue.
Confirmation of microarray results by qPCR
We performed Quantitative Real-time PCR (qPCR) on 6 up-regulated genes (COL1A, BGN, SPP1, MELK, IGFBP4, SPARC) and 4 down-regulated genes (PGC, SST, MT1X, S100P) to verify our data in gastric cancer tissues (Tumor) and noncancerous tissues (Normal). The expression ratios of these 10 genes (Tumor/Normal) from qPCR are consistent with those from microarray (S4 Table). It suggested the data of differential genes expression from microarray was reliable. What’s more, our team has been worked on some of the differential genes such as PHF10[55], CEACAM6[56], SFRP1[57], SOX11[58], CLDN1[59] to investigate their expression and functions in gastric cancer and the results perfect proved our microarray data.
Discussion
Microarray gene-expression analyses on gastric cancer have previously been used to predict diagnostic markers [60] and to identify gene expression patterns associated with prognosis [61,62], but it hasn’t been used to reveal molecular interactions among lncRNAs and mRNAs in GC. In this study, we analyzed 26 gastric cancer tissues with paired noncancerous tissues and profiled the genes differentially expressed according to their GO categories, pathways, Gene-Act network and Co-Expression network.The gene expression results were obtained by using an Agilent G3 Human GE 8x60K microarray, which not only covers the transcriptome databases for mRNA targets but also includes probes for lncRNAs (long non-coding RNAs). With the combination of mRNA and lncRNAs, it can perform two experiments on a single microarray and predict lncRNA function and interaction with mRNAs. The analyses revealed a set of genes that were differentially expressed between gastric cancer and normal tissue. Some of them have been reported previously in gastric or other cancers. For example, expression of gastrokine-2 (GKN2) was significantly down-regulated or absent in gastric cancer cell lines, gastric intestinal metaplasia, and tumor tissues. Over-expression of GKN2 contributed to cell proliferation, migration, and invasion of gastric cancer and arrested the cell cycle at the G1–S transition phase [6]. In contrast, levels of expression of inhibin beta A (INHBA) were significantly higher in cancer tissue than in adjacent normal mucosa, and it is regarded as an independent prognostic factor in gastric cancer [22]. In addition, we discovered some novel genes, such as TMEM184A, PSAPL1, KIAA1199, CLRN3 and FNDC1, which have not been reported in gastric cancer previously, and their roles in cancer remain unknown.One of the advantages of our gene expression microarray analysis is that it represented the expression of lncRNAs and mRNAs so that both could be investigated together. Our previously report on the role of lncRNA H19 and its network in GC[36] was based on this microarray data. However, most of the lncRNAs such as DRD5, FMO6P, SNAR-A3 and TPRXL showed in our microarray haven’t been identified and need further investigation to clarify their roles in gastric cancer.Based on our gene expression profiling data, the genes and their functions activated in gastric cancer were responsible for proliferation, adhesion, migration and metastasis, which was consistent with the results from pathway analyses. Interestingly, we discovered that most of the cancer-related signaling pathways reported previously such as Notch, mTOR and Hedgehog were activated in GC based on our data. These results support the viewpoint that heterogeneity is the characteristic of GC. Comparison of the co-expression network between normal tissues and cancer suggested that the expression, functions and interactions of the majority of physiological gene were lost or damaged in gastric cancer, whereas proliferation, migration and metastasis were abnormally enhanced. These interesting findings match the characteristics of cancer, such as anaplasia and dedifferentiation. These differentially expressed genes involved in signaling pathways acted as key genes in co-expression network might be the potential targets of anti-cancer therapy or diagnostic markers in the future.
Pathway analyses of differential genes
(XLSX)Click here for additional data file.
GO analyses of differential genes
(XLSX)Click here for additional data file.
Gene-Act network of differential genes
(XLSX)Click here for additional data file.
Primers and verification
(DOCX)Click here for additional data file.(DOCX)Click here for additional data file.
Authors: F B Fahlbusch; M Ruebner; H Huebner; G Volkert; C Zuern; F Thiel; M Koch; C Menendez-Castro; D L Wachter; A Hartner; W Rascher Journal: Placenta Date: 2013-08-28 Impact factor: 3.481
Authors: Lina Seiz; Julia Dorn; Matthias Kotzsch; Axel Walch; Nicolai I Grebenchtchikov; Apostolos Gkazepis; Barbara Schmalfeldt; Marion Kiechle; Jane Bayani; Eleftherios P Diamandis; Rupert Langer; Fred C G J Sweep; Manfred Schmitt; Viktor Magdolen Journal: Biol Chem Date: 2012-04 Impact factor: 3.915
Authors: Olga Kim; Jung Hwan Yoon; Won Suk Choi; Hassan Ashktorab; Duane T Smoot; Suk Woo Nam; Jung Young Lee; Won Sang Park Journal: J Cell Physiol Date: 2014-06 Impact factor: 6.384
Authors: Caroline E Ford; Eve Jary; Sean Si Qian Ma; Sheri Nixdorf; Viola A Heinzelmann-Schwarz; Robyn L Ward Journal: PLoS One Date: 2013-01-11 Impact factor: 3.240
Authors: A Takeno; I Takemasa; Y Doki; M Yamasaki; H Miyata; S Takiguchi; Y Fujiwara; K Matsubara; M Monden Journal: Br J Cancer Date: 2008-09-30 Impact factor: 7.640