Shifeng Kan1, Song Chai1, Wenhua Chen1, Bo Yu1. 1. Department of Rehabilitation, Shanghai General Hospital, Shanghai Jiaotong University, Shanghai 200080, P.R. China.
Abstract
Glioblastoma multiforme (GBM) is one of the most lethal and damaging types of human cancer. The current study was conducted to identify differentially methylated genes (DMGs) between GBM and normal controls, and to improve our understanding of GBM at the epigenetic level. The DNA methylation profile of GBM was downloaded from the Gene Expression Omnibus (GEO) database using the accession number GSE50923. The MethyAnalysis package was applied to identify DMGs between GBM and controls, which were then analyzed by functional enrichment analysis. Protein-protein interaction (PPI) networks were constructed using the hypermethylated and hypomethylated genes. Finally, transcription factors (TFs) that can regulate the hypermethylated and hypomethylated genes were predicted, followed by construction of transcriptional regulatory networks. Furthermore, another relevant dataset, GSE22867, was downloaded from the GEO database for data validation. A total of 476 hypermethylated and 850 hypomethylated genes were identified, which were mainly associated with the functions of 'G-protein-coupled receptors ligand binding', 'cytokine activity', 'cytokine-cytokine receptor interaction', and 'D-glutamine and D-glutamate metabolism'. The hypermethylated gene neuropeptide Y (NPY) and the hypomethylated gene tumor necrosis factor (TNF) demonstrated high degrees in the PPI network. Forkhead box protein A1 (FOXA1), potassium voltage-gated channel subfamily C member 3 (KCNC3) and caspase-8 (CASP8) exhibited high degrees in the transcriptional regulatory networks. In addition, the methylation profiles of NPY, TNF, FOXA1, KCNC3 and CASP8 were confirmed by another dataset. In summary, the present study systematically analyzed the DNA methylation profile of GBM using bioinformatics approaches and identified several abnormally methylated genes, providing insight into the molecular mechanism underlying GBM.
Glioblastoma multiforme (GBM) is one of the most lethal and damaging types of humancancer. The current study was conducted to identify differentially methylated genes (DMGs) between GBM and normal controls, and to improve our understanding of GBM at the epigenetic level. The DNA methylation profile of GBM was downloaded from the Gene Expression Omnibus (GEO) database using the accession number GSE50923. The MethyAnalysis package was applied to identify DMGs between GBM and controls, which were then analyzed by functional enrichment analysis. Protein-protein interaction (PPI) networks were constructed using the hypermethylated and hypomethylated genes. Finally, transcription factors (TFs) that can regulate the hypermethylated and hypomethylated genes were predicted, followed by construction of transcriptional regulatory networks. Furthermore, another relevant dataset, GSE22867, was downloaded from the GEO database for data validation. A total of 476 hypermethylated and 850 hypomethylated genes were identified, which were mainly associated with the functions of 'G-protein-coupled receptors ligand binding', 'cytokine activity', 'cytokine-cytokine receptor interaction', and 'D-glutamine and D-glutamate metabolism'. The hypermethylated gene neuropeptide Y (NPY) and the hypomethylated gene tumor necrosis factor (TNF) demonstrated high degrees in the PPI network. Forkhead box protein A1 (FOXA1), potassium voltage-gated channel subfamily C member 3 (KCNC3) and caspase-8 (CASP8) exhibited high degrees in the transcriptional regulatory networks. In addition, the methylation profiles of NPY, TNF, FOXA1, KCNC3 and CASP8 were confirmed by another dataset. In summary, the present study systematically analyzed the DNA methylation profile of GBM using bioinformatics approaches and identified several abnormally methylated genes, providing insight into the molecular mechanism underlying GBM.
Glioblastoma multiforme (GBM), which develops from astrocytes, is the most common primary brain tumor in adults and one of the most damaging types of humancancer. Despite aggressive multimodal treatment with surgery, radiotherapy and chemotherapy, the prognosis of GBM is extremely poor (1,2). Additionally, GBM demonstrates a high potential to infiltrate the brain parenchyma, which poses a challenge to the available treatment strategies. Typically, GBM results in mortality at 12–15 months post-diagnosis (1). Therefore, novel therapies for GBM are urgently required.Identifying the mechanisms underlying the development of GBM is important for the development of new treatments. The genetic alterations that affect genes controlling cell growth, apoptosis and invasion have been widely examined in GBM (3). Epigenetic alterations have also been identified to be involved in GBM by affecting the expression of cancer-associated genes alone or in combination with genetic mechanisms (4). Aberrant methylation of gene promoters is the most widely studied epigenetic change that occurs during oncogenesis. It is understood that increased methylation, termed hypermethylation, in the CpG island promotes carcinogenesis by silencing tumor suppressor genes, while loss of methylation, termed hypomethylation, enhances the transcriptional activation of oncogenes and induces chromosomal instability (5,6). Decreased expression of tumor suppressor genes, including retinoblastoma gene, phosphatase and tensin homolog, and TP53, is associated with CpG island promoter hypermethylation and has been reported in GBM (7–9). Epigenetic silencing of O6-methylguanine-DNA methyltransferase, a DNA repair gene that can protect cancer cells from chemotherapeutic alkylating agents, has been revealed to be significantly associated with longer survival times in patients with GBM who are treated with alkylating agents (10,11). Such epigenetic changes may be promising targets for epigenetic anticancer treatments. Indeed, the DNA-demethylating agents 5-azacitidine and 5-aza-2′-deoxycitidine have been approved by the Food and Drug Administration for the treatment of myelogenous leukemia and myelodysplastic syndromes (12).Bioinformatics tools and algorithms assist the processing and analysis of high-throughput DNA methylation data (13,14). For example, a previous study used a joint analysis of DNA methylation and gene expression data of GBM to demonstrate that changes in DNA methylation can be associated with survival outcome (15). In addition, a recent study used a computational approach to integrate gene expression and genomic or methylation data to investigate biological networks in GBM (16). The current study used bioinformatics approaches to reanalyze the DNA methylation data deposited by Lai et al (17). The present results may improve understanding of the epigenetic regulation mechanism of GBM and provide potential gene methylation biomarkers for GBM, which may contribute to the development of treatments.
Materials and methods
Microarray data
DNA methylation data from the GBM study by Lai et al (17) were retrieved from the National Center for Biotechnology Information Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/projects/geo/) with the accession number GSE50923. The DNA methylation profiles of 54 GBM samples and 24 control brain samples were previously investigated using Illumina HumanMethylation27 BeadChip. The patient details are presented in the original study by Lai et al (17). Only data from 31 GBM and 24 control brain samples were present in the GEO database, therefore, only these data were analyzed in the current study.
Identification of differentially methylated regions (DMRs)
The original methylation data were processed using the Bioconductor lumi package version 2.18.0, which is designed to process the Illumina microarray data (18,19). Following quality control and background correction, these data were scaled by quantile normalization as implemented in the lumi package. The methyAnalysis package version 1.8.0 (20) was applied to identify DMRs between GBM and controls. P<0.001 and a difference in methylation levels, calculated as the M-value (19), of >1 or <-1 were considered to indicate a statistically significant DMR. The identified DMRs were then annotated using the methyAnalysis package and DMRs located at gene promoter regions, within 2 kb upstream of the transcription start site, were selected for further analysis.
Functional enrichment analysis
Functional enrichment analysis was conducted to explore the most significant differentially methylated genes (DMGs) with relevant biological functions. Gene Ontology (GO) is a bioinformatics tool for annotating genes, gene products and sequences using defined GO terms (21). The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a comprehensive database linking genomic data, stored in the GENES database, with higher order functional information, stored in the PATHWAY database (22). Reactome is a free online database of biological pathways (23). The KEGG pathway, Reactome pathway and GO functional terms for the DMGs were identified using the following cut-off criteria: P<0.05 and number of over-represented genes >2.
Construction of a protein-protein interaction (PPI) network
The PPI data were retrieved from the STRING database (http://string.embl.de/). Subsequently, the DMGs were mapped into these interactions, and DMG pairs with an interaction score >0.9 were selected to construct the PPI networks, which were visualized using Cytoscape software version 3.1.0 (24).
Classification of DMGs
TSGene (25) is a database for tumor suppressor genes (TSGs) and the TAG database (26) provides information on well-characterized oncogenes and TSGs. The known TSGs and oncogenes were identified from the list of DMGs based on information retrieved from the TSGene and TAG databases.
Identification of transcription factors (TFs) regulating DMGs and construction of a transcriptional regulatory network
The TF regulation data were downloaded from the Encyclopedia of DNA Elements data portal (27,28), and TFs that regulate the hypermethylated genes and hypomethylated genes were identified. The hypergeometric distribution was used to assign a P-value for the prediction of these TFs and an adjusted P-value of <0.05 was considered significant. Furthermore, transcriptional regulatory networks of hypermethylated genes and hypomethylated genes were constructed using Cytoscape (version 3.6.0) (24).
Data validation
To validate the identified DMGs from GSE50923, another relevant DNA Methylation Profiling dataset, GSE22867 (29), was downloaded from the GEO database and used for data validation. GSE22867 included 55 GBM samples and 3 control brain samples. The platform was GPL8490 Illumina HumanMethylation27 BeadChip (HumanMethylation27_270596_v.1.2). The β-value was calculated based on the methylated and unmethylated signal of the sample data, and a t-test was implemented using the genefilter package (version 1.56.0) (30) to identify significant differentially methylated CpGs (P<0.05). In addition, selected CpG sites exhibited a mean methylation (β-value) difference ≥0.05 between the disease group and the control group. Subsequently, the genes covering the differentially methylated CpGs were identified and compared with the identified DMGs from GSE50923.
Results
Identification of DMGs
Based on the methyAnalysis package, a total of 2,407 DMRs were identified between the GBM samples and normal controls. Among these DMRs, 476 hypermethylated and 850 hypomethylated regions were located in gene promoters, which clearly distinguished GBM from normal samples (Fig. 1).
Figure 1.
Cluster analysis of differentially methylated genes between patients with glioblastoma multiforme and controls. Each row represents a single gene and each column represents a single sample. Red represents hypermethylated genes and green indicates hypomethylated genes.
To gain insight into the dysregulated biological processes induced by DMGs, separate functional enrichment analyses were performed for the hypermethylated and hypomethylated genes. The data indicated that certain KEGG pathways, including ‘neuroactive ligand-receptor interaction’, and ‘D-glutamine and D-glutamate metabolism’, were enriched by hypermethylated genes, while ‘cytokine-cytokine receptor interaction’ and ‘hematopoietic cell lineage’ were enriched by hypomethylated genes (Table I). GO terms were grouped into the following three categories: Biological process, molecular function and cellular component. The most significantly enriched GO biological process and molecular function terms were ‘anatomical structure development’ and ‘sequence-specific DNA-binding transcription factor activity’, respectively, while no statistically significant enriched cellular component terms were identified (Table II). For hypomethylated genes, the most significant GO terms were ‘defense response’, ‘cytokine activity’ and ‘extracellular region’ (Table II). Reactome pathway analysis demonstrated that both hypermethylated and hypomethylated genes were closely associated with the ‘G-protein-coupled receptor (GPCR) ligand binding pathway’ (Table III).
Table I.
Top five KEGG pathways enriched by hypermethylated genes and hypomethylated genes.
A, Hypermethylated genes
KEGG ID
KEGG pathway
P-value
4080
Neuroactive ligand-receptor interaction
1.18×10−3
471
D-Glutamine and D-glutamate metabolism
3.62×10−3
4114
Oocyte meiosis
6.76×10−3
250
Alanine, aspartate and glutamate metabolism
7.87×10−3
4971
Gastric acid secretion
1.02×10−2
B, Hypomethylated genes
KEGG ID
KEGG pathway
P-value
4060
Cytokine-cytokine receptor interaction
1.22×10−10
4640
Hematopoietic cell lineage
1.49×10−5
5150
Staphylococcus aureus infection
1.35×10−4
4610
Complement and coagulation cascades
2.72×10−4
KEGG, Kyoto Encyclopedia of Genes and Genomes.
Table II.
Top five GO terms enriched by hypermethylated genes and hypomethylated genes.
A, Hypermethylated genes
GO ID
GO term
P-value
Biological process
GO:0048856
Anatomical structure development
3.19×10−8
GO:0032502
Developmental process
4.06×10−8
GO:0044707
Single-multicellular organism process
4.85×10−8
GO:0032501
Multicellular organismal process
8.26×10−8
GO:0007275
Multicellular organismal development
3.47×10−7
Molecular function
GO:0003700
Sequence-specific DNA binding transcription factor activity
Top five Reactome pathways enriched by hypermethylated genes and hypomethylated genes.
A, Hypermethylated genes
Reactome ID
Name
P-value
500792
GPCR ligand binding
8.88×10−2
112316
Neuronal system
1.61×10−1
373076
Class A/1 (Rhodopsin-like receptors)
3.55×10−1
418597
G α (z) signaling events
4.98×10−1
1296072
Voltage-gated potassium channels
9.92×10−1
B, Hypomethylated genes
Reactome ID
Name
P-value
500792
GPCR ligand binding
5.94×10−4
211897
Cytochrome P450-arranged by substrate type
1.05×10−3
373076
Class A/1 (Rhodopsin-like receptors)
5.68×10−3
211945
Phase 1-Functionalization of compounds
6.96×10−3
GPCR, G-protein-coupled receptor.
Construction of PPI networks
A total of 476 hypermethylated and 850 hypomethylated genes were mapped to the STRING database and significant interactions with scores >0.9 were selected. A total of 83 hypermethylated genes and 204 hypomethylated genes were screened to separately construct PPI networks involving hypermethylated genes (Fig. 2) and hypomethylated genes (Fig. 3). The DMGsadenylate cyclase type 2 (ADCY2; degree, 18), neuropeptide Y (NPY; degree, 11) and somatostatin (SST; degree, 11) demonstrated the highest degrees, determined by the number of interactions in the hypermethylated gene network. Kininogen 1 (KNG1; degree, 27), proto-oncogene tyrosine protein kinase (SRC; degree, 18) and tumor necrosis factor (TNF; degree, 15) exhibited the highest degrees in the hypomethylated gene network.
Figure 2.
Protein-protein interaction networks for hypermethylated genes. White nodes with blue circles represent the hypermethylated genes and the node size indicates the degree value (larger nodes possess a higher degree). The degree display format of the node is gene_D: num, where ‘gene’ indicates the gene name, ‘D’ stands for degree and ‘num’ is the degree value. Lines/edges represent interactions.
Figure 3.
Protein-protein interaction network for hypomethylated genes. White nodes with blue circles represent the hypermethylated genes and the node size indicates the degree value (larger nodes possess a higher degree). The degree display format of the node is gene_D: num, where ‘gene’ indicates the gene name, ‘D’ stands for degree and ‘num’ is the degree value. Lines/edges represent interactions.
Classification of the DMGs
A total of 43 cancer-associated genes were identified among the hypermethylated genes, including 8 oncogenes and 25 TSGs, while 41 hypomethylated genes were revealed to be cancer-associated genes, including 10 oncogenes and 24 TSGs.
Prediction of TFs that target the DMGs and construction of transcriptional regulatory networks
A total of 55 TFs that could regulate the hypermethylated genes were identified, including polycomb repressive complex 2 subunit, neuron-restrictive silencer factor and cohesin complex component. Additionally, 55 TFs that could regulate the hypomethylated genes were identified, including signal transducer and activator of transcription 3, and forkhead box protein A1 (FOXA1). Notably, FOXA1 was also revealed to be hypermethylated. The transcriptional regulatory networks of hypermethylated genes and hypomethylated genes are presented in Figs. 4 and 5, respectively. The top 10 hypermethylated genes and hypomethylated genes with high node degrees in the regulatory networks are listed in Table IV, including the hypermethylated genes, FOXA1 (degree, 35), adenine phosphoribosyltransferase (degree, 17) and potassium voltage-gated channel subfamily C member 3 (KCNC3; degree, 15), and the hypomethylated genes, proliferating cell nuclear antigen-associated factor (degree, 14) and caspase-8 (CASP8; degree, 12).
Figure 4.
Transcriptional regulatory network of hypermethylated genes. Red nodes represent the hypermethylated gene, green arrows represent transcription factors and the size of node indicates the degree value. Lines/edges represent interactions.
Figure 5.
Transcriptional regulatory network of hypomethylated genes. Red nodes represent the hypermethylated gene, green arrows represent transcription factors and the size of node indicates the degree value. Lines/edges represent interactions.
Table IV.
Top 10 hypermethylated and hypomethylated genes with high node degree in the transcriptional regulatory networks.
Gene
Node degree
Hypermethylated
FOXA1
35
APRT
17
KCNC3
15
LSR
15
SNAI1
13
NUP98
13
PDE4D
12
RPL31
12
RPL31
12
CREM
12
Hypomethylated
KIAA0101
14
CASP8
12
TMA16
12
FAM107B
11
LINC00263
11
RXRB
11
PHLDA1
10
DEGS1
10
KAZALD1
10
C1RL-AS1
9
From the GSE22867 validation dataset, 2,151 DMGs were identified. As demonstrated in Fig. 6, a total of 690 overlapping DMGs were revealed. In particular, the hypermethylated genes NPY, FOXA1 and KCNC3, as well as the hypomethylated genes TNF and CASP8, were among the overlapping DMGs.
Figure 6.
A Venn diagram illustrating data validation using another dataset.
Discussion
The current study systemically analyzed the DNA methylation profile of GBM samples to improve our understanding of GBM at the epigenetic level and identify potential biomarkers as therapeutic targets for GBM.Differential analysis revealed that the methylation levels of 1,326 genes were altered in patients with GBM compared with those in the controls. The biological functions most associated with the DMGs were explored by functional enrichment analysis. The results revealed that DMGs were closely associated with the functions of ‘GPCR ligand binding’, ‘cytokine activity’, ‘cytokine-cytokine receptor interaction’, and ‘D-glutamine and D-glutamate metabolism’. Both ‘cytokine activity’ and ‘cytokine-cytokine receptor interaction’ are associated with inflammatory responses, and there is a close association between inflammation and cancer. Unresolved inflammation resulting from a failure in the regulation of the immune response creates a tumor microenvironment, an important aspect of tumorigenesis proliferation, survival and migration (31,32). GPCR ligands bind to GPCRs and activate the downstream signaling that regulates cellular physiology. Aberrant G-protein signaling is closely associated with cancer development and progression (33).To investigate whether abnormally methylated genes are associated with GBM, PPI networks of the screened genes were constructed and the degrees of the nodes were analyzed in the present study. The results demonstrated that the top three genes with the highest node degrees were ADCY2, NPY and SST in the hypermethylated genes network, and KNG1, SRC and TNF in the hypomethylated genes network. Among these genes, NPY and TNF were validated in the alternative dataset. NPY encodes a 36-amino acid neuropeptide that acts as a neurotransmitter in the brain and autonomic nervous system of humans. Using bioinformatic analysis, abnormal methylation of NPY has been observed in numerous types of cancer (34–36). Furthermore, a previous study reported that NPY is expressed in various types of intracranial tumor in humans and that NPY mRNA is detectable in the temporal lobe in higher quantities compared with that in tumors (37). In addition, the present study identified that TNF was hypomethylated, which may lead to upregulation of the gene. The expression of TNF-α has been identified to be increased in GBM (38), which corresponds with the current study. Therefore, NPY and TNF may be involved in GBM due to their abnormal methylation and may result in the disturbance of biological processes.The hypermethylated genes FOXA1 and KCNC3, as well as the hypomethylated gene CASP8, exhibited high degrees in the transcriptional regulatory networks, which was also confirmed by the second dataset. FOXA1 acts as a TF and has been demonstrated to be a potential regulator of humanglioma progression (39). KCNC3, encoding the Kv3. 3 voltage-gated potassium channel, is expressed in various neuronal cell types that are involved in motor function (40). Previous studies have demonstrated an association between KCNC3 expression and the poor prognosis of patients with GBM (41). Patients with GBM with higher KCNC3 expression exhibit improved survival times (42). Genomic loss of CASP8 by DNA methylation may result in tumor resistance to therapies targeting TNF-related apoptosis-inducing ligand-associated apoptosis pathways (43). In summary, the methylation of FOXA1, KCNC3 and CASP8 in GBM should be investigated in future studies, as this may promote the development of therapeutic approaches.In conclusion, the current study comprehensively analyzed the DNA methylation profile of GBM using bioinformatics approaches and identified several abnormally methylated genes, including NPY, TNF, FOXA1, KCNC3 and CASP8. The findings of the present study improve the understanding of the molecular mechanism underlying GBM, and provide potential biomarkers for GBM and the development of novel treatment strategies. However, the number of samples included in this study was relatively small. Experimental verification and additional studies with larger sample sizes are required to confirm the present results.
Authors: M A Harris; J Clark; A Ireland; J Lomax; M Ashburner; R Foulger; K Eilbeck; S Lewis; B Marshall; C Mungall; J Richter; G M Rubin; J A Blake; C Bult; M Dolan; H Drabkin; J T Eppig; D P Hill; L Ni; M Ringwald; R Balakrishnan; J M Cherry; K R Christie; M C Costanzo; S S Dwight; S Engel; D G Fisk; J E Hirschman; E L Hong; R S Nash; A Sethuraman; C L Theesfeld; D Botstein; K Dolinski; B Feierbach; T Berardini; S Mundodi; S Y Rhee; R Apweiler; D Barrell; E Camon; E Dimmer; V Lee; R Chisholm; P Gaudet; W Kibbe; R Kishore; E M Schwarz; P Sternberg; M Gwinn; L Hannick; J Wortman; M Berriman; V Wood; N de la Cruz; P Tonellato; P Jaiswal; T Seigfried; R White Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971
Authors: Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker Journal: Genome Res Date: 2003-11 Impact factor: 9.043
Authors: I Knerr; S Schuster; P Nomikos; M Buchfelder; J Dötsch; E Schoof; R Fahlbusch; W Rascher Journal: Neuropathol Appl Neurobiol Date: 2001-06 Impact factor: 8.090
Authors: Elise P W Jenkins; Alina Finch; Magda Gerigk; Iasonas F Triantis; Colin Watts; George G Malliaras Journal: Adv Sci (Weinh) Date: 2021-07-22 Impact factor: 16.806