Haiguo Zhang1,2, Guanli Song3, Guanbo Song4, Ruolei Li5, Min Gao5, Ling Ye2, Chengfang Zhang5. 1. Department of Hematology, Qilu Hospital, Shandong University, Jinan, Shandong, P.R. China. 2. Department of Hematology, Jining NO.1 People's Hospital, Jining, Shandong, P.R. China. 3. Department of Preventive and Health Care, Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, P.R. China. 4. Department of Clinical Laboratory, Jining Chinese Medicine Hospital, Jining, Shandong, P.R. China. 5. Department of Clinical Laboratory, Jining NO.1 People's Hospital, Jining, Shandong, P.R. China.
Abstract
BACKGROUND: The aim of this study is to find the potential survival related DNA methylation signature capable of predicting survival time for acute myelocytic leukemia (AML) patients. METHODS: DNA methylation data were downloaded. DNA methylation signature was identified in the training group, and subsequently validated in an independent validation group. The overall survival of DNA methylation signature was performed. Functional analysis was used to explore the function of corresponding genes of DNA methylation signature. Differentially methylated sites and CpG islands were also identified in poor-risk group. RESULTS: A DNA methylation signature involving 8 DNA methylation sites and 6 genes were identified. Functional analysis showed that protein binding and cytoplasm were the only two enriched Gene Ontology terms. A total of 70 differentially methylated sites and 6 differentially methylated CpG islands were identified in poor-risk group. CONCLUSIONS: The identified survival related DNA methylation signature adds to the prognostic value of AML.
BACKGROUND: The aim of this study is to find the potential survival related DNA methylation signature capable of predicting survival time for acute myelocytic leukemia (AML) patients. METHODS: DNA methylation data were downloaded. DNA methylation signature was identified in the training group, and subsequently validated in an independent validation group. The overall survival of DNA methylation signature was performed. Functional analysis was used to explore the function of corresponding genes of DNA methylation signature. Differentially methylated sites and CpG islands were also identified in poor-risk group. RESULTS: A DNA methylation signature involving 8 DNA methylation sites and 6 genes were identified. Functional analysis showed that protein binding and cytoplasm were the only two enriched Gene Ontology terms. A total of 70 differentially methylated sites and 6 differentially methylated CpG islands were identified in poor-risk group. CONCLUSIONS: The identified survival related DNA methylation signature adds to the prognostic value of AML.
Acute myeloid leukemia (AML) is a highly aggressive hematologic malignancy characterized by a vast proliferation of immature myeloid blasts that accumulate in the bone marrow and blood. AML is closely correlated to cytokine networks of proliferation, differentiation and apoptosis of leukemic cells [1]. AML is caused by different factors including radiation, mutations and carcinogens [2-5]. The disease can progress quickly and can become fatal in a short period of time without treatment. Known prognostic factors of AML include age, mutations, complex karyotype, the antecedent hematologic disease, presence of elevated white blood cell counts and prior chemo or radiotherapy for another malignancy [6]. Intensive chemotherapy is initially effective in most patients with AML, however, the surviving LIC clones repopulate the disease and lead to subsequent disease relapse and poor prognosis [7]. In addition, the treatment of elderly patients and patients with relapsed refractory remains a challenge [8-10]. Therefore, an improved understanding of the molecular mechanisms underlying of AML could be helpful to improve the treatment efficacy to prolong the survival time for patients.DNA methylation is an important epigenetic mechanism in regulating gene expression [11]. It is pointed out that epigenetic disturbances have been involved in the pathogenesis of leukemia [12]. Jiang Y et al found that progression from myelodysplastic syndrome to AML was correlated to increased aberrant DNA methylation [13]. Previous studies have investigated genome-wide methylation in AML [14]. In AML, the presence of common methylation patterns in p15 and E-cadherin has been described [15, 16]. In addition, methylation of secreted frizzled related protein (sFRP)1, sFRP12, sFRP13 and sFRP15 with corresponding transcriptional silencing has been found in AML cell lines [17]. The survival analysis showed that GATA binding protein 4 (GATA4) promoter methylation was significantly associated with shorter overall survival of pediatric AML [18]. Thus it can be seen that DNA methylation may play a crucial role in the development of AML. In this study, we aimed to find potential survival-related DNA methylation signature in AML, which may pave the way for the development of novel tumor markers and therapeutic targets for AML.
Materials and methods
DNA methylation data retrieval and analysis
DNA methylation data were downloaded from the TCGA dataset (http://tcga-data.nci.nih.gov/tcga). The data was derived from blood of AML. Among which, there were 195 samples with DNA methylation information. There were 188 samples with follow-up information. Finally, we selected 182 samples with both DNA methylation information and follow-up information. In order to improve data accuracy, the DNA methalytion sites were first preprocessed. DNA methalytion sites on sex chromosomes were excluded. Considering the heterogeneity of blood, based on the DNA methylation data, blood components were then predicted using the R packages in RefFreeEWAS. Finally, all samples were divided into the training group (127 cases) and validation group (55 cases) randomly. There were no overlapped cases between two groups. The chi-square test and t-test were used to analyze the statistical difference of clinical index between the two groups. And there was no significant difference in age, race, gender, vital status, survival time and disease risk result between the two groups. The clinical characteristics of these two groups were shown in Table 1.
Table 1
The clinical characteristics for training group and validation group.
Clinical index
Training group (n = 127)
Validation group (n = 55)
P value
Age
Mean ± SD
55.28346±16.05244
55.65455±16.52632
0.8886992
Median
59
58
Race
Asia, Black or African
1
1
0.4152429
American
6
5
White
119
48
NA
1
1
Gender
Female
54
29
0.268031
Male
73
26
Vital status
Alive
50
14
0.1017496
Dead
77
41
Survival time
Mean ± SD
543.4724± 601.9671
570.3091 ±539.5071
0.7667396
Median
334
365
Disease risk
Favorable
26
9
0.7268791
Intermediate/Normal
73
33
Poor
25
13
NA: Not applicable.
NA: Not applicable.
Identification of survival related DNA methylation sites
In order to select the survival associated DNA methylation sites, all DNA methylation sites were analyzed by the single factor Cox proportional hazard (CoxPH) regression after adjustment of age, race, gender, blood constituent and cytogenetic risk. Similarly, after further adjustment of age, race, gender, blood constituent and cytogenetic risk, the multi-factor CoxPH regression analysis was used for identification of DNA methylation signature in the survival evaluation. The statistical significance was set at p<0.001. In order to investigate the characteristic of DNA methylation signature, the Illumina Infinium HumanMethylation450 BeadChips Assay was utilized for DNA methylation sites annotation.
Functional analysis of survival related DNA methylation signature genes
In order to study the biological function of survival-related DNA methylation signature genes, the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed by using the online software GeneCodis3 (http://genecodis.cnb.csic.es/analysis). And the threshold of false discovery rate (FDR) < 0.05 was set as the criteria of statistical significance.
Survival time analysis of DNA methylation signature in the training group and validation group
The risk score (RS) of identified DNA methylation signature was calculated by the following equation:
N presents the number of DNA methylation; Methi presents the spectrum of ith DNA methylation site; Ci represents regression coefficient of ith DNA methylation site in the multi-factor CoxPH regression analysis. Kaplan-Meier survival curves were drawn and compared among subgroups using log-rank tests.In addition, the receiver operating characteristic (ROC) analysis was performed to assess the 5 years’ survival value of DNA methylation signature by using pROC package in R language. The area under the curve (AUC) under binomial exact confidence interval was calculated and the ROC curve was generated.
Identification of risk related DNA methylation sites
In order to identify the risk-related DNA methylation sites between patients with poor-risk (38 cases) and favorable-risk (35 cases), the related DNA methylation data were downloaded. We selected CpG sites based on differential methylation value calculated as mean (β case) − mean (β normal) (Δβ) combined with the false discovery rate (FDR) values. Finally, the threshold of |Δβ|>0.2 and FDR<0.05 was set as the criteria of statistical significance.
Results
Survival related DNA methylation signature
After original data preprocess, a total of 3884 DNA methylation sites were identified in the training group. These DNA methylation sites were used for the single factor and multi-factor CoxPH regression analysis. The result showed that 8 DNA methylation sites were identified. 8 DNA methylation sites were located in 4 CpG islands (chr12:81102034–81102716, chr17:78863569–78863813, chr3:10183305–10183941 and chr6:29600192–29600661) and 6 genes (MYF6, RPTOR, MMP10, SH3PXD2B, VHL and GABBR1). The annotation of 8 DNA methylation sites was shown in Table 2.
Table 2
The annotation of 8 DNA methylation sites.
Site
Strand
Gene
Gene context
CpG island
CpG island context
cg26400830
F
cg20171297
R
MYF6
1stExon; 5'UTR
chr12:81102034–81102716
N_Shore
cg09891288
R
RPTOR
Body; Body
chr17:78863569–78863813
Island
cg02061229
R
MMP10
Body
cg19979108
R
SH3PXD2B
Body
cg20916523
R
VHL
Body; Body
chr3:10183305–10183941
S_Shore
cg21644740
F
GABBR1
Body; Body
chr6:29600192–29600661
N_Shore
cg26182859
R
Functional annotation of survival related DNA methylation signature genes
In order to further study the biological function of survival-related DNA methylation signature genes (MYF6, RPTOR, MMP10, SH3PXD2B, VHL and GABBR1), GO and KEGG functional annotation were performed. The result showed that only 2 GO terms were obtained. Protein binding was the most significantly enriched molecular function (FDR = 0.0103025) involving RPTOR, VHL and SH3PXD2B; cytoplasm (FDR = 0.00664319) was the most significantly enriched cellular component involving RPTOR, GABBR1, VHL and SH3PXD2B. Enriched GO terms of survival related DNA methylation signature genes were shown in Table 3.
Table 3
Enriched GO terms of survival related DNA methylation signature genes.
Survival time analysis of DNA methylation signature
In order to explore the association between the risk score and identified DNA methylation signature, the clustering analysis map of methylation value in the DNA methylation signature sites was performed (Fig 1). In the risk score calculation, the patients were dichotomized into either low risk or the high risk group. In the training group, a highly significant difference was observed between the high risk and the low risk group (p = 1.1e-06), which was shown in Fig 2. When the same DNA methylation signature equation was applied to the validation group, a similar significant difference was also observed between the high risk and the low risk group (p = 0.054), which was shown in Fig 3. Additionally, we performed 5 years’ survival analysis of DNA methylation signature by ROC and calculated the AUC to assess the discriminatory ability of DNA methylation signature in the training and validation group, respectively (Figs 4 and 5). The AUC of the DNA methylation signature in the training group was 0.8441558, and the validation was 0.8170732. Our result suggested that the DNA methylation signature could be the prognosis model for predicting the survival situation of AML.
Fig 1
The clustering analysis map of methylation value in the DNA methylation signature sites.
Diagram presents the result of a two-way hierarchical clustering of DNA methylation sites and risk score. The clustering is constructed using the complete-linkage method together with the Euclidean distance. Each row represents a DNA methylation site, and each column, a risk score value. The risk score clustering tree is shown on the right. The colour scale illustrates the relative value of the risk score: red, below the reference channel; green, higher than the reference.
Fig 2
Kaplan-Meier curves showing AML patients dichotomized based on risk score in the training group.
High risk is defined as a risk score ≥ the median, and low risk is defined as a risk score < the median in the training group.
Fig 3
Kaplan-Meier curves showing AML patients dichotomized based on risk score in the validation group.
High risk is defined as a risk score ≥ the median, and low risk is defined as a risk score < the median in the validation group.
Fig 4
5 years’ ROC curves of DNA methylation signature in the training group.
The ROC curves were used to show the diagnostic ability of DNA methylation signature with 1-Specificity and sensitivity. The x-axis shows 1-specificity and y-axis shows sensitivity.
Fig 5
5 years’ ROC curves of DNA methylation signature in the validation group.
The ROC curves were used to show the diagnostic ability of DNA methylation signature with 1-Specificity and sensitivity. The x-axis shows 1-specificity and y-axis shows sensitivity.
The clustering analysis map of methylation value in the DNA methylation signature sites.
Diagram presents the result of a two-way hierarchical clustering of DNA methylation sites and risk score. The clustering is constructed using the complete-linkage method together with the Euclidean distance. Each row represents a DNA methylation site, and each column, a risk score value. The risk score clustering tree is shown on the right. The colour scale illustrates the relative value of the risk score: red, below the reference channel; green, higher than the reference.
Kaplan-Meier curves showing AML patients dichotomized based on risk score in the training group.
High risk is defined as a risk score ≥ the median, and low risk is defined as a risk score < the median in the training group.
Kaplan-Meier curves showing AML patients dichotomized based on risk score in the validation group.
High risk is defined as a risk score ≥ the median, and low risk is defined as a risk score < the median in the validation group.
5 years’ ROC curves of DNA methylation signature in the training group.
The ROC curves were used to show the diagnostic ability of DNA methylation signature with 1-Specificity and sensitivity. The x-axis shows 1-specificity and y-axis shows sensitivity.
5 years’ ROC curves of DNA methylation signature in the validation group.
The ROC curves were used to show the diagnostic ability of DNA methylation signature with 1-Specificity and sensitivity. The x-axis shows 1-specificity and y-axis shows sensitivity.In order to identify the DNA methylation sites between patients with poor-risk and favorable-risk, the differentially methylated sites and CpG islands were analyzed based on the threshold of |Δβ|>0.2 and FDR<0.05. The result showed that 70 differentially methylated sites (64 hypermethylation and 6 hypomethylation sites) and 6 differentially methylated CpG islands (3 hypermethylation and 3 hypomethylation CpG islands) were identified. The top 20 differentially methylated sites in the poor-risk and favorable-risk group were shown in Table 4.
Table 4
The top 20 differentially methylated sites in the poor-risk and favorable-risk group.
Site
Gene
Island
Δβ
FDR
cg27321949
LRPAP1
chr4:3516456–3516844
0.597435476
4.82581E-17
cg04857395
LRPAP1
chr4:3516456–3516844
0.603433175
4.82581E-17
cg25278298
MAEA
chr4:1303490–1303835
0.41009799
6.33572E-17
cg24973755
MAEA
chr4:1304768–1305114
0.539319729
6.33572E-17
cg06466348
ADCY7
0.529336818
4.87433E-16
cg09936008
ZNF213
chr16:3190765–3191389
0.522768451
3.94334E-15
cg01360627
TNF
chr6:31548436–31549277
0.556695472
5.62805E-15
cg18349022
ZNF498
0.574523405
6.55958E-15
cg13458384
SARDH
chr9:136567684–136568146
0.410375297
1.34354E-14
cg17741993
TNF
chr6:31548436–31549277
0.587861991
1.34354E-14
cg25197194
CCDC48
0.55303526
3.2255E-14
cg26996656
0.558266306
3.37347E-14
cg17427926
LOC283999
chr17:76228110–76228380
0.468902001
8.05456E-14
cg02340818
KIAA1688
chr8:145806258–145806713
0.480393004
9.56514E-14
cg06250720
0.483657658
1.96516E-13
cg11227278
KLHL29
chr2:23749086–23749291
0.509376312
cg17115419
0.493501781
cg27049094
HK2
0.567439864
cg01824603
ARHGEF10
0.511145159
cg25605731
CALR
chr19:13056458–13057125
0.588599806
Δβ: mean (β case) − mean (β normal)
FDR: false discovery rate
Δβ: mean (β case) − mean (β normal)FDR: false discovery rate
Discussion
AML is the most common form of adult leukemia and the survival rate is very low [19-Cytometry Part B, Clinical cytometry. 2014 ">21]. Therefore, it is urgent to understand the pathological mechanism and find potential survival related genes in the development of AML. In this study, we found a DNA methylation signature involving MYF6, RPTOR, MMP10, SH3PXD2B, VHL and GABBR1, which could be a valuable tool in guiding treatment decisions for AML.Myogenic factor 6 (MYF6, also called MRF4 or Herculin) is expressed in skeletal muscle and is related to myogenesis [22-25]. It is found that the MYF6 is a differentially methylation gene in plasma cf-DNA in the different stage of hepatocellular carcinoma development [26]. In addition, the methylation frequency of MYF6 in stage I non-small cell lung cancer is obviously higher than that of non-cancerous lung disease control, which suggested that MYF6 could offer a specificity and a sensitivity in the stage I non-small cell lung cancer diagnosis [27]. Thus it can be seen that MYF6 methylation was associated with the development of cancer. In this study, we first found the DNA methylation of MYF6 in the blood of AML. Our result showed that MYF6 was significantly associated with survival time of AML and could be a diagnostic and prognostic marker of AML.Regulatory associated protein of MTOR complex 1 (RPTOR) is a signal transduction gene important for hematopoiesis [28]. RPTOR, mechanistic target of rapamycin kinase and MTOR associated protein (mTOR), LST8 homolog (MLST8) constitute the core subunits of the mammalianTORC1 complex which play an important role in controlling cell growth, survival and metabolism and is often deregulated in cancer [29-32]. It is reported that RPTOR is hypermethylated in human hepatocytes [33]. Previous reports have demonstrated that RPTOR is methylated gene in breast tumors [34, 35]. This suggested that RPTOR methylation may play an important role in the process of cancer. In the present study, we found that RPTOR was methylated in the blood of AML and significantly associated with the survival time of AMLpatients. Our result suggested that RPTOR played a crucial role in AML and may be a diagnostic and prognostic marker in the process of AML.Matrix metallopeptidase 10 (MMP10), an enzyme promoting angiogenesis, promotes cell growth and invasion, and exerts anti-apoptotic property in vitro [36, 37]. It is noted that MMP10 is essential to the tumor microenvironment. In colorectal cancer, MMP10 is up-regulated in cancerous tissue and adversely associated with patients survival [38, 39]. In cutaneous melanoma, MMP10 is potentially modified by modification of histone acetylation [40]. In head and neck squamous cell cancer, MMP10 is a differentially methylated gene in radiation-sensitive and -resistant tumors [41]. In immature teratomas, reduced methylation of MMP10 significantly enriched in axonal guidance signaling pathway [42]. Previous reports indicated that the epigenomics changes of MMP10, especially DNA methylation may be involved in the occurrence of cancer. Herein, we found the methylation of MMP10 in the blood of AML. Moreover, MMP10 was related to the survivability of AMLpatients. Our result may provide a new field in the diagnosis and prognosis of AML.SH3 and PX domains 2B (SH3PXD2B) is associated with growth and production. It is found that SH3PXD2B is involved in the protein binding, cytoplasm and cell junction in hepatocellular carcinoma [43]. In addition, SH3PXD2B positively regulates the differentiation of fat cell and shows differences in DNA methylation in gene body or intergenic region [44, 45]. However, there are no reports about the relationship between SH3PXD2B methylation and cancer. In this study, we first found methylated SH3PXD2B in the blood of AML. Moreover, it was observably associated with the survival time of AMLpatients. In a word, SH3PXD2B may play a crucial role in AML and could be a diagnostic and prognostic marker in the development of AML.The gene product of the Von Hippel-Lindau tumor suppressor (VHL) plays a key role in regulation of metabolic genes expression, erythropoiesis, angiogenesis, proliferation and apoptosis. VHL-associated tumors are highly vascularized and overproduce VEGF [46]. It is indicated that VHL promoter hypermethylation may play an important role in pheochromocytoma and abdominal paraganglioma development [47]. Herman and Zhong et al found that VHL was a methylated gene in renal carcinoma cell lines [48, 49]. Hypomethylation of VHL has been reported in head and neck cancer and lung squamous carcinoma [50, 51]. Thus, VHL functions as a methylated gene in different cancers. It is pointed out that the mutations in the VHL gene are involved in the pathogenesis of AML [46]. Herein, we also found the relationship between VHL and AML. Our result showed that VHL was methylated and was significantly associated with survival time of AMLpatients.Gamma-aminobutyric acid type B receptor subunit 1 (GABBR1) encodes the G protein-coupled receptor that can form the heterodimer with GABAB receptor 2, which triggering the proliferation, differentiation and migration of cancer cells. It is showed that multiple loci of GABBR1 within 6p21.3 are related to nasopharyngeal carcinoma [52, 53]. GABBR1 is a different methylated gene in plasma cf-DNA in different early stage of hepatocellular carcinoma development [26]. In addition, GABBR1 is associated with the survival time of patients with gastric cancer [54]. It is also a survival-associated methylation marker for oral squamous cell carcinoma [55]. Thus it can be seen that GABBR1 methylation play a crucial role in various cancers. In the present study, we found that GABBR1 was methylated and associated with AML survival. It suggested that GABBR1 may be a survival-associated methylation marker for AML.Previous studies have reported that the 5-year survival rate was 55% for patients with favorable cytogenetics, 24% for patients with intermediate risk, and 5% for patients with poor-risk cytogenetics [56]. Thus it can be seen that the survival time was associated with AML risk. In this study, we identified numbers of survival-associated differentially methylated genes such as LRPAP1, MAEA and TNF between poor-risk and favorable-risk patients.LDL receptor related protein associated protein 1 (LRPAP1) is a gene that involved in the cell proliferation in cancer [57]. It is noted that LRPAP1 is a biologically relevant gene found in leukemia and was associated with different biological processes including cell apoptosis, signaling pathway and cell cycle checkpoint [58, 59]. Macrophage erythroblast attacher (MAEA) is a 36-kD transmembrane protein that expressed by erythroblasts and macrophage cells and plays a crucial role in hematopoiesis [60, 61]. It is reported that MAEA is a differentially methylated gene in Type-2 diabetes and idiopathic pulmonary fibrosis [62, 63]. Tumor necrosis factor (TNF) is a proinflammatory cytokine. It has been identified that TNF is elevated in serum of patients with aplastic anemia and myelodysplastic syndromes, suggesting that the hematopoietic repressive activity of TNF may contribute to the cytopenic phenotype of these patients [64-67]. Interestingly, TNF levels are significantly higher in the peripheral blood of AMLpatients of M3, M4, and M5 subtypes when compared with healthy donors [68]. Furthermore, the increased level of TNF is associated with poor prognosis of patients with AML, especially older adults [7, 69–71]. Our result further indicated the important role of LRPAP1, MAEA and TNF in AML.
Conclusions
In a word, we have identified and successfully validated a DNA methylation signature in patients with AML. This signature adds to the potential predictive role in the survival time of AMLpatients. Utilization of a prognostic DNA methylation signature would enrich for potentially sensitive patients, thereby improving clinical outcome for future patients with AML. However, there is a limitation to our study. In the present study, we didn’t perform the deeper mechanism study based on the identified methylated genes. Some animal model and cell experiments are further needed to explore the potential mechanism of AML.
Authors: S Deneberg; M Grövdal; M Karimi; M Jansson; H Nahi; A Corbacioglu; V Gaidzik; K Döhner; C Paul; T J Ekström; E Hellström-Lindberg; S Lehmann Journal: Leukemia Date: 2010-03-18 Impact factor: 11.528