This study aimed to explore the underlying mechanism of relapsed acute lymphoblastic leukemia (ALL).Datasets of GSE28460 and GSE18497 were downloaded from Gene Expression Omnibus (GEO). Differentially expressed genes (DEGs) between diagnostic and relapsed ALL samples were identified using Limma package in R, and a Venn diagram was drawn. Next, functional enrichment analyses of co-regulated DEGs were performed. Based on the String database, protein-protein interaction network and module analyses were also conducted. Moreover, transcription factors and miRNAs targeting co-regulated DEGs were predicted using the WebGestalt online tool.A total of 71 co-regulated DEGs were identified, including 56 co-upregulated genes and 15 co-downregulated genes. Functional enrichment analyses showed that upregulated DEGs were significantly enriched in the cell cycle, and DNA replication, and repair related pathways. POLD1, MCM2, and PLK4 were hub proteins in both protein-protein interaction network and module, and might be potential targets of E2F. Additionally, POLD1 and MCM2 were found to be regulated by miR-520H via E2F1.High expression of POLD1, MCM2, and PLK4 might play positive roles in the recurrence of ALL, and could serve as potential therapeutic targets for the treatment of relapsed ALL.
This study aimed to explore the underlying mechanism of relapsed acute lymphoblastic leukemia (ALL).Datasets of GSE28460 and GSE18497 were downloaded from Gene Expression Omnibus (GEO). Differentially expressed genes (DEGs) between diagnostic and relapsed ALL samples were identified using Limma package in R, and a Venn diagram was drawn. Next, functional enrichment analyses of co-regulated DEGs were performed. Based on the String database, protein-protein interaction network and module analyses were also conducted. Moreover, transcription factors and miRNAs targeting co-regulated DEGs were predicted using the WebGestalt online tool.A total of 71 co-regulated DEGs were identified, including 56 co-upregulated genes and 15 co-downregulated genes. Functional enrichment analyses showed that upregulated DEGs were significantly enriched in the cell cycle, and DNA replication, and repair related pathways. POLD1, MCM2, and PLK4 were hub proteins in both protein-protein interaction network and module, and might be potential targets of E2F. Additionally, POLD1 and MCM2 were found to be regulated by miR-520H via E2F1.High expression of POLD1, MCM2, and PLK4 might play positive roles in the recurrence of ALL, and could serve as potential therapeutic targets for the treatment of relapsed ALL.
Acute lymphoblastic leukemia (ALL) is a heterogeneous group of disorders originating from B and T progenitor cells [, and accounts for the most frequent blood malignancy of childhood.[ Although the survival rate of childhood malignancy has increased to approximately 90% compared to 10% in the 1960s with the development of medical technology,[ relapsed ALL remains the leading cause of cancer-related mortality during childhood.[ It has been reported that the overall survival rate of relapsed B-ALL is only 35% to 40%, even treated with stem cell transplantation or intensified chemotherapy[, and this condition shows a lower survival in adults than in children.[ Therefore, it is important to reveal the molecular mechanisms of relapsed ALL to develop more effective therapeutic methods for improving the survival rate of patients suffering from relapsed ALL.With the development of sequencing techniques, genome sequencing has been widely used to identify potential biomarkers and therapeutic targets based on variations in gene expression. Yang et al[ have identified that children with ALL with the CC genotype at rs116855232 of NUDT15 have higher mercaptopurine resistance (83.5%) than those with the TT and TC genotypes. Perez-Andreu et al[ have reported that the risk allele at rs3824662GATA3 is one of the most frequent in Philadelphia chromosome (Ph)-like ALL, which also increases susceptibility to non-Ph-like ALL in adults and adolescents. Additionally, Paulsson et al[ have documented that the RTK-RAS pathway and its modifiers perform critical roles in the hyperdiploid 51–67 chromosomes ALL, which is one of the most frequent types of ALL. Moreover, Fischer et al[ have demonstrated that enriched stem cell and myeloid characteristics in TCF3-HLF signatures may result in strong drug resistance to traditional chemotherapeutics, but sensitivity to glucocorticoids in ALL. Besides, microRNAs (miRNAs) are also identified to be involved in the pathogenesis of ALL. Agirre et al[ have demonstrated that miRNA-124a confers a poor prognosis in ALL, and Schotte et al[ have documented that miR-196b and miR-708 are closely associated with the subtypes of ALL. However, few studies have examined relapsed ALL, and only a very small number of genes have been identified to be differentially expressed between diagnosis and relapse of ALL.[To reveal the potential molecule mechanism of relapsed ALL, 2 datasets of GSE28460 and GSE18497 were deposited by Hogan et al[ and Staal et al,[ respectively. For GSE28460, Hogan et al[ have revealed that diverse genetic changes from diagnosis to relapse, and methylation analysis showed that the Wnt and mitogen-activated protein kinase pathway may be involved in these variations. Additionally, for GSE18497, Staal et al[ have not only found that differentially expressed genes (DEGs) between diagnosis and relapsed ALL are strongly associated with the changes in cell cycle, DNA replication and repair, and that upregulated genes in ALL are involved in colon cancer and ubiquitination. Other studies utilized these 2 datasets to identify DEGs,[ potential markers,[ and therapeutic methods for B-ALL.[ However, how these changes occur remains unclear. In the present study, to further uncover the underlying mechanism of relapsed ALL, DEGs were screened between diagnosis and relapsed based on the GSE28460 and GSE18497 datasets; biofunctional enrichment and transcriptional factor prediction were performed to provide insight into the understanding and treatment of relapsed ALL.
Materials and methods
Data sourcing
The gene expression files for GSE28460[ and GSE18497[ were downloaded from the Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) database. Specifically, 98 ALL bone marrow samples were included in GSE28460, including 49 diagnosis cases and 49 relapse cases. Construction of this dataset was approved by the institutional review board of all participating institutions, and informed consent was obtained from all patients. There were 41 matched diagnosis and relapse pairs of ALL bone marrow samples included in GSE18497,[ and microarrays performed according to consensus guide-lines described for leukemia analyses by 3 European networks. Both of these 2 datasets were sequenced on the platform of GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array.
Identification of DEGs
Raw data in CEL format was downloaded from the GEO database, and Affy package in R (Version 1.54.0, http://www.bioconductor.org/packages/release/bioc/html/affy.html)[ was used for data preprocessing, including background correction, normalization, and expression calculation. According to the annotation files, unmatched gene probes were removed, and expression of matched genes was calculated. For several probes matched to a specific gene, the mean value of different probes was computed, and used as the expression value of the gene. Next, the Bayes method provided by Limma package in R (version 3.10.3, http://www.bioconductor.org/packages/2.9/bioc/html/limma.html)[ was used to compare gene expression between diagnosis and relapse samples, and DEG was considered when P < .05.
VENN diagram
Venn diagram is a R package that allows to visualize all aspects of the generated diagram.[ In the present study, the online tool VENNY (Version2.1, http://bioinfogp.cnb.csic.es/tools/venny/index.html)[ was used to Venn diagram analysis to identify co-regulated DEGs between the analyzed datasets.
Gene ontology (GO) and pathway enrichment analyses
DAVID is an online tool commonly used for bioinformatics analysis. In the present study, DAVID (version 6.8, https://david-d.ncifcrf.gov/)[ was used to perform GO- biological process (BP) enrichment analysis of co-regulated DEGs. Significant GO enrichment was considered at a gene count ≥2 and P < .05. Additionally, based on the Reactome Pathway database (Reactome version 61, http://reactome.org/), [ pathway enrichment analysis of co-regulated DEGs was conducted using DAVID.
Protein–protein interaction (PPI) network and module analyses
According to the PPI pairs provided by the STRING (version 10.0, http://www.string-db.org/) database,[ PPIs among proteins encoded by co-regulated DEGs were predicted with PPI score ≥0.15 as a threshold to obtain the greatest number of PPI pairs. Followed by this, the PPI network was visualized using Cytoscape (version 3.2.0, http://www.cytoscape.org/). For the parameter without weight, the topological properties of nodes involved in this network were analyzed using plug-in CytoNCA (version 2.1.6, http://apps.cytoscape.org/apps/cytonca)[ in Cytoscape. According to the final score, the hub nodes were screened in this network. Bio-functional modules in the PPI network were screened using a plug-in MCODE (version 1.4.2, http://apps.cytoscape.org/apps/MCODE) in Cytoscape with an enriched score >5 as the threshold.
The WEB-based Gene SeT Analysis Toolkit (WebGestalt, http://www.webgestalt.org/option.php) is a functional enrichment analysis online tool used for bioinformatics analysis.[ In the present study, WebGestalt was used to perform TF enrichment analysis of co-regulated DEGs, and P < .05 was considered significant. Based on the significant enriched regulatory pairs, the TF-target regulatory network was visualized using Cytoscape.
Construction of miRNA-TF-target regulatory network
Using WebGestalt, co-regulated DEG-related miRNAs were predicted using the threshold of P < .05. Additionally, regulatory relationships between miRNAs and TFs were predicted. Next, the miRNA-TF-target regulatory network was visualized using Cytoscape.
Results
Identification of DEGs and co-regulated DEGs
According to the selected criterion, DEGs of the 2 datasets were screened. Specifically, a total of 1674 DEGs were identified in GSE28460, including 1014 upregulated genes and 660 downregulated genes. Additionally, 508 DEGs were identified in the GSE18497 dataset, including 234 upregulated genes and 274 downregulated genes.Subsequently, based on the identified DEGs, co-regulated DEGs were screened between these 2 datasets using the VENNY tool. A total of 71 co-regulated DEGs were identified, including 56 co-upregulated DEGs (Fig. 1A) and 15 co-downregulated DEGs (Fig. 1B).
Figure 1
Venn diagram for co-regulated DEGs. (A) Co-upregulated DEGs and (B) co-downregulated DEGs. DEGs = differentially expressed genes.
Venn diagram for co-regulated DEGs. (A) Co-upregulated DEGs and (B) co-downregulated DEGs. DEGs = differentially expressed genes.
GO and pathway enrichment analyses
For further analysis, functional enrichment analyses of co-regulated DEGs were conducted using the DAVID online tool. As a result, co-upregulated DEGs were significantly enriched in 15 GO-BP terms, and most were mitosis- and cell cycle-related biological processes, including the G1/S transition of mitotic cell cycle (P = 2.20 × 10−4), cell cycle (P = 4.13 × 10−04), DNA replication (P = .0011), and so on. The top 10 are tabulated in Table 1. However, only 1 GO-BP term was significantly enriched by the co-downregulated DEGs, namely protein glycosylation GO-BP term (P = .040) (Fig. 2A).
Table 1
The top 10 GO-BP terms enriched by co-upregulated DEGs.
Figure 2
Functional enrichment for co-regulated DEGs. (A) The top 10 significantly enriched GO-BP terms for upregulated DEGs and 1 term for downregulated DEGs. (B) The top 10 significantly enriched pathways for upregulated DEGs and 1 pathway for downregulated DEGs. Red represents the upregulated terms, and green represents the downregulated terms. DEGs = differentially expressed genes, GO = gene ontology, and BP = biological process.
The top 10 GO-BP terms enriched by co-upregulated DEGs.Functional enrichment for co-regulated DEGs. (A) The top 10 significantly enriched GO-BP terms for upregulated DEGs and 1 term for downregulated DEGs. (B) The top 10 significantly enriched pathways for upregulated DEGs and 1 pathway for downregulated DEGs. Red represents the upregulated terms, and green represents the downregulated terms. DEGs = differentially expressed genes, GO = gene ontology, and BP = biological process.The co-upregulated DEGs were significantly enriched in 43 pathways, mostly mitosis- and cell cycle-related pathways, such as cell cycle, mitotic (P = 3.28 × 10−09), cell cycle (P = 4.25 × 10−09), and mitotic prometaphase (P = .00012). The top 10 are tabulated in Table 2. As in the GO-BP analytical results, only 1 pathway was significantly enriched by co-downregulated DEGs, the genetic transcription pathway (P = .041) (Fig. 2B).
Table 2
The top 10 KEGG pathways enriched by co-upregulated DEGs.
The top 10 KEGG pathways enriched by co-upregulated DEGs.
PPI network and module analyses
Based on the STRING database, the PPI network was constructed, including 50 nodes (co-regulated DEGs encoded proteins) and 253 regulatory relationship pairs (Fig. 3A). Additionally, a significant functional module was screened out from this PPI network with a module score = 18.423 (Fig. 3B). According to the node degrees, the top 10 nodes in PPI are tabulated in Table 3, including FOXM1 (degree = 25), TYMS (degree = 25), POLD1 (degree = 25), MCM2 (degree = 22), PLK4 (degree = 22), and so on. There were 20 nodes and 175 regulatory relationship pairs included in this module, and all proteins involved in this module were encoded by co-upregulated DEGs. According to the node degrees, the top 10 nodes in the module are tabulated in Table 3, including FOXM1 (degree = 25), TYMS (degree = 25), POLD1 (degree = 25), MCM2 (degree = 22), PLK4 (degree = 22), and so on.
Figure 3
PPI and module analyses for co-regulated DEGs. (A) PPI network and (B) module. Red triangle represents the upregulated protein, and green arrow represents the downregulated protein. DEGs = differentially expressed genes, PPI = protein–protein interaction.
Table 3
The top 10 nodes involved in PPI network and module.
PPI and module analyses for co-regulated DEGs. (A) PPI network and (B) module. Red triangle represents the upregulated protein, and green arrow represents the downregulated protein. DEGs = differentially expressed genes, PPI = protein–protein interaction.The top 10 nodes involved in PPI network and module.
Construction of TF-target regulatory network
Based on the relationship pairs predicted by WebGestalt, the TF-target regulatory network was constructed using Cytoscape (Fig. 4). There were 20 nodes and 35 regulatory relationships included in the TF-target regulatory network. Specifically, 8 of 20 were TFs, including NRF1 (degree = 7), E2F (degree = 6), E2F1 (degree = 4), E2F1DP1 (degree = 4), E2F1DP2 (degree = 4), E2F4DP2 (degree = 4), E2F4DP1 (degree = 3), and E2F1DP1RB (degree = 3).
Figure 4
TF-target regulatory network. Red triangle represents the upregulated protein, and blue hexagon represents TF. TF = transcription factor.
TF-target regulatory network. Red triangle represents the upregulated protein, and blue hexagon represents TF. TF = transcription factor.
MiRNA-TF-target regulatory network
According to the results predicted by WebGestalt, the miRNA-TF-target regulatory network was constructed using Cytoscape (Fig. 5). In this network, 2 significant miRNAs: miR-520G and miR-520H were significantly enriched, and both CKS1B and WDR1 could be targeted by these 2 miRNAs. Moreover, E2F1 was the common target TF of miR-520G and miR-520H.
Figure 5
miRNA-TF-target regulatory network. Red triangle represents the upregulated protein, blue hexagon represents TF, and orange diamond represents miRNA. miRNA = microRNA, TF = transcription factor.
miRNA-TF-target regulatory network. Red triangle represents the upregulated protein, blue hexagon represents TF, and orange diamond represents miRNA. miRNA = microRNA, TF = transcription factor.
Discussion
In the present study, a total of 71 co-regulated DEGs were identified between GSE28460 and GSE18497, including 56 upregulated genes and 15 downregulated genes. Functional enrichment of these DEGs indicated that co-upregulated genes were significantly enriched cell cycle and DNA replication and repair related GO-BP terms, as well as cell cycle and mitosis related pathways. Additionally, downregulated DEGs were significantly enriched in the protein glycosylation GO-BP term and genetic transcription pathway. Further analyses showed that POLD1, MCM2, and PLK4 were hub nodes in the PPI network and module and could be upregulated by E2F.POLD1, coding for DNA polymerase delta 1 (POLD1), is a catalytic subunit of the DNA polymerase δ, which is reported to be an important target of p53tumor suppressor.[ Functional enrichment analyses showed that POLD1 was significantly enriched in the cell cycle and DNA replication related pathway, which is consistent with the results described above. A previous study showed that a deficiency inDNA polymerase δ proofreading is strongly associated with a high incidence of epithelial cancers.[ Germline mutations in POLD1 perform a critical role in family colorectal cancer.[ Moreover, Staal et al[ found that ALL and colon cancer share some upregulated genes, and the colon was identified as an important location for relapsed ALL.[ Thus, POLD1 might play a crucial role in relapsed ALL, while few studies have examined the role of POLD1 in ALL. In the present study, POLD1 was identified to be significantly upregulated in relapsed ALL than diagnosis, and predicted to be a target of E2F, a transcription factor targets for the RB protein.[ A previous study demonstrated that E2F participates the regulation of gene promoter methylation,[ which may explain the upregulation of POLD1.MCM2, which codes for minichromosome maintenance complex component 2 (MCM2), was also identified to be targeted by E2F and significantly upregulated in relapsed ALL in this study. Richet et al[ identified that MCM2 acts as a chaperone for histone interactions with ASF1 at the replication fork. Moreover, several studies demonstrated that MCM2 performs a critical role in forming the prereplication complex and replication fork.[ In the present study, functional enrichment analyses showed that MCM2 and POLD1 were significantly in the mitotic G1-G1/S phases and S phase related pathways. Liu et al[ documented that the long noncoding RNA FTX inhibits the proliferation and metastasis of hepatocellular carcinoma by binding MCM2. Thus, MCM2 might also play an important role in the replication of tumor cells involved in relapsed ALL. Additionally, in the present study, MCM2 was found to interact with multiple proteins in the PPI network, such as FOXM1, POLD1, and PLK4, but whether these proteins could form a replication complex requires further analysis.PLK4, which codes of polo like kinase 4 (PLK4), is another upregulated gene targeted by E2F. Dementyeva et al[ found that expression of PLK4 is significantly elevated in multiple myeloma. Ward et al[ showed that deregulated methylation of PLKs, including PLK4, is a potential biomarker in hematological malignancies. Therefore, high expression of PLK4 in ALL might contribute to the deregulation of methylation on its promoter by E2F, which was also suggested as an effect of upregulation of POLD1 in the current study. Kazazian et al[ reported that PLK4 promotes cancer invasion and metastasis via Arp2/3 complex regulation of the actin cytoskeleton. Moreover, several studies revealed that PLK4 can interact with Cep192, Cep152,[ STIL,[ and CDK1,[ or regulated by others, such as E3 ubiquitin ligase Mib1[ and KAT2,[ to affect centriole biogenesis resulting in abnormalities in cell proliferation. Further biofunctional enrichment analysis also showed that PLK4 was significantly enriched in mitotic related pathways. These findings indicated that high levels of PLK4 promote the relapse of ALL by facilitating the cell cycle. Thus, PLK4 might be considered as a potential diagnostic marker or therapeutic target for the treating relapsed ALL.E2F1 is another E2F family TF found to target both POLD1 and MCM2 in the current study. Nagel et al[ revealed that overexpression of miR-17-92 suppresses the apoptosis of ALL by decreasing the expression of E2F1. Additionally, Kojima et al[ showed that E2F1 and P53 can be significantly induced by the tryptamine derivative JNJ-26854165 to increase the apoptosis of ALL. These findings indicated that E2F1 played a negative role in the development of ALL. In this study, E2F1 was identified to be regulated by miR-520H. Previous studies revealed that miR-520H plays an important role in the stem cell maintenance[ and differentiation of HSC,[ and with lower expressed in T lymphocytes and high expression in CD34+ cells.[ Su et al[ identified that high level of miR-520H is closely related to the poor prognosis of breast cancer. Moreover, downregulation of miR-520H via E1A has an anticancer effect.[ Bioinformatics analysis showed that the expression of miR-520H is inversely correlated with the expression levels of its targets.[ Hence, these findings suggested that miR-520H might downregulated the expression of E2F1 to increase the expression of POLD1 and MCM2, promoting ALL relapse.There are some limitations that should be strengthened in the present study. First, the results were obtained using a bioinformatics analysis. Therefore, some potential important genes/proteins might have been ignored because the parameters were selected manually. Second, analysis of the event-free survival rate of relapsed ALL was limited because of deficiencies in the clinical data. Finally, because clinical samples were difficult to collect from relapsed ALL patients, the expression levels of important DEGs, such as PLK4, POLD1, and MCM2, were not validated in an experimental manner.
Conclusion
In conclusion, POLD1, MCM2, and PLK4 played important roles in regulating cell cycle- and DNA replication-related pathways. E2F could upregulate the expression levels of POLD1, MCM2, and PLK2 by deregulating the methylations of their promoters to promote the relapse of ALL. Additionally, miR-520H could upregulate the expression levels of POLD1 and MCM2 via E2F1. Hence, POLD1, MCM2, and PLK4 might serve as potential diagnostic markers and therapeutic targets for treating relapsed ALL.
Authors: Laura E Hogan; Julia A Meyer; Jun Yang; Jinhua Wang; Nicholas Wong; Wenjian Yang; Gregory Condos; Stephen P Hunger; Elizabeth Raetz; Richard Saffery; Mary V Relling; Deepa Bhojwani; Debra J Morrison; William L Carroll Journal: Blood Date: 2011-09-14 Impact factor: 22.113
Authors: Fernando Bellido; Marta Pineda; Gemma Aiza; Rafael Valdés-Mas; Matilde Navarro; Diana A Puente; Tirso Pons; Sara González; Silvia Iglesias; Esther Darder; Virginia Piñol; José Luís Soto; Alfonso Valencia; Ignacio Blanco; Miguel Urioste; Joan Brunet; Conxi Lázaro; Gabriel Capellá; Xose S Puente; Laura Valle Journal: Genet Med Date: 2015-07-02 Impact factor: 8.822