Literature DB >> 31496748

A six-gene-based prognostic model predicts complete remission and overall survival in childhood acute myeloid leukemia.

Nan Zhang1, Ying Chen1, Shifeng Lou1, Yan Shen1, Jianchuan Deng1.   

Abstract

OBJECTIVE: Acute myeloid leukemia (AML) is a malignant clonal disorder. Despite enormous progress in its diagnosis and treatment, the mortality rate of AML remains high. The aim of this study was to identify prognostic biomarkers by using the gene expression profile dataset from public database, and to improve the risk-stratification criteria of survival for patients with AML.
MATERIALS AND METHODS: The gene expression data and clinical parameter were acquired from the Therapeutically Applicable Research to Generate Effective Treatment (TARGET) database. A total of 856 differentially expressed genes (DEGs) were obtained from the childhood AML patients classified into first complete remission (CR1) group (n=791) and not CR group (n=249). We performed a series of bioinformatics analysis to screen key genes and pathways, further comprehending these DEGs through Gene Ontology (GO) function and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses.
RESULTS: Six genes (SLC17A7, MSX2, CDC26, MSLN, CTSZ and DEFA3) identified by univariate, Kaplan-Meier survival and multivariate Cox regression analyses were used to develop the prognostic model. Further analysis showed that the survival estimations in the high-risk group had an increased risk of death compared with the low-risk group based on the model. The area under the curve of the receiver operator characteristic curve in the prognostic model for predicting the overall survival was 0.729, confirming good prognostic model. We also performed a nomogram to provide an individual patient with the overall probability, and internal validation in the TARGET cohort.
CONCLUSION: We identified a six-gene prognostic signature for risk-stratifying in patients with childhood AML. The risk classification model can be used to predict CR markers and may assist clinicians in providing realize the individualized treatment in this patient population.

Entities:  

Keywords:  bioinformatics; childhood acute myeloid leukemia; gene expression profiling; prognosis; remission induction; survival analysis

Year:  2019        PMID: 31496748      PMCID: PMC6701647          DOI: 10.2147/OTT.S218928

Source DB:  PubMed          Journal:  Onco Targets Ther        ISSN: 1178-6930            Impact factor:   4.147


Introduction

Acute myeloid leukemia (AML) is a malignant clonal disorder characterized by abnormal proliferation of immature myeloid cells at various stages of maturation.1 About 4% of AML cases occur in children and adolescents. The 5-year overall survival (OS) rate for patients under 19 is about 65%, but drops to 50%, 32%, and 6%, respectively, when the patients aged 20–49, 50–64, and 65 years and older.2 The cytogenetic karyotype and molecular abnormalities at diagnosis are considered the most significant prognostic factors and are highly predictive of complete remission (CR) rates, OS, risk of relapse and disease-free survival.3–5 During the last decades, accumulating evidence has proposed that many abnormal expressions and mutations of genes are involved in the progression and carcinogenesis of AML. Mutated genes with prognostic significance that have been reported include KIT, WT1, RUNX1, FLT3, KIT, CEBPA, NPM1, and MYC.6,7 However, only aberrations in NPM1, WT1, CEBPA and FLT3 are being widely utilized in clinical practice.8 Despite extensive research that has been carried out to identify find prognostic markers, the mortality rate of AML remains high. Therefore, prognostic risk stratification needs to be improved because it has the potential to develop effective diagnostic and therapeutic strategies. With recent developments in microarray technology and bioinformatic analysis, the complex molecular architecture of AML has been widely used to inform disease classification, prognostic stratification and novel drug target discovery. Multiple studies have suggested that patients whose leukemic blasts contain the NPM1 mutation without FLT3-ITD have a favorable prognosis, whereas patients with TET2 or AXSL1 mutation have a poor prognosis.9,10 In addition, patients with CBF rearrangements or CEBPA mutations are assigned to the low-risk subgroup.11,12 Recently, Ng et al13 developed a risk-stratification model that generates a prognostic score based on 17-gene expression for rapid determination in patients with acute leukemia, Patel et al14 proposed a model of somatic mutations for risk stratification based on microarray technology of a set of 18 genes. These models were found to have prognostic value in their studies. However, even with these progresses, pediatric AML risk classification remains suboptimal as a large number of patients with AML have not achieved CR regardless of the known high-risk factors. To improve the risk-stratification criteria for predicting prognosis in patients with childhood AML, our study analyzed the differentially expressed genes (DEGs) based on first CR15,16 using mRNA-seq datasets from the TARGET. We performed a systematic evaluation of mRNAs for the diagnosis of childhood AML by univariate analysis of gene expression and Cox regression analysis. We pooled the specificity and sensitivity of all genes in the files and constructed a time-dependent receiver operator characteristic (ROC) curve. We ranked and screened out the genes with high diagnostic accuracy based on area under the curve (AUC) values. The final risk-stratification model represents a potentially useful tool for predicting, CR but needs to be further evaluated in clinical practice.

Materials and methods

Data sources and processing

Gene mRNA expression data and clinical parameters associated with childhood AML patients up to April 29, 2019 were download from the NCI TARGET database (https://ocg.cancer.gov/). Series matrix files were extracted to assess mRNA expression, and mRNA-seq datasets preprocessed by quantile normalization or log2 transformation. According to the annotation platform file, we translated the mRNA IDs into symbol names. Then, we divided the patients into CR1 group (791 samples) and not CR group (249 samples) based on the sample annotation, see Table 1. The flow chart of the analysis procedure is shown in Figure 1. The gene mRNA expression data and clinical characteristics are publicly available and open to access, so this study did not need the approval from the ethics committee.
Table 1

Clinical characteristics of patients with CR1 and not CR

IDCR1 (n=791)Not CR (n=249)X2P-value
Age2.9690.085
 <14549 (69.4%)187 (75.1%)
 ≥14242 (30.6%)62 (24.9%)
Gender0.6090.435
 Male416 (52.6%)138 (55.4%)
 Female375 (47.4%)111 (44.6%)
White blood cell4.5670.033
 <150720 (91%)215 (86.3%)
 ≥15071 (9%)34 (13.7%)
Bone marrow leukemic blast0.3860.534
 <90%603 (76.2%)185 (74.3%)
 ≥90%188 (23.8%)64 (25.7%)
Peripheral blasts1.6570.198
 <90%715 (90.4%)218 (87.6%)
 ≥90%76 (9.6%)31 (12.4%)
CNS disease2.5600.110
 Yes47 (5.9%)22 (8.8%)
 No744 (94.1%)227 (91.2%)
Chloroma4.1100.043
 Yes86 (10.9%)39 (15.7%)
 No705 (89.1%)210 (84.3%)
FAB category24.7410.001
 MO16 (2%)15 (6%)
 M186 (10.9%)33 (13.3%)
 M2178 (22.5%)52 (20.9%)
 M32 (0.3%)0 (0%)
 M4192 (24.3%)33 (13.3%)
 M5148 (18.7%)46 (18.5%)
 M613 (1.6%)4 (1.6%)
 M731 (3.9%)15 (6%)
 Unknown125 (15.8%)51 (20.5%)
Primary cytogenetic code25.819<0.001
 inv (16)115 (14.5%)12 (4.8%)
 MLL146 (18.5%)44 (17.7%)
 t (8;21)123 (15.5%)29 (11.6%)
 Other189 (23.9%)85 (34.1%)
 Normal180 (22.8%)68 (27.3%)
 Unknown38 (4.8%)11 (4.4%)
FLT3/ITD positive7.9740.005
 Yes133 (16.8%)62(24.9%)
 No655 (82.8%)187(75.1%)
 Unknown3 (0.4%)0 (0%)
FLT3 PM0.0570.811
 Yes54 (6.8%)16 (6.4%)
 No733 (92.7%)233 (93.6%)
 Unknown4 (0.5%)0 (0%)
NPM mutation5.2250.022
 Yes77 (9.7%)13 (5.2%)
 No698 (88.2%)236 (94.8%)
 Unknown16 (2%)0 (0%)
CEBPA mutation5.1560.023
 Yes52 (6.6%)7 (2.8%)
 No727 (91.9%)241 (96.8%)
 Unknown12 (1.5%)1 (0.4%)
WT1 mutation12.147<0.001
 Yes45 (5.7%)31(12.4%)
 No731 (92.4%)218(87.6%)
 Unknown15 (1.9%)0 (0%)
c-Kit mutation exon 80.4570.499
 Yes36 (4.6%)5 (2%)
 No189 (23.9%)37 (14.9%)
 Not done566 (71.6%)207 (83.1%)
c-Kit mutation exon 172.3210.128
 Yes24 (3%)8 (3.2%)
 No200 (25.3%)34(13.7%)
 Not done567 (71.7%)207 (83.1%)
MRD at end of course 1161.121<0.001
 Yes126 (15.9%)143 (57.4%)
 No492 (62.2%)68 (27.3%)
 Unknown173 (21.9%)38 (15.3%)

Abbreviations: CR, complete remission; FAB, French-American-British; MRD, minimal residual disease.

Figure 1

Flow diagram of the analysis procedure.

Abbreviations: OS, overall survival; ROC, receiver operator characteristic; WHO, World Health Organization.

Clinical characteristics of patients with CR1 and not CR Abbreviations: CR, complete remission; FAB, French-American-British; MRD, minimal residual disease. Flow diagram of the analysis procedure. Abbreviations: OS, overall survival; ROC, receiver operator characteristic; WHO, World Health Organization.

Identifying genes of differential expression

All data were analyzed with the R 3.5.2 software (https://www.r-project.org/). The differential expression of mRNA in childhood AML (260 CR1 and 93 not CR samples with full survival information along with mRNA-seq datasets) was calculated by using R/Bioconductor package of edgeR.17 We defined the cut-off criteria DEGs as |log2 fold-change(log2FC)|>1.5 and adjusted P-value (adj.P) <0.01. Finally, hierarchical cluster analysis was used to show the heat map and volcano plot of two groups by using gplots package in R platform.

Functional and pathway enrichment analysis

To explore the biological effects and pathways of the identified DEGs. The top 10 of Gene Ontoloy (GO) Biological Process analyses were conducted by using the R/Bioconductor package of Clusteprofiler.18 The significant results of biological process (BP), cellular component (CC), and molecular function (MF) were based on the threshold of P<0.05. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis19 was performed for the selected genes using The Database for Annotation, Visualization and Integrated Discovery (DAVID; https://david.ncifcrf.gov/). A P<0.05 was considered statistically significant.

Integration of the protein–protein interaction (PPI) network

The Search Tool for the Retrieval of Interacting Genes version 11.0 (STRING; https://string-db.org/) was used for the exploration of potential DEGs interactions at the protein level.20 In the present study, the parameter of interactions was set as interaction score >0.55 could be considered statistically significant, hiding disconnected nodes in the network. Then, the Cytoscape software (version 3.7.1; https://cytoscape.org/) was used for constructing and visualizing a PPI network of common DEGs.21 The plug-in Molecular Complex Detection (MCODE) (version 1.5.1) of Cytoscape was used to Cluster a given network based on topology to find densely connected regions. The criteria for selection were as follows: degree cut-off=2, node score cut-off=0.2, k-core=2, and max depth=100.

Hub genes selection and analysis

To identify DEGs predictive of clinical factors and survival outcomes, the information of 138 hub genes in the training dataset was utilized to perform univariate Cox regression analysis using the Survival package of R software (Version 2.44-1.1). The HR with 95% CI were calculated and log-rank test (P<0.01) was conducted to further select the most significant candidate genes. The OS analyses of candidate genes were performed using Kaplan–Meier plots by using Bioconductor R package. Gene expression value was labeled as high or low using a dichotomy method, with P<0.05 being considered significantly different. Multivariate Cox proportional hazards regression model was used to calculate the risk score (RS) based on the 12 potentially relevant genes in the preliminary screening, and the impact of OS information. The RS of each sample was calculated using the formula of RS=β1Exp1+β2Exp2+…+βxExpx (βi: the coefficient value, Expx: the gene expression level). The childhood AML patients were classified into low-risk and high-risk groups according to the median RS survival analysis and log-rank test were performed to evaluate the differences between the two groups. The ROC analyses were performed by using SurvivalROC package of R (Version 1.0.3) based the prognostic model that incorporates genes expression factors to predict the probability of 3- and 5-year OS. Then, identifying prognostic genes between CR1 and not CR samples, according to this research, we used performed the nomogram-based model to predict the survival probability by using the R package “rms” (Version 5.1-3.1).22 We divided the patients into eight groups by the French-American-British (FAB) category from database to analyze the six candidate genes expression level in different subtypes of childhood AML. The statistical analysis this study is performed by using the GraphPad Prism (Version 8.0.2; GraphPad Software, Inc., La Jolla, CA, USA).

Results

Identification of differential molecules in childhood AML

A gene expression database generated by RNA-Seq was downloaded from TARGET. The database included the expression levels detected in childhood AML samples with clinical information on whether the patient achieved first CR or not. A total of 856 differential genes met the criteria of |log2FC|>1.5 and adj-P<0.01, including 543 up-regulated genes and 313 down-regulated genes in childhood AML compared with CR1 group. The heat map and volcano plots that demonstrated significant differential distribution among each data set are shown in Figure 2A and B.
Figure 2

(A) Heat map for potential mRNAs based on the expression profles of signifcantly diferentially expressed genes. (B) Volcano plot of genes detected in childhood AML, red dots represent upregulated and green dots represent downregulated.

(A) Heat map for potential mRNAs based on the expression profles of signifcantly diferentially expressed genes. (B) Volcano plot of genes detected in childhood AML, red dots represent upregulated and green dots represent downregulated.

DEGs functional and pathway enrichment analysis

To explore the biological functional implication of DEGs, the top 10 GO enrichment analysis of up-regulated and down-regulated DEGs was performed, see Figure 3A and B. The up-regulated genes were mostly associated with the BP terms response to lipopolysaccharide, molecule of bacterial origin, leukocyte chemotaxis, and chemokine-mediated signaling pathway, while the down-regulated genes were mostly enriched in cell fate commitment, pattern specification process, regionalization, and morphogenesis of a branching structure. In addition, CC analysis showed that the up-regulation genes were related to extracellular matrix, receptor complex, proteinaceous extracellular matrix, and apical plasma membrane, and the down-regulated genes were mostly found in the postsynapse, extracellular matrix, neuron projection membrane, and dendrite membrane. Additionally, for MF terms, up-regulated genes were enriched in channel activity, passive transmembrane transporter activity, and G-protein coupled peptide receptor activity, while the down-regulated genes were relevant to transcriptional activator activity, RNA polymerase II transcription regulatory region sequence-specific DNA binding, and passive transmembrane transporter activity.
Figure 3

GO enrichment analysis of aberrantly diferentially expressed genes with no complete remission. The top 10 up-regulated (A) and down-regulated (B) genes GO analysis (The size of each dot represents the count of genes, the color represents the adj-P).

GO enrichment analysis of aberrantly diferentially expressed genes with no complete remission. The top 10 up-regulated (A) and down-regulated (B) genes GO analysis (The size of each dot represents the count of genes, the color represents the adj-P). KEGG pathway enrichment analysis was performed using DAVID. Table 2 shows the most significant KEGG pathway of the up-regulated and down-regulated DEGs, including cytokine-cytokine receptor interaction, neuroactive ligand-receptor interaction, cell adhesion molecules (CAMs), hematopoietic cell lineage, and signaling pathways regulating pluripotency of stem cells, etc.
Table 2

KEGG pathway enrichment analysis of aberrantly differentially expressed genes in childhood AML with no complete remission

Pathway IDDescriptionP-valueGene countGenes
hsa04060Cytokine-cytokine receptor interaction3.231E-0525CSF3, CXCL1, IL1R2, CSF2, CXCL5, IL7, TNFSF15, CXCR1, CXCR2, IL24, PF4V1, CCL7, ACVR2A, CCR8, TNFRSF10C, CCR7, TNFRSF11B, CCR6, CXCL14, CCL20, PRLR, IFNB1, IL12B, BMP7, PRL
hsa04080Neuroactive ligand-receptor interaction3.293E-0322OPRM1, F2RL2, GABRG1, CGA, GABRA2, GLRB, GRIK1, GABRA4, GRIN3B, BDKRB2, GRM4, GABRR1, PRLR, ADRA2A, AVPR1A, CHRND, UTS2R, PRL, ADRA1D, GLP1R, CHRNG, GRID1
hsa04514CAMs1.083E-0213PTPRM, CD8B, CD276, LRRC4B, CLDN22, CLDN10, CLDN11, CDH5, NCAM2, SIGLEC1, CDH15, CLDN2, CNTNAP2
hsa05033Nicotine addiction2.033E-026SLC17A7, GABRG1, GABRA2, GABRR1, GABRA4, GRIN3B
hsa04640Hematopoietic cell lineage2.203E-029CSF3, CSF2, IL1R2, CD3G, CD3D, DNTT, IL7, CD8B, ITGA1
hsa04530Tight junction2.203E-029PARD6B, MPDZ, CLDN22, CRB3, CLDN2, ACTN2, CLDN10, MYH14, CLDN11
hsa04978Mineral absorption2.953E-026TF, MT1M, HMOX1, SLC26A9, MT1H, MT1G
hsa04550Signaling pathways regulating pluripotency of stem cells0.05221511WNT5A, ACVR2A, HNF1A, OTX1, DLX5, PAX6, NEUROG1, IGF1, WNT6, WNT8A, KLF4
hsa04950Maturity onset diabetes of the young0.0828154HNF1A, MNX1, PAX6, NEUROG3
hsa05410HCM0.0897227ACE, CACNG8, CACNG6, ITGA1, CACNB2, IGF1, TTN

Abbreviations: KEGG, Kyoto Encyclopedia of Genes and Genomes; AML, acute myeloid leukemia; CAMs, cell adhesion molecules; HCM, hypertrophic cardiomyopathy.

KEGG pathway enrichment analysis of aberrantly differentially expressed genes in childhood AML with no complete remission Abbreviations: KEGG, Kyoto Encyclopedia of Genes and Genomes; AML, acute myeloid leukemia; CAMs, cell adhesion molecules; HCM, hypertrophic cardiomyopathy.

PPI network construction and module analysis

The STRING was used to construct PPI networks of DEGs, see Figure 4A. The plug‐in MCODE of the Cytoscape software was used to identify the most significant module. Ultimately, 138 nodes and 885 edges were established from the most significant genes with differential expression, including 88 up-regulated genes and 50 down-regulated genes, see Figure 4B, which possibly play an important role in childhood AML progression and can be used as a predictor of CR.
Figure 4

(A) PPI network of signifcantly diferentially expressed genes. (B) The most significant module was established from PPI network with 138 nodes and 885 edges, up-regulated genes are marked with light red; down-regulated genes are marked with light blue.

Abbreviation: PPI, protein–protein interaction.

(A) PPI network of signifcantly diferentially expressed genes. (B) The most significant module was established from PPI network with 138 nodes and 885 edges, up-regulated genes are marked with light red; down-regulated genes are marked with light blue. Abbreviation: PPI, protein–protein interaction.

Prognostic gene marker screening

To assess the prognostic value of the most significant module form 138 genes, we performed Cox regression analysis, OS analysis and ROC curve analyses along with calculations of the area under the curve (AUC). The results of log-rank test showed that 17 genes were significantly associated with OS evidenced by positive coefficients in the Cox regression, suggesting that they may have a low risk of recurrence, see Table 3. Secondly, we analyzed the association between these candidate gene expression of patients with childhood AML by Kaplan-Meier analysis. The results showed that 12 genes expression (RAMP3, LYPD2, CHIT1, CXCR2, SLC17A7, MSX2, DEFA4, CDC26, MMP8, MSLN, CTSZ, DEFA3) was associated with OS for childhood AML, see Figure 5.
Table 3

Univariate Cox regression analysis for the candidate genes in the training dataset

GeneHRLower 95% CIUpper 95% CIzP-value
RAMP31.1074311.0354151.1844562.9743990.002936
LYPD20.8961170.8348450.961886−3.0353320.002403
FBXO21.1270051.0322271.2304852.6676570.007638
CHIT11.1001481.0329551.1717112.9683460.002994
CXCL11.1148861.0432311.1914623.2086770.001333
FBXO211.1984871.0620741.3524222.9367870.003316
CXCR21.1537501.0698021.2442863.7105580.000207
SLC17A71.0941911.0242021.1689632.6690370.007607
FFAR21.0996281.0230251.1819662.5778530.009942
MSX21.0713171.0201381.1250632.7582740.005811
DEFA41.0558761.0135701.0999492.6059880.009161
CDC261.3906491.1687351.6547003.7178250.000201
DEFA1B1.0823481.0289121.1385603.0633110.002189
MMP81.0714281.0269451.1178373.1889250.001428
MSLN0.9400930.9089230.972332−3.5908550.000330
CTSZ0.9085710.8469550.974670−2.6760110.007450
DEFA31.0531261.0151751.0924942.7642950.005705
Figure 5

Prognostic value of twelve key genes (A) RAMP3 (B) LYPD2 (C) CHIT1 (D) CXCR2 (E) SLC17A7 (F) MSX2 (G) DEFA4 (H) CDC26 (I) MMP8 (J) MSLN (K) CTSZ (L) DEFA3 in childhood AML from TARGET database.

Univariate Cox regression analysis for the candidate genes in the training dataset Prognostic value of twelve key genes (A) RAMP3 (B) LYPD2 (C) CHIT1 (D) CXCR2 (E) SLC17A7 (F) MSX2 (G) DEFA4 (H) CDC26 (I) MMP8 (J) MSLN (K) CTSZ (L) DEFA3 in childhood AML from TARGET database.

Genetic risk score model construction and ROC curve analysis

Among the 12 prognostic genes identified for which multiple stepwise Cox regression was performed to explore the effect of these genes on the survival time and the patient’s outcome, six gene markers were found to be independent predictors in childhood AML patients, see Table 4. As a result, six genes were finally selected to build a predictive model. Patient RSs were determined using the formula below.
Table 4

A six-gene signature identified by multivariate Cox regression analysis

idcoefexp (coef)se (coef)zPr(>|z|)
SLC17A70.1089931.1151540.0342113.1858980.001443
MSX20.1107291.1170920.0283773.9020840.000095
CDC260.3190051.3757580.0857763.7190480.000200
MSLN−0.0486600.9525050.019963−2.4374330.014792
CTSZ−0.0681230.9341460.039113−1.7416740.081566
DEFA30.0456491.0467070.0204742.2296200.025773
A six-gene signature identified by multivariate Cox regression analysis A total of 295 patients were classified into a high-risk group and a low-risk group by using the median of the RSs as a cut-off point. The survival estimated for childhood AML patients in the high-risk group and those in the low-risk group were significantly different, with an increased risk of death in the high-risk group. The results show that the 3- and 5-year survival rate were significantly different between the high-risk group and low-risk group, see Figure 6A. The prognostic capacity of the six-gene signature was evaluated by using the AUC of a time-dependent ROC curve. The AUC of genes biomarker prognostic model was 0.729, see Figure 6B. The RS, expression heat map, and patients’ survival status distribution of the 6 prognostic genes in two groups are shown in Figure 6C, indicating that the predictive model had a high sensitivity and specificity. We developed a nomogram to predict the probability of the 1‐, 3‐ and 5‐year OS. The predictors of the nomogram included six independent prognostic factors including SLC17A7, MSX2, CDC26, MSLN, CTSZ and DEFA3, see Figure 7. We analyzed the six candidate genes expression level in different subtypes of childhood AML based on the FAB category, but M3 data are scarce, see Figure 8.
Figure 6

Prognostic risk score model analysis of six prognostic genes. (A) The Kaplan–Meier curves for low-risk and high-risk groups. (B) The ROC curves for predicting OS by the risk score. (C) The distribution of risk score, expression heat map, and survival status.

Abbreviations: AUC, area under the curve; ROC, receiver operator characteristic; OS, overall survival.

Figure 7

Nomogram for predicting 1-, 3-, and 5-year survival rate in childhood AML patients. By adding up the points identified on the point scale for each variable, the total score on the bottom scale shows the probability of survival.

Figure 8

The French-American-British (FAB) category from database to analyze the six each candidate genes expression level in different subtypes of childhood AML. (A) SLC17A7 (B) MSX2 (C) CDC26 (D) MSLN (E) CTSZ (F) DEFA3.

Prognostic risk score model analysis of six prognostic genes. (A) The Kaplan–Meier curves for low-risk and high-risk groups. (B) The ROC curves for predicting OS by the risk score. (C) The distribution of risk score, expression heat map, and survival status. Abbreviations: AUC, area under the curve; ROC, receiver operator characteristic; OS, overall survival. Nomogram for predicting 1-, 3-, and 5-year survival rate in childhood AML patients. By adding up the points identified on the point scale for each variable, the total score on the bottom scale shows the probability of survival. The French-American-British (FAB) category from database to analyze the six each candidate genes expression level in different subtypes of childhood AML. (A) SLC17A7 (B) MSX2 (C) CDC26 (D) MSLN (E) CTSZ (F) DEFA3. In order to validate the prognostic model, we incorporate WHO risk-stratification criteria such as cytogenetics and genetics.23 A total of 295 patients was analyzed with favorable or adverse factors for internal validation, see Table 5. The results show that the CR rate was lower in children in the high-risk group (68.7%) than in those in low-risk group (87.8%) (P<0.01). In addition, the distribution of some favorable or adverse factors8 such as RUNX1-RUNX1T1, CBFB-MYH11, CEBPA mutation, cytogenetic complexity, and FLT3-ITD combined with WT1 mutation was consistent with the results of previous studies.
Table 5

Evaluate the prognostic model by WHO classification (cytogenetics or genetics)

High risk (n=147)Low risk (n=148)P-value
CR status at end of course 1101 (68.7%)130 (87.8%)6.7E-05*
Favorable factors
 t(8;21)(q22;q22)/RUNX1-RUNX1T15 (3.4%)41 (27.7%)1.5E-08*
 inv(16)(p13.1q22)/CBFB-MYH110 (0%)42 (28.4%)5.8E-12*
 CEBPA mutation4 (2.7%)16 (10.8)5.7E-03*
 NPM mutation14 (9.5%)14 (9.4%)9.8E-01
Adverse factors
 Cytogenetic complexity (3 or more)31 (21.1%)18 (12.1%)3.9E-02*
 t(10;11)(p12;q23)/MLLT10-MLL5 (3.4%)2 (1.3%)2.2E-01
 t(6;9)(p23;q34)/DEK-NUP2142 (1.4%)1 (0.6%)5.3E-01
 FLT3-ITD/combined with WT1 mutation12/21 (57.1%)2/14 (14.2%)8.6E-03*

Note: *Difference between the two groups was significant (P<0.05).

Evaluate the prognostic model by WHO classification (cytogenetics or genetics) Note: *Difference between the two groups was significant (P<0.05).

Discussion

AML is one of the most common malignancies, with multiple types of molecular and cellular heterogeneity in childhood.3 Hematopoietic stem cell transplantation combined with chemotherapy are the basic means to treat AML, but the prognosis of childhood AML remains suboptimal due to high recurrence and high mortality.24,25 In particular, refractory acute leukemia has poor response to treatment, a short survival period and low-induced relieving rate in the second CR2 after relapse.26 Recently, many studies had reported that the prognosis of childhood AML is partly driven by genetic factors, and the expressions of multiple genes maybe beneficial to predicting prognosis and select treatment regimens.8,27,28 The clinical implementation of an improved child AML risk classification model is likely to provide more relevant information for clinical decisions and improve the prognosis of child AML patients by refining patient’s risk stratification.29,30 Therefore, understanding the etiological factors and molecular mechanisms of childhood AML progression is essential for the diagnosis and treatment of this disease. Microarray technology has been widely applied to identify potential therapeutic targets. Previously, Luo et al31 analyzed the GSE8970 dataset and revealed that ubiquitin-conjugating enzyme E2E1 (UBE2E1) as a prognostic factor may be involved in AML. Zhang et al32 analyzed the GSE12417 dataset and suggested that the long non-coding RNA H19 may serve roles in AML. Niu et al33 analyzed the TCGA dataset and constructed a risk prediction model based on relapse information, with the limitations that the number of AMLs cohorts was small and more specimens should be included to validate the ability of model. On the other hand, the TARGET database has the advantage of having large AML samples and complete clinical information for children. To reduce mortality and improve the risk-stratification criteria, there is an urgent need for the molecular screening of biomarkers of childhood AML. In this study, we identified significant DEGs between the childhood AML into first CR and not CR samples from the TARGET database. Furthermore, we performed a series of bioinformatics analyses to screen key genes and pathways. As a result, a total of 856 DEGs were identified, consisting of 543 up-regulated genes and 313 down-regulated genes. GO function and KEGG pathway analyses were performed to acquire an in-depth understanding of these DEGs. The functional enrichment analyses demonstrated that the up-regulated genes were enriched in some BPs such as leukocyte chemotaxis, chemokine-mediated signaling pathway, receptor complex, apical plasma membrane, G-protein coupled peptide receptor activity, and channel activity. In addition, the down-regulated genes were mostly enriched in cell fate commitment, morphogenesis of a branching structure, projection membrane, transcriptional activator activity, and RNA polymerase II transcription regulatory region sequence-specific DNA binding. The results are consistent with previous knowledge proved that gain or loss of these functions plays an important role in AML tumorigenesis and progression. The KEGG pathway analysis revealed that the DEGs were significantly associated with cytokine-cytokine receptor interaction, neuroactive ligand–receptor interaction, hematopoietic cell lineage, and signaling pathways regulating pluripotency of stem cells. Our study results suggested that these DEGs may be involved in the onset and progression of childhood AML. Based on these findings, the hub genes were screened, and univariate, multivariate Cox analyses were conducted to build a risk model to predict childhood AML prognosis. We identified six genes: SLC17A7, MSX2, CDC26, MSLN, CTSZ and DEFA3. High expression levels of SLC17A7, MSX2, CDC26 and DEFA3 were relevant to a poor prognosis in childhood AML patients, but MSLN and CTSZ were associated with a good prognosis. The AUC of the ROC curve for the prognostic model for predicting the OS was 0.729, indicating that the six-gene signature had a good performance for survival prediction. With the gene expression risk scoring prognostic model, the patients with childhood AML were divided into a high-risk group and a low-risk group. According to the results predicted by the model, the clinician can change the treatment plan and provide individualized treatment for childhood AML patients. There is a need for developing strategies to improve CR in the high-risk group. Patients in the high-risk group should be followed more frequently, and bone marrow aspiration and biopsy should be performed regularly to facilitate early detection of disease recurrence. Our prognostic mode is independent of other factors in childhood AML, and may have implication in guiding hematopoietic stem cell transplantation. Similarly, nomogram is a kind of statistical tools that provides an individual patient with the overall probability of a particular outcome. Whether this model is applicable to adult AML,34 warrants further investigation. The protein encoded by SLC17A7 is a vesicle-bound, sodium-dependent phosphate transporter that is particularly expressed in neuron-rich regions of the brain. Wan et al35 identified SLC17A7 as the potential diagnostic and prognostic biomarkers of uveal melanoma by Co-expression modules. Homeobox-containing (HOX) genes encode transcription factors, which play an important regulatory role in signal transduction pathways such as cell development, migration, and differentiation, and are frequently found to be aberrantly expressed in cancer.36 Up-regulation of muscle segment homeobox genes 2 (MSX2), a member of the homeobox gene family, was found in pancreatic cancer and prostate cancer patients. Many clinical studies showed MSX2 was involved in the occurrence and development of tumors.37,38 Zhai et al39 have discovered that MSX2 is a direct downstream target of WNT signaling and correlated with the invasiveness of endometrioid adenocarcinoma. Moreover, MSX2 has been identified as a physiological NKL in hematopoietic cells. It is involved in NOTCH3-signaling,and this pathway interacts between the physiological and oncogenic homeobox signaling in T-ALL.40 Cell division control protein 26 (CDC26) is part of the protein modification and involved in the pathway protein ubiquitination. It catalyzes the formation of protein-ubiquitin conjugates that are subsequently degraded by the proteasome.41 Mesothelin (MSLN) is a glycosylphosphatidylinositol-anchored cell-surface protein and may be a CAM. Steinbach et al42 prospectively evaluated the prognostic value of monitoring treatment response in AML by measuring the expression of 7 leukemia-related genes. Among them, MSLN is regarded as the important prognostic indicator. Cathepsin Z (CTSZ), a lysosomal cysteine protease and a member of the peptidase C1 family is widely expressed in tumor cell lines and primary tumors. Like other members of the family, it may be involved in the occurrence of tumors.43 Defensin alpha 3 (DEFA3) is present in the bactericidal granules of neutrophils and may play a role in phagocyte-mediated host defense. The proliferation rate was affected by the stimulation of defensin in tumor cell lines.44 The six-gene prognostic model may facilitate the development of new prognostic predictors for childhood AMLs. In addition, our solution significantly reduces the cost of sequencing, which makes the application of gene-specific targeted sequencing more cost-effective and routine. In future, we plan to use single-cell transcriptome sequencing in bone marrow to detect the expression of these six genes in patients who are poor candidates for transplantation. The prognostic assessment is crucial in selecting the suitable treatment. Since patients with the same subtype and stage can have different clinical outcomes, we developed this predictive model for risk stratification in childhood AML, and the model may become routinely used in the future. Our study has several limitations. First, our results were derived from data in TARGET dataset and generated by bioinformatic analysis. The TARGET database does not provide information about specific treatments received by each patient. Thus, the results of our study need to be validated in other databases. Further investigations are needed to validate our results based on childhood AML samples and clinical data. Second, the number of samples without CR was smaller than those with CR in childhood AML. Therefore, more specimens need to be included to validate the predictive model capability we developed. In conclusion, our study results indicate that the six-gene prognostic model is a reliable tool for predicting the OS of childhood AML, and a nomogram comprising a prognostic model can serve as a predictor for CR and may assist clinicians in providing individualized treatment in this patient population. This discovery has the potential to provide new therapeutic targets for childhood AML.
  44 in total

1.  Differential expression of MSX2 in nodular hyperplasia, high-grade prostatic intraepithelial neoplasia and prostate adenocarcinoma.

Authors:  Chee-Wai Chua; Yung-Tuen Chiu; Hiu-Fung Yuen; Kwok-Wah Chan; Xianghong Wang; Ming-Tat Ling; Yong-Chuan Wong
Journal:  APMIS       Date:  2010-10-13       Impact factor: 3.205

2.  Impact of cytogenetics on outcome of matched unrelated donor hematopoietic stem cell transplantation for acute myeloid leukemia in first or second complete remission.

Authors:  Martin S Tallman; Gordon W Dewald; Sharavi Gandham; Brent R Logan; Armand Keating; Hillard M Lazarus; Mark R Litzow; Jayesh Mehta; Tanya Pedersen; Waleska S Pérez; Jacob M Rowe; Meir Wetzler; Daniel J Weisdorf
Journal:  Blood       Date:  2007-03-20       Impact factor: 22.113

Review 3.  Molecular genetics of adult acute myeloid leukemia: prognostic and therapeutic implications.

Authors:  Guido Marcucci; Torsten Haferlach; Hartmut Döhner
Journal:  J Clin Oncol       Date:  2011-01-10       Impact factor: 44.544

Review 4.  Prognostic factors in pediatric acute myeloid leukemia.

Authors:  Mohamed Radhi; Soheil Meshinchi; Alan Gamis
Journal:  Curr Hematol Malig Rep       Date:  2010-10       Impact factor: 3.952

5.  Outcome for children treated for relapsed or refractory acute myelogenous leukemia (rAML): a Therapeutic Advances in Childhood Leukemia (TACL) Consortium study.

Authors:  Matthew F Gorman; Lingyun Ji; Richard H Ko; Phillip Barnette; Bruce Bostrom; Raymond Hutchinson; Elizabeth Raetz; Nita L Seibel; Clare J Twist; Elena Eckroth; Richard Sposto; Paul S Gaynon; Mignon L Loh
Journal:  Pediatr Blood Cancer       Date:  2010-09       Impact factor: 3.167

Review 6.  Homeobox gene expression in cancer: insights from developmental regulation and deregulation.

Authors:  Shaija Samuel; Honami Naora
Journal:  Eur J Cancer       Date:  2005-09-30       Impact factor: 9.162

7.  Insights into anaphase promoting complex TPR subdomain assembly from a CDC26-APC6 structure.

Authors:  Jing Wang; Billy T Dye; Kanagalaghatta R Rajashankar; Igor Kurinov; Brenda A Schulman
Journal:  Nat Struct Mol Biol       Date:  2009-08-09       Impact factor: 15.369

Review 8.  Advances in molecular genetics and treatment of core-binding factor acute myeloid leukemia.

Authors:  Krzysztof Mrózek; Guido Marcucci; Peter Paschka; Clara D Bloomfield
Journal:  Curr Opin Oncol       Date:  2008-11       Impact factor: 3.645

9.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

10.  NK-like homeodomain proteins activate NOTCH3-signaling in leukemic T-cells.

Authors:  Stefan Nagel; Letizia Venturini; Grzegorz K Przybylski; Piotr Grabarczyk; Corinna Meyer; Maren Kaufmann; Karin Battmer; Christian A Schmidt; Hans G Drexler; Michaela Scherr; Roderick Af Macleod
Journal:  BMC Cancer       Date:  2009-10-19       Impact factor: 4.430

View more
  6 in total

1.  Searching for a signature involving 10 genes to predict the survival of patients with acute myelocytic leukemia through a combined multi-omics analysis.

Authors:  Haifeng Zhuang; Yu Chen; Xianfu Sheng; Lili Hong; Ruilan Gao; Xiaofen Zhuang
Journal:  PeerJ       Date:  2020-06-25       Impact factor: 2.984

2.  Overexpression of annexin A5 might guide the gemtuzumab ozogamicin treatment choice in patients with pediatric acute myeloid leukemia.

Authors:  Nan Zhang; Ying Zhang; Ping Zhang; Shifeng Lou; Ying Chen; Huan Li; Hanqing Zeng; Yan Shen; Jianchuan Deng
Journal:  Ther Adv Med Oncol       Date:  2020-05-27       Impact factor: 8.168

3.  Gene Panel of Persister Cells as a Prognostic Indicator for Tumor Repopulation After Radiation.

Authors:  Yucui Zhao; Yanwei Song; Ruyi Zhao; Minghui Zhao; Qian Huang
Journal:  Front Oncol       Date:  2020-11-20       Impact factor: 6.244

4.  CACNA1C is a prognostic predictor for patients with ovarian cancer.

Authors:  Xiaohan Chang; Yunxia Dong
Journal:  J Ovarian Res       Date:  2021-07-01       Impact factor: 4.234

5.  Alternative donor peripheral blood stem cell transplantation for the treatment of high-risk refractory and/or relapsed childhood acute leukemia: a randomized trial.

Authors:  Binglei Zhang; Jian Zhou; Fengkuan Yu; Tianxin Lv; Baijun Fang; Dandan Fan; Zhenyu Ji; Yongping Song
Journal:  Exp Hematol Oncol       Date:  2020-04-06

6.  Investigation of the underlying genes and mechanism of familial hypercholesterolemia through bioinformatics analysis.

Authors:  Dinghui Wang; Bin Liu; Tianhua Xiong; Wenlong Yu; Qiang She
Journal:  BMC Cardiovasc Disord       Date:  2020-09-16       Impact factor: 2.298

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.