Literature DB >> 31333881

Elevated HOX gene expression in acute myeloid leukemia is associated with NPM1 mutations and poor survival.

Ádám Nagy1,2, Ágnes Ősz1,2, Jan Budczies3, Szilvia Krizsán4, Gergely Szombath5, Judit Demeter6, Csaba Bödör4, Balázs Győrffy1,2.   

Abstract

Acute myeloid leukemia (AML) is a clonal disorder of hematopoietic progenitor cells and the most common malignant myeloid disorder in adults. Several gene mutations such as in NPM1 (nucleophosmin 1) are involved in the pathogenesis and progression of AML. The aim of this study was to identify genes whose expression is associated with driver mutations and survival outcome. Genotype data (somatic mutations) and gene expression data including RNA-seq, microarray, and qPCR data were used for the analysis. Multiple datasets were utilized as training sets (GSE6891, TCGA, and GSE1159). A new clinical sample cohort (Semmelweis set) was established for in vitro validation. Wilcoxon analysis was used to identify genes with expression alterations between the mutant and wild type samples. Cox regression analysis was performed to examine the association between gene expression and survival outcome. Data analysis was performed in the R statistical environment. Eighty-five genes were identified with significantly altered expression when comparing NPM1 mutant and wild type patient groups in the GSE6891 set. Additional training sets were used as a filter to condense the six most significant genes associated with NPM1 mutations. Then, the expression changes of these six genes were confirmed in the Semmelweis set: HOXA5 (P = 3.06E-12, FC = 8.3), HOXA10 (P = 2.44E-09, FC = 3.3), HOXB5 (P = 1.86E-13, FC = 37), MEIS1 (P = 9.82E-10, FC = 4.4), PBX3 (P = 1.03E-13, FC = 5.4) and ITM2A (P = 0.004, FC = 0.4). Cox regression analysis showed that higher expression of these genes - with the exception of ITM2A - was associated with worse overall survival. Higher expression of the HOX genes was identified in tumors harboring NPM1 gene mutations by computationally linking genotype and gene expression. In vitro validation of these genes supports their potential therapeutic application in AML.

Entities:  

Keywords:  AML, acute myeloid leukemia; Acute myeloid leukemia; Clinical samples; FAB classification, French–American–British classification; FC, fold change; Gene expression; HOX genes; HOX, homeobox; HR, hazard ratio; ITD, internal tandem duplication; MEIS, myeloid ecotropic viral integration site; Mutation; NCBI GEO, National Center for Biotechnology Gene expression Omnibus; OS, overall survival; PBX, pre-B-cell leukemia homeobox; Survival; TCGA, The Cancer Genome Atlas; WHO, World Health Organization; qPCR, quantitative polymerase chain reaction

Year:  2019        PMID: 31333881      PMCID: PMC6614546          DOI: 10.1016/j.jare.2019.05.006

Source DB:  PubMed          Journal:  J Adv Res        ISSN: 2090-1224            Impact factor:   10.479


Introduction

Acute myeloid leukemia (AML) is characterized by clonal proliferation of myeloid blasts. Based on statistical data, AML represents approximately 1.1% of all new cancer cases in the U.S. and is more common in older adults and males. The death rate is higher among patients over 65 years and unfortunately, the rate has failed to decrease in recent years [1]. Chromosomal structural variations and genetic abnormalities play an essential role in the pathogenesis of AML [2]. According to The Cancer Genome Atlas project, the five most common mutated genes in AML comprise NPM1, IDH1, IDH2, DNMT3A, and FLT3 [3]. Isocitrate dehydrogenase 1/2 (IDH1/2) mutations occur in approximately 15% of AML patients, and the frequency increases with age [4]. Mutations in IDH1/2 are associated with DNA and histone hypermethylation, altered gene expression and blocked differentiation of hematopoietic progenitor cells [5]. The FMS-like tyrosine kinase 3 (FLT3) gene encodes a class III receptor tyrosine kinase that regulates hematopoiesis, including differentiation and proliferation of stem cells [6]. FLT3 mutations are correlated with worse clinical outcome in younger adults [7]. Activating mutations in the tyrosine kinase domain (TKD) of FLT3 exist in 15% of patients with AML. The nucleophosmin gene (NPM1) is one of the most frequently mutated genes in AML [8]. The normal function of NPM1 is to control ribosome formation and export, stabilize the oncosuppressor p14Arf protein in the nucleolus and regulate centrosome duplication [9]. Mutations in NPM1 were found in 20–30% of AML patients. These alterations induce abnormal cytoplasmic localization of the protein which is a critical step in leukemogenesis [8]. NPM1 mutations are restricted to myeloid cells, and aberrant cytoplasmic dislocation was not observed in lymphoid cells, including the reactive lymph nodes or B and T cells from bone marrow biopsies or peripheral blood [10]. NPM1 mutations are frequently associated with internal tandem duplication (ITD) of FLT3 and DNMT3A mutations [11], [12]. In addition, besides the FLT3-ITD and DNMT3A mutations, NPM1 mutations also co-occur with IDH1, IDH2, and TET2 mutations [13]. There are mutations that rarely occur with NPM1 mutations, such as partial tandem duplication in the mixed lineage leukemia (MLL) gene and mutations in RUNX1, CEBPA, and TP53 genes [3]. FLT3 tyrosine kinase domain (TKD) mutations are rarely accompanied by NPM1 mutations [14]. A previous study described favorable prognosis of NPM1 mutated AML patients with normal karyotype [15]. Another study demonstrated that karyotype, age, NPM1 mutation status, white blood cell count, lactate dehydrogenase, and CD34 expression were independent prognostic markers for overall survival [16]. A previous study also demonstrated that IDH1 mutations are associated with favorable survival outcome in NPM1 mutant/FLT3-ITD-negative patients [17]. Currently, chemotherapy in younger and fit patients is still the primary treatment for AML patients. Chemotherapy generally includes a combination of an anthracycline, such as daunorubicin [18] or idarubicin [19], and cytarabine [20] agents. Of note, NPM1 mutated AML is highly responsive to induction chemotherapy [21], and up to 80% of patients experience complete remission with clearance of leukemic cells 16 days after starting a treatment [22]. In the last decade, several molecularly targeted agents were proposed for the treatment of AML, including tyrosine kinase inhibitors, such as sorafenib [23], midostaurin [24], quizartinib [25], and crenolanib [26] which inhibit the tyrosine kinase domain of the FLT3 kinase. STAT3 inhibitors, including C188-9 [27] and OPB-31121 [28], specifically inhibit the phosphorylation of STAT3 protein, which is highly upregulated in up to 50% of AML patients and is associated with poor prognosis. There are several additional targeted agents, such as IDH1 and IDH2 inhibitors [29], [30], nuclear export inhibitors [31] and CD33 and CD123 antigen specific inhibitors [32]. The aim was to examine the transcriptomic fingerprint of NPM1 gene mutations to shed light on transformed molecular pathways. First, genes showing altered expression in NPM1 mutated patients were identified and correlated these findings to different survival outcomes in multiple different genome-wide training sets. The best hits were validated in an independent set of patients.

Material and methods

The analysis was based on utilizing a training and a validation set (Fig. 1A). Data processing was performed in the R v3.2.3 statistical environment (http://www.r-project.org).
Fig. 1

Training set setup. Summary of the analysis workflow (A). Proportion of driver mutations and clinical characteristics of the training sets GSE6891 (B) and TCGA (C). Distribution of the NPM1 mutation localizations in the TCGA samples (D).

Training set setup. Summary of the analysis workflow (A). Proportion of driver mutations and clinical characteristics of the training sets GSE6891 (B) and TCGA (C). Distribution of the NPM1 mutation localizations in the TCGA samples (D).

Preprocessing of the training set

A suitable training AML dataset with available gene expression and clinical data was searched in the NCBI GEO repository (http://www.ncbi.nlm.nih.gov/geo/). The keywords “AML,” “GPL570” and “GPL96” were utilized, and we filtered for those datasets that included raw gene expression data and clinical information for the same patients. Array quality control was performed for all samples using the “yaqcaffy” (http://bioconductor.org/packages/yaqcaffy/) library. The background, the raw Q, the percentage of present calls, the presence of BioB-/C-/D- spikes, the GAPDH 3’ to 5’ ratio and the beta-actin 3’–5’ ratio were assessed and used only those arrays that passed the preset quality criteria. The MAS5 algorithm by the “affy” (http://bioconductor.org/packages/affy/) library was used to normalize the data. An additional second scaling normalization was made to set the mean expression on each array to 1000. For genes measured by various probe sets, we employed JetSet to choose the most trustworthy probe set [33].

RNA-seq and mutation data of AML patients

Two additional datasets were used for training, a gene-chip dataset (processed as described above) and an RNA-seq dataset. In the RNA-seq dataset, the somatic mutation data were obtained from The Cancer Genome Atlas (TCGA, https://cancergenome.nih.gov/). The preprocessed and annotated MAF (Mutation Annotation Format) data files were used generated by MuTect2, MUSE, VarScan and SomaticSniper pipelines. The “maftools” package (http://bioconductor.org/packages/maftools/) was applied for aggregation and visualization of mutation data. The htseq counts RNA-seq data generated by the Illumina HiSeq 2000 RNA Sequencing version 2 platform was used for gene expression estimation. The “AnnotationDbi” package (http://bioconductor.org/packages/AnnotationDbi/) was applied to annotate Ensembl transcript IDs with gene symbols (n = 25,228). The “DESeq” package based on the negative binomial distribution was used to normalize the raw read counts data [34].

Semmelweis set

Clinical samples diagnosed at the 1st Department of Pathology-, and Experimental Cancer Research, Semmelweis University, Budapest, Hungary were utilized in the in vitro validation. All materials and protocols were approved by the Institutional Scientific and Research Ethics Committee of the Semmelweis University TUKEB – 14383-2/2017/EKU. Mutation status was determined by Sanger sequencing and quantitative PCR measurement was utilized to examine the gene expression changes. DNA was isolated from peripheral blood and bone marrow samples using the High Pure PCR Template Preparation Kit (Roche, Basel, Switzerland) following the manufacturer’s protocol. DNA concentration was measured by UV spectrophotometry (NanoDrop; Thermo Fisher Scientific, Waltham, Massachusetts, USA).

RNA isolation

The peripheral blood and bone marrow samples were homogenized for 2 h using hemolysis solution containing 0.15 M NH4Cl, 10 M NH4HCO3, and 0.1 M EDTA with a pH of 7.4 (Sigma-Aldrich, St. Louis, MO, USA). After hemolysis, samples were centrifuged at 1800 RPM for 10 min and washed with 1x phosphate-buffered saline (PBS; Lonza, Basel, Switzerland). Total RNA was isolated from cells using TRIzol Reagent (Invitrogen, Waltham, Massachusetts, USA) following the manufacturer’s protocol. RNA concentration was measured by UV spectrophotometry (NanoDrop; Thermo Fisher Scientific, Waltham, Massachusetts, USA).

Sanger sequencing

The amplification of NPM1 was performed using AmpliTaqGold (Thermo Fisher Scientific, Waltham, Massachusetts, USA) polymerase mix in a PE 2720 GeneAmp (Perkin-Elmer, Waltham, Massachusetts, USA) PCR machine. Forward (5′- TTC CAT ACA TAC TTA AAA CCA A-3′) and reverse (5′- TGG TTC CTT AAC CAC ATT TCT TT −3′) primers were employed in a 25 mL final volume. The reaction mix contained 2x AmpliTaqGold mix, 400 nM of each primer and 100 ng of DNA. Amplification started with denaturation for 10 min at 95 °C, and then 95 °C for 30 sec, 56 °C for 60 sec and 72 °C for 60 sec were repeated for 40 cycles. The PCR products were cleaned using ExoSAP-IT PCR Product Cleanup (Affymetrix, Santa Clara, California USA), and trailed using the Big Dye Terminator kit v3.1 (Thermo Fisher Scientific, Waltham, Massachusetts, USA) direct sequencing reaction following the manufacturer’s protocol. For sequencing analysis an ABI 3500 Genetic Analyzer (Thermo Fisher Scientific, Waltham, Massachusetts, USA) machine was used, and the results were visualized using SeqA6 (Thermo Fisher Scientific, Waltham, Massachusetts, USA) software.

Quantitative PCR measurement

For qPCR analysis, 1 µg of total RNA from each sample was transcribed in a final volume of 25 µL using the High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Quantitative PCR was performed using the CFX96 Real-Time PCR Machine (Bio-Rad Laboratories, Hercules, California, USA) and SensiFAST SYBR No-ROX Kit (Bioline Reagents, London, UK). Primers were designed on exon-exon junctions and covering all transcript variants of each selected gene. GAPDH and TBP genes were used as reference genes (Table 1).
Table 1

Quantitative PCR primers for selected and references genes.

MutationGeneNCBI nucleotide sequencePrimer sequenceLength (bp)Temp (°C)
IDH1RASGRP3NM_015376.2F:5′-CAAGCCAACCTTCTGCGAAC-3′8360
R:5′-TGGCTCCACAGTCTTTGCAT-3′
IDH2NPDC1NM_015392.3F:5′-GACTACGCCACTGCGAAGG-3′13960
R:5′-CTTTATGCCGCTCCAGGCAC-3′
NPM1HOXA5NM_019102.3F:5′-AGCTGCACATAAGTCATGACAACA-3′13660
R:5′-TCAATCCTCCTTCTGCGGGT-3′
NPM1HOXB5NM_002147.3F:5′-AACTCCTTCTCGGGGCGTTAT-3′13860
R:5′-CATCCCATTGTAATTGTAGCCGT-3′
NPM1HOXA10NM_018951.3F:5′-GAGAGCAGCAAAGCCTCGC-3′12760
R:5′-CCAGTGTCTGGTGCTTCGTG-3′
NPM1ITM2ANM_001171581.1F:5′-TGTTGCTGGGGAACTGCTAT-3′10260
R:5′-GATATCTGCCACTCGCCAGTTT-3′
NPM1MEIS1NM_002398.2F:5′-CACGGGACTCACCATCCTTC-3′9960
R:5′-TGACTTACTGCTCGGTTGGAC-3′
NPM1PBX3NM_006195.5F:5′-CACACCTCAGCAACCCCTAC-3′9060
R:5′-ACCAATTGGATACCTGTGACACT-3′
GAPDHNM_002046.6F:5′-AAATCAAGTGGGGCGATGCT-3′8660
R:5′-CAAATGAGCCCCAGCCTTCT-3′
TBPNM_003194.4F:5′-GCACAGGAGCCAAGAGTGAA-3′12760
R:5′-TCACAGCTCCCCACCATGT-3′

Annealing temperature (Temp) calculation was executed using NCBI Primer Blast (www.ncbi.nlm.nih.gov/tools/primer-blast/).

Quantitative PCR primers for selected and references genes. Annealing temperature (Temp) calculation was executed using NCBI Primer Blast (www.ncbi.nlm.nih.gov/tools/primer-blast/). The reactions were performed in a 20 µL final volume, containing 1 µL of cDNA, diluted 2-fold, and 125 nM of each primer. After a preliminary denaturation step of 2 min at 95 °C, 40 cycles with three steps were performed: 95 °C for 15 sec, 60 °C for 15 sec and 72 °C for 30 sec. Each sample was measured in triplicate, and the threshold cycle (Ct) was determined for each gene. The ΔCt method was employed to evaluate gene expression changes and we used 2(-ΔCt)-values of the data. WinSTAT (http://www.winstat.com) was used to analyze the data.

Statistical computations

First, patients were divided into a mutated and a wild-type cohort based on the somatic mutation status of NPM1. Normal distribution of the data was checked using the Shapiro-Wilk’s W test. Then, Wilcoxon analysis was used to identify differentially expressed genes between the mutant and wild type cohorts. In addition, median fold change (FC) was computed for each gene to determine the direction of the expression change. Significance was accepted for genes with less than 0.5 or higher than 2 and with a p value below P < 0.05. Correlation between gene expression and overall survival (OS) was computed using Cox proportional hazards regression and by plotting Kaplan-Meier survival plots. To calculate the prognostic effect of a gene, each percentile of gene expression were computed between the lower and upper quartiles and the best performing threshold was used as the final cutoff in the Cox regression analysis [35]. The “survival” R package (http://CRAN.R-project.org/package=survival) was applied for Cox regression analysis and “survplot” R package (http://www.cbs.dtu.dk/~eklund/survplot/) to generate Kaplan-Meier plots. Finally, q-value was computed (the minimum false discovery rate at which the test may be called significant) to combat multiple hypothesis testing.

Results

Analysis of the first training cohort

The training cohort was based on 536 patients from the GSE6891 dataset [36]. The gene expression profiles of these samples were determined using Affymetrix Human Genome U133 Plus 2.0 Arrays (GPL570), and we obtained both mutation and gene expression data for 460 of the 536 patients. The median follow-up for overall survival (OS) was 18.7 months. Fig. 1B and Table 2 show the clinico-pathological parameters, including age, gender, and FAB subtype. NPM1 was the most frequently mutated gene as 30% of patients harbored a mutation. When correlating survival length in the training cohort and NPM1 mutation status, no significant correlation was observed (P = 0.3).
Table 2

Clinical characteristics of datasets.

GSE6859TCGAGSE1159Semmelweis set
Total number of samples536200293169
Samples with mutation & expression data460116247169
Age range (median)15–60 (43)18–89 (58)15–60 (42)0–85 (59)
Sex (F/M)230/23091/109128/11984/85
Median survival time (months)18,712176.92
Karyotype (good/intermediate/poor/unknown)97/261/92/8660/136/48/4912/97/25/35
FAB subtype (M0/M1/M2/M3/M4/M5/M6)16/95/105/24/84/104/66/55/54/17/43/62/3

F: female, M: male, PB: peripheral blood, BM: bone marrow.

Clinical characteristics of datasets. F: female, M: male, PB: peripheral blood, BM: bone marrow. Wilcoxon analysis across all genes (12,205) identified 85 genes showing significantly altered expression in NPM1 mutant patients compared to the NPM1 wild type cohort. Of these, 57 genes were upregulated and 28 genes were downregulated. The full list of significantly altered genes is displayed in Table 3. Cox regression analysis performed for the significant genes identified a correlation with overall survival for 47 genes at an FDR below 10% (Table 4).
Table 3

List of genes showing significantly altered expression when comparing NPM1 mutant and wild type cohorts in the training set.

GeneMutant medianWild medianFCP-value
HOXB3598.51893.175.12E−45
HOXA5279910027.991.87E−44
HOXB22282220.510.352.85E−43
HOXB6101783.512.184.55E−43
HOXA102952683.54.322.22E−39
PBX33544.56545.425.45E−39
MEIS12264.54315.251.12E−38
HOXB5840.5321.52.611.35E−38
PDGFD665.5227.52.932.30E−33
SMC444152043.52.162.75E−32
COL4A51342.5100.513.361.00E−31
DMXL24371.513983.133.00E−31
PLA2G4A593.5262.52.266.11E−29
CD34257.518540.147.04E−29
APP498390.063.44E−28
BAALC78.56110.133.49E−28
ITM2C834.525790.322.45E−27
CD20077.5664.50.123.38E−27
H2AFY2588.5235.52.51.41E−25
CCND22266.54802.50.472.54E−24
GYPC803.52440.50.335.68E−23
RASGRP31022.5278.53.672.54E−22
JUP70219440.366.90E−22
PRKAR2B2554871.52.935.88E−21
TSPAN13343.51157.50.31.59E−20
MAN1A11746.53552.50.492.11E−20
ITM2A977.529890.333.81E−20
H1F0562.521170.271.45E−18
C3AR11880831.52.262.43E−18
BAHCC118647702.422.77E−18
LPAR63189640.333.72E−18
IFITM113702974.50.464.47E−18
SEL1L31668.5766.52.182.28E−17
LGALS3BP2999.57943.783.47E−17
MEST98630280.333.88E−17
HIST2H2BE306815002.055.65E−16
CPVL1442.5553.52.611.03E−15
SLC38A1818.51878.50.442.49E−15
EGFL7276.57280.383.33E−15
PRKD33318050.416.67E−15
VNN111442614.389.17E−15
TLR411935242.283.39E−14
CTSG3670948.53.871.66E−13
JAG11095.5480.52.282.63E−13
TNFAIP22286.511142.055.73E−13
CD36277811552.412.74E−12
CCNA11382.5476.52.97.85E−12
TARP4965.52317.52.141.03E−11
PPBP1487.53324.481.08E−11
EREG1391.52555.461.39E−11
EMP143310630.412.96E−11
SPINK22270589.53.853.75E−11
CX3CR12901.58933.255.75E−11
MARCKS1786.5635.52.819.32E−11
TREM11000.54472.241.19E−10
BCL2A19934462.231.35E−09
WASF1452911.50.52.60E−09
PTX3766368.52.082.63E−09
MAFB1597.5385.54.146.14E−09
PF4514.51972.611.17E−08
PROM13201699.50.191.96E−08
LILRB2976382.52.552.19E−08
CYTL1342.5751.50.463.27E−08
NPR3479.514400.333.50E−08
SERPINA145211940.52.338.33E−08
HK31125432.52.63.45E−07
TMEM176B7442632.834.79E−07
SLC4A14701161.50.46.02E−07
HBB603119,0890.321.43E−06
VCAN2036491.54.141.81E−06
TMEM176A619.5302.52.053.33E−06
BASP1288511202.583.68E−06
MPO6784.515,8380.434.05E−06
CPA33423.51255.52.731.83E−05
MYCN839390.52.152.42E−05
MYOF736.5303.52.433.17E−05
IFI3049281872.52.633.24E−05
CA1764.518000.422.42E−04
FCN12595.58692.994.39E−04
FGL220208932.267.20E−04
FPR11097478.52.299.26E−04
C5AR11231.56092.021.48E−03
ELANE2086.549840.422.26E−03
CD1412113593.375.38E−03
S100A127653582.142.23E−02
Table 4

NPM1 mutation associated genes that expression was correlated with OS in the training set.

GeneHRP-valueq-value
MPO2.172.85E−072.42E−05
HOXA50.551.15E−054.41E−04
HOXA100.541.56E−054.41E−04
CD340.552.78E−055.71E−04
TARP0.613.36E−055.71E−04
SPINK20.636.59E−059.34E−04
MYOF0.622.27E−042.76E−03
MEIS10.593.12E−043.31E−03
SEL1L30.613.63E−043.43E−03
PRKAR2B0.665.22E−044.44E−03
H2AFY20.678.56E−046.62E−03
PRKD30.661.10E−037.81E−03
PPBP0.681.35E−038.85E−03
MEST1.532.10E−031.25E−02
PF40.682.21E−031.25E−02
SMC40.72.75E−031.25E−02
PLA2G4A0.72.81E−031.25E−02
ELANE1.542.91E−031.25E−02
BASP10.662.94E−031.25E−02
MARCKS0.693.31E−031.25E−02
LILRB20.663.34E−031.25E−02
H1F00.683.36E−031.25E−02
JUP1.53.38E−031.25E−02
TSPAN130.693.83E−031.36E−02
FCN10.714.58E−031.50E−02
ITM2A1.464.65E−031.50E−02
PBX30.694.76E−031.50E−02
BAALC0.697.04E−032.14E−02
IFI300.687.80E−032.24E−02
CPVL0.718.09E−032.24E−02
VNN10.698.18E−032.24E−02
CD140.718.83E−032.34E−02
HOXB50.739.86E−032.54E−02
LGALS3BP0.721.13E−022.81E−02
TNFAIP20.721.21E−022.88E−02
SLC38A10.741.22E−022.88E−02
CD2000.731.38E−023.16E−02
GYPC1.341.41E−023.16E−02
MYCN0.731.48E−023.23E−02
COL4A50.751.54E−023.27E−02
HOXB60.761.75E−023.59E−02
FPR10.721.77E−023.59E−02
RASGRP30.761.90E−023.75E−02
EREG0.762.12E−024.10E−02
MAFB0.732.22E−024.19E−02
EMP10.732.61E−024.83E−02
HOXB30.772.71E−024.90E−02
CTSG0.763.22E−025.71E−02
CYTL11.353.33E−025.78E−02
HOXB20.774.19E−027.02E−02
EGFL70.764.21E−027.02E−02
IFITM10.774.36E−027.09E−02
MAN1A11.284.42E−027.09E−02
List of genes showing significantly altered expression when comparing NPM1 mutant and wild type cohorts in the training set. NPM1 mutation associated genes that expression was correlated with OS in the training set.

Selecting genes for qPCR analysis

Two additional datasets, the TCGA and the GSE1159, were used to filter the results to obtain the most reliable genes. The TCGA repository has 200 AML patients of which 152 patients had RNA-seq gene expression data and 149 patients had somatic mutation data (Table 2). Overall survival data were available for 175 patients, and the median follow-up time was 12 months. There were 116 patients who had both gene expression and mutation data. Survival analysis was not performed for this dataset because less than half of the patients had simultaneous survival, mutation and gene expression data. The clinical characteristics of the TCGA dataset are found in Fig. 1C and Table 2. The GSE1159 dataset [37] includes 293 patients measured using Affymetrix Human Genome U133A Arrays (GPL96). Follow-up with overall survival data was available for 260 patients. There were 247 patients with simultaneous gene expression and mutation data (Table 2). In the TCGA dataset, NPM1 mutations were found in 17% of patients, of which 75% of the mutations were frame shift insertions, 20% were missense and 5% were in frame deletions (Fig. 1D). Most of the frame shift insertions were localized at the nucleolar localization signal region in the C-terminal DNA/RNA binding domain of the NPM1 gene (Fig. 1D). In the TCGA and GSE1159 datasets, 49 of the previously identified 85 genes reached statistical significance. The results of the Wilcoxon test are listed in Table 5, and the results of the survival analysis in Table 6.
Table 5

List of genes that expression was significantly altered between NPM1 mutant and wild type cohorts in the TCGA (A) and GSE1159 (B) datasets.

GeneMutant medianWild medianFCP-value
(A)
BAALC41.510100.044.75E−06
HOXA51651.5175.59.411.15E−05
CD348995870.011.18E−05
GYPC752.52596.50.291.26E−05
HOXB364537298.851.54E−05
HOXB5426.5585.32.75E−05
HOXB67147.595.23.61E−05
RASGRP32853693.54.115.14E−05
MAN1A11577.54319.50.375.91E−05
PBX33952895.54.416.29E−05
HOXB2750.51993.776.48E−05
CD200378690.047.68E−05
PDGFD37785.54.411.10E−04
COL4A517695432.761.26E−04
PROM111834210.031.26E−04
HOXA101164.5318.53.661.46E−04
DMXL2933842202.211.51E−04
MEIS1417812353.381.96E−04
SMC45938.534711.712.14E−04
NPR3561.531750.183.67E−04
ITM2C233539290.594.83E−04
MEST67817100.41.27E−03
BAHCC114,30259902.391.49E−03
TSPAN13133.54050.332.20E−03
TMEM176B28.5105.50.272.90E−03
TMEM176A1765.50.263.04E−03
JUP202343070.473.22E−03
APP2304225.50.054.07E−03
PTX3177991.795.66E−03
PLA2G4A845.5542.51.567.47E−03
CTSG3846.58914.327.55E−03
IFITM1208.54050.518.51E−03
LPAR63306490.518.60E−03
CCND24057.569800.588.98E−03
SEL1L32942.51823.51.611.41E−02
ITM2A730.521730.341.45E−02
SLC38A12730.55749.50.471.70E−02
EMP14786980.681.93E−02
EGFL779916280.492.28E−02
JAG11032701.51.472.56E−02
CCNA1866.5392.52.212.60E−02
ELANE281516441.713.66E−02
TREM11238565.52.194.07E−02
TNFAIP2528934481.534.25E−02
SLC4A12551005.50.254.29E−02
PRKD38981510.50.594.33E−02
LGALS3BP5023.511904.224.60E−02
TARP1053.55032.094.72E−02
HBB325311122.50.294.72E−02



(B)
BAALC1055270.21.20E−14
HOXA53320167.519.823.79E−26
CD3431018620.171.02E−13
GYPC8142218.50.377.62E−12
HOXB3395934.251.19E−25
HOXB5687245.52.81.53E−23
HOXB695214.565.661.06E−22
RASGRP37431973.771.93E−11
MAN1A110252469.50.427.51E−11
PBX334066475.261.58E−22
HOXB222682459.262.71E−23
CD200695380.131.63E−16
PDGFD573205.52.792.23E−18
COL4A511619911.731.03E−17
PROM128814680.22.68E−05
HOXA101842304.56.058.81E−23
DMXL236441164.53.131.03E−17
MEIS11761352.554.04E−21
SMC435021565.52.245.65E−20
NPR344014930.297.34E−06
ITM2C71225380.283.27E−17
MEST94828770.331.20E−12
BAHCC12543127321.15E−08
TSPAN132527320.344.82E−11
TMEM176B6511703.832.21E−03
TMEM176A831435.51.918.10E−04
JUP5101762.50.291.40E−14
APP43335.50.131.90E−14
PTX37222862.523.17E−07
PLA2G4A400187.52.133.73E−15
CTSG39098374.671.05E−08
IFITM1129523010.561.06E−08
LPAR62208050.271.24E−11
CCND221375490.50.393.31E−16
SEL1L316507912.092.16E−09
ITM2A64719670.331.75E−11
SLC38A18311893.50.444.41E−10
EMP1281906.50.313.13E−09
EGFL73769650.391.51E−09
JAG18884032.28.53E−08
CCNA11514583.52.592.04E−05
ELANE246658110.421.27E−02
TREM11158597.51.942.99E−07
TNFAIP221961215.51.812.72E−08
SLC4A12847940.364.45E−04
PRKD3269558.50.486.82E−08
LGALS3BP2623996.52.632.70E−09
TARP50952815.51.813.96E−05
HBB451421338.50.215.18E−05
Table 6

NPM1 mutation associated genes that expression was correlated with OS in the GSE1159 dataset.

GeneHRP-valueq-value
HOXA100.481.63E−057.99E−04
TARP0.531.31E−042.22E−03
HOXA50.511.69E−042.22E−03
SEL1L30.531.81E−042.22E−03
MEIS10.492.85E−042.76E−03
ITM2A1.963.38E−042.76E−03
PLA2G4A0.591.19E−038.33E−03
ELANE1.81.39E−038.51E−03
MEST1.772.06E−031.12E−02
CD340.582.48E−031.22E−02
JUP1.763.73E−031.58E−02
GYPC1.573.86E−031.58E−02
LGALS3BP0.624.49E−031.67E−02
SMC40.624.76E−031.67E−02
MAN1A11.545.38E−031.76E−02
PBX30.656.02E−031.84E−02
HOXB50.637.11E−032.05E−02
CTSG0.638.96E−032.44E−02
TSPAN130.651.03E−022.66E−02
SLC38A10.671.26E−023.09E−02
IFITM11.491.49E−023.48E−02
HOXB20.681.98E−024.41E−02
RASGRP30.712.79E−025.94E−02
CCND21.384.21E−028.24E−02
LPAR61.414.42E−028.24E−02
HOXB30.724.45E−028.24E−02
EGFL71.464.54E−028.24E−02
List of genes that expression was significantly altered between NPM1 mutant and wild type cohorts in the TCGA (A) and GSE1159 (B) datasets. NPM1 mutation associated genes that expression was correlated with OS in the GSE1159 dataset. For qPCR measurement only those genes were selected which showed a significant gene expression change and a fold change over 2.0 or below 0.5 in each training set (n = 32). Correlation to survival was used as an additional filter (n = 19), and the pipeline of gene selection for qPCR measurement is depicted in Fig. 2A.
Fig. 2

A–G. Best genes in the training set. Workflow of selecting differentially expressed genes (A). The best performing genes linked to NPM1 mutations in the training set (B–G). Hazard rates with 95% confidence intervals are shown.

A–G. Best genes in the training set. Workflow of selecting differentially expressed genes (A). The best performing genes linked to NPM1 mutations in the training set (B–G). Hazard rates with 95% confidence intervals are shown. The best performing genes discriminating NPM1 mutant and wild-type samples were HOXA5, HOXB5, HOXA10, PBX3, MEIS1, and ITM2A. Of these, ITM2A was the only downregulated gene (Fig. 2G). Kaplan-Meier curves show that high expression of these genes was correlated with poor survival (Fig. 2B–F). In the case of ITM2A, lower expression was associated with worse outcome (Fig. 2G). Correlation between mutation status and expression and expression and survival in the TCGA and GSE1159 datasets for these genes is provided in Figs. 3 and 4, respectively.
Fig. 3

Validation of NPM1-associated differentially expressed genes in the GSE1159 (A) and TCGA datasets (B).

Fig. 4

The expression of HOXA5 (A), HOXB5 (B), HOXA10 (C), PBX3 (D), MEIS1 (E) and ITM2A (F) genes was significantly correlated with OS in the GSE1159 dataset. HRs with 95% confidence intervals are shown.

Validation of NPM1-associated differentially expressed genes in the GSE1159 (A) and TCGA datasets (B). The expression of HOXA5 (A), HOXB5 (B), HOXA10 (C), PBX3 (D), MEIS1 (E) and ITM2A (F) genes was significantly correlated with OS in the GSE1159 dataset. HRs with 95% confidence intervals are shown.

Correlation between NPM1 mutation and mutations in other genes

The prevalence of NPM1 mutation was compared to IDH1, IDH2, and FLT3 mutation status in the training and validation sets by Chi-square analysis. In the training set, the correlation to IDH1 and FLT3 was significant (chi-stat = 44.7, P < 0.00001 and chi-stat = 9.2, P = 0.0024, respectively) while the correlation to IDH2 was not significant. Similarly, in the validation set, the correlation to IDH1 and FLT3 were significant (chi-stat = 5.03, P = 0.024 and chi-stat = 8.2, P = 0.0041, respectively), and IDH2 was not significant. Important to note that only 89 patients had simultaneous mutation state for each gene in the validation set.

Validation of target genes by qPCR in the Semmelweis set

Mutation data were available for all patients in our clinical sample cohort. In this group, the NPM1 gene was mutated in 25% of patients (Fig. 5A). The FLT3, IDH2, and IDH1 genes harbored a mutation in 25%, 14%, and 5% of patients, respectively. The mutation frequency was independent of the sample origin, including bone marrow and blood (data not shown).
Fig. 5

A–J. Validation in an independent clinical set. Clinical characteristics of the Semmelweis set (A–D). RT-qPCR for differentially expressed genes with validated expression linked to NPM1 mutations and survival in the clinical set (E–J). Hazard rates with 95% confidence intervals are shown.

A–J. Validation in an independent clinical set. Clinical characteristics of the Semmelweis set (A–D). RT-qPCR for differentially expressed genes with validated expression linked to NPM1 mutations and survival in the clinical set (E–J). Hazard rates with 95% confidence intervals are shown. The Semmelweis set contains 169 AML patients (Fig. 1A); 52.6% of the samples were obtained from bone marrow and 47.4% of the samples were collected from peripheral blood. All samples have overall survival data with a median follow-up time of 6.92 months. Similar to the training sets, most patients have intermediate cytogenetic risk (Fig. 5A). Additional clinico-pathological characteristics of the samples are displayed in Fig. 5A–D and Table 2. When analyzing the mutation status of NPM1 in the Semmelweis set, no significant correlation to overall survival was observed (P = 0.4). The most significant genes associated with NPM1 mutations as observed in the training sets was validated by qPCR. The expressions of HOXA5 (P = 3.06E−12, FC = 8.3), HOXA10 (P = 2.44E−09, FC = 3.3), HOXB5 (P = 1.86E−13, FC = 37), MEIS1 (P = 9.82E−10, FC = 4.4) and PBX3 (P = 1.03E−13, FC = 5.4) genes were significantly higher while the expression of the ITM2A (P = 0.004, FC = 0.4) gene was significantly lower in the NPM1 mutant patient cohort (Fig. 5E–J). Finally, the survival analysis provided a significant association between the expression of the HOXA5, HOXA10, PBX3, and MEIS1 genes and overall survival in the validation cohort (Fig. 5E–I).

Correlation between HOX genes and co-factors

Pearson’s rank correlation was computed to examine the relation of gene expression between HOX, MEIS, and PBX genes. All the P-values were less than 2.2E−16. High correlation was found between HOXA5 and HOXA10, HOXA5 and MEIS1, HOXA10 and MEIS1, HOXA10 and PBX3, and MEIS1 and PBX3 genes (Fig. 6A). In Fig. 6B, the potential interplay between HOX genes and co-factors (PBX3 and MEIS1) in the cell is displayed.
Fig. 6

Correlation between top target genes. Scatterplot and Pearson rank correlation coefficients of gene expression (P < 2.2E−16 for each correlation) (A). HOX genes and identified cofactors act in concert to influence multiple features of a cancer cell (B).

Correlation between top target genes. Scatterplot and Pearson rank correlation coefficients of gene expression (P < 2.2E−16 for each correlation) (A). HOX genes and identified cofactors act in concert to influence multiple features of a cancer cell (B).

Discussion

Genes showing altered expression with NPM1 somatic mutations and altered survival were identified in AML. Interestingly, NPM1 mutation status per se was not correlated to survival neither in the training nor in the validation set. The final set of NPM1-assicated genes is established in four independent datasets (three previously published genomic sets and one clinical sample set collected at the Semmelweis University). The results demonstrate that the HOXA5, HOXB5, HOXA10, PBX3, MEIS1, and ITM2A genes show the highest expression change when comparing NPM1 mutant and wild type cohorts. Of these genes, HOXA5, HOXB5, HOXA10, PBX3, and MEIS1 were upregulated, and the ITM2A gene was downregulated in the NPM1 mutant tumors. With the exception of ITM2A, higher expression was also correlated with poor prognosis. Homeobox genes are members of transcription factor families that are grouped into four main clusters (HOXA-D) on four different chromosomes. HOX genes play central roles in embryonic development, differentiation, and proliferation of hematopoietic cells [38]. Expression changes of HOX genes are also highly correlated with the development of hematologic malignancies [39]. In a genome-wide analysis, several HOXA and HOXB genes with their co-factors were overexpressed in AML with normal karyotype [40]. HOX expression in AML is restricted to specific genes in the HOXA or HOXB loci, and are highly correlated with recurrent cytogenetic abnormalities [41]. Overexpression of HOX genes results in the expansion of progenitor cell populations and simultaneously blockade of the differentiation of these cells [42]. Here, three homeobox (HOX) genes were found – HOXA5, HOXB5, and HOXA10 – that show significantly higher expression in NPM1 mutant tumor samples. A previous study revealed that high expression of HOXA5 is linked with worse survival in AML [38]. In pediatric AML cases, NPM1 mutations affected the expression of HOXA4, HOXA6, HOXA7, HOXA9, and HOXB9 genes and the MEIS1 and PBX3 genes [43]. The mechanism of action for upregulation of HOX genes in NPM1 mutated patients remains uncertain. NPM1 might directly modify the expression of HOX genes, or NPM1 mutations might inhibit the differentiation of early hematopoietic progenitors where HOX expression is upregulated [44]. The results of present study also provide robust clinical support for recent cell-culture based observations establishing the connection between NPM1 and HOX expression in AML. In their study, Brunetti and coworkers show the key role of mutant NPM1 and its aberrant cytoplasmic localization in inducing HOX expression. Nuclear re-localization of the mutated protein (NPM1c) induced immediate downregulation of HOX genes, followed by cell differentiation [45]. Hox transcription factors frequently co-operate with PBX (pre-B-cell leukemia homeobox) and MEIS (myeloid ecotropic viral integration site homeobox) family genes [46]. These genes are encoded by homeodomain-containing transcription cofactors, which have an essential role in some HOX-dependent developmental programs [47]. HOX proteins from paralog groups 1 to 10 interact with PBX proteins, whereas interaction with MEIS proteins is limited to HOX paralogs 9 to 13 [48]. PBX proteins were identified as fusion proteins from chromosome translocations causing pre-B cell leukemia in humans [49]. The interaction between PBX and HOX proteins is essential for HOX function [50] (see Fig. 6B). Earlier studies presented that the DNA binding affinity of HOX proteins is higher when PBX proteins are present [51]. In addition, these co-factors can mediate the DNA target selection of HOX proteins [52]. PBX proteins also bind to additional factors, such as histone deacetylases (HDACs) and histone acetyltransferases (HATs) to mobilize these factors to the HOX complexes [53]. MEIS proteins are members of HMP (homothorax, meis and prep) proteins and are identified as proto-oncogenes coactivated with HOX genes in leukemia [54]. Previous studies demonstrated that HMP proteins can form complexes with PBX and HOX proteins [55] (Fig. 6B). MEIS proteins also counteract HDAC activity [56]. PBX-HOX complexes can bind HDACs and repress transcription; however, this repression can be blocked by MEIS proteins capable of initiating transcription [56]. ITM2A (integral membrane protein 2A) is a type II membrane protein that belongs to the ITM2 family [57]. ITM2A is involved in myogenic differentiation, mesenchymal stem cell differentiation, and autophagy [58]. A patent describing a monoclonal antibody against ITM2A for the potential treatment of AML by inducing ADCC was recently submitted [59]. Decreased ITM2A expression in AML was described previously, but its function in the progression of AML is still unclear [14]. These results support the idea of targeting the HOX transcription complex in the targeted therapy of NPM1 mutated AML. In some solid cancers, including lung [60], breast [61], prostate [62], melanoma [63], and AML cell lines [64], HXR9 is a potent cell penetrating peptide inhibitor targeting HOX proteins by inhibiting the interaction with PBX cofactors. Alharbi et al. evaluated the mechanism of HXR9 induced cell death and found that HXR9 promotes apoptosis and necroptosis and its cytotoxicity can be enhanced by inhibiting protein kinase C (PKC) in AML cell lines [65].

Conclusions

In summary, by connecting mutation status with a gene expression signature we identified HOX genes and their co-factors significantly upregulated in NPM1 mutant tumors. The expression of these genes also correlated to survival outcome. The strength of this study is the utilization of several different training sets for feature selection and validation using an independent method. Based on these results, the complex involving the HOX genes with the PBX3 and MEIS1 co-factors may serve as an advanced therapeutic target in NPM1 mutated AML patients.

Availability of data and material

The NCBI Gene Expression Omnibus datasets are available using the following links: GSE6891: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE6891. GSE1159: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1159. TCGA (The Cancer Genome Atlas) dataset is available using the following link: https://portal.gdc.cancer.gov/projects/TCGA-LAML.

Conflict of interest

The authors have declared no conflict of interest.
  11 in total

1.  The Prognostic Value and Function of HOXB5 in Acute Myeloid Leukemia.

Authors:  Miao Chen; Yi Qu; Pengjie Yue; Xiaojing Yan
Journal:  Front Genet       Date:  2021-08-05       Impact factor: 4.599

2.  Hoxa11-mediated reduction of cell migration contributes to myeloid sarcoma formation induced by cooperation of MLL/AF10 with activating KRAS mutation in a mouse transplantation model: Hoxa11 in myeloid sarcoma formation.

Authors:  Jen-Fen Fu; Chih-Jen Wen; Tzung-Hai Yen; Lee-Yung Shih
Journal:  Neoplasia       Date:  2022-04-29       Impact factor: 6.218

3.  Surface antigen-guided CRISPR screens identify regulators of myeloid leukemia differentiation.

Authors:  Eric Wang; Hua Zhou; Bettina Nadorp; Geraldine Cayanan; Xufeng Chen; Anna H Yeaton; Sofia Nomikou; Matthew T Witkowski; Sonali Narang; Andreas Kloetgen; Palaniraja Thandapani; Niklas Ravn-Boess; Aristotelis Tsirigos; Iannis Aifantis
Journal:  Cell Stem Cell       Date:  2021-01-14       Impact factor: 24.633

4.  Homeobox proteins are potential biomarkers and therapeutic targets in gastric cancer: a systematic review and meta-analysis.

Authors:  Xiao Jin; Lu Dai; Yilan Ma; Jiayan Wang; Haihao Yan; Ye Jin; Xiaojuan Zhu; Zheng Liu
Journal:  BMC Cancer       Date:  2020-09-09       Impact factor: 4.430

5.  ITM2A as a Tumor Suppressor and Its Correlation With PD-L1 in Breast Cancer.

Authors:  Rui Zhang; Tao Xu; Yu Xia; Zhi Wang; Xingrui Li; Wen Chen
Journal:  Front Oncol       Date:  2021-02-12       Impact factor: 6.244

6.  The landscape of gene co-expression modules correlating with prognostic genetic abnormalities in AML.

Authors:  Chao Guo; Ya-Yue Gao; Qian-Qian Ju; Chun-Xia Zhang; Ming Gong; Zhen-Ling Li
Journal:  J Transl Med       Date:  2021-05-29       Impact factor: 5.531

7.  Decoupling Lineage-Associated Genes in Acute Myeloid Leukemia Reveals Inflammatory and Metabolic Signatures Associated With Outcomes.

Authors:  Hussein A Abbas; Vakul Mohanty; Ruiping Wang; Yuefan Huang; Shaoheng Liang; Feng Wang; Jianhua Zhang; Yihua Qiu; Chenyue W Hu; Amina A Qutub; Monique Dail; Christopher R Bolen; Naval Daver; Marina Konopleva; Andrew Futreal; Ken Chen; Linghua Wang; Steven M Kornblau
Journal:  Front Oncol       Date:  2021-08-04       Impact factor: 6.244

8.  HOXA5 confers tamoxifen resistance via the PI3K/AKT signaling pathway in ER-positive breast cancer.

Authors:  Clara Yuri Kim; Yu Cheon Kim; Ji Hoon Oh; Myoung Hee Kim
Journal:  J Cancer       Date:  2021-06-01       Impact factor: 4.207

9.  Identification and validation of inferior prognostic genes associated with immune signatures and chemotherapy outcome in acute myeloid leukemia.

Authors:  Jie Wang; Jian-Ping Hao; Md Nazim Uddin; Yun Wu; Rong Chen; Dong-Feng Li; Dai-Qin Xiong; Nan Ding; Jian-Hua Yang; Xuan-Sheng Ding
Journal:  Aging (Albany NY)       Date:  2021-06-18       Impact factor: 5.682

10.  A genetic screen in Drosophila uncovers the multifaceted properties of the NUP98-HOXA9 oncogene.

Authors:  Gwenaëlle Gavory; Caroline Baril; Gino Laberge; Gawa Bidla; Surapong Koonpaew; Thomas Sonea; Guy Sauvageau; Marc Therrien
Journal:  PLoS Genet       Date:  2021-08-12       Impact factor: 5.917

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.