Literature DB >> 25355294

The mutational landscape in pediatric acute lymphoblastic leukemia deciphered by whole genome sequencing.

Carl Mårten Lindqvist1, Jessica Nordlund, Diana Ekman, Anna Johansson, Behrooz Torabi Moghadam, Amanda Raine, Elin Övernäs, Johan Dahlberg, Per Wahlberg, Niklas Henriksson, Jonas Abrahamsson, Britt-Marie Frost, Dan Grandér, Mats Heyman, Rolf Larsson, Josefine Palle, Stefan Söderhäll, Erik Forestier, Gudmar Lönnerholm, Ann-Christine Syvänen, Eva C Berglund.   

Abstract

Genomic characterization of pediatric acute lymphoblastic leukemia (ALL) has identified distinct patterns of genes and pathways altered in patients with well-defined genetic aberrations. To extend the spectrum of known somatic variants in ALL, we performed whole genome and transcriptome sequencing of three B-cell precursor patients, of which one carried the t(12;21)ETV6-RUNX1 translocation and two lacked a known primary genetic aberration, and one T-ALL patient. We found that each patient had a unique genome, with a combination of well-known and previously undetected genomic aberrations. By targeted sequencing in 168 patients, we identified KMT2D and KIF1B as novel putative driver genes. We also identified a putative regulatory non-coding variant that coincided with overexpression of the growth factor MDK. Our results contribute to an increased understanding of the biological mechanisms that lead to ALL and suggest that regulatory variants may be more important for cancer development than recognized to date. The heterogeneity of the genetic aberrations in ALL renders whole genome sequencing particularly well suited for analysis of somatic variants in both research and diagnostic applications.
© 2014 WILEY PERIODICALS, INC.

Entities:  

Keywords:  RNA sequencing; acute lymphoblastic leukemia; clonal heterogeneity; whole genome sequencing

Mesh:

Substances:

Year:  2015        PMID: 25355294      PMCID: PMC4309499          DOI: 10.1002/humu.22719

Source DB:  PubMed          Journal:  Hum Mutat        ISSN: 1059-7794            Impact factor:   4.878


Introduction

Acute lymphoblastic leukemia (ALL) is the most common pediatric cancer, which arises from the malignant transformation of lymphocyte progenitor cells to leukemic cells in the B- and T-cell lineages. B-cell precursor ALL (BCP-ALL) is the most common immunophenotype, which is divided into genetic subtypes with therapeutic and prognostic importance based on recurrent large-scale chromosomal aberrations that are detected in about 75% of the patients. Hyperdiploidy and t(12;21)ETV6-RUNX1 (MIM#s 600618, 151385) rearrangement characterize the most common subtypes, which are associated with a favorable outcome (Pui et al., 2011). The high-risk T-cell immunophenotype represents about 12% of the patients. Genomic characterization of T-ALL, BCP-ALL samples carrying the t(12;21) rearrangement, and several of the high-risk BCP-ALL subtypes using microarrays and next generation sequencing has revealed distinct patterns of genetic lesions in these subtypes [Mullighan, 2013; Papaemmanuil et al., 2014], whereas few studies have addressed the genetics of BCP-ALL patients without a known primary genetic aberration. The focus of most studies has been on protein-coding regions, where the lesions have been found to affect hematopoietic development, cell cycle regulation, Ras and tyrosine signaling, cytokine receptors, tumor suppression, and epigenetic regulation [Mullighan, 2013]. The findings that most driver genes in acute myeloid leukemia are involved in gene regulation [The Cancer Genome Atlas Research Network, 2013], and that the TERT promoter is recurrently mutated across several types of human cancer [Vinagre et al., 2013] suggest that variants in non-coding regulatory regions might be more important for cancer development than recognized to date. To determine the full range of genetic lesions in pediatric ALL, we sequenced the whole genomes and transcriptomes of two patients belonging to well-characterized groups (t(12;21) and T-ALL) and two patients without a known primary genetic aberration. We analyzed coding and non-coding regions of the genome and identified putative driver genes by targeted sequencing in 168 additional patients.

Materials and Methods

Patient Samples

The pediatric ALL patients analyzed in this study were diagnosed and treated at Swedish centers (Uppsala, Umeå, Stockholm and Gothenburg) according to the Nordic Society for Pediatric Haematology and Oncology (NOPHO) protocols [Schmiegelow et al., 2010]. ALL diagnosis was established by analysis of leukemic cells with respect to morphology, immunophenotype, and cytogenetics. ALL lineage (BCP-ALL or T-ALL) was defined according to the European Group for the Immunological Characterization of Leukemias. Fluorescence in situ hybridization (FISH) or reverse transcriptase PCR (RT-PCR) analyses were used to screen for gene fusions. Karyotypes were based on the International System for Human Cytogenetic Nomenclature [Shaffer et al., 2013]. Bone marrow aspirates collected at diagnosis of ALL and matched germline peripheral blood or bone marrow samples collected in first continuous complete remission (CCR1) from four patients were subjected to whole genome sequencing (WGS) (Table 1). Targeted sequencing of 168 additional samples collected at ALL diagnosis and 159 matched CCR1 samples from the same patients was performed (Table 2 and Supp. Table S1). For nine patients, no CCR1 sample was available. An in-house RNA-seq dataset containing 27 BCP-ALL samples and 18 T-ALL samples (Nordlund, Dahlberg et al., unpublished data, Supp. Table S2) was used as control to assess potential effects of somatic variants on gene expression. This dataset provides a better control than matched remission samples, which are comprised of a mixture of mononuclear cells (T-cells, B-cells, monocytes, etc.) and therefore are expected to display substantial expression differences compared with ALL cells that are unrelated to somatic variants. The study was approved by the Regional Ethical Review Board in Uppsala, Sweden. The study was conducted according to the guidelines of the Declaration of Helsinki, and all patients and/or guardians provided written informed consent.
Table 1

Clinical Characteristics of Whole Genome Sequenced ALL Patients

PatientSexAgeaWBCb% BlastsImmunophenotypeGenetic subtypecTreatment groupdClinical follow-upeRemission tissue
ALL_458Male3.612.390BCP-ALLt(12;21)IRCCR1 (8.5)PB
ALL_559Male5.9128.095T-ALLT-ALLHRCCR1 (7)PB
ALL_707Male1.69.680–90BCP-ALLOtherSRCCR1 (5)PB
ALL_501Female6.71.480BCP-ALLNormalSRCCR1 (8)BM

Age at diagnosis in years.

White blood cell count at diagnosis (109 cells/l).

The full karyotypes are:

ALL_458: 46,XY.ish.t(12;21)(p13;q22),del(12)(p13p13),del(21)(q22q22)

ALL_559: 46,XY,t(7;9)(q3?4;q3?2)[10].ish.del(9)(p21p21)x2,der(11)t(7;11)(q3?4;p1?3)/46,XY[15]. The t(7;9) was detected by G-banding in 10 of 25 metaphases. Remaining aberrations, and t(7;9), were detected by FISH. The question marks indicate that the breakpoint on sub-band level is uncertain.

ALL_707: 46,XY,der(7)t(7;9)(q11;p13)del(9)(p21p24),der(9)t(7;9)(q11;p13),del(19)(q13)[24]/46,XY[1]

ALL_501: 46,XX[20]. Hyperdiploidy and the most common rearrangements (BCR-ABL1, PBX1-TCF3, ETV6-RUNX1, and MLL) were excluded by FISH and DNA index analysis.

The patients were treated according to the Nordic Society for Pediatric Haematology and Oncology (NOPHO) protocols [Schmiegelow et al., 2010].

Within parenthesis is the follow-up time in years.

BCP-ALL, B-cell precursor ALL; SR, standard risk; IR, intermediate risk; HR, high risk; CCR1, first continuous complete remission; PB, peripheral blood; BM, bone marrow.

Table 2

Cytogenetic Representation of ALL Patients Included in the Validation Cohort

ImmunophenotypeGenetic subtypeaNumber of patientsbPopulation (%)c
BCP-ALLHeH47 (28.0)26.3
t(12;21)35 (20.8)16.7
Other21 (12.5)13.4
Normal / no result18 (10.7)21.7
t(9;22)8 (4.8)2.2
11q23/MLL4 (2.4)3.6
iAMP214 (2.4)0.5
t(1;19)4 (2.4)1.9
dic(9;20)3 (1.8)1.8
> 67 chr1 (0.6)0.4
T-ALLT-ALL23 (13.7)10.5
Total168 (100)99.0

HeH, high hyperdiploidy (51–67 chromosomes); t(12;21), translocation between the chromosomes (12;21)(p13;q22)ETV6-RUNX1; t(9;22), translocation between the chromosomes (9;22)(q11;q34)BCR-ABL1; 11q23/MLL, translocation between MLL and various other genes; iAMP21, intrachromosomal amplification of chromosome 21; dic(9;20), dicentric chromosome (9;20)(p13;q11); > 67 chr, > 67 chromosomes; Other, other clonal aberrations; Normal, no genetic aberrations detected and a normal karyotype observed in at least 5 of 25 metaphases; No result, no karyotype reported or the cytogenetic analysis failed.

Within parenthesis is the percentage of samples of each subtype in the validation cohort.

Percentage of each subtype in 2367 patients diagnosed with ALL in the Nordic countries during 1996–2008. The frequency of subtypes changes with time, since new subtypes are discovered and new analysis methods are added. The reason why the population does not sum to 100% is that the rare subtype hypodiploidy is not represented in the validation cohort.

Clinical Characteristics of Whole Genome Sequenced ALL Patients Age at diagnosis in years. White blood cell count at diagnosis (109 cells/l). The full karyotypes are: ALL_458: 46,XY.ish.t(12;21)(p13;q22),del(12)(p13p13),del(21)(q22q22) ALL_559: 46,XY,t(7;9)(q3?4;q3?2)[10].ish.del(9)(p21p21)x2,der(11)t(7;11)(q3?4;p1?3)/46,XY[15]. The t(7;9) was detected by G-banding in 10 of 25 metaphases. Remaining aberrations, and t(7;9), were detected by FISH. The question marks indicate that the breakpoint on sub-band level is uncertain. ALL_707: 46,XY,der(7)t(7;9)(q11;p13)del(9)(p21p24),der(9)t(7;9)(q11;p13),del(19)(q13)[24]/46,XY[1] ALL_501: 46,XX[20]. Hyperdiploidy and the most common rearrangements (BCR-ABL1, PBX1-TCF3, ETV6-RUNX1, and MLL) were excluded by FISH and DNA index analysis. The patients were treated according to the Nordic Society for Pediatric Haematology and Oncology (NOPHO) protocols [Schmiegelow et al., 2010]. Within parenthesis is the follow-up time in years. BCP-ALL, B-cell precursor ALL; SR, standard risk; IR, intermediate risk; HR, high risk; CCR1, first continuous complete remission; PB, peripheral blood; BM, bone marrow. Cytogenetic Representation of ALL Patients Included in the Validation Cohort HeH, high hyperdiploidy (51–67 chromosomes); t(12;21), translocation between the chromosomes (12;21)(p13;q22)ETV6-RUNX1; t(9;22), translocation between the chromosomes (9;22)(q11;q34)BCR-ABL1; 11q23/MLL, translocation between MLL and various other genes; iAMP21, intrachromosomal amplification of chromosome 21; dic(9;20), dicentric chromosome (9;20)(p13;q11); > 67 chr, > 67 chromosomes; Other, other clonal aberrations; Normal, no genetic aberrations detected and a normal karyotype observed in at least 5 of 25 metaphases; No result, no karyotype reported or the cytogenetic analysis failed. Within parenthesis is the percentage of samples of each subtype in the validation cohort. Percentage of each subtype in 2367 patients diagnosed with ALL in the Nordic countries during 1996–2008. The frequency of subtypes changes with time, since new subtypes are discovered and new analysis methods are added. The reason why the population does not sum to 100% is that the rare subtype hypodiploidy is not represented in the validation cohort. Mononuclear cells were isolated with 1.077 g/ml Ficoll-Isopage (Pharmacia, Uppsala, Sweden) density-gradient centrifugation. The proportion of leukemic cells was estimated to be ≥80% in all samples by light microscopy in May-Grünwald-Giemsa-stained cytocentrifugate preparations. DNA and RNA were extracted from vital frozen cells or cell pellets containing 1–15 million cells using the QIAamp DNA Blood or AllPrep DNA/RNA Mini Kit (Qiagen, GmbH, Hilden, Germany). RNA samples were treated with DNase using the RNase-Free DNase Set (Qiagen). DNA and RNA were quantified using the Qubit dsDNA Broad-Range assay and Qubit RNA Broad-Range assay, respectively, on a Qubit 2.0 fluorometer (Invitrogen, Carlsbad, CA, USA). The integrity of the RNA was examined by capillary electrophoresis with a Bioanalyzer using RNA 6000 Nano Labchips (Agilent, Technologies, Santa Clara, CA, USA).

WGS and Analysis of WGS Data

Four to eight WGS libraries were prepared from diagnostic and remission DNA from each of the four selected ALL patients. The libraries were sequenced paired-end with 100 or 150 bp reads using a HiSeq2000 or GAIIx instrument (Illumina Inc, San Diego, CA, USA). Sequence reads were aligned to the human reference genome hg19 using BWA [Li and Durbin, 2009]. Single nucleotide variants (SNVs) were called with SomaticSniper [Larson et al., 2012] and insertion-deletions (indels) were called with GATK [DePristo et al., 2011]. A variant was considered conserved if it overlapped a conserved non-exonic element (CNEE) [Lowe et al., 2011] or a region present in the phastCons track from the UCSC table browser. DNase hypersensitive (DHS) regions and active histone marks (H3K4me1, H3K27ac, H3K4me3, H3K36me3) in lymphoid cells were determined using data from the NIH Epigenomics project [Bernstein et al., 2010] as previously described [Nordlund et al., 2013]. The experiment IDs are listed in Supp. Table S3. Somatic copy number alterations (CNAs) were called using BIC-Seq [Xi et al., 2011], ControlFREEC [Boeva et al., 2011], dwac-seq (https://github.com/Vityay/DWAC-Seq), and Patchwork [Mayrhofer et al., 2013]. CNAs detected by at least two programs were retained. All variants were annotated against the Ensembl database.

RNA Sequencing and Analysis of RNA Sequence Data

Strand-specific libraries for RNA-sequencing (RNA-seq) were prepared from 1 μg RNA using the ScriptSeq V2 RNA-seq Library Preparation Kit (Epicentre, Madison, WI, USA) after ribosomal RNA depletion with the Ribo-Zero method (Epicentre). Paired-end sequencing was performed on a HiSeq2000/2500 (Illumina). The read length was 100 bp for the four whole genome sequenced samples and 50 bp for the samples in the control dataset. Alignment of sequence reads and quantification of expression levels was performed with TopHat and Cufflinks [Trapnell et al., 2012]. The expression levels were normalized to fragments per kilobase per million mapped reads (FPKM) with Cufflinks. Fusion genes were identified using FusionCatcher (Nicorici et al., submitted manuscript).

HumanOmni2.5 BeadChip Genotyping

Two hundred and fifty nanograms of DNA from diagnostic and remission samples from the four whole genome sequenced patients was genotyped using the HumanOmni2.5 BeadChip (Illumina). The diploid coverage of the WGS data was estimated as the percentage of heterozygous sites detected using genotyping that were also heterozygous in the sequence data. CNAs were predicted using ASCAT [Van Loo et al., 2010].

Target Capture Experiment

The following categories of genomic regions were selected for target capture and resequencing (Supp. Fig. S1): (1) 51 bp regions flanking all candidate SNVs identified by WGS. (2) Exons of genes with an SNV or indel in an exon or an untranslated region (UTR) (Supp. Table S4). (3) Non-coding conserved or DHS regions containing a candidate SNV and regions flanking candidate SNVs with score ≤2 in RegulomeDB [Boyle et al., 2012]. Target capture was performed using 200 ng DNA and reagents from a HaloPlex Target Enrichment kit (Agilent), according to the HaloPlex Target Enrichment System Automation Protocol Version D.3. All leukemic samples (n = 172) and the four whole genome sequenced remission samples were enriched individually. Remaining remission samples were enriched in pools of 10 samples. In addition, 84 samples from healthy Swedish blood donors, enriched in pools of 21 samples, were included as population controls. Paired-end sequencing with read length of 100 bp was performed on a HiSeq2000/2500 (Illumina). The average sequence depth in the target region was 638× for ALL samples in the validation cohort, 162× for remission samples, and 133× for Swedish blood donors (Supp. Fig. S2).

Validation of Somatic Variants Identified by WGS

Exonic indels (n = 2 of 6 tested) and a subset of candidate SNVs (n = 50 of 60 tested) identified by WGS were validated by PCR and Sanger sequencing. Fifty nanograms of genomic DNA was whole genome amplified using the REPLI-g Midi Kit (Qiagen). PCR primers were designed using Primer3Plus [Untergasser et al., 2007]. ALL and remission samples were amplified by PCR and the products were sequenced with BigDye Terminator 3.1 chemistry using an Applied Biosystems 3730XL DNA sequencer. The sequence traces were analyzed with the Sequencher software (Applied Biosystems, Foster City, CA, USA). For validation of remaining SNVs, the allele fraction (AF) from the target capture experiment was calculated with a custom Python script (available at https://github.com/Molmed/Berglund-Lindqvist-2013). A candidate SNV was considered validated if the AF was ≥ 0.1 in the ALL sample, the AF was < 0.01 in the matched remission sample, and the sequence depth was ≥10 in both samples. Bases with a Phred quality score < 20 and bases in a read with mapping quality = 0 were disregarded. Positions where an SNV was called in more than one patient were manually inspected in the Integrative Genomics Viewer [Thorvaldsdottir et al., 2013].

Variant Calling in the Validation Cohort

SNVs and indels in the validation cohort were called with FreeBayes (http://arxiv.org/abs/1207.3907) and the GATK HaplotypeCaller [DePristo et al., 2011], respectively. Variants were filtered based on sequence coverage and quality scores. Germline variants were excluded using remission samples and population variation.

Accessibility of Reported Variants

All variants identified in the whole genome sequenced patients and the validation cohort are listed in the Supporting Information. Putative driver variants have been submitted to COSMIC with COSP ID 37259. Further details on the materials and methods are available in the Supporting Information.

Results

Somatic Variants in Whole Genome Sequenced ALL Patients

In this study, four patients selected to be representative of pediatric ALL were subjected to WGS. ALL_458 (BCP-ALL) carried the recurrent t(12;21)(p13;q22)ETV6-RUNX1 translocation, ALL_559 (T-ALL) had two translocations involving chromosome 7, ALL_707 (BCP-ALL) had several cytogenetic aberrations, including a t(7;9)(q11;p13) translocation, and ALL_501 (BCP-ALL) showed a normal karyotype (Table 1). Each of the patients responded to treatment and had remained in first continuous complete remission (CCR1) for at least 5 years. We sequenced the genomes of diagnostic and remission samples from these patients to an average depth of 31× high quality aligned data and the transcriptomes from the diagnostic samples (Supp. Table S5). In the genomes of the four patients, we identified between 713 and 851 candidate somatic SNVs and exonic insertion–deletions (indels) in non-repeated regions (Supp. Table S6). We validated close to 200 somatic SNVs and indels per BCP-ALL patient, and 305 in the T-ALL patient (Supp. Tables S7–S8) using PCR and Sanger sequencing and/or target capture and deep sequencing. A relatively low validation rate of 29% was obtained for SNVs (Supp. Table S7), which can be attributed to the non-stringent criteria for inclusion of candidate SNVs in the HaloPlex experiment. In addition, most comparable studies have focused on validation in exons, which are more conserved than the remaining part of the genome and less prone to alignment artifacts. Indeed, exonic SNVs were associated with both higher somatic scores, as defined by SomaticSniper, (averages of 87 and 72 for exonic and non-exonic candidates, respectively) and a substantially higher validation rate of 66% (Supp. Table S7). The candidate SNVs that failed to validate were either false positives, or had similar AFs in the leukemic and remission samples, suggesting that they were germline variants or alignment artifacts (Supp. Fig. S3). All further analysis includes only the validated variants. The validated SNVs were evenly distributed over the genome with no evidence of hypermutated regions (Fig. 1A). The most common somatic mutation in all patients except ALL_707 (subtype “other”) was C>T, whereas C>A was most frequent in ALL_707 (Fig. 1B).
Figure 1

A: Circos [Krzywinski et al., 2009] plots showing the genomic location of validated somatic single nucleotide variants (SNVs), insertion-deletions (indels), copy number alterations (CNAs), copy neutral loss of heterozygosity (LOH) events, and translocations in the whole genome sequenced ALL patients. SNVs and indels are shown as red (original clone) or blue (subclone) dots in the circle closest to the chromosomes. Inside the SNVs and indels, deletions are shown with red, duplications with blue and LOH with yellow circle segments. Black arcs indicate translocations. Gene names are color-coded as follows: gray, expressed genes with exonic indels or nsSNVs that were predicted to be damaging and genes that were highlighted as putative drivers in the validation cohort; black, genes involved in translocations; red, selected genes in CNA or LOH regions; green, selected differentially expressed genes that are located near breakpoints for translocations or putatively regulatory SNVs. B: Mutational patterns in the whole genome sequenced patients. The higher frequency of C>A mutations compared to C>T mutations in ALL_707 is significantly different from the other patients (chi-square test, P < 0.001).

A: Circos [Krzywinski et al., 2009] plots showing the genomic location of validated somatic single nucleotide variants (SNVs), insertion-deletions (indels), copy number alterations (CNAs), copy neutral loss of heterozygosity (LOH) events, and translocations in the whole genome sequenced ALL patients. SNVs and indels are shown as red (original clone) or blue (subclone) dots in the circle closest to the chromosomes. Inside the SNVs and indels, deletions are shown with red, duplications with blue and LOH with yellow circle segments. Black arcs indicate translocations. Gene names are color-coded as follows: gray, expressed genes with exonic indels or nsSNVs that were predicted to be damaging and genes that were highlighted as putative drivers in the validation cohort; black, genes involved in translocations; red, selected genes in CNA or LOH regions; green, selected differentially expressed genes that are located near breakpoints for translocations or putatively regulatory SNVs. B: Mutational patterns in the whole genome sequenced patients. The higher frequency of C>A mutations compared to C>T mutations in ALL_707 is significantly different from the other patients (chi-square test, P < 0.001). We validated 23 exonic variants in the four patients, including three loss of function mutations, 18 nonsynonymous SNVs (nsSNVs), and two synonymous SNVs (Table 3). Prediction of functional effects suggested that 11 out of 18 nsSNVs were damaging, including nsSNVs in KRAS (MIM# 190070) and NOTCH1 (MIM# 190198), which are known drivers of BCP-ALL [Liang et al., 2006] and T-ALL [Weng et al., 2004], respectively. We validated 74 SNVs in non-coding putative regulatory regions, of which 50 were located in conserved regions, 16 in DNase hypersensitive (DHS) regions and eight had a high regulatory potential according to RegulomeDB (Supp. Table S7 and Supp. Fig. S4). In comparison with the remaining non-coding SNVs, these putatively functional SNVs were more often located in introns and within 1 kb from a gene and less often in intergenic regions (Supp. Fig. S4).
Table 3

Validated Exonic Single Nucleotide Variants and Insertion–Deletions in Whole Genome Sequenced ALL Patients

SampleChrPositioncDNA changeaProtein changeGeneGene descriptionEffectSIFTbPP2cAF DNAdAF RNAFPKM sampleFPKM controle
ALL_4582178416648c.844G>Ap.E282KTTC30BTetratricopeptide repeat domain 30BnsSNVDD0.120.142.20.8
ALL_458351749797c.2008C>Tp.R670WGRM2Glutamate receptor, metabotropic 2 isoform ansSNVDD0.44NA0.10.1
ALL_4581211285961c.883C>Tp.R295WTAS2R30Type 2 taste receptor member 30nsSNVTB0.330.000.21.6
ALL_559110363472c.2229T>Gp.I743MKIF1BKinesin family member 1B isoform bnsSNVDD0.340.202.53.3
ALL_559111186751c.1069C>Tp.R357CMTORFK506 binding protein 12-rapamycin associatednsSNVTB0.560.6012.310.8
ALL_559132745467c.1067A>Cp.E356ALCKLymphocyte-specific protein tyrosine kinasensSNVDD0.610.64306.867.2
ALL_559348617469c.5119C>Tp.R1707WCOL7A1Alpha 1 type VII collagen precursornsSNVDP0.471.000.00.3
ALL_5595180053029c.1261C>Tp.P421SFLT4fms-related tyrosine kinase 4 isoform 2nsSNVDP0.35NA0.01.0
ALL_559639864708c.2462G>Ap.R821QDAAM2Dishevelled associated activator ofnsSNVTP0.400.000.11.6
ALL_559684055977c.515G>Tp.G172VME1Cytosolic malic enzyme 1nsSNVDD0.390.000.10.0
ALL_5599139397768c.5033T>Cp.L1678PNOTCH1Notch1 preproproteinnsSNVDD0.310.299.57.8
ALL_5591541165511c.456C>Tp.N152NRHOVras homolog gene family, member VsSNVNANA0.22NA0.10.1
ALL_559174536237c.1459G>Ap.V487MALOX15Arachidonate 15-lipoxygenasensSNVDP0.45NA0.00.0
ALL_5591913041102c.438_439insGp.G146fsFARSAPhenylalanyl-tRNA synthetase, alpha subunitframeshift insNANA0.620.368.89.5
ALL_7071225398281c.38G>Ap.G13DKRASc-K-ras2 protein isoform a precursornsSNVDNA0.380.2514.510.9
ALL_7072010030811c.1594G>Tp.A532SANKEF1Ankyrin repeat and EF-hand domain containing 1nsSNVDD0.39NA0.10.2
ALL_7072142609582c.544G>Tp.E182XBACE2Beta-site APP-cleaving enzyme 2 isoform Anonsense SNVNANA0.410.00NANA
ALL_501X151092993c.857A>Gp.K286RMAGEA4Melanoma antigen family A, 4nsSNVTP0.32NA0.00.0
ALL_501X48382171c.12C>Tp.N4NEBPEmopamil binding protein (sterol isomerase)sSNVNANA0.460.0030.416.0
ALL_5011157659674c.1724C>Tp.A575VFCRL3Fc receptor-like 3 precursornsSNVTB0.310.000.60.7
ALL_501261711139c.2610G>Cp.Q870HXPO1exportin 1nsSNVTB0.460.48115.1107.1
ALL_50111112123109c.410C>Tp.A137VPLET1Hypothetical protein LOC349633 precursornsSNVTB0.380.000.20.2
ALL_5011249426841c.11647_11648insGCTCp.H3883fsKMT2DLysine (K)-specific methyltransferase 2Dframeshift insNANA0.600.325.215.4

Nucleotide numbering uses +1 as the A of the ATG translation initiation codon in the reference sequence, with the initiation codon as codon 1.

SIFT predictions: D, damaging; T, tolerated.

PolyPhen2 (PP2) predictions: D, probably damaging; P, possibly damaging; B, benign.

AF from deep-sequencing data. The AF for the SNV in TAS2R30 is from WGS data, as this SNV was not covered in the deep-sequencing data.

The shown expression value represents the mean of 27 BCP-ALL (for ALL_458, ALL_707 and ALL_501) or 18 T-ALL (for ALL_559) samples.

Chr, chromosome; nsSNV, nonsynonymous SNV; sSNV, synonymous SNV; ins, insertion; AF, allele fraction; FPKM: fragments per kilobase of transcript per million mapped reads.

Validated Exonic Single Nucleotide Variants and Insertion–Deletions in Whole Genome Sequenced ALL Patients Nucleotide numbering uses +1 as the A of the ATG translation initiation codon in the reference sequence, with the initiation codon as codon 1. SIFT predictions: D, damaging; T, tolerated. PolyPhen2 (PP2) predictions: D, probably damaging; P, possibly damaging; B, benign. AF from deep-sequencing data. The AF for the SNV in TAS2R30 is from WGS data, as this SNV was not covered in the deep-sequencing data. The shown expression value represents the mean of 27 BCP-ALL (for ALL_458, ALL_707 and ALL_501) or 18 T-ALL (for ALL_559) samples. Chr, chromosome; nsSNV, nonsynonymous SNV; sSNV, synonymous SNV; ins, insertion; AF, allele fraction; FPKM: fragments per kilobase of transcript per million mapped reads. Using a combination of the WGS data and high-density genotype data, we identified between two and five somatic CNAs and copy-neutral loss of heterozygosity (LOH) events in the four patients (Fig. 1A and Table 4). The majority of the events (10/15) were not detected by cytogenetic analysis at diagnosis. The portion of the genome affected by these events varied between 428 kb in ALL_501 (normal karyotype) and 176 Mb in ALL_458 (t(12;21)). Several of the aberrations, including deletions of the wild type ETV6 [12p13] and VPREB1 (MIM# 605141) [22q11] in ALL_458, CDKN2A (MIM# 600160) [9p21] in ALL_559 (T-ALL) and ALL_707 (subtype “other”), and IKZF1 (MIM# 603023) [7p12] in ALL_501, and duplication of chromosome 10 in ALL_458, are known to be recurrent in pediatric ALL [Mullighan et al., 2007; Lilljebjorn et al., 2010; Mangum et al., 2014]. Both of the CDKN2A deletions were homozygous and they were flanked by two LOH events in ALL_559 and by two hemizygous deletions in ALL_707. Other deletions include, for example, the oncogenes MDM2 (MIM# 164785) and RAP1B (MIM# 179530) in ALL_458 and the putative tumor suppressors FOXP1 (MIM# 605515), RYBP (MIM# 607535), and SHQ1 (MIM# 613663) in ALL_559 (Table 4).
Table 4

Somatic Copy Number Alterations in Whole Genome Sequenced ALL Patients

SampleSubtypeChrStartEndSize (kb)aTypebCytobandCytogenetic predictioncAffected genesFPKM sampledFPKM controle
ALL_458t(12;21)101135534747135,535Dup*all chr 10NA691 genes21.913.3
ALL_458t(12;21)119660081413494437938,344Del*q21-q25NA261 genes1110.6
ALL_458t(12;21)1211253986133577912,104Delp13.2-p13.1p1325 genes including ETV63.16.5
ALL_458t(12;21)126883536469203100368Delq15NAMDM2, NUP107, RAP1B, SLC35E385.487.6
ALL_458t(12;21)22225695012260020031Delq11.22NAVPREB139.372.3
ALL_559T-ALL370233601744946014,261Delp13-p12.3NA11 genes including FOXP1, RYBP, SHQ169.8
ALL_559T-ALL9465872129594221,249LOHp24.3-p21.3NA74 genes46.528.5
ALL_559T-ALL92130000022100401800Delp21.3p21×211 genes including CDKN2A, CDKN2B0.10.9
ALL_559T-ALL9221109973224398110,133LOHp21.3-p21.1NA13 genes1.51.5
ALL_707Other91974212198080121,783Delp24.3-p21.3p21-p2482 genes25.719.9
ALL_707Other9219808022202100140Delp21.3p21-p24CDKN2A, CDKN2B01.2
ALL_707Other9220210023690800114,887Delp21.3-p13.2p21-p2492 genes28.334.9
ALL_707Other195851918659089786571Delq13.43q1323 genes18.916.8
ALL_501Normal28916581689554016388Delp11.2NANANANA
ALL_501Normal7504128945046363451Delp12.2NAIKZF184.671.1

The size represents the minimal overlap between different predictions. The breakpoints of the IKZF1 deletion were determined by analysis of softclipped reads in the WGS data.

A * indicates that the CNA is subclonal.

Results of cytogenetic analysis at diagnosis. NA indicates that no aberration was observed in the region by cytogenetic analysis.

Mean expression of genes located within the CNA in the sample harboring the CNA.

Mean expression of genes located within the CNA in the control data set, where each gene is represented by the mean of 27 BCP-ALL (for ALL_458, ALL_707 and ALL_501) or 18 T-ALL (for ALL_559) samples.

Chr, chromosome; Dup, duplication; Del, deletion; LOH, loss of heterozygosity; FPKM, fragments per kilobase of transcript per million mapped reads; CNA, copy number alteration.

Somatic Copy Number Alterations in Whole Genome Sequenced ALL Patients The size represents the minimal overlap between different predictions. The breakpoints of the IKZF1 deletion were determined by analysis of softclipped reads in the WGS data. A * indicates that the CNA is subclonal. Results of cytogenetic analysis at diagnosis. NA indicates that no aberration was observed in the region by cytogenetic analysis. Mean expression of genes located within the CNA in the sample harboring the CNA. Mean expression of genes located within the CNA in the control data set, where each gene is represented by the mean of 27 BCP-ALL (for ALL_458, ALL_707 and ALL_501) or 18 T-ALL (for ALL_559) samples. Chr, chromosome; Dup, duplication; Del, deletion; LOH, loss of heterozygosity; FPKM, fragments per kilobase of transcript per million mapped reads; CNA, copy number alteration.

Allele Fractions of Somatic Variants Reveal Clonal Heterogeneity

As we have shown before, target capture followed by deep sequencing allows accurate estimation of the AF of somatic SNVs and enables detection of clonal heterogeneity [Berglund et al., 2013]. Analysis of the validated somatic SNVs revealed a large density peak with AF 0.41–0.46 in each of the four whole genome sequenced patients (Fig. 2A), in agreement with the estimation that the samples contained 80%–95% leukemic blasts. In addition, density peaks indicative of subclones with AF of 0.19 and 0.27, were observed in ALL_458 (t(12;21)) and ALL_559 (T-ALL), respectively. Based on the AFs, we estimate that approximately half of the SNVs in ALL_458 belong to the subclone, and that 40% of the leukemic cells carry these SNVs. For comparison, an SNV present in the original clone is expected to be present in all leukemic cells, including those that belong to the subclone. The subclone in ALL_559 contains fewer (30%) of the SNVs, but they are present in a larger proportion (60%) of the leukemic cells. Comparison of the mutational patterns of the SNVs that were present in the original clones and those arising in the subclones showed no major differences, although the subclones contained a larger proportion of C>T substitutions than the original clone in both patients (Fig. 2B). ALL_707 and ALL_501 did not display additional density peaks, however, both patients harbored a substantial number of SNVs with relatively low AF, which could indicate the presence of minor clones with few SNVs. It should also be noted that because of the limited sequence depth in the WGS data, only subclones present in a relatively large proportion of the cells would have been detected.
Figure 2

A: Density plot showing the allele fraction (AF) distribution of validated somatic single nucleotide variants (SNVs) in the four whole genome sequenced ALL patients. Each sample displays a density peak with AF between 0.41 and 0.46. In addition, ALL_458 and ALL_559 display density peaks with AF of 0.19 and 0.27, respectively, indicative of subclones. B: Mutational patterns of SNVs belonging to the original clone and the subclone for ALL_458 and ALL_559. C: Subclonal copy number alterations (CNAs) in ALL_458 visualized using Omni2.5 BeadChip data. The top panel shows the log R ratio (LRR) and the bottom panel shows the B-allele frequency (BAF) for the duplication of chromosome 10 (left) and the deletion of chromosome 11q21–25 (right). R corresponds to the total intensity of each probe, and LRR is the log2 of the ratio of the measured normalized R-value in the ALL sample and the normalized R-value of the reference. LRR = 0 indicates no change in copy number. BAF represents the AF, with values of 0 and 1 indicating homozygosity and 0.5 indicating heterozygosity in a diploid genomic region. The solid and dashed lines correspond to the estimated BAF if the CNA is present in 40% and 100% of the leukemic cells, respectively. The red dots show the genomic location of somatic SNVs, with the position on the y-axis corresponding to the AF.

A: Density plot showing the allele fraction (AF) distribution of validated somatic single nucleotide variants (SNVs) in the four whole genome sequenced ALL patients. Each sample displays a density peak with AF between 0.41 and 0.46. In addition, ALL_458 and ALL_559 display density peaks with AF of 0.19 and 0.27, respectively, indicative of subclones. B: Mutational patterns of SNVs belonging to the original clone and the subclone for ALL_458 and ALL_559. C: Subclonal copy number alterations (CNAs) in ALL_458 visualized using Omni2.5 BeadChip data. The top panel shows the log R ratio (LRR) and the bottom panel shows the B-allele frequency (BAF) for the duplication of chromosome 10 (left) and the deletion of chromosome 11q21–25 (right). R corresponds to the total intensity of each probe, and LRR is the log2 of the ratio of the measured normalized R-value in the ALL sample and the normalized R-value of the reference. LRR = 0 indicates no change in copy number. BAF represents the AF, with values of 0 and 1 indicating homozygosity and 0.5 indicating heterozygosity in a diploid genomic region. The solid and dashed lines correspond to the estimated BAF if the CNA is present in 40% and 100% of the leukemic cells, respectively. The red dots show the genomic location of somatic SNVs, with the position on the y-axis corresponding to the AF. Analysis of the AF of SNPs from genotyping data suggested that the duplication of chromosome 10 and the deletion of chromosome 11q21–25 in ALL_458 were subclonal (Fig. 2C). Interestingly, the somatic SNVs on chromosome 10 fell into three AF clusters, which provide clues to when and where these mutations occurred. The first cluster (n = 3 SNVs) showed low AF (0.10–0.13), suggesting that these SNVs occurred in the subclone, either before duplication on the non-duplicated allele or after duplication on any allele. The second cluster of SNVs (n = 3, AF 0.32–0.36) likely occurred on the non-duplicated allele in the original clone. The third set of SNVs (n = 2) had high AF (0.52–0.55), and probably occurred in the original clone, on the allele that was subsequently duplicated in the subclone. The SNVs in the deleted region on chromosome 11 also have varying AFs, however, the low number of SNVs (n = 3) hinders inference of their origin.

Expression of Genes Affected by Somatic Variants

Using RNA-seq data, we found that 10 of the 23 genes with validated somatic exonic variants were expressed at ≥ 1 fragment per kilobase of transcript per million mapped reads (FPKM) (Table 3). The variant allele was expressed in 9/10 genes, but never overexpressed in comparison to the AF in DNA (Table 3). To assess the putative effect of the somatic variants on gene expression, we compared to immunophenotype-matched control RNA-seq datasets, consisting of 27 BCP-ALL samples or 18 T-ALL samples (Supp. Table S2). In comparison with the control dataset, LCK (MIM# 153390) was 4.6-fold overexpressed and KMT2D (MIM# 602113, previously named MLL2) was threefold underexpressed in the samples harboring the variants (Table 3). Of the 14 genes with damaging or loss of function mutations, 50% (n = 7) were expressed, in comparison with 33% (3/9) of the genes with predicted benign variants. We also observed at least 1.6-fold decreased expression of the genes in three of the nine hemizygous deletions that affected at least one gene, almost complete loss of expression in the two homozygous deletions, and 1.6-fold increased expression of the genes on the duplicated chromosome 10 (Table 4). The expression of the genes on chromosome 9p24–21 with copy-neutral LOH was 1.6-fold increased in ALL_559 compared with the control dataset, whereas there was no difference in expression of the genes in the other LOH region on the same chromosome (Table 4). Next, we analyzed putative effects of non-coding SNVs on the expression of nearby genes. A total of 125 SNVs that were located in a conserved or DHS region, had a RegulomeDB score ≤ 4, or a score from FunSeq [Khurana et al., 2013] ≥ 3, were included in the analysis. In order to call a gene differentially expressed, we required a fourfold relative difference and an absolute difference of at least 3 standard deviations between the sample and the control. Using these criteria, we identified 31 genes that were differentially expressed in comparison to the control dataset, all of which were overexpressed, corresponding to 19 putatively regulatory SNVs (Supp. Table S9). These 19 SNVs were located in introns (n = 8), intergenic regions (n = 8) or the 5′ flank of a gene (n = 3). Three of the intronic SNVs were covered by at least two reads in the RNA-seq data, putatively representing unspliced mRNAs (Supp. Table S9). To further investigate the putative function of these variants, we determined the overlap between SNVs and known histone marks in lymphoid cells (Supp. Table S9). Eight of the 19 putative regulatory SNVs, associated with differential expression of 12 genes, overlapped with histone marks. Notably, an SNV located 53 kb upstream of the growth factor MDK (MIM# 162096), which was overexpressed in ALL_458 (t(12;21)) overlapped with markers for enhancer elements (H3K27ac and H3K4me1 modifications) and active transcription (H3K4me3 modification). We also used the RNA-seq data to identify expressed fusion genes (Supp. Table S10). In addition to the canonical ETV6-RUNX1 fusion in ALL_458, which was detected by RT-PCR at diagnosis, the RNA-seq data revealed expression of the reciprocal fusion gene RUNX1-ETV6 that contains the first exon of RUNX1 and the last three exons of ETV6. Cytogenetic analysis of ALL_559 (T-ALL) detected the two translocations t(7;9) and t(7;11). RNA-seq demonstrated that both translocations result in expressed fusion genes with a common 3′ partner TRBC2 (MIM# 615445), which is part of the T-cell receptor beta locus [7q34]. The 5′ partners were RIC3 (MIM# 610509) [11p15] and a non-annotated gene located 500 bp upstream of TMEM38B (MIM# 611236) [9q31]. We observed overexpression of LMO1 (MIM# 186921) and TUB (MIM# 601197), which flank RIC3, and TAL2 (MIM# 186855), which is located downstream of TMEM38B, in ALL_559 compared with other T-ALL samples (Fig. 3). ALL_707 also had a t(7;9) based on karyotype data. RNA-seq demonstrated that this fusion resulted in a highly expressed PAX5-ELN (MIM#s 167414, 130161) fusion gene (Fig. 3). We did not detect any expressed fusion genes apart from those resulting from the translocations detected at diagnosis in the three patients mentioned above and no fusions were detected in ALL_501 with normal karyotype.
Figure 3

Bar plots showing the expression of five genes up- and downstream of genes involved in chromosomal rearrangements. The gene involved in the fusion, if annotated, is highlighted in bold. Genes exhibiting a fourfold relative difference and an absolute difference of at least 3 standard deviations between the sample and the control are marked with a *. A and B: Expression changes in ALL_559 associated with the fusions of TRBC2 with RIC3 (A) and a non-annotated gene close to TMEM38B (B). LMO1, TUB, and TAL2 are overexpressed in ALL_559 compared with 18 T-ALL samples. C and D: Expression changes in ALL_707 associated with the PAX5-ELN fusion. ELN, which constitutes the major part of the fusion gene, is overexpressed in ALL_707 compared with 27 BCP-ALL samples suggesting that the PAX5-ELN fusion gene is highly expressed.

Bar plots showing the expression of five genes up- and downstream of genes involved in chromosomal rearrangements. The gene involved in the fusion, if annotated, is highlighted in bold. Genes exhibiting a fourfold relative difference and an absolute difference of at least 3 standard deviations between the sample and the control are marked with a *. A and B: Expression changes in ALL_559 associated with the fusions of TRBC2 with RIC3 (A) and a non-annotated gene close to TMEM38B (B). LMO1, TUB, and TAL2 are overexpressed in ALL_559 compared with 18 T-ALL samples. C and D: Expression changes in ALL_707 associated with the PAX5-ELN fusion. ELN, which constitutes the major part of the fusion gene, is overexpressed in ALL_707 compared with 27 BCP-ALL samples suggesting that the PAX5-ELN fusion gene is highly expressed.

Recurrently Mutated Genes and Regions Identified by Targeted Sequencing

To ascertain if any of the genes or putative regulatory regions identified by WGS of four patients were recurrently mutated in ALL, we performed target capture and deep sequencing of these regions in a cohort of 145 BCP-ALL and 23 T-ALL samples (Table 2 and Supp. Tables S1 and S4). Matched DNA from remission was sequenced in pools for 159 of the patients, and analysis of the sequence data demonstrated that 139 of the remission samples were well represented in the pools and could be used to filter out germline variants (Supp. Methods). In these 139 patients, we detected on average 0.7 SNVs, whereas in the remaining 29 patients, we detected on average 3.0 SNVs. This result suggests that the majority of the SNVs called in the samples without matched remission sample are germline variants. To avoid false positives we only report variants included in the COSMIC database for these samples. We detected 107 SNVs (Supp. Table S11) and 15 indels (Supp. Table S12) in the validation cohort, including 43 SNVs and 10 indels in exons or UTRs of the 30 resequenced genes (Fig. 4). The most frequently mutated genes were KRAS (18 SNVs affecting 16 patients) and NOTCH1 (9 SNVs and 7 indels, affecting 12 patients). 17/18 KRAS mutations were found in BCP-ALL patients, and all NOTCH1 mutations were found in T-ALL patients. In addition to KRAS and NOTCH1, we identified KMT2D, KIF1B (MIM# 605995) and ME1 (MIM# 154250) as novel putative driver genes using three complementary tools (MutSigCV, Lawrence et al. (2013) Oncodrive-fm, Gonzalez-Perez, Lopez-Bigas (2012) and OncodriveCLUST, Tamborero et al. (2013)). To find subtype-specific patterns, we analyzed the T-ALL samples and the largest BCP-ALL subtypes (HeH, t(12;21), normal and other) individually. KRAS was highlighted as driver in all BCP-ALL subtypes except t(12;21) and NOTCH1 was highlighted in T-ALL. No subtype-specific pattern was observed for the novel genes.
Figure 4

Recurrent somatic mutations detected in the validation cohort in the genes that were identified by WGS. Each column represents one patient, with the whole genome sequenced samples in the four leftmost columns. In the upper panel, each row represents one gene. Only samples and genes with at least one mutation in an exon, splice site, or untranslated region (UTR) in the validation cohort are shown. Each colored box indicates a mutation. For patients with more than one variant in the same gene, the color is prioritized according to the order shown in the legend. In the lower panel, the genetic subtype of each sample is shown.

Recurrent somatic mutations detected in the validation cohort in the genes that were identified by WGS. Each column represents one patient, with the whole genome sequenced samples in the four leftmost columns. In the upper panel, each row represents one gene. Only samples and genes with at least one mutation in an exon, splice site, or untranslated region (UTR) in the validation cohort are shown. Each colored box indicates a mutation. For patients with more than one variant in the same gene, the color is prioritized according to the order shown in the legend. In the lower panel, the genetic subtype of each sample is shown. Recurrent mutations in UTRs were identified in several genes (Fig. 4), however, the mutations were co-located in either the 3′ or the 5′ UTR only for ASTN1 (MIM# 600904), which is not expressed in ALL according to our RNA-seq dataset. For identification of recurrently mutated non-coding regions in the validation cohort, we defined “super-regions” consisting of all non-coding regions that were selected because of the same original SNV (Supp. Methods). In most cases, the super-regions consisted of a contiguous genomic region, however, in cases where the SNV was located in a conserved region, they contained all conserved regions within 2 kb. Twelve non-coding super-regions were found to harbor recurrent mutations in the validation cohort (Supp. Table S13). The original SNVs detected by WGS were located in a conserved (n = 4) or DHS region (n = 1), had a RegulomeDB hit (n = 1), were located in a DHS region and had a RegulomeDB hit (n = 1), or lacked functional annotation (n = 5). Two SNVs overlapped histone marks for enhancer elements (H3K27ac and H3K4me1) and active transcription (H3K4me3). One specific SNV (chr4:157006979C>T) was identified in two t(12;21) patients, including the whole genome sequenced ALL_458. This SNV did not overlap any of the annotated regions or histone marks, and the possible functional implications are unclear.

Discussion

In this study, we performed a thorough characterization of the genomes of four representative pediatric ALL patients using WGS and RNA-seq. We validated and determined the AFs of somatic variants genome-wide and identified recurrently mutated coding and non-coding regions by targeted sequencing of 168 additional ALL patients. In ALL_458 (t(12;21)), we found deletions of ETV6 and VPREB1, in line with previous observations that patients with the ETV6-RUNX1 translocation often harbor deletions that target genes involved in B-cell development [Mullighan et al., 2007; Papaemmanuil et al., 2014]. We also observed expression of the rarely reported reciprocal RUNX1-ETV6 fusion gene, which has been suggested to be involved in cellular regrowth [Stams et al., 2005; Al-Shehhi et al., 2013], and overexpression of MDK coinciding with a putative regulatory SNV located in an enhancer element and DHS region 53 kb upstream of the gene. MDK is involved in cancer development and has previously been shown to be upregulated in BCP-ALL compared to normal peripheral blood and bone marrow [Hidaka et al., 2007]. ALL_458 also contained a subclone with a large number of SNVs, suggesting that these cells have an increased mutation rate and/or a growth advantage during leukemic progression compared to the cells in the original clone. Clonal heterogeneity has previously been observed in ALL by copy number profiling [Jan and Majeti, 2013] or analysis of AFs from exome sequencing [Papaemmanuil et al., 2014] and is one of the most important challenges for the successful application of targeted therapies [Landau et al., 2014]. Despite the potentially rapidly growing subclone and the presence of two lesions that have been suggested to have a negative impact on clinical outcome, namely the deletion of VPREB [Mangum et al., 2014] and the expression of RUNX1-ETV6 [Stams et al., 2005], this patient responded well to treatment and has remained in CCR1. The absence of SNVs or indels in putative driver genes in ALL_458, the few KRAS mutations in t(12;21) patients, and the previous failure to identify recurrently mutated genes by exome sequencing of t(12;21) patients [Lilljebjorn et al., 2012] suggest that point mutation in exons of protein-coding genes is not a dominant force for leukemic development in this subtype. We identified several lesions that are characteristic of T-ALL in ALL_559, including a NOTCH1 mutation, two translocations involving the T-cell receptor beta locus [Le Noir et al., 2012] that resulted in overexpression of LMO1 [Atak et al., 2013] and TAL2 [Marculescu et al., 2003], and deletion of the tumor suppressor CDKN2A [Mullighan et al., 2007]. ALL_559 also displayed mutation and overexpression of the proto-oncogene LC, which is involved in T-cell development. LCK has previously been found to be mutated and overexpressed in fusions with the T-cell receptor region in T-ALL and it was suggested that oncogenic transformation of LCK requires two mutations, one that deregulates gene transcription and one that activates protein function [Wright et al., 1994]. Although we did not identify any regulatory SNV that could cause the aberrant expression of LCK in ALL_559, it is possible that there is a regulatory SNV that is located at a larger distance than 1 Mb from LCK or that another type of genetic lesion is involved. Findings in our study that are novel in T-ALL include deletion of the putative tumor suppressors FOXP1, RYBP and SHQ1, which has frequently been observed in prostate cancer and has been suggested to exert a tumor-promoting effect [Krohn et al., 2013], the novel fusion partner for TRBC2 in a non-annotated gene on 9q31, and the identification of ME1 and KIF1B as putative driver genes. Although ME1 is not expressed in any of the T-ALL samples in our RNA-seq dataset and probably does not play a major role in leukemogenesis, KIF1B is expressed and has been suggested to be a tumor suppressor that contributes to cancer development by dosage reduction [Henrich et al., 2012]. Recurrent lesions in ALL_707 (subtype “other”) included the KRAS mutation [Liang et al., 2006], the CDKN2A deletion [Mullighan et al., 2007], and the PAX5-ELN fusion gene [Bousquet et al., 2007]. We did not identify any novel putative driver events in this patient, however, we observed a large number of C>A mutations, in contrast to the other patients where the dominant mutation was C>T. Although C>T mutations are common in most cancer types, and can be caused by UV-light or deamination of 5-methylcytosines [Alexandrov et al., 2013], excessive C>A mutations have mainly been observed in lung cancer, and have been attributed to tobacco exposure [Pleasance et al., 2010]. Future studies will reveal whether this mutational signature, which has not previously been observed in ALL, will be detected in other ALL patients. ALL_501 was selected for this study because its karyotype was completely normal, and we were especially interested in finding the driver events in this patient. In agreement with the cytogenetic results, we found no fusion gene or large CNA. We detected a focal deletion of exons 3–6 in IKZF1, which encodes the transcription factor Ikaros that plays key roles in lymphoid development and tumor suppression. The resulting transcript, which lacks four zinc fingers that are required for DNA binding and therefore is unable to bind transcriptional targets, is known as Ik6 and acts as a dominant negative inhibitor of Ikaros function [Mullighan and Downing, 2008]. The second finding in ALL_501 was a frameshift insertion and reduced expression of KMT2D (MLL2), which was highlighted as a putative driver gene in the validation cohort. Frequent loss-of-function mutations in KMT2D, which encodes a histone methyltransferase involved in regulation of gene transcription, have been found in a range of cancers, and this gene has been proposed to be a tumor suppressor and a putative therapeutic target [Guo et al., 2013]. An intriguing question is whether these two mutations are sufficient to induce leukemia. Mouse models have shown that expression of Ik6 can induce T-cell leukemia [Winandy et al., 1995], however, expression exclusively in B-cells does not result in B-lineage leukemia [Wojcik et al., 2007]. Point mutations in KMT2D have, to our knowledge, not been reported as putative drivers before in ALL, and their role in leukemogenesis is yet to be determined. In summary, we provide a high-resolution map of the genomes of four representative pediatric ALL patients. We found that each patient had a unique genome, with a combination of well known and previously undetected genomic aberrations, including SNVs, CNAs, and chromosomal rearrangements. Despite the limited size of the discovery cohort we identified KMT2D and KIF1B as novel putative driver genes in ALL, which suggests that analysis of more samples would enable identification of additional genes. The non-annotated fusion partner to TRBC2 is an example of a novel finding enabled by RNA-seq. Our finding that overexpression of MDK coincide with a non-coding putative regulatory SNV suggests that regulatory variants may be more important for the development of ALL and other cancers than recognized to date, and that future WGS and RNA-seq studies in larger cohorts combined with functional experiments will be useful to explore this area. The results from our study as well as earlier sequencing studies [Roberts et al., 2012; Zhang et al., 2012; Holmfeldt et al., 2013; Papaemmanuil et al., 2014] contribute to an increased understanding of the biological mechanisms that lead to ALL. The heterogeneity of the genetic aberrations in ALL, and lack of large numbers of recurrent mutations renders WGS particularly well suited for diagnosis and stratification of ALL patients into subgroups for new treatment protocols. Analysis of serially collected samples from the 20% of patients that relapse has already revealed mechanisms of clonal evolution that provide clues to the cause of treatment failure [Meyer et al., 2013; Tzoneva et al., 2013], and more extensive studies are likely to further increase the understanding of the biology behind relapse in ALL. Next generation sequencing technology has developed fast in recent years, and today it is a reality to apply WGS at costs and speed that are acceptable for routine clinical genetic diagnostics of ALL.
  60 in total

1.  Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion.

Authors:  Ruibin Xi; Angela G Hadjipanayis; Lovelace J Luquette; Tae-Min Kim; Eunjung Lee; Jianhua Zhang; Mark D Johnson; Donna M Muzny; David A Wheeler; Richard A Gibbs; Raju Kucherlapati; Peter J Park
Journal:  Proc Natl Acad Sci U S A       Date:  2011-11-07       Impact factor: 11.205

Review 2.  Biology, risk stratification, and therapy of pediatric acute leukemias: an update.

Authors:  Ching-Hon Pui; William L Carroll; Soheil Meshinchi; Robert J Arceci
Journal:  J Clin Oncol       Date:  2011-01-10       Impact factor: 44.544

3.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.

Authors:  Cole Trapnell; Adam Roberts; Loyal Goff; Geo Pertea; Daehwan Kim; David R Kelley; Harold Pimentel; Steven L Salzberg; John L Rinn; Lior Pachter
Journal:  Nat Protoc       Date:  2012-03-01       Impact factor: 13.491

4.  SomaticSniper: identification of somatic point mutations in whole genome sequencing data.

Authors:  David E Larson; Christopher C Harris; Ken Chen; Daniel C Koboldt; Travis E Abbott; David J Dooling; Timothy J Ley; Elaine R Mardis; Richard K Wilson; Li Ding
Journal:  Bioinformatics       Date:  2011-12-06       Impact factor: 6.937

5.  Genetic alterations activating kinase and cytokine receptor signaling in high-risk acute lymphoblastic leukemia.

Authors:  Kathryn G Roberts; Ryan D Morin; Jinghui Zhang; Martin Hirst; Yongjun Zhao; Xiaoping Su; Shann-Ching Chen; Debbie Payne-Turner; Michelle L Churchman; Richard C Harvey; Xiang Chen; Corynn Kasap; Chunhua Yan; Jared Becksfort; Richard P Finney; David T Teachey; Shannon L Maude; Kane Tse; Richard Moore; Steven Jones; Karen Mungall; Inanc Birol; Michael N Edmonson; Ying Hu; Kenneth E Buetow; I-Ming Chen; William L Carroll; Lei Wei; Jing Ma; Maria Kleppe; Ross L Levine; Guillermo Garcia-Manero; Eric Larsen; Neil P Shah; Meenakshi Devidas; Gregory Reaman; Malcolm Smith; Steven W Paugh; William E Evans; Stephan A Grupp; Sima Jeha; Ching-Hon Pui; Daniela S Gerhard; James R Downing; Cheryl L Willman; Mignon Loh; Stephen P Hunger; Marco A Marra; Charles G Mullighan
Journal:  Cancer Cell       Date:  2012-08-14       Impact factor: 31.743

6.  The genetic basis of early T-cell precursor acute lymphoblastic leukaemia.

Authors:  Jinghui Zhang; Li Ding; Linda Holmfeldt; Gang Wu; Sue L Heatley; Debbie Payne-Turner; John Easton; Xiang Chen; Jianmin Wang; Michael Rusch; Charles Lu; Shann-Ching Chen; Lei Wei; J Racquel Collins-Underwood; Jing Ma; Kathryn G Roberts; Stanley B Pounds; Anatoly Ulyanov; Jared Becksfort; Pankaj Gupta; Robert Huether; Richard W Kriwacki; Matthew Parker; Daniel J McGoldrick; David Zhao; Daniel Alford; Stephen Espy; Kiran Chand Bobba; Guangchun Song; Deqing Pei; Cheng Cheng; Stefan Roberts; Michael I Barbato; Dario Campana; Elaine Coustan-Smith; Sheila A Shurtleff; Susana C Raimondi; Maria Kleppe; Jan Cools; Kristin A Shimano; Michelle L Hermiston; Sergei Doulatov; Kolja Eppert; Elisa Laurenti; Faiyaz Notta; John E Dick; Giuseppe Basso; Stephen P Hunger; Mignon L Loh; Meenakshi Devidas; Brent Wood; Stuart Winter; Kimberley P Dunsmore; Robert S Fulton; Lucinda L Fulton; Xin Hong; Christopher C Harris; David J Dooling; Kerri Ochoa; Kimberly J Johnson; John C Obenauer; William E Evans; Ching-Hon Pui; Clayton W Naeve; Timothy J Ley; Elaine R Mardis; Richard K Wilson; James R Downing; Charles G Mullighan
Journal:  Nature       Date:  2012-01-11       Impact factor: 49.962

7.  Whole-exome sequencing of pediatric acute lymphoblastic leukemia.

Authors:  H Lilljebjörn; M Rissler; C Lassen; J Heldrup; M Behrendtz; F Mitelman; B Johansson; T Fioretos
Journal:  Leukemia       Date:  2011-11-18       Impact factor: 11.528

8.  Three periods of regulatory innovation during vertebrate evolution.

Authors:  Craig B Lowe; Manolis Kellis; Adam Siepel; Brian J Raney; Michele Clamp; Sofie R Salama; David M Kingsley; Kerstin Lindblad-Toh; David Haussler
Journal:  Science       Date:  2011-08-19       Impact factor: 47.728

9.  Annotation of functional variation in personal genomes using RegulomeDB.

Authors:  Alan P Boyle; Eurie L Hong; Manoj Hariharan; Yong Cheng; Marc A Schaub; Maya Kasowski; Konrad J Karczewski; Julie Park; Benjamin C Hitz; Shuai Weng; J Michael Cherry; Michael Snyder
Journal:  Genome Res       Date:  2012-09       Impact factor: 9.043

10.  A framework for variation discovery and genotyping using next-generation DNA sequencing data.

Authors:  Mark A DePristo; Eric Banks; Ryan Poplin; Kiran V Garimella; Jared R Maguire; Christopher Hartl; Anthony A Philippakis; Guillermo del Angel; Manuel A Rivas; Matt Hanna; Aaron McKenna; Tim J Fennell; Andrew M Kernytsky; Andrey Y Sivachenko; Kristian Cibulskis; Stacey B Gabriel; David Altshuler; Mark J Daly
Journal:  Nat Genet       Date:  2011-04-10       Impact factor: 38.330

View more
  18 in total

1.  Novel Gene and Network Associations Found for Acute Lymphoblastic Leukemia Using Case-Control and Family-Based Studies in Multiethnic Populations.

Authors:  Priyanka Nakka; Natalie P Archer; Heng Xu; Philip J Lupo; Benjamin J Raphael; Jun J Yang; Sohini Ramachandran
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2017-07-27       Impact factor: 4.254

Review 2.  Epigenetics of hematopoiesis and hematological malignancies.

Authors:  Deqing Hu; Ali Shilatifard
Journal:  Genes Dev       Date:  2016-09-15       Impact factor: 11.361

3.  The role of the RAS pathway in iAMP21-ALL.

Authors:  S L Ryan; E Matheson; V Grossmann; P Sinclair; M Bashton; C Schwab; W Towers; M Partington; A Elliott; L Minto; S Richardson; T Rahman; B Keavney; R Skinner; N Bown; T Haferlach; P Vandenberghe; C Haferlach; M Santibanez-Koref; A V Moorman; A Kohlmann; J A E Irving; C J Harrison
Journal:  Leukemia       Date:  2016-04-22       Impact factor: 11.528

4.  Mutations in TP53 and JAK2 are independent prognostic biomarkers in B-cell precursor acute lymphoblastic leukaemia.

Authors:  Maribel Forero-Castro; Cristina Robledo; Rocío Benito; Irene Bodega-Mayor; Inmaculada Rapado; María Hernández-Sánchez; María Abáigar; Jesús Maria Hernández-Sánchez; Miguel Quijada-Álamo; José María Sánchez-Pina; Mónica Sala-Valdés; Fernanda Araujo-Silva; Alexander Kohlmann; José Luis Fuster; Maryam Arefi; Natalia de Las Heras; Susana Riesco; Juan N Rodríguez; Lourdes Hermosín; Jordi Ribera; Mireia Camos Guijosa; Manuel Ramírez; Cristina Díaz de Heredia Rubio; Eva Barragán; Joaquín Martínez; José M Ribera; Elena Fernández-Ruiz; Jesús-María Hernández-Rivas
Journal:  Br J Cancer       Date:  2017-05-30       Impact factor: 7.640

5.  Transcriptome sequencing in pediatric acute lymphoblastic leukemia identifies fusion genes associated with distinct DNA methylation profiles.

Authors:  Yanara Marincevic-Zuniga; Johan Dahlberg; Sara Nilsson; Amanda Raine; Sara Nystedt; Carl Mårten Lindqvist; Eva C Berglund; Jonas Abrahamsson; Lucia Cavelier; Erik Forestier; Mats Heyman; Gudmar Lönnerholm; Jessica Nordlund; Ann-Christine Syvänen
Journal:  J Hematol Oncol       Date:  2017-08-14       Impact factor: 17.388

6.  Deep targeted sequencing in pediatric acute lymphoblastic leukemia unveils distinct mutational patterns between genetic subtypes and novel relapse-associated genes.

Authors:  C Mårten Lindqvist; Anders Lundmark; Jessica Nordlund; Eva Freyhult; Diana Ekman; Jonas Carlsson Almlöf; Amanda Raine; Elin Övernäs; Jonas Abrahamsson; Britt-Marie Frost; Dan Grandér; Mats Heyman; Josefine Palle; Erik Forestier; Gudmar Lönnerholm; Eva C Berglund; Ann-Christine Syvänen
Journal:  Oncotarget       Date:  2016-09-27

7.  Genomic characterization of pediatric T-cell acute lymphoblastic leukemia reveals novel recurrent driver mutations.

Authors:  Jean-François Spinella; Pauline Cassart; Chantal Richer; Virginie Saillour; Manon Ouimet; Sylvie Langlois; Pascal St-Onge; Thomas Sontag; Jasmine Healy; Mark D Minden; Daniel Sinnett
Journal:  Oncotarget       Date:  2016-10-04

8.  Mathematical models of amino acid panel for assisting diagnosis of children acute leukemia.

Authors:  Zhidai Liu; Tingting Zhou; Xing Han; Tingyuan Lang; Shan Liu; Penghui Zhang; Haiyan Liu; Kexing Wan; Jie Yu; Liang Zhang; Liyan Chen; Roger W Beuerman; Bin Peng; Lei Zhou; Lin Zou
Journal:  J Transl Med       Date:  2019-01-23       Impact factor: 5.531

9.  Integrative genomic analysis of pediatric T-cell lymphoblastic lymphoma reveals candidates of clinical significance.

Authors:  Tasneem Khanam; Sarah Sandmann; Jochen Seggewiss; Charlotte Ruether; Martin Zimmermann; Allison B Norvil; Christoph Bartenhagen; Gerrit Randau; Stephanie Mueller; Heidi Herbrueggen; Per Hoffmann; Stefan Herms; Lanying Wei; Marius Woeste; Christian Wuensch; Humaira Gowher; Ilske Oschlies; Wolfram Klapper; Wilhelm Woessmann; Martin Dugas; Birgit Burkhardt
Journal:  Blood       Date:  2021-04-29       Impact factor: 22.113

Review 10.  Polycomb complexes in normal and malignant hematopoiesis.

Authors:  Valerio Di Carlo; Ivano Mocavini; Luciano Di Croce
Journal:  J Cell Biol       Date:  2018-10-19       Impact factor: 10.539

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.