Pediatric high-grade glioma (HGG) is a devastating disease with a less than 20% survival rate 2 years after diagnosis. We analyzed 127 pediatric HGGs, including diffuse intrinsic pontine gliomas (DIPGs) and non-brainstem HGGs (NBS-HGGs), by whole-genome, whole-exome and/or transcriptome sequencing. We identified recurrent somatic mutations in ACVR1 exclusively in DIPGs (32%), in addition to previously reported frequent somatic mutations in histone H3 genes, TP53 and ATRX, in both DIPGs and NBS-HGGs. Structural variants generating fusion genes were found in 47% of DIPGs and NBS-HGGs, with recurrent fusions involving the neurotrophin receptor genes NTRK1, NTRK2 and NTRK3 in 40% of NBS-HGGs in infants. Mutations targeting receptor tyrosine kinase-RAS-PI3K signaling, histone modification or chromatin remodeling, and cell cycle regulation were found in 68%, 73% and 59% of pediatric HGGs, respectively, including in DIPGs and NBS-HGGs. This comprehensive analysis provides insights into the unique and shared pathways driving pediatric HGG within and outside the brainstem.
Pediatric high-grade glioma (HGG) is a devastating disease with a less than 20% survival rate 2 years after diagnosis. We analyzed 127 pediatric HGGs, including diffuse intrinsic pontine gliomas (DIPGs) and non-brainstem HGGs (NBS-HGGs), by whole-genome, whole-exome and/or transcriptome sequencing. We identified recurrent somatic mutations in ACVR1 exclusively in DIPGs (32%), in addition to previously reported frequent somatic mutations in histone H3 genes, TP53 and ATRX, in both DIPGs and NBS-HGGs. Structural variants generating fusion genes were found in 47% of DIPGs and NBS-HGGs, with recurrent fusions involving the neurotrophin receptor genes NTRK1, NTRK2 and NTRK3 in 40% of NBS-HGGs in infants. Mutations targeting receptor tyrosine kinase-RAS-PI3K signaling, histone modification or chromatin remodeling, and cell cycle regulation were found in 68%, 73% and 59% of pediatric HGGs, respectively, including in DIPGs and NBS-HGGs. This comprehensive analysis provides insights into the unique and shared pathways driving pediatric HGG within and outside the brainstem.
Although childhood and adult HGG share related histopathological characteristics, adult HGGs arise predominantly in the cerebral cortex, while childhood HGGs more frequently involve a broader spectrum of locations. There are also significant differences in molecular features between pediatric and adult HGG[3,6-16]. Histone H3 (H3F3A and HIST1H3B) pK27M mutations are frequent in DIPGs, which arise in the brainstem almost exclusively in children, and in pediatric HGGs in midline structures such as thalamus and cerebellum, while pG34R/V histone H3 mutations occur in pediatric HGGs of the cerebral cortex[3-5,17]. In contrast, histone H3 mutations are extremely rare in adult HGGs [3]. HGGs arising in infants younger than 3 years of age have a better prognosis, and a lower frequency of TP53 mutations, suggesting that there may be age-dependent subgroups of HGG even within the pediatric population[2]. Thus, the selective pressures driving gliomagenesis in children vary with age and anatomical site.To more comprehensively understand the pathways driving childhood glioma, we analyzed the genomic landscape of HGGs from 118 pediatric patients (127 tumors, 108 matched to germline DNA) consisting of 57 DIPGs and 70 non-brainstem HGGs (NBS-HGG) by whole genome (WGS) (n= 42), whole exome (n= 80) or transcriptome sequencing (n= 75) (Supplementary Tables 1-9). A total of 39,590 sequence mutations, including single nucleotide variations (SNVs) and small insertions or deletions, and 2,039 structural variations (SVs) were found by WGS while an additional 2,600 sequence mutations and 138 SVs were found by exome sequencing and transcriptome sequencing, respectively. Overall, the cohort showed a median background mutation rate of 9E-07 and a median of 22 SVs per genome (Supplementary Fig. 1). All SNVs and SVs found in WGS were verified experimentally by independent sequencing methods (Online Methods).Among recurrent mutations in pediatric HGG, the most frequently mutated gene not previously identified in cancer was ACVR1 (also known as ALK2) encoding a BMP type I receptor (Fig. 1 and 2, Supplementary Fig. 2-3). Clonal missense ACVR1 mutations were found exclusively in DIPGs (32%), and were significantly associated with younger age, longer survival, and the presence of pK27M HIST1H3B (p<0.0000001), or PIK3CA or PIK3R1 mutations (p<0.005)(Fig. 1 and 2, Supplementary Fig. 3, Supplementary Tables 4 and 5). Four of these somatic ACVR1 mutations were the same as germline mutations previously identified in the autosomal dominant syndrome fibrodysplasia ossificans progressiva (FOP), in which aberrant cellular differentiation drives progressive heterotopic ossifications[18,19]. All residues impacted by mutation in DIPG cluster around either the inhibitory glycine/serine rich (G/S) domain or the ATP binding pocket of the kinase domain, and would be expected to shift the kinase to an active conformation (Figure 2 and Supplementary Fig. 3c)[20]. Indeed, mutations of these residues induced a weak gain of function[20,21]. A previous study showed that the R206HACVR1 mutation caused a ventralized phenotype in zebrafish embryos, an indicator of BMP pathway activation[22]. We tested all of the ACVR1 mutations found in DIPG using this assay. Zebrafish embryos injected with ACVR1WT mRNA (WT) displayed a mild dorsalized phenotype consistent with BMP pathway inhibition, while all six ACVR1 mutants, shown in order of severity, exhibited varying degrees of ventralization with partial to complete loss of head and dorsal structures (Fig. 2b,c, Supplementary Fig. 3d,e). A moderate dose of LDN-193189 (LDN), a highly selective antagonist of the BMP pathway[22,23], partially reversed the ventralization effects induced by ACVR1 mutants as can be seen by the rescue of dorsal head structures for R258G, G328E, G328W, R206H and the reduced severity of ventralization for G356D and G328V (Fig. 2c). Expression of ACVR1 mutants in mouse primary astrocyte cultures caused increased levels of phospho-SMAD1/5, a downstream indication of active BMP signaling, with varying magnitude (Fig. 2d). LDN also effectively blocked signaling to phospho-SMAD1/5 downstream of the mutant ACVR1 in primary astrocytes (Supplementary Fig. 3f).
Fig. 1
Recurrent genetic alterations in pediatric high-grade glioma
Genetic alterations detected in 19 genes, including ACVR1, and genes most recurrently mutated in the pathways indicated on the left, are displayed according to the color key shown below. Diagonal white line indicates loss of the wild-type allele, or male patient for ATRX, which is X-linked. H3F3A (H3.3) and HIST1H3B (H3.1) mutations are grouped together into the category H3. Structural variants involving NTRK1, NTRK2, or NTRK3, and copy number variants of components of the CyclinD1, D2, D3, or CDK4, CDK6 G1 checkpoint complex are grouped together as NTRK1/2/3 or CCND1/2/3/CDK4/6, respectively. Tumor subgroup (DIPG or NBS-HGG), location of NBS-HGGs (midline versus tumors in cerebral hemispheres), and tumor grade are indicated. White boxes for location or tumor grade indicates information not available. < 3 y.o denotes less than 3 years of age. Mutations for 112 HGGs are shown. Four hypermutator samples and 11 samples for which only RNA-seq data was available, were excluded from this summary. Data is shown in tabular form in Supplementary Table 9.
Fig. 2
ACVR1 mutations in DIPG activate BMP signaling
a. Missense ACVR1 substitutions in DIPG were clustered in the glycine/serine rich domain (G/S) or kinase domain. Each red circle indicates a DIPG carrying the specified mutation, and an * indicates mutations previously found as germline mutations in individuals with FOP. The extracellular domain (EC) and transmembrane domain (TM) did not contain mutations.
b. ACVR1 mutations ventralize zebrafish embryos. Graph shows the percentage of embryos exhibiting a dorsalized or ventralized phenotype. Embryos injected with wild-type ACVR1 mRNA (WT) showed a dorsalized phenotype, while embryos injected with mutant ACVR1 mRNA showed a ventralized phenotype (increasing severity from left to right). R258G had the least severe effect, resulting only in the V3-V4 ventralized phenotype, whereas G328V had the most severe effect with 90% of embryos showing the V5 ventralized phenotype. The number of embryos examined is shown on top.
c. Representative phenotype images of zebrafish embryos injected with the indicated ACVR1 mRNA. Untreated mutants R258G, G328E, G328W, and R206H have little to no dorsal structures, and G356D and G328V are more severely affected. Treatment with LDN-193189 (LDN) reversed the ventralization effects in the ACVR1 mutants, as can be seen by the partial rescue of dorsal structures (i.e. head) for R258G (100%, n=20), G328E (83%, n=23), G328W (100%, n=20), R206H (100%, n=27), and the reduced severity of ventralization without the formation of dorsal structures for G356D and G328V. Scale bar is 200 μm.
d. ACVR1 mutations drive increased levels of phospho-SMAD1/5 in primary astrocyte cultures. Western blots from lysates of primary astrocytes isolated from brainstem of neonatal Tp53 conditional knockout mice, transduced with retroviruses expressing FLAG-tagged ACVR1 wild-type, or indicated mutants, and serum starved for 2 hours. Quantitation of the ratio of phospho-SMAD/Total SMAD normalized to the empty vector control is shown below.
The recurrent and clonal activating mutation of ACVR1 in 32% of DIPGs provides strong evidence that it is an oncogenic driver in this disease. However, germline ACVR1 mutation in the genetic syndrome FOP is not associated with cancer predisposition, indicating that ACVR1 mutation likely provides a selective advantage in the presence of other critical mutations, rather than driving tumor initiation. Consistent with this hypothesis, all 6 of the DIPG-associated ACVR1 mutants failed to render Tp53-null astrocytes tumorigenic when implanted into brain (not shown). BMP signaling is associated with contrasting effects dependent on context, driving astrocytic differentiation, or proliferation of early hindbrain progenitors[24], and inducing differentiation of medulloblastoma[25], but driving either a differentiation or proliferative response in glioblastoma stem cells, related in part to the epigenetic state of the cell[26,27]. These context-dependent consequences likely drive the exclusive association between ACVR1 mutation and DIPG.Principal Component Analysis using the 1000 most variable probesets shows that HGG samples clustered by tumor location with no segregation of DIPGs by their ACVR1 mutation status (Supplementary Fig. 3g). Genes involved in regulation of immune system processes were significantly different between DIPGs with and without ACVR1 mutation (Fisher’s exact test, p=0.0002, FDR=0.26%) (Supplementary Table 10).Structural variants generating fusion genes are common, and were identified in 47% of pediatric HGG, in equal proportions of DIPGs and NBS-HGGs, from transcriptome and WGS data. Gene fusions involving the kinase domain of each of the three neurotrophin receptor (NTRK) genes and five different N-terminal fusion partners, were identified in 4% of DIPGs and 10% of NBS-HGGs. Notably, 40% (4/10) of NBS-HGGs in children younger than 3 years old contained an NTRK fusion gene (Fig. 1 and 3, Supplementary Tables 7 and 8). The NTRK receptors transduce a wide range of developmental signals in the nervous system ranging from induction of neurite outgrowth, differentiation, neuronal survival or death[28,29]. NTRK fusion genes have recently been identified at low frequency in low-grade pediatric astrocytomas as well as adult glioblastomas[30-32]. Two of the five NTRK fusions found in our cohort, TPM3-NTRK1 and ETV6-NTRK3, were previously identified in other cancer types and shown to be oncogenic[33-37].
a. All fusions included the C-terminal kinase domain from NTRK1, NTRK2 or NTRK3 (blue). N-terminal fusion partners include the tropomyosin domain (yellow) of TPM3, an actin-binding protein fused to NTRK1, the BTB/POZ dimerization domain (gray), and Kelch domain (orange) from the topoisomerase I-interacting protein BTBD1, or the pointed protein-protein interaction domain (purple), of the ETS transcription factor ETV6, fused to NTRK3. The N-terminus of the actin-binding protein Vinculin (light blue, VCL) was fused to NTRK2, and the N-terminus (green) of the ATP/GTP binding protein AGBL4 was fused to NTRK2. The functional carboxypeptidase domain of AGBL4 is not present in the fusion protein. For each fusion protein, the dotted red line shows the fusion point, with the amino acid of the N-terminal and C-terminal fusion partners breakpoint indicated. The full-length of the fusion protein is shown on the right end.
b. NTRK fusion proteins induce high-grade astrocytomas. Tp53-null mouse primary astrocytes isolated from neonatal cortex or brainstem were transduced with FLAG-tagged TPM3-NTRK1 (top row) or BTBD1-NTRK3 (lower row) respectively, and implanted into mouse brain. Representative (of 7 independent mice per construct) H&E stains show pleomorphic tumor cells, many with features of astrocytic differentiation and high mitotic activity. Tumors induced by BTBD1-NTRK3 showed the frequent presence of giant cells reminiscent of giant cell glioblastoma. Immunohistochemical analysis showed expression of FLAG-tagged NTRK fusion proteins, and elevated phospho-Akt, and phospho-p42/44 Mapk in tumor relative to surrounding normal tissue. Scale bar=50μm.
To test the ability of NTRK fusion genes to drive glioma formation, we implanted Tp53-null primary mouse astrocytes transduced with retrovirus expressing FLAG-tagged TPM3-NTRK1 or BTBD1-NTRK3 into mouse brain. Both NTRK fusions induced high-grade astrocytomas with very short latency and complete penetrance (Fig. 3b and Supplementary Fig.4). The resulting tumors showed elevated levels of phospho-Akt and phospho-p42/44 Mapk, downstream indicators of PI3K and MAPK pathway activation.Although NTRK activating fusions were specifically found at high frequency in infanttumors, activation of RTK/RAS/PI3K signaling through other mutations was frequent across the entire cohort, occurring in 69% of DIPGs and 67% of NBS-HGGs (Fig. 1, Supplementary Fig. 5 and Online Methods). In contrast to previous reports detecting EGFRvIII expression in pediatric HGG[38,39], we only detected EGFRvIII in one out of the 85 tumors analyzed by WGS or RNA seq.In addition to recurrent histone H3, ATRX and SETD2 mutations reported previously,[3-5,17,40] frequent mutations of other histone writers and erasers, and chromatin remodeling genes, were also detected (Fig. 4 and Supplementary Fig. 5). Interestingly, mutations in histone H3 writers were significantly more frequent in NBS glioblastomas (p=0.007), while mutations in histone erasers were not significantly different between DIPG and NBS-HGG (p=0.3). Although only ATRX mutations were highly recurrent, collectively this group of genes involved in epigenetic regulation was targeted by mutation in 22% of DIPGs and 48% of NBS-HGG, excluding histone H3 mutations (Fig. 4 and Supplementary Fig. 5). These mutations were often concurrent with missense mutations in histone H3. Indeed, including histone H3 mutations, 91% of DIPGs, 70% of midline NBS-HGGs, and 48% of hemispheric NBS-HGGs contain mutations in histone and/or this subgroup of epigenetic regulators. pK27M H3.3 or H3.1 mutation results in a dominant loss of H3K27me3 in the entire cellular pool of histone H3[41-44]. Mutations that modulate H3K27me3 by targeting components of the PRC2 complex that methylates H3K27, or the H3K27 demethylases KDM6A and KDM6B, are found in other tumor types[45-50]. However, there were no such mutations across the entire HGG cohort, including DIPGs with wild-type histone H3, supporting the unique selective advantage of pK27M mutation in pediatric HGG (Fig. 4). Mutations in transcriptional regulators that impact the epigenetic landscape were also found, including focal amplifications of MYC and MYCN, transcription factors that act to amplify levels of expressed genes across the genome[51,52], and truncating mutations in the transcriptional co-repressors BCOR and BCORL1 (Fig. 1).
Fig. 4
Pediatric HGG mutations in histone modifiers or chromatin regulators
Mutations identified in NBS-HGG (pink) or DIPG (blue) are shown. Genetic alterations were identified in proteins that attach (writers, shown above) or remove (erasers, shown below) post-translational modifications of lysines (K4, K9, K18, K27, K36) in the tail of histone H3, as well as proteins involved in modifications of other histones, or chromatin remodelers. Notably, there were no mutations in writers or erasers of K27, which is directly mutated at high frequency in DIPG, and to a lesser degree in NBS-HGG.
There was an enormous range in the complexity of somatic mutations driving pediatric HGG. The 10 NBS-HGGs in children under three years old showed significantly lower mutation rates than the rest of the cohort (p<0.0001), suggesting that only a small number of driver mutations is required in tumors from this age group (Supplementary Fig. 7 and Fig. 5). ETV6-NTRK3 was identified as one of only two non-silent alterations in SJHGG082, a glioblastoma from a one month-old patient. Notably, NTRK fusion genes, including two of the fusions found here, were identified in multiple tumor types, including those from very young children such as congenital fibrosarcoma, as well as in papillary thyroid cancer, an adult disease[30,33-36]. The high frequency of NTRK fusion genes in NBS-HGGs from children younger than three, the paucity of additional mutations in these tumors, and the rapid tumor onset in our experimental glioma model, strongly suggest that these fusion genes are potent oncogenic drivers in early postnatal brain tumor development.
Fig. 5
CIRCOS plots showing the range of structural alterations in pediatric HGG
CIRCOS plots display the genome by chromosome in a circular plot, and depict structural genetic variants, including DNA copy number alterations, intra- and inter-chromosomal translocations, and non-sequence mutations. Loss of heterozygosity, orange; amplification, red; deletion, blue Sequence mutations in RefSeq genes: missense SNVs, brown; indels, red; splice site SNVs, blue, Genes at SV breakpoints: genes involved in in-frame fusions, pink; others, blue.
Patient SJHGG003 carried a germline PMS2 mutation, and developed two independent tumors, first a hemispheric malignant glioneuronal tumor (SJHGG003_D), and 2 years later a DIPG (SJHGG003_A). The SNVs and indels for these two cases were too numerous to include on the plot for these two cases. The hypermutator tumor with more than 800,000 somatic mutations had an extremely stable genome (SJHGG003_D), while the second tumor with approximately 100-fold fewer SNVs, carried typical genomic copy number and structural abnormalities as seen in other HGGs (SJHGG003_A), thus demonstrating the broad range and complexity of mutations associated with germline PMS2 mutation. SJHGG016_D is an infant NBS-HGG with a TPM3-NTRK1 fusion and very stable genome. SJHGG027_D is an NBS-HGG from a patient with A-T, showing a relatively stable genome. This tumor sample was collected prior to radiotherapy. SJHGG004_D, is a DIPG with chromothripsis driving BTBD1-NTRK3 fusion, shown in more detail in Supplementary Fig. 8a. SJHGG044_D is an NBS-HGG showing dramatic chromothripsis. For the examples of chromothripsis, the names of genes disrupted by structural variants were too numerous to display.
Four tumors, three with matched normal DNA, were scored as hypermutators, with an extremely high background mutation rate, including more than 800,000 somatic mutations in SJHGG111, in which biallelic germline PMS2 mutations were identified (Supplementary Fig. 7). Patient SJHGG003 carried a heterozygous germline PMS2 mutation, and developed a grade IV hemispheric malignant glioneuronal tumor (MGNT) and a separate DIPG. Although both tumors independently acquired different somatic inactivating mutations in the remaining PMS2 allele, the basal mutation rate in the first tumor arising in this patient, the MGNT (SJHGG003_D) was nearly 100-fold higher than in the DIPG that arose 2 years later (SJHGG003_A), demonstrating the potential range in tumor mutation burden associated with inherited mismatch repair mutations (Supplementary Fig. 7, Fig. 5). Hypermutator tumors were excluded from the evaluation of mutation frequency. Seven patients carried germline mutations in known cancer predisposition genes, including TP53, PMS2, MSH6 and NF1 (Supplementary Table 11).Thirteen tumors (31%) analyzed by WGS had chromothripsis[53], shown by complex re-arrangements with multiple inter-connecting breakpoints corresponding to genomic segments of oscillating copy number states (Supplementary Results, Supplementary Table 12, Fig. 5 and Supplementary Fig. 8-9). Among our cohort, chromothripsis resulted in oncogenic re-arrangement including BTBD1-NTRK3 fusion, re-arrangement/amplification of PDGFRA and EGFR (Supplementary Fig. 8). Nearly half of all samples showing chromothripsis were collected prior to adjuvant therapy, indicating that the mechanism was, at least in some cases, independent of DNA-damaging therapeutics. SJHGG027_D, an NBS-HGG arising in a child with ataxia telangiectasia (A-T), had a relatively stable genome despite a compromised DNA damage checkpoint due to the absence of functional ATM (Fig. 5, and Supplementary Fig. 7). Multiple subclones were identified in almost all HGG tumors. A founder clone, or a descendant of a founder clone in the diagnosis tumor could seed the development of relapsed or autopsy tumor (Supplementary Results and Supplementary Fig. 10).We identified TERT promoter mutations in only 2% of DIPGs and 3% of NBS-HGGs, in strong contrast to the frequency in 86% of adult primary glioblastomas[54].The genomic landscape of pediatric HGGs also includes frequent mutations in common cancer pathways, consistent with previous reports[3,7-9,11,12,15,16]. TP53 mutations occurred in 42% of pediatric HGG and were mutually exclusive with truncating mutations in the TP53-induced phosphatase PPM1D, previously shown to impair the TP53-dependent G1 checkpoint (Fig. 1)[55]. The G1 checkpoint regulators CCND1, 2 and 3, CDK4 and CDK6, were predominantly amplified in DIPG, while CDKN2A homozygous deletion was restricted to NBS-HGGs (Fig. 1). Taken together, mutations impacting cell cycle regulation, including the TP53 and RB pathways, were found in 59% of pediatric HGG (Fig. 1, Supplementary Fig. 5).This global view of the genetic landscape of pediatric HGG defines critical pathways driving a devastating spectrum of childhood brain tumors, and identifies high frequency mutations in potential therapeutic targets; ACVR1 in DIPGs, and NTRK fusions in infantNBS-HGGs.
Online Methods
Patient cohorts and sample details
High grade gliomas (HGGs) (WHO Grade III and IV) were requested from the St. Jude Children’s Research Hospital tissue resource core facility, and from the Institute of Cancer Research/Royal Marsden Hospital with approval for genome sequence analysis in accordance with St Jude Institutional Review Board (IRB) approval for the Pediatric Cancer Genome Project (PCGP), and the Clinical Research and Development Board of the Royal Marsden Hospital and the United Kingdom Children’s Cancer and Leukemia Group research ethics approval. Detailed clinicopathologic and sequencing information is in Supplementary Table 1. There was a significant association between gender and tumor subtype, with 63% female DIPGpatients and 63% male NBS-HGGpatients (p=0.004).The study cohort comprised 127 (57 DIPGs and 70 NBS-HGGs) tumors (54 DIPGs and 54 NBS-HGG with matching germline samples) from 118 patients in two cohorts: a cohort for whole genome sequencing (WGS, n=42, 20 DIPGs and 22 NBS-HGGs), a cohort for evaluating the frequency of abnormalities using whole exome sequencing (WES, n=80, 36 DIPGs and 44 NBS-HGGs). Six tumors including their matched germline samples, including two hypermutator tumors (SJHGG003_D and SJHGG111_D) and four non-hypermutator tumors (SJHGG003_A, SJHGG008_A, SJHGG019_E and SJHGG022_D) were sequenced by both whole genome and whole exome sequencing. Among these tumors, 75 (31 DIPGs and 44 NBS-HGGs) were characterized by RNA-seq. In addition, 12 tumors (3 DIPGs and 9 NBS-HGGs) were characterized by RNA-seq only for structural variant discovery.Tumor was available from diagnosis and relapse in 5 cases (SJHGG019_E/S, SJHGG024_D/R, SJHGG033_D/R, SJHGG112_D/R, SJHGG115_D/R), or diagnosis and autopsy in 3 cases (SJHGG001_D/A, SJHGG002_D/A, SJHGG093_D/A). One patient developed two independent tumors, a hemispheric malignant glioneuronal tumor (SJHGG003_D), and then a subsequent independent DIPG (SJHGG003_A).Histopathology was centrally reviewed by DWE, an experienced neuropathologist, and MRI images of DIPG cases were reviewed by a pediatric neuro-oncologist (AB). DNA and RNA was extracted as previously described [56].
Whole-genome, whole exome and transcriptome sequencing and analysis
WGS, WES, and RNA-seq were performed as previously described [46,57]. For WGS, WES and RNA-Seq, paired-end sequencing was performed using the Illumina GAIIx or HighSeq platform with 100bp read length. The WGS data are deposited at the European Bioinformatics Institute (EBI) with accession number: WGS mapping, coverage and quality assessment, Single nucleotide variations (SNVs)/indel detection, tier annotation for sequence mutations, prediction of deleterious effects of missense mutations, and identification of loss of heterozygosity (LOH) were described previously [46,57]. The reference human genome assembly GRCh37-lite was used for mapping all samples. The mapping statistics and coverage of each tumor on different sequencing platforms were summarized in Supplementary Table 2 and Supplementary Figure 1.SNVs were classified into the following four tiers, as previously described [46,57]: a) Tier 1: coding synonymous, nonsynonymous, splice-site, and non-coding RNA variants; b) Tier 2: conserved variants (cutoff: conservation score ≥ 500, based on either the phastConsElements28way table or the phastConsElements17way table from the UCSC genome browser, and variants in regulatory regions annotated by UCSC annotation (Regulatory annotations included are targetScanS, ORegAnno, tfbsConsSites, vistaEnhancers,eponine, firstEF, L1 TAF1 Valid, Poly(A), switchDbTss, encodeUViennaRnaz, laminB1, cpgIslandExt); c) Tier 3: variants in non-repeat masked regions; d) Tier 4: the remaining SNVs.All tier 1, tier2 and tier3 sequence mutations (including SNVs, indels and SVs) discovered in non-hypermutator WGS samples were validated by custom capture platform. The overall validation rate is 93%, with a median validation rate 95% per sample. All tier1 SNVs in WES cohort were also validated by custom capture. For all gene coding indels found in WES samples, we performed the validation with MiSeq platform and the validation rate was 92% (167/182). Four non-hypermutator WGS samples (SJHGG003_A, SJHGG008_A, SJHGG019_E and SJHGG022_D) were subjected to exome sequencing so the overlapping SNVs/indels were regarded as validated. The validated and high-quality variations for tiers 1-3 mutations in non-hypermutator tumors are summarized in Supplementary Table 3, 4, 5.For two hypermutator WGS samples (SJHGG003_D and SJHGG111_D), the exome sequencing served as validation. SNVs found in both WGS and WES were regarded as validated. The validated and high-quality variations for tier 1 mutations in hypermutator tumors have been deposited at the PCGP Explorer website (http://pcgpexplore.org/).CNVs were identified by evaluating the difference in read depth between each tumor and its matching normal using a novel algorithm, CONSERTING (COpy Number SEgmentation by Regression Tree In Next-Gen sequencing; Chen et al, manuscript in preparation). The results are reported in Supplementary Table 6.Structural variations in WGS were analyzed using CREST and annotated as previously described [58,46]. Custom capture was used to validate somatic SVs found in WGS samples. The results are reported in Supplementary Table 7. Paired-end reads from RNA-seq were aligned to the following 4 database files using BWA (0.5.5) aligner[59]: (1) the human GRCh37-lite reference sequence, (2) RefSeq, (3) a sequence file representing all possible combinations of non-sequential pairs in RefSeq exons, (4) AceView database flat file downloaded from UCSC representing transcripts constructed from human ESTs. The mapping results from (2) to (4) were aligned to human reference genome coordinates. The final BAM file was constructed by selecting the best alignment in the four databases. SV detection was carried out using CICERO, a novel algorithm that uses de novo assembly to identify structural variation in RNASeq (Li et al, manuscript in preparation). For the structural variants detected in RNA-seq, we validated with MiSeq sequencing. Primer pairs were designed (with Primer3) to bracket the genomic regions containing putative SVs. The SVs found by RNA-seq are reported in Supplementary Table 8.
Microarray copy number and expression analysis
Affymetrix SNP6.0 arrays were analyzed as described [8]. Candidate targets of focal amplification or deletion were identified from minimum common regions with copy number greater than 5 or less than 0.8. For DIPG samples, we also identified candidate targets of focal gain or loss as described [8]. Briefly, we derived minimum common regions for recurrent focal gains (copy number > 2.3) or recurrent focal deletions (copy number < 1.7) found in at least two tumors or were classified as a single focal gain or deletion. Regions associated with known CNVs were removed. All remaining regions with less than 60 genes were manually inspected for cancer/glioma-related genes, and candidate targets of focal gain or deletion were selected. Focal amplifications of MYC, MYCN,PDGFRA, MET, CDK4, CDK6, CCND1, CCND2, and CCND3, as well as focal deletions of CDKN2A and CDKN2B from SNP data were included in Fig. 1 because many samples sequenced only by exome lacked this copy number data.Affymetrix U133v2 expression array data was available for 71 HGG, including 32 DIPG samples, nine with ACVR1 mutation [8]. Principal Component Analysis was done using GeneMaths with top 1000 most variable probes selected based on median absolute deviation (MAD) score. The differentially expressed genes between DIPGs with and without ACVR1 mutation were selected with linear models (limma/R, p-value < 0.001), and enrichment of GO terms among the selected genes was evaluated using DAVID Bioinformatics Resource (http://david.abcc.ncifcrf.gov/).
Frequency of mutations in pathways
To summarize samples altered in RTK/RAS/PI3K pathways, we collected the genes from KEGG, Ingenuity, and NCI-Nature Protein Interaction Database, and limited to the following genes that have at least one somatic mutation in our study cohort: 1) RTK: PDGFRA, KIT, EGFR, MET, CSF1R, FGFR1, FGFR3, FLT1, IGF1R, INSR, NRTK1, NTRK2, NTRK3; 2) PIP3 regulation: PTEN, PIK3CA, PIK3R1; 3) Ligand: FGF3, FGF5, FIGF, PDGFA, VEGFC; and 4) downstream effector: PLCG2, PLEKHA2, YES1, SGK1, G6PC, GNB1, GNGT1, MLST8, PHLPP1, PKN2, PPP2R2D, PRKAA2, PRKCA, PRKCZ, RAC1, RPS6, RPTOR, SGK3, STK11, INPP5D, SOS1, AKT3, JAK1, KRAS, MAP2K1, TSC2, GAB2, PPM1L, NF1, BRAF, RASGRF2, RASSFS. To summarize the samples altered in cell cycle regulation, we included the following genes: TP53, TP73, CCND1, CCND2, CDK1, CDK6, CDKN1B, CDKN2A, CDKN2B, CDKN2C, RB1, CDK4, CCND3, RBL1, CDC27. For DNA repair genes, we included the following: ATR, ATM, BRCA1, RBBP8, RAD51, BRCA2, ERCC2, ERCC3, ERCC8, LIG4, RAD23A, XAB2, MSH6, LIG1, LIG3, LIG4, POLD1, POLE, RAD23A, RAD50, RAD54B, RUVBL2, SETX, PMS2, UVRAG. To summarize samples altered in histone modifications and chromatin regulators, we limited our analysis to the following genes: 1) Histone writer: MLL, MLL2, MLL3, MLL4, PRDM9, SMYD3, SETD1A, SETD2, SETD3, ASH1L; 2) Histone eraser: JMJD1C, KDM2A, KDM3A, KDM3B, KDM4B, KDM5B, KDM5C, SIRT7; 3) Chromatin remodeler: ATRX, SMARCA4, BRWD1, CBX4, CHD2, CHD4, CHD6, CHD7, CHD8, RAD54B; 4) Other writers: UHRF1, NCOA1, STK4, UBR2, UBR5. Therefore our estimates of the numbers of mutations impacting these pathways are conservative.
TERT promoter mutation analysis
Due to the high GC content at the TERT promoter regions, there was poor coverage (×3x on average) in WGS at the two recurrently mutated sites (chr5.1295228 and chr5.1295250). Therefore, a portion of the TERT promoter (HG19 coordinates, chr5: 1295151-1295347) was amplified by PCR from tumors and matched normal and sequenced by Sanger Sequencing using primers in Supplementary Table 13 to check for two previously described TERT promoter SNVs. Sequence was analyzed using SNPDetector [60] and manual review was performed using Consed[61].
Statistical analysis
Association between ACVR1 mutation and age at diagnosis was analyzed by the Kruskal-Wallis test: H = 6.62 DF = 1 P = 0.010 (adjusted for ties). Association of ACVR1 mutation with survival of DIPGpatients was quantified by the log rank test: Chisq= 10.1 on 1 degrees of freedom, p= 0.00149. Based on a two-variable COX model for DIPGpatients where age at diagnosis (as a continuous variable) and ACVR1 mutation status were included simultaneously, it appears that age at diagnosis was not significantly associated with survival within this cohort (p-val =0.55), but ACVR1 status is (p-val=0.0046). The co-occurrence of ACVR1 mutations with other mutations was quantified by Fisher’s exact test: ACVR1 co-occurrence with pK27M HIST1HB: p-Value = 0.0000001, and ACVR1 co-occurrence with PIK3CA and PIK3R1 mutations: p-Value = 0.0049702. The association between gender and tumor subtype (DIPG vs NBS-HGG) was quantified by Fisher’s exact test. The association between tier 1 mutation rate and tumors in children less than 3 years of age was quantified by Kruskal-Wallis Test (H=13.42, DF=1, p<0.0001). The association between mutations in H3 writers and NBS-HGGs was quantified by Pearson Chi-square (by monte-carlo), p=0.0007.To identify samples with extremely high and low mutation rates, we used least median squares (LMS) as a method for robust estimation of the center of the data and outlier detection [62]. Basically, LMS identifies the shortest interval that covers at least 50% of the data. This interval represents the densest “bulk” of the data and outliers are detected by comparing the data values to a normal distribution with the same IQR. Thus, the LMS interval will not be influenced by outliers and it can be used to detect outliers.
Germline mutation analysis
We identified germline variants as previously described [63]. In this study, we implemented additional filters: 1) we only kept genes listed in Cancer Gene Consensus or genes involved in DNA damage/repair; 2) Only nonsense or splice SNVs and indels were kept, except a few missense mutations for TP53.
Zebrafish embryo injections
ACVR1 wild-type or mutant cDNA was cloned into pCS2 and mRNA was generated using the mMESSAGE mMACHINE Kit (Life Technologies). Zebrafish embryos of the AB strain were microinjected with approximately 20 pg of ACVR1 wild-type or mutant mRNA at the one-cell stage. Embryos were inspected at 24 hours post fertilization (hpf) and scored as ventralized (Classes V1-V5) or dorsalized (Classes C1-C4) based upon published criteria [23, 22]. For chemical rescue, 2.5 μM LDN-193189 (Stemgent) was added to embryos at about 3 hpf.
Primary astrocyte transductions and in vivo tumorigenesis assays
A FLAG tag was incorporated by PCR immediately before the termination codon in cDNA encoding the full open reading frame of wild-type or mutant ACVR1, TPM3-NTRK1 and BTBD1-NTRK2 and cloned into the retroviral vectors MSCV-IRES-mCherry (MIC) [64], which was modified by inserting an adapter into a blunted EcoRI site to generate a Gateway cloning site, or MSCV-IRES-Puro (MIP) [65]. Sequence of all constructs was verified. Retrovirus was produced by co-transfecting 293T cells along with helper plasmids as previously described [66].Mouse experiments were approved by the Institutional Animal Care and Use Committee and are in compliance with national and institutional guidelines. Tp53-null primary mouse astrocytes (PMAs) were isolated from cortex or brainstem of postnatal day 2 GFAP-cre;Trp53 (Tp53 conditional knockout mice)[67,68] and transduced with retroviruses as previously described [69]. PMA cultures isolated from multiple mice were pooled for retroviral transductions to control for potential variation among primary cultures. For tumorigenesis studies, at passage one, Tp53-null PMAs were transduced with retroviruses generated from MIC vectors expressing wild-type or mutant ACVR1 or TPM3-NTRK1 or BTBD1-NTRK2 and implanted into immunodeficientmice as described [70]. When mice displayed brain tumor symptoms, tumors were dissected, fixed in 4% paraformaldehyde in PBS at 4°C overnight, then processed, embedded in paraffin, and cut into 5 μm sections. Hematoxylin and eosin stained sections from all collected tumors were evaluated by a clinical neuropathologist (DWE). Immunohistochemistry was performed with microwave antigen retrieval, primary antibodies against p-Akt Ser 473(Cell Signaling #9271, 1:50), p-p42/44 Mapk Thr202/Tyr204 (Cell Signaling #4270, 1:400), or FLAG (Sigma Aldrich, F1804, 1:100), biotinylated secondary antibodies and horseradish peroxidase-conjugated streptavidin (Elite ABC, Vector Labs), detected with NovaRED substrate (Vector Labs) and counterstained with hematoxylin (Vector Labs). For TPM3-NTRK1, BTBD1-NTRK2 and empty vector controls, 7 mice were implanted for each construct. For ACVR1, empty vector control, wild-type, G328E, R258G or R206HACVR1 (7 mice each construct), or G356D (6 mice) were implanted. A few of the immunodeficientmice were euthanized when they became ill without showing brain tumor symptoms (One G328E at 97 days post-implantation, one R258G at 217 days, 2 G356D at 139 and 190 days, and one WT at 212 days). The remaining mice were euthanized between 219 to 222 days postimplantation. Brains were formalin fixed and paraffin embedded, sectioned, stained with H&E and evaluated for ACVR1-driven brain tumor growth both by analysis of H&E sections in the area surrounding the implantation site, as well as by immunohistochemistry for the mCherry marker expressed as part of the bicistronic message with ACVR1. None of the mice showed brain tumor symptoms, and no histological evidence of ACRV1-driven tumor formation was detected in the entire cohort.
Western Blotting
For Western blotting, Tp53-null PMAs were isolated from neonatal brainstem and transduced at passage one with MIP retroviruses expressing wild-type or mutant ACVR1. 48 hours later, cells were selected with 2.5 μg/ml puromycin for 48 hours. For serum starvation, cells were washed twice with PBS, then incubated for 2 hours in media without serum or growth factors. For treatment with LDN-193189, cells were incubated in standard astrocyte growth media [69] with vehicle (DMSO) or LDN-193189 (1μM). Cells were lysed in RIPA buffer with protease and phosphatase inhibitors (Roche). 20μg of protein was resolved on 4-12% NuPAGE Bis-Tris gels, transferred to nitrocellulose membranes, and detected with antibodies targeting p-SMAD1/5 (Cell Signaling, 9516, 1:1000), total SMAD1/5/8 (Santa Cruz, sc-6031, 1:500), p-p38MAPK (Invitrogen, 44-684G, 1:1000), total p38MAPK (Cell Signaling, 9212, 1:1000), FLAG (Sigma-Aldrich, F1804, 1:1000) and tubulin (Santa Cruz, sc-23948, 1:1000). Following incubation with the appropriate HRP-conjugated secondary antibody, membranes were exposed to chemiluminescent substrate (SuperSignal West Dura, 34076, Thermo Scientific). Images were obtained with the Odyssey Imaging System (LI-COR Biosciences).
RT-PCR Validation of TRK Fusions
Random-primed cDNA was generated by reverse transcriptase from tumor RNA and used for PCR to identify the fusion gene of interest. PCR products were analyzed by Sanger sequencing. Primers are listed in Supplementary Table 13.
Statistical evaluation of chromothripsis in HGG tumors analyzed by WGS
Chromothripsis was described as localized chromosome shattering and repair occuring in a single event. The initial criterion is oscillation between 2 main CNV states [53], which were found in 15 HGG tumors in this study. Most recently, Korbel and Campbell [71] proposed four potential criteria for assessing chromothripsis: 1) clustering of breakpoints; 2) randomness of DNA fragment joins; 3) randomness of DNA fragment order; and 4) ability to walk the derivative chromosome. Since randomness of DNA fragment order (Criterion 3) was not entirely valid, even in Korbel and Campbell’s own analysis, we decided not to evaluate this feature. For the 13 tumors in Supplementary Table 12, we performed Bartlett’s goodness-of-fit test for exponential distribution to assess whether the distribution of SV breakpoints in each tumor departs from the null hypothesis of random distribution. A significant departure from random distribution supports clustering of SV breakpoints. To evaluate whether there is any bias in the DNA fragment joints categorized by the SV types (i.e. deletion, tandem duplication, head-to-head re-arrangements and tail-to-tail re-arrangements), we applied goodness-of-fit test separately for inter- and intra-chromosomal events with a minimum of 5 SVs. A significant p value suggests biased fragment joins, which would not support chromothripsis. When both inter- and intra-chromosomal data are available, we reported the lower p value to represent a more conservative assessment of the random distribution for DNA fragment joints.Details of Tumor Purity and Tumor Heterogeneity Estimations, and Tumor Evolution Analysis are included in the Supplementary Note.
Authors: Barbara S Paugh; Chunxu Qu; Chris Jones; Zhaoli Liu; Martyna Adamowicz-Brice; Junyuan Zhang; Dorine A Bax; Beth Coyle; Jennifer Barrow; Darren Hargrave; James Lowe; Amar Gajjar; Wei Zhao; Alberto Broniscer; David W Ellison; Richard G Grundy; Suzanne J Baker Journal: J Clin Oncol Date: 2010-05-17 Impact factor: 44.544
Authors: Qi Shen; Shawn C Little; Meiqi Xu; Julia Haupt; Cindy Ast; Takenobu Katagiri; Stefan Mundlos; Petra Seemann; Frederick S Kaplan; Mary C Mullins; Eileen M Shore Journal: J Clin Invest Date: 2009-10-12 Impact factor: 14.808
Authors: Dorine A Bax; Nathalie Gaspar; Suzanne E Little; Lynley Marshall; Lara Perryman; Marie Regairaz; Marta Viana-Pereira; Raisa Vuononvirta; Swee Y Sharp; Jorge S Reis-Filho; João N Stávale; Safa Al-Sarraj; Rui M Reis; Gilles Vassal; Andrew D J Pearson; Darren Hargrave; David W Ellison; Paul Workman; Chris Jones Journal: Clin Cancer Res Date: 2009-09-08 Impact factor: 12.531
Authors: Joshua D Schiffman; J Graeme Hodgson; Scott R VandenBerg; Patrick Flaherty; Mei-Yin C Polley; Mamie Yu; Paul G Fisher; David H Rowitch; James M Ford; Mitchel S Berger; Hanlee Ji; David H Gutmann; C David James Journal: Cancer Res Date: 2010-01-12 Impact factor: 12.701
Authors: Dorine A Bax; Alan Mackay; Suzanne E Little; Diana Carvalho; Marta Viana-Pereira; Narinder Tamber; Anita E Grigoriadis; Alan Ashworth; Rui M Reis; David W Ellison; Safa Al-Sarraj; Darren Hargrave; Chris Jones Journal: Clin Cancer Res Date: 2010-06-22 Impact factor: 12.531
Authors: Stephen C Mack; Christopher G Hubert; Tyler E Miller; Michael D Taylor; Jeremy N Rich Journal: Nat Neurosci Date: 2016-01 Impact factor: 24.884
Authors: Kristian W Pajtler; Ji Wen; Martin Sill; Tong Lin; Wilda Orisme; Bo Tang; Jens-Martin Hübner; Vijay Ramaswamy; Sujuan Jia; James D Dalton; Kelly Haupfear; Hazel A Rogers; Chandanamali Punchihewa; Ryan Lee; John Easton; Gang Wu; Timothy A Ritzmann; Rebecca Chapman; Lukas Chavez; Fredrick A Boop; Paul Klimo; Noah D Sabin; Robert Ogg; Stephen C Mack; Brian D Freibaum; Hong Joo Kim; Hendrik Witt; David T W Jones; Baohan Vo; Amar Gajjar; Stan Pounds; Arzu Onar-Thomas; Martine F Roussel; Jinghui Zhang; J Paul Taylor; Thomas E Merchant; Richard Grundy; Ruth G Tatevossian; Michael D Taylor; Stefan M Pfister; Andrey Korshunov; Marcel Kool; David W Ellison Journal: Acta Neuropathol Date: 2018-06-16 Impact factor: 17.088