| Literature DB >> 28584254 |
A Burns1,2, R Alsolami1,3,4, J Becq5, B Stamatopoulos1, A Timbs1, D Bruce2,6, P Robbe1,2, D Vavoulis1,2, R Clifford2, M Cabes1, H Dreau1, J Taylor7, S J L Knight7, R Mansson8, D Bentley5, R Beekman9, J I Martín-Subero9, E Campo10, R S Houlston11, K E Ridout1,2, A Schuh1,2,6.
Abstract
Chronic lymphocytic leukaemia (CLL) consists of two biologically and clinically distinct subtypes defined by the abundance of somatic hypermutation (SHM) affecting the Ig variable heavy-chain locus (IgHV). The molecular mechanisms underlying these subtypes are incompletely understood. Here, we present a comprehensive whole-genome sequencing analysis of somatically acquired genetic events from 46 CLL patients, including a systematic comparison of coding and non-coding single-nucleotide variants, copy number variants and structural variants, regions of kataegis and mutation signatures between IgHVmut and IgHVunmut subtypes. We demonstrate that one-quarter of non-coding mutations in regions of kataegis outside the Ig loci are located in genes relevant to CLL. We show that non-coding mutations in ATM may negatively impact on ATM expression and find non-coding and regulatory region mutations in TCL1A, and in IgHVunmut CLL in IKZF3, SAMHD1,PAX5 and BIRC3. Finally, we show that IgHVunmut CLL is dominated by coding mutations in driver genes and an aging signature, whereas IgHVmut CLL has a high incidence of promoter and enhancer mutations caused by aberrant activation-induced cytidine deaminase activity. Taken together, our data support the hypothesis that differences in clinical outcome and biological characteristics between the two subgroups might reflect differences in mutation distribution, incidence and distinct underlying mutagenic mechanisms.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28584254 PMCID: PMC5808074 DOI: 10.1038/leu.2017.177
Source DB: PubMed Journal: Leukemia ISSN: 0887-6924 Impact factor: 11.528
Figure 1Overview of the mutational landscape in IgHVmut and IgHVunmut CLL. (a) Total number of SNVs and indels per sample and by IgHV mutation status. Asterisks denote cases with IgHV 3-21 rearrangement. (b) Dot plot of total somatic mutation count between IgHV subgroups. Patients with IgHVmut CLL harbour higher total mutational load, but reduced driver coding (c) and higher promoter variant counts (d and e) than IgHVunmut CLL. Error bars show±s.e.m.
Figure 2Structural rearrangements in IgHVunmut and IgHVmut CLL. Circos plot depicting the frequency and locations of CNAs and translocations detected in 46 CLL genomes. The two outermost tracks represent CNAs: CN loss (red) and CN gain (green). Copy number data is displayed as the percentage of each IgHV subgroup affected. The centre plot shows translocations identified in the cohort. Red links indicate tier 1 events (gene:gene), green links indicate tier 2 events (gene:intergenic) and grey indicate tier 3 (intergenic:intergenic). The widths of the bands are indicative of the number of patients affected by the translocation.
Regions of kataegis affecting gene regulatory and exonic regions in individual CLL cases
| CLL156 | Unmutated | 2 | 140 885 911 | 141 045 820 | 8 | |
| CLL154 | Mutated | 3 | 157 290 339 | 157 295 704 | 6 | |
| CLL348 | Mutated | 3 | 183 273 058 | 183 273 364 | 6 | |
| CLL063 | Unmutated | 5 | 21 810 369 | 21 843 504 | 9 | |
| CLL301 | Mutated | 7 | 122 433 699 | 122 517 296 | 12 | |
| CLL301 | Mutated | 7 | 122 622 586 | 122 638 447 | 6 | |
| CLL351 | Mutated | 9 | 123 416 699 | 123 479 606 | 43 | |
| CLL144 | Mutated | 11 | 108 121 624 | 108 129 499 | 9 | |
| CLL307 | Unmutated | 13 | 51 664 141 | 51 665 954 | 6 | |
| CLL156 | Unmutated | 14 | 21 835 327 | 21 835 546 | 6 | |
| CLL252 | Unmutated | 18 | 9 284 149 | 9 374 417 | 6 | |
| CLL301 | Mutated | 18 | 60 873 525 | 60 988 029 | 12 | |
| CLL348 | Mutated | 18 | 60 906 440 | 60 988 117 | 10 | |
| CLL301 | Mutated | X | 98 765 633 | 98 769 284 | 10 |
Abbreviations: CLL, chronic lymphocytic leukaemia; IgHV, Ig variable heavy-chain locus.
Figure 3Kataegic ATM mutations in CLL144. (a) Graphical representation of ATM showing the distribution of kataegic mutations in CLL144. Blue boxes represent exonic regions of the ATM transcript. (b) Dot plot comparing the ATM transcript expression levels of CLL cases with no 11q disruption (wild type), those with del(11q), mutations in ATM (ATMmut) and CLL144. Expression levels are shown as reads per million aligned reads. Error bars show±s.e.m.
Figure 4Non-coding mutations confirmed experimentally with ChIP-seq and ATAC-seq. Diagrams of non-coding mutations in three genes and the corresponding annotation data. All lower panels; correlation of mutation loci with H3K27Ac and ATAC-seq signal of CLL110 and ChromHMM annotation (all three layers obtained from experiments with primary CLL cells) and layered H3K27Ac data from the ENCODE database (GM12878). (a) IKZF3 locus and surrounding region on chromosome 17, with the position of all mutations detected in our cohort, (b) SAMHD1 locus and surrounding region on chromosome 20, with the position of all mutations detected in our cohort and (c) BIRC3 locus and surrounding region on chromosome 11, with the position of all mutations detected in our cohort.
Figure 5Overview of the mutational landscape in IgHVmut and IgHVunmut CLL. Top panel—Type and number of IgHV subclones present in each patient. Where several clones are present, IgHV status is determined by the identity of the dominant clone. Samples where only Sanger sequencing data were available are labelled as ‘not known’. Middle panels—Presence of copy number aberrations within the cohort. Lower panel—Incidence of both coding and non-coding mutations within the cohort. Colour spectrum (light to dark blue or light to dark pink) corresponds with mutational or CNA load per patient, respectively. *Genes containing kataegic mutations. ‡Genes with multiple mutations in regulatory regions and are involved in important B-cell pathways. §Genes with smaller mutational hotspots. Panel left: Number of mutations per gene across the entire cohort.
Figure 6Mutation signature analysis. (a) Three mutation signatures identified across 46 patients using non-negative matrix factorisation (NMF). (b) Correlations between signatures identified in 46 patients and the corresponding age, or proportion of mutated canonical AID or APOBEC sites per subtracted somatic sample. Blue lines=regression lines, and P-values from the glm. (c) Proportions of mutation signatures from Alexandrov (2013),29,43 where Sig.1B corresponds to Alexandrov Signature 1B, and so on, and signatures found across 46 patients.