| Literature DB >> 27136393 |
Sandro Morganella1, Ludmil B Alexandrov2,3,4, Dominik Glodzik2, Xueqing Zou2, Helen Davies2, Johan Staaf5, Anieta M Sieuwerts6, Arie B Brinkman7, Sancha Martin2, Manasa Ramakrishna2, Adam Butler2, Hyung-Yong Kim8, Åke Borg5, Christos Sotiriou9, P Andrew Futreal1,10, Peter J Campbell2, Paul N Span11, Steven Van Laere12, Sunil R Lakhani13,14, Jorunn E Eyfjord15, Alastair M Thompson16,17, Hendrik G Stunnenberg7, Marc J van de Vijver18, John W M Martens6, Anne-Lise Børresen-Dale19,20, Andrea L Richardson21,22, Gu Kong8, Gilles Thomas23, Julian Sale24, Cristina Rada24, Michael R Stratton2, Ewan Birney1, Serena Nik-Zainal2,25.
Abstract
Somatic mutations in human cancers show unevenness in genomic distribution that correlate with aspects of genome structure and function. These mutations are, however, generated by multiple mutational processes operating through the cellular lineage between the fertilized egg and the cancer cell, each composed of specific DNA damage and repair components and leaving its own characteristic mutational signature on the genome. Using somatic mutation catalogues from 560 breast cancer whole-genome sequences, here we show that each of 12 base substitution, 2 insertion/deletion (indel) and 6 rearrangement mutational signatures present in breast tissue, exhibit distinct relationships with genomic features relating to transcription, DNA replication and chromatin organization. This signature-based approach permits visualization of the genomic distribution of mutational processes associated with APOBEC enzymes, mismatch repair deficiency and homologous recombinational repair deficiency, as well as mutational processes of unknown aetiology. Furthermore, it highlights mechanistic insights including a putative replication-dependent mechanism of APOBEC-related mutagenesis.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27136393 PMCID: PMC5001788 DOI: 10.1038/ncomms11383
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Summary of relationships between each mutational signature and various genomic features.
| Mutational signature | Mutation type | Predominant features of signature | Associated mutational process | Transcriptional strand | Replicative strand | Replication time | Chromatin organization |
|---|---|---|---|---|---|---|---|
| 1 | Sub | C>T at | Deamination of methyl-cytosine (age associated) | Some bias | Enriched late | ||
| 5 | Sub | T>C | Uncertain (age associated) | Some bias | Some bias | Enriched late | Slight enrichment at linker |
| 2 | Sub | C>T at Tp | APOBEC related | Some bias | Strong lagging strand bias | Enriched late | |
| 13 | Sub | C>G at Tp | APOBEC related | Some bias | Strong lagging strand bias | Flat | |
| 6 | Sub | C>T (and C>A and T>C) | MMR deficient | Some bias | Flat | ||
| 20 | Sub | C>A (and C>T and T>C) | MMR deficient | Some bias | Enriched late | ||
| 26 | Sub | T>C | MMR deficient | Some bias | Strong bias | Enriched late | Enriched at linker |
| 3 | Sub | HR deficient | Some bias | Some bias | Enriched late | ||
| 8 | Sub | C>A | amplified by HR deficiency? | Some bias | Enriched late | ||
| 18 | Sub | C>A | Uncertain | Some bias | Some bias | Enriched late | Enriched at nucleosomes and periodic |
| 17 | Sub | T>G | Uncertain | Some bias | Enriched late | Enriched at nucleosomes and periodic | |
| 30 | Sub | C>T | Uncertain | Flat | |||
| RS1 | Rearr | Large tandem duplications (>100 kb) | Uncertain type of HR deficiency? | NA | NA | Enriched early | |
| RS2 | Rearr | Dispersed translocations | NA | NA | Enriched early | ||
| RS3 | Rearr | Small tandem duplications (<10 kb) | HR deficiency (BRCA1) | NA | NA | Enriched early | |
| RS4 | Rearr | Clustered translocations | NA | NA | Enriched early | ||
| RS5 | Rearr | Deletions | HR deficient | NA | NA | Enriched early | |
| RS6 | Rearr | Other clustered rearrangements | NA | NA | Enriched early | ||
| Repeat-med | Indel | <3 bp indel at polynucleotide repeat tract | amplified when MMR deficient | NA | NA | Enriched late | Enriched at linker and periodic |
| Microhom | Indel | ≥3 bp indel with microhomology at breakpoint junction | HR deficient | NA | NA | Enriched late |
HR, homologous recombination; indel, insertions/deletions; rearr, rearrangement; RS, rearrangement signature; sub, substitution.
The 20 mutational signatures are noted in the left most column. This is followed by information on mutation classes, features that predominantly characterize each signature and associated aetiologies, if known. Relationships relating to transcriptional strands, replication time and strands and chromatin organization are also noted.
Figure 1Distribution of all mutations across the cell cycle.
Replication domains were identified by using conservatively defined transition zones in DNA replication time data. Data were separated into deciles, with each segment containing exactly 10% of the observed replication time signal. Normalized mutation density per decile is presented for early (left) to late (right) replication domains. (a) Aggregated distribution of mutations (green), rearrangements (purple) and indels (orange) across the cell cycle. (b) Distribution of the 12 base substitution signatures across the cell cycle. Dashed grey lines represent the predicted distribution of mutations for each signature based on simulations that take into account mutation burden and sequence characteristics of individual mutations and of the signatures that were estimated to be present in each patient (Methods section). (c) Distribution of the six rearrangement signatures across the cell cycle. Dashed grey lines represent the predicted distribution of mutations for each signature based on simulations. (d) Distribution of the indel signatures across the cell cycle.
Figure 2Replication and transcriptional strand bias and strand-coordinated mutagenesis of mutational signatures.
Forest plots showing replication (blue) and transcription (orange) strand bias for the 6 base substitution classes (a) and for the 12 base substitution signatures (b). Mutations were oriented in the pyrimidine context (the current convention for characterizing mutational signatures). Observed distribution between strands is shown as a diamond for replication and circle for transcriptional strands with 95% confidence intervals, against an expected probability of 0.5 (Supplementary Table 1 for values). (c) Relationship between processive group lengths (columns) and mutational signatures (rows). Processive groups were defined as sets of adjacent substitutions of the same mutational signature sharing the same reference allele, and the group length indicates the number of adjacent substitutions within each group. The size of each circle represents the number of groups (log10) observed for the specified group length (column) for each signature (row). The intensity of colour of each circle indicates significance of the likelihood of detection of a processive group of a defined length (−log10 of the P value obtained by comparing observed data to simulations, further details in Methods section).
Figure 3Relationship between mutational signatures and nucleosome occupancy.
The distribution of the signal of nucleosome density (y axis) is shown in a 2 kb window centred on each mutation (position 0 on the x axis), for each signature. The averaged signal was calculated as the total amount of signal observed at each point divided by total number of mutations contributing to that signal. (a) Nucleosome density for aggregated substitutions (green), and for deletions observed in MMR-proficient (blue) and MMR-deficient (orange) samples. (b) Nucleosome density for the twelve base substitution signatures (note the degree of variation between substitution signatures relative to aggregated substitutions in a). The grey line shows the distribution predicted by simulations if mutations from each signature were randomly distributed. The analysis reveals that most of the observed distributions showed similar trends to those expected from simulations, apart from signatures 17, 18 and 26 and to a lesser extent signatures 5 and 8.
Figure 4A replication-related model of mutagenesis for putative APOBEC-related signatures 2 and 13.
1. During replication, transient moments of increased availability of single-stranded DNA (ssDNA) (for example, uncoupling between leading and lagging replicative strands or delays in elongation of the nascent lagging strand by Okazaki fragments) could occur, exposing ssDNA for APOBEC deamination, potentially for long genomic tracts. 2. Uracil-N-glycosylase (UNG) acts to remove undesirable uracils leaving a trail of abasic sites in its wake. Divergence of mutational processes occurs from this point. 3A Earlier in replication, error-prone translesion polymerases such as REV1 have been postulated to insert cytosines opposite abasic sites to avoid detrimental replication fork stalling or collapse. 4A The final outcome is stretches of successive C>G transversions at a TpC sequence context characteristic of signature 13. 3B Alternatively, uracils and abasic sites that are not fixed via REV1, undergo contingency processing, for example, the ‘A' rule. 4B The final outcome is of C>T mutations at a TpC sequence context.