| Literature DB >> 29300948 |
Monika Zavodna1, Andrew Bagshaw2, Rudiger Brauning3, Neil J Gemmell1,4.
Abstract
Short tandem repeats (STR) are ubiquitous components of the genomic architecture of most living organisms. Recent work has highlighted the widespread functional significance of such repeats, particularly around gene regulation, but the mutational processes underlying the evolution of these highly abundant and highly variable sequences are not fully understood. Traditional models assume that strand misalignment during replication is the predominant mechanism, but empirical data suggest the involvement of other processes including recombination and transcription. Despite this evidence, the relative influences of these processes have not previously been tested experimentally on a genome-wide scale. Using deep sequencing, we identify mutations at >200 microsatellites, across 700 generations in replicated populations of two otherwise identical sexual and asexual Saccharomyces cerevisiae strains. Using generalized linear models, we investigate correlates of STR mutability including the nature of the mutation, STR composition and contextual factors including recombination, transcription and replication origins. Sexual capability was not a significant predictor of microsatellite mutability, but, intriguingly, we identify transcription as a significant positive predictor. We also find that STR density is substantially increased in regions neighboring, but not within, recombination hotspots.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29300948 PMCID: PMC5814968 DOI: 10.1093/nar/gkx1253
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Outline of the workflow for this study.
Distribution of microsatellite mutations between sexual and asexual strains
| Mutated | Asexual | Sexual |
|---|---|---|
| No | 943 | 850 |
| Yes | 346 | 330 |
Numbers refer to potential sites of mutation (microsatellites × populations).
Results from a generalized linear model predicting frequency of microsatellite mutation, defined as number of variant reads per total number of reads
| Predictor |
|
|
|---|---|---|
| Asexual/sexual | 1.4 | 0.16 |
| Copy number | 15 | <2 × 10−16 |
| Uniformity | 10.8 | <2 × 10−16 |
| Motif length | −9.6 | <2 × 10−16 |
| Motif GC-content | −4.9 | 9.9 × 10−7 |
| # Promoters | −3.5 | 0.00053 |
| Transcript abundance | 5.0 | 8.0 × 10−7 |
Transcript abundance was from tiling array data by Xu et al. (59). Null deviance from the model was 179 on 2168 degrees of freedom and residual deviance was 128 on 2161 degrees of freedom. The variance inflation factors were <1.3 for all predictors.
Microsatellite mutability by motif
| Motif | # Loci | Proportion of loci mutated | Mean frequency |
|---|---|---|---|
| AAAT | 155 | 0 | 0 |
| AAT | 149 | 0.25 | 0.02 |
| AT | 1607 | 0.36 | 0.04 |
| AC | 153 | 0.24 | 0.04 |
| AG | 76 | 0.22 | 0.03 |
| Other | 329 | 0.012 | 0.0007 |
Motifs were grouped, for example AC, CA, TG and GT were all called AC, any four bp motif with three A’s and one T or one T and three A’s was called AAAT, etc. ‘Loci’ refers to potential sites of mutation (microsatellites x populations) as in Table 1. Frequency was number of variant reads per total number of reads.
Factors influencing whether microsatellite mutations were insertions
| Predictor |
|
|
|---|---|---|
| Asexual/sexual | −2 | 0.042 |
| Copy number | 0.62 | 0.53 |
| Purity | 2.07 | 0.038 |
| Motif length | −0.82 | 0.41 |
| Motif GC-content | −2.8 | 0.004 |
| # Promoters | 0.8 | 0.43 |
| Transcript abundance | 4.1 | 4.6 × 10−5 |
Predictors were the same as for Table 2.
Figure 2.Enrichment of microsatellites in regions neighboring (A) DSB hotspots (n = 3599) and (B) recombination hotspots (n = 248). Microsatellite locations were permuted 100 times. Standard errors of the mean for this permutation per 50 bp bin were all <1.24 for the DSB hotspots and <0.19 for the recombination hotspots. Telomeric, compound and ORF repeats were excluded from the analysis.