| Literature DB >> 31974096 |
Matthew Hartfield1,2,3, Thomas Bataillon2.
Abstract
A major research goal in evolutionary genetics is to uncover loci experiencing positive selection. One approach involves finding 'selective sweeps' patterns, which can either be 'hard sweeps' formed by de novo mutation, or 'soft sweeps' arising from recurrent mutation or existing standing variation. Existing theory generally assumes outcrossing populations, and it is unclear how dominance affects soft sweeps. We consider how arbitrary dominance and inbreeding via self-fertilization affect hard and soft sweep signatures. With increased self-fertilization, they are maintained over longer map distances due to reduced effective recombination and faster beneficial allele fixation times. Dominance can affect sweep patterns in outcrossers if the derived variant originates from either a single novel allele, or from recurrent mutation. These models highlight the challenges in distinguishing hard and soft sweeps, and propose methods to differentiate between scenarios.Entities:
Keywords: Adaptation; Dominance; Population Genetics; Selective Sweeps; Self-fertilisation
Mesh:
Year: 2020 PMID: 31974096 PMCID: PMC7056974 DOI: 10.1534/g3.119.400919
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Glossary of Notation
| Symbol | Usage |
|---|---|
| Population size (with | |
| Proportion of matings that are self-fertilizing | |
| Wright’s inbreeding coefficient, probability of identity-by-descent at a single gene, equal to | |
| Joint probability of identity-by-descent at two loci (Equation 1) | |
| Effective population size, equal to | |
| Recombination rate between loci | |
| ‘Effective’ recombination rate, approximately equal to | |
| Frequency at which the derived allele at | |
| Accelerated (effective) starting frequency of | |
| Selective advantage of derived allele at | |
| Dominance coefficient of derived allele at | |
| Number of generations in the past from the present day | |
| Time in the past when derived locus became beneficial | |
| Frequency of beneficial allele at time | |
| Probability of coalescence at time | |
| Probability of recombination at time | |
| Probability of mutation at time | |
| Probability that neutral marker does not coalesce or recombine during sweep phase | |
| Probability that neutral marker recombines during sweep phase | |
| Probability that neutral marker recombines during standing phase | |
| Probability that a lineage mutates during sweep phase | |
| Probability that a lineage mutates during standing phase | |
| ‘Effective’ dominance coefficient for allele at low, high frequency | |
| Pairwise diversity at site ( | |
| Pairwise diversity following sweep from standing variation | |
| Pairwise diversity following sweep from recurrent mutation | |
| Probability of neutral mutation occurring per site per generation | |
| Probability of beneficial mutation occurring at target locus per generation | |
| Population level neutral mutation rate | |
| Population level beneficial mutation rate |
Figure 1A schematic of the model. The history of the derived variant is separated into two phases; the ‘standing phase’ (shown in light gray), and the ‘sweep phase’ (shown in dark gray). Axis on the left-hand side show allele frequency on a log-scale. Dots on the right-hand side represent a sample of haplotypes taken at the present day, with lines representing their genetic histories. Solid lines represent coalescent histories for the derived genetic background; dotted lines represent coalescent histories for the ancestral, neutral background. Note the allele trajectory is an idealised version as assumed in the model.
Figure 2Examples of the effective starting frequency. Equation 8 is plotted as a function of F for different dominance values, as shown in the legend. Other parameters are , . The dashed line shows the actual starting frequency, .
Figure 3Expected relative pairwise diversity following a selective sweep. Plots of as a function of the recombination rate scaled to population size . Lines are analytical solutions (Equation 9), points are forward-in-time simulation results. , , (note μ is scaled by N, not ), and dominance coefficient (red lines, points), 0.5 (black lines, points), or 0.9 (blue lines, points). Values of and self-fertilization rates σ used are shown for the relevant row and column; note the axis range changes with the self-fertilization rate. For we use in our model, as given by Equation 8. Further results are plotted in Section C of Supplementary File S1.
Figure 4Beneficial allele trajectories. These were obtained by numerically evaluating the negative of Equation 4 forward in time. , , and h equals either 0.1 (red lines), 0.5 (black lines), or 0.9 (blue lines). Values of and self-fertilization rates σ used are shown for the relevant row and column. Note the different axis scales used in each panel. Further results are plotted in Section C of Supplementary File S1.
Figure 5Expected site frequency spectrum, in flanking regions to the adaptive mutation, following a selective sweep. Lines are analytical solutions (Equation A12 in Supplementary File S2), points are simulation results. , , , and dominance coefficient (red lines, points), 0.5 (black lines, points), or 0.9 (blue lines, points). The neutral SFS is also included for comparisons (gray dashed line). Values of , self-fertilization rates σ and recombination distances R are shown for the relevant row and column. Results for other recombination distances are in Section E of Supplementary File S1.
Figure 6Comparing sweeps from recurrent mutation to those from standing variation. Top row: comparing relative diversity following a soft sweep, from either standing variation (Equation 9 with , solid lines) or recurrent mutation (using Equation 11 with , dashed lines). , , and dominance coefficient (red lines), 0.5 (black lines), or 0.9 (blue lines). Bottom row: the ratio of the diversity following a sweep from standing variation to one from recurrent mutation. Parameters for each panel are as in the respective plot for the top row. Vertical dashed black line indicates (the approximate form of Equation 16); horizontal dashed line in the bottom-row plots show when the ratio equals 1. Note the different axis between left- and right-hand panels. Results are also plotted in Section F of Supplementary File S1.