| Literature DB >> 29084517 |
Tom R Booker1, Benjamin C Jackson1, Peter D Keightley2.
Abstract
Population geneticists have long sought to understand the contribution of natural selection to molecular evolution. A variety of approaches have been proposed that use population genetics theory to quantify the rate and strength of positive selection acting in a species' genome. In this review we discuss methods that use patterns of between-species nucleotide divergence and within-species diversity to estimate positive selection parameters from population genomic data. We also discuss recently proposed methods to detect positive selection from a population's haplotype structure. The application of these tests has resulted in the detection of pervasive adaptive molecular evolution in multiple species.Entities:
Mesh:
Year: 2017 PMID: 29084517 PMCID: PMC5662103 DOI: 10.1186/s12915-017-0434-y
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Fig. 1.Selective sweeps and background selection. Maynard Smith and Haigh [79] showed that as an advantageous mutation rises in frequency it drags with it linked neutral polymorphisms. With increasing genetic distance from the selected site, the effect is reduced, resulting in troughs in genetic diversity in surrounding regions. a Hard/classic sweeps - the most well-studied model of sweeps. A new advantageous mutation rapidly increases in frequency to eventual fixation. As it sweeps, the adaptive allele carries with it a portion of the haplotype on which it arose, reducing levels of neutral diversity in the surrounding area [27, 79]. b-c Soft sweeps - a neutral allele segregating in a population may become favoured (due, for example, to a change in the environment). b The segregating allele may be associated with multiple haplotypes, and as it rises in frequency, so do the multiple haplotypes. c A similar process, also termed a soft sweep, can occur if an advantageous mutation arises by multiple, distinct mutation events. See [66] for a thorough review on soft sweep models. d Incomplete/partial sweeps - If an advantageous allele increases in frequency, but does not reach fixation, there will still be some loss of linked neutral diversity. In this review, we use the term incomplete sweeps to describe sweeps that are polymorphic at the time of sampling, but may (or may not) eventually reach fixation a. The term partial sweep describes the situation wherein a sweeping allele becomes effectively neutral at a certain frequency in its trajectory d. The magnitude of both processes’ effects on linked neutral diversity depends on the frequency reached by the sweeping allele when selection is ‘turned off’ or on the time of sampling [33]. Partial sweeps may be common in cases of adaptation involving selection on quantitative traits [67]. e Background selection - as natural selection purges deleterious mutations, neutral alleles linked to the selected locus are also lost. The process of background selection is qualitatively similar to recurrent selective sweeps, since both processes reduce local genetic diversity [80] and skew the SFS towards rare variants [81, 82]. Models of background selection envisage a neutral site linked to many functional sites at different distances, such that the effects of selection at many sites accumulate to reduce diversity [83, 84]. Blue circles represent neutral alleles, red, yellow and orange circles represent advantageous alleles, and red squares represent deleterious alleles
MK table for the Adh gene [3] showing numbers of fixed differences and polymorphic sites between and within D. melanogaster, D. simulans and D. yakuba
| Differences ( | Polymorphism ( | |
|---|---|---|
| Nonsynonymous | 7 | 2 |
| Synonymous | 17 | 42 |
Note that the ratio of fixed nonsynonymous to synonymous differences (7/17) is substantially higher than the ratio of nonsynonymous to synonymous polymorphisms (2/42), indicating that some amino acid differences are adaptive
Fig. 2.The site frequency spectrum. The numbers of variants segregating at different frequencies in a population can be summarized as the site frequency spectrum (SFS). Consider the ten chromosome samples shown in a. Observations of a particular minor allele frequency are used to populate the folded SFS b. ‘Unfolding’ the SFS requires knowledge of whether alleles are ancestral or derived. Aligning sequenced data to an outgroup (the blue nucleotides in a) allows the inference of ancestral and derived states for polymorphic and diverged sites, by maximum parsimony. However, the parsimony approach makes a number of biologically unrealistic assumptions; for example, that there have been no mutations in the lineage leading to the outgroup. Because of these, a number of alternative approaches have been proposed that have been shown to be more accurate than parsimony (e.g. [15]). Various evolutionary processes can alter the SFS, including directional and balancing selection, gene conversion, population size change and migration. For example, purifying selection prevents harmful variants from rising in frequency, resulting in a skew in the SFS towards rare variants. Multiple statistics have been proposed to summarize both the folded and unfolded SFS, and these can shed light on the evolutionary process (reviewed in [4])