| Literature DB >> 26867134 |
Jessica Hedge1, Daniel J Wilson1,2.
Abstract
Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers through the fundamentals underpinning popular methods for measuring selection in pathogens. These methods are transferable to a wide variety of organisms, and the exercises provided are designed for researchers with any level of programming experience.Entities:
Mesh:
Year: 2016 PMID: 26867134 PMCID: PMC4750996 DOI: 10.1371/journal.pcbi.1004739
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Phylogenetic tree reconstruction and evolutionary rate estimation.
A phylogenetic tree comprises a collection of branches that connect sampled sequences at the tips (called taxa) with the most recent common ancestor of the sample. The point where each pair of branches join together is called a node. The lengths of these branches represent the evolutionary distance between sequences at either end, usually measured in numbers of substitutions per site, which can be calculated using the scale bar. The length of the vertical branches and rotation of branches around each node are arbitrary. The tree can be rooted using a divergent sequence (called an outgroup) (a), in which case the direction of substitutions can be inferred and each node represents the common ancestor of all descendent nodes and taxa. The node furthest from the tips is called the root. The tree can also be left unrooted and displayed radially (b) (tip labels have been omitted for visual clarity). Assuming the phylogeny has been rooted correctly, linear regression analysis can be used to test for a signal of a molecular clock by plotting the sampling time of each sequence against its evolutionary distance from the root of the tree. If the test is significant (c), the slope of the regression line (red) can provide an estimate of the evolutionary rate. The lack of any temporal signal (d) may occur if insufficient time has passed for substitutions to accumulate or if the molecular clock has been violated (for example, due to selection, recombination, or hypermutation).
Fig 2Detecting selection from microbial sequence data.
The phylogeny shows the evolutionary history of 20 sequences sampled evenly from four divergent populations. dN/dS methods test for selection by comparing the rates of non-synonymous and synonymous substitution occurring between divergent lineages (i.e., only substitutions that have occurred on the black branches) with those expected under neutrality. In contrast, the McDonald-Kreitman test for selection compares the ratio of non-synonymous and synonymous polymorphisms that are present within populations (due to substitutions occurring on red branches) with the ratio of non-synonymous and synonymous fixed differences that are present between populations (due to substitutions occurring on black branches). The phylogeny can also be used to detect selection by identifying parallel evolution, whereby recurrent mutations occur at a site or across a gene during the evolutionary history of a sample (for example, substitution X on the phylogeny).
Two-way contingency table used in the MacDonald-Kreitman test.
| Fixed differences | Polymorphisms | |
|---|---|---|
| Synonymous mutations | ||
| Non-synonymous mutations |