| Literature DB >> 25425237 |
Jessica Hedge1, Daniel J Wilson2.
Abstract
UNLABELLED: Phylogenetic inference in bacterial genomics is fundamental to understanding problems such as population history, antimicrobial resistance, and transmission dynamics. The field has been plagued by an apparent state of contradiction since the distorting effects of recombination on phylogeny were discovered more than a decade ago. Researchers persist with detailed phylogenetic analyses while simultaneously acknowledging that recombination seriously misleads inference of population dynamics and selection. Here we resolve this paradox by showing that phylogenetic tree topologies based on whole genomes robustly reconstruct the clonal frame topology but that branch lengths are badly skewed. Surprisingly, removing recombining sites can exacerbate branch length distortion caused by recombination. IMPORTANCE: Phylogenetic tree reconstruction is a popular approach for understanding the relatedness of bacteria in a population from differences in their genome sequences. However, bacteria frequently exchange regions of their genomes by a process called homologous recombination, which violates a fundamental assumption of phylogenetic methods. Since many researchers continue to use phylogenetics for recombining bacteria, it is important to understand how recombination affects the conclusions drawn from these analyses. We find that whole-genome sequences afford great accuracy in reconstructing evolutionary relationships despite concerns surrounding the presence of recombination, but the branch lengths of the phylogenetic tree are indeed badly distorted. Surprisingly, methods to reduce the impact of recombination on branch lengths can exacerbate the problem.Entities:
Mesh:
Year: 2014 PMID: 25425237 PMCID: PMC4251999 DOI: 10.1128/mBio.02158-14
Source DB: PubMed Journal: mBio Impact factor: 7.867
FIG 1 Effects of recombination in bacteria on phylogenetic tree topology and growth rate estimates. (a) The true clonal frame (left) and ML phylogenies constructed from all sites (center) and only nonhomoplastic sites (right) representing the evolutionary history of a population of 100 bacterial genomes of 1 million base pairs. The recombination rate (ρ) and substitution rate (θ) were fixed at 1%. The number of homoplasies per branch is shown for the center tree. (b) Estimates of branch accuracy for trees reconstructed using ML, BEAST, NJ, and UPGMA at three different values of ρ. The means and standard errors are based on 1,000 simulations of a demographic model of constant population size. (c) Mean posterior estimates of the exponential growth rate parameter (g) from BEAST, averaged over analyses of 1,000 simulated data sets. Data were simulated under a demographic model of constant population size (gray), low exponential growth (blue), and high exponential growth (red) and at three different values of ρ. Error bars represent the mean 95% confidence intervals. Estimates from analyses using either all sites in the sequence alignment (filled triangles) or only those sites without homoplasies (open circles) are plotted. Black dashed horizontal lines represent the true value of the exponential growth rate parameter used in the simulations.