| Literature DB >> 21936906 |
David Williams1, Gregory P Fournier, Pascal Lapierre, Kristen S Swithers, Anna G Green, Cheryl P Andam, J Peter Gogarten.
Abstract
Phylogenetic reconstruction using DNA and protein sequences has allowed the reconstruction of evolutionary histories encompassing all life. We present and discuss a means to incorporate much of this rich narrative into a single model that acknowledges the discrete evolutionary units that constitute the organism. Briefly, this Rooted Net of Life genome phylogeny is constructed around an initial, well resolved and rooted tree scaffold inferred from a supermatrix of combined ribosomal genes. Extant sampled ribosomes form the leaves of the tree scaffold. These leaves, but not necessarily the deeper parts of the scaffold, can be considered to represent a genome or pan-genome, and to be associated with members of other gene families within that sequenced (pan)genome. Unrooted phylogenies of gene families containing four or more members are reconstructed and superimposed over the scaffold. Initially, reticulations are formed where incongruities between topologies exist. Given sufficient evidence, edges may then be differentiated as those representing vertical lines of inheritance within lineages and those representing horizontal genetic transfers or endosymbioses between lineages.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21936906 PMCID: PMC3189188 DOI: 10.1186/1745-6150-6-45
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Figure 1Phylogenetic tree used to simulate genome evolution including a directed highway of gene sharing. Two different trees were tested, one having a slightly longer internal branch of 0.05 substitutions per site compared to the other tree with only 0.01 substitutions per site. Genome B' was used as a donor for genes transferred into the lineage leading to genome F. Genome B' was not included in the phylogenetic reconstruction and genes from genome B' were used as replacements for their orthologs in genome F. The simulations were repeated with increasing amount of transfers from genome B' to F. The genome sequences were generated using Evolver from the PAML package [113]. Each simulated genomes contained a total of 100 genes, each 300 amino acids long.
Figure 2Comparison of supermatrix and supertree approaches for recovering the correct tree following horizontal genetic transfer. Horizontal genetic transfer was simulated between lineage B' and F (Figure 1) with an internal branch of 0.05 (A) or 0.01 substitutions per site (B). The frequency with which the correct tree is recovered from supermatrix and supertree approaches from data that include increasing amounts of genes transferred along a single highway of gene sharing was tested. Each simulated genome contained a total of 100 genes, each 300 amino acids long. Genes were concatenated into a single sequence from each simulated genome for the supermatrix tree calculation or alternatively, gene trees were calculated individually from each gene for the supertree approach. The sequences were not realigned to avoid any additional artifact potentially introduced from alignment algorithms. Neighbor-joining trees were calculated with Kimura correction in ClustalW version 2.0.12 [114]. Maximum likelihood trees were calculated with PhyML V.3.0 [115] with Pinvar, JTT model and estimated gamma distribution under 4 categories. The embedded quartet trees [116] as well as the resulting plurality trees (supertree) were calculated from the individual gene family trees using Quartet Suite v.1.0 [117]. The simulations were repeated 100 times to measure the reproducibility of the different tree reconstruction methods in recovering the original tree topology.