| Literature DB >> 19014489 |
Angela McCann1, James A Cotton, James O McInerney.
Abstract
BACKGROUND: In the past decade or more, the emphasis for reconstructing species phylogenies has moved from the analysis of a single gene to the analysis of multiple genes and even completed genomes. The simplest method of scaling up is to use familiar analysis methods on a larger scale and this is the most popular approach. However, duplications and losses of genes along with horizontal gene transfer (HGT) can lead to a situation where there is only an indirect relationship between gene and genome phylogenies. In this study we examine five widely-used approaches and their variants to see if indeed they are more-or-less saying the same thing. In particular, we focus on Conditioned Reconstruction as it is a method that is designed to work well even if HGT is present.Entities:
Mesh:
Substances:
Year: 2008 PMID: 19014489 PMCID: PMC2592249 DOI: 10.1186/1471-2148-8-312
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 11a displays a PCA of the variation between the distance matrices. Each point is named after either the phylogenetic method used to create the matrix or in the case of CR the conditioning genome used. The size of the points is proportional to the size of the conditioning genome used. Figure 1b depicts PCA analysis of the 100 bootstrap replicates from each of the gene-content methods and the 22 variants of CR using one genome. Finally 1c displays a matrix produced using Methanosarcina acetivorans as the conditioning genome. The columns representing Methanococcus jannaschii and Methanosarcina mazei in the distance matrixhave been plotted. The dots on the plot represent the distance between the labeled genome and Mco. jannaschii (plotted on the x axis) and Methanosarcina mazei (plotted on the y-axis).
Figure 2(a) Phylogenetic tree constructed using the CR algorithm with only one conditioning genome chosen, in this case The tree is rooted on four members of the Crenarchaeota (b) Phylogenetic tree constructed using Avg CR i.e. every genome in the analysis acts as the conditioning genome.
Figure 3Robinson Foulds distances between each of the 100 bootstrap replicates for a method against the sets of 100 bootstrap replicates for all the methods. The sequenced-based versus sequenced-based methods are coloured black, the gene-content versus gene-content methods grey and the gene-content versus sequence-based are left blank. Each asterisk represents the Robinson Foulds distance between the unpermuted datasets.