| Literature DB >> 30367577 |
Eliran Avni1, Sagi Snir2.
Abstract
BACKGROUND: Deciphering the history of life on Earth has long been regarded as one of the most central tasks in biology. In past years, widespread discordance between the evolutionary histories of different groups of orthologous genes of prokaryotes have been revealed, primarily due to horizontal gene transfers (HGTs). Nonetheless, evidence that support a strong tree-like signal of evolution have been uncovered, despite the presence of HGT events. Therefore, a challenging task is to distill this tree-like signal from the noise induced by all sources of non-tree-like events.Entities:
Keywords: Horizontal gene transfer; Phylogenetic reconstruction; Prokaryotic evolution; Quartet plurality; Supertree reconstruction
Mesh:
Year: 2018 PMID: 30367577 PMCID: PMC6101080 DOI: 10.1186/s12864-018-4921-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Plurality Qfit and average (single tree) Qfit scores, for three values of λ and a constant n=100. We note three important features: a) The plurality Qfit may be much greater than the average Qfit. b) The plurality Qfit is an increasing function of the number of gene trees m. c) The plurality Qfit is a decreasing function of the HGT rate λ
Theoretical and empirical values of maximum “good” λ’s
| Number of leaves | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
|---|---|---|---|---|---|---|---|---|---|---|
| Maximum “good” | 0.07 | 0.06 | 0.06 | 0.04 | 0.05 | 0.05 | 0.04 | 0.05 | 0.06 | 0.04 |
| Maximum “good” | 1.0 | 0.6 | 0.7 | 0.5 | 0.6 | 0.5 | 0.5 | 0.4 | 0.7 | 0.4 |
We define maximum “good” λ as the maximum value of λ that enables us to reconstruct the species tree accurately. Of course, the exact values of the “good” λ’s may vary from one species tree to another. As one can see, the empirical values of maximum “good” λ’s are much higher than the theoretical ones, dictated by (5) (approximately one order of magnitude)
A summary of the properties of the different characters of the suggested phylogenies
| Classification by phylum | Classification by order | |
|---|---|---|
| QP1 tree | Tree is not perfect. One leaf misplaced | Tree is perfect |
| COG tree | Tree is not perfect. Eight leaves misplaced | Tree is not perfect. One leaf misplaced |
| Ribosomal protein tree | Tree is perfect | Tree is perfect |
A tree can either be perfect (i.e., induce a perfect separation with respect to the relevant character) or not, in which case the number of taxa needed to be replaced in order to make the tree perfect is indicated. We see that the ribosomal protein tree is perfect with respect to both characters, and that the QP1 tree is perfect apart from one misplaced taxon
A summary of the average Qfit and RF similarity scores between the three suggested phylogenies and the underlying gene pool
| Similarity measure | ||
|---|---|---|
| Average Qfit | Average RF | |
| QP1 tree | 0.48 | 0.42 |
| COG tree | 0.47 | 0.39 |
| Ribosomal protein tree | 0.48 | 0.42 |
For each suggested phylogeny, the Qfit and RF similarity scores between it and each gene in the gene pool were calculated. The average similarity scores were subsequently calculated and presented. We see that the scores relating to the QP1 tree and the ribosomal protein tree are the highest
A summary of the properties of the different characters of the suggested phylogenies
| Classification by phylum | Classification by order | |
|---|---|---|
| QP2 tree | Tree is perfect | Tree is not perfect. Three leaves misplaced |
| 16s tree | Tree is perfect | Tree is not perfect. Two leaves misplaced |
| Synteny tree | Tree is not perfect. One leaf misplaced | Tree is not perfect. Three leaves misplaced |
A tree can either be perfect (i.e., induce a perfect separation with respect to the relevant character) or not, in which case the number of taxa needed to be replaced in order to make the tree perfect is indicated. We see that no single tree is perfect with respect to order, but the QP2 tree and the 16s tree are perfect with respect to phylum
A summary of the average Qfit and RF similarity scores between the three suggested phylogenies and the underlying gene pool
| Similarity measure | ||
|---|---|---|
| Average Qfit | Average RF | |
| QP2 tree | 0.69 | 0.47 |
| 16s tree | 0.66 | 0.39 |
| Synteny tree | 0.67 | 0.42 |
For each suggested phylogeny, the Qfit and RF similarity scores between it and each gene in the gene pool were calculated. The average similarity scores were subsequently calculated and presented. We see that the QP2 tree receives the highest average similarity scores