| Literature DB >> 16327885 |
Abstract
Several projects investigating genetic function and evolution through sequencing and comparison of multiple genomes are now underway. These projects consume many resources, and appropriate planning should be devoted to choosing which species to sequence, potentially involving cooperation among different sequencing centres. A widely discussed criterion for species choice is the maximisation of evolutionary divergence. Our mathematical formalization of this problem surprisingly shows that the best long-term cooperative strategy coincides with the seemingly short-term "greedy" strategy of always choosing the next best single species. Other criteria influencing species choice, such as medical relevance or sequencing costs, can also be accommodated in our approach, suggesting our results' broad relevance in scientific policy decisions.Entities:
Mesh:
Year: 2005 PMID: 16327885 PMCID: PMC1298936 DOI: 10.1371/journal.pgen.0010071
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Phylogenetic Scopes and Divergence of Sets of Species
(A) Phylogenetic scope comprising hypothetical species A, B, C, D, and E. Numbers are branch lengths indicating evolutionary distances (not necessarily reflecting temporal distances). The subtree connecting species B, C, and E is shown in red and has divergence 1 + 3 + 1 + 5 + 2 + 4 = 16. Applying the greedy algorithm always produces maximally divergent extensions of the original set. For example, the subsets constructed starting with B—BE (divergence 11), BCE (16), BCDE (19)—have maximum divergence among those obtainable by adding one, two, and three additional species, respectively. The series AE (12), ACE (17), ACDE (20) is optimal among all possible subsets of two, three, and four species.
(B) Phylogenetic scope comprising placental mammals that have been or are being sequenced (in red) and candidates for future sequencing (derived from [17]). If five groups choose the next five targets for sequencing using the greedy strategy described in the text, the following species (in blue) will be selected (in order): (1) tenrec, (2) hedgehog, (3) rock hyrax, (4) tree shrew, (5) dog-faced fruit bat (a megabat). Within the phylogenetic scope shown, this is guaranteed to be the choice of five species that maximises the total resulting divergence. These species have recently been announced amongst targets for future sequencing [9].
Figure 2Representative Example for the Scenario Described in the Proof of Theorem 1.
TX is depicted in blue and TY-{ in green. Species (leaves) in Y are represented by filled circles.
Figure 3Representative Examples for Two Scenarios in the Proof of the Lemma
In both examples, TX is in blue and TY-{ in green, and species (leaves) in Y are represented by filled circles. The two scenarios in the proof of the lemma, A and B, are illustrated correspondingly in (A) and (B), respectively.
Figure 4Topologically Distinct Phylogenetic Trees for Two Sets of Species, X and Y, such that |X| = 2 and |Y| = 3.
X and TX are depicted in dark blue, leaves in Y are denoted with circles, and a possible choice for x (satisfying the requirements in the lemma), with the path from TX to x, in light blue.