Literature DB >> 19000697

A basic limitation on inferring phylogenies by pairwise sequence comparisons.

Mike Steel1.   

Abstract

Distance-based approaches in phylogenetics such as Neighbor-Joining are a fast and popular approach for building trees. These methods take pairs of sequences, and from them construct a value that, in expectation, is additive under a stochastic model of site substitution. Most models assume a distribution of rates across sites, often based on a gamma distribution. Provided the (shape) parameter of this distribution is known, the method can correctly reconstruct the tree. However, if the shape parameter is not known then we show that topologically different trees, with different shape parameters and associated positive branch lengths, can lead to exactly matching distributions on pairwise site patterns between all pairs of taxa. Thus, one could not distinguish between the two trees using pairs of sequences without some prior knowledge of the shape parameter. More surprisingly, this can happen for any choice of distinct shape parameters on the two trees, and thus the result is not peculiar to a particular or contrived selection of the shape parameters. On a positive note, we point out known conditions where identifiability can be restored (namely, when the branch lengths are clocklike, or if methods such as maximum likelihood are used).

Mesh:

Year:  2008        PMID: 19000697     DOI: 10.1016/j.jtbi.2008.10.010

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  5 in total

1.  Combinatorics of distance-based tree inference.

Authors:  Fabio Pardi; Olivier Gascuel
Journal:  Proc Natl Acad Sci U S A       Date:  2012-09-25       Impact factor: 11.205

2.  Identifiability and inference of non-parametric rates-across-sites models on large-scale phylogenies.

Authors:  Elchanan Mossel; Sebastien Roch
Journal:  J Math Biol       Date:  2012-08-09       Impact factor: 2.259

3.  A stochastic Farris transform for genetic data under the multispecies coalescent with applications to data requirements.

Authors:  Gautam Dasarathy; Elchanan Mossel; Robert Nowak; Sebastien Roch
Journal:  J Math Biol       Date:  2022-04-08       Impact factor: 2.164

4.  APPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments.

Authors:  Metin Balaban; Shahab Sarmashghi; Siavash Mirarab
Journal:  Syst Biol       Date:  2020-05-01       Impact factor: 15.683

5.  How Fitch-Margoliash Algorithm can Benefit from Multi Dimensional Scaling.

Authors:  Sylvain Lespinats; Delphine Grando; Eric Maréchal; Mohamed-Ali Hakimi; Olivier Tenaillon; Olivier Bastien
Journal:  Evol Bioinform Online       Date:  2011-06-07       Impact factor: 1.625

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.