Literature DB >> 35426933

A Linear Time Solution to the Labeled Robinson-Foulds Distance Problem.

Samuel Briand1, Christophe Dessimoz2,3,4,5,6, Nadia El-Mabrouk1, Yannis Nevers2,3,6.   

Abstract

A large variety of pairwise measures of similarity or dissimilarity have been developed for comparing phylogenetic trees, for example, species trees or gene trees. Due to its intuitive definition in terms of tree clades and bipartitions and its computational efficiency, the Robinson-Foulds (RF) distance is the most widely used for trees with unweighted edges and labels restricted to leaves (representing the genetic elements being compared). However, in the case of gene trees, an important information revealing the nature of the homologous relation between gene pairs (orthologs, paralogs, and xenologs) is the type of event associated to each internal node of the tree, typically speciations or duplications, but other types of events may also be considered, such as horizontal gene transfers. This labeling of internal nodes is usually inferred from a gene tree/species tree reconciliation method. Here, we address the problem of comparing such event-labeled trees. The problem differs from the classical problem of comparing uniformly labeled trees (all labels belonging to the same alphabet) that may be done using the Tree Edit Distance (TED) mainly due to the fact that, in our case, two different alphabets are considered for the leaves and internal nodes of the tree, and leaves are not affected by edit operations. We propose an extension of the RF distance to event-labeled trees, based on edit operations comparable to those considered for TED: node insertion, node deletion, and label substitution. We show that this new Labeled Robinson-Foulds (LRF) distance can be computed in linear time, in addition of maintaining other desirable properties: being a metric, reducing to RF for trees with no labels on internal nodes and maintaining an intuitive interpretation. The algorithm for computing the LRF distance enables novel analyses on event-label trees such as reconciled gene trees. Here, we use it to study the impact of taxon sampling on labeled gene tree inference and conclude that denser taxon sampling yields trees with better topology but worse labeling. [Algorithms; combinatorics; gene trees; phylogenetics; Robinson-Foulds; tree distance.].
© The Author(s) 2022. Published by Oxford University Press on behalf of the Society of Systematic Biologists.

Entities:  

Mesh:

Year:  2022        PMID: 35426933      PMCID: PMC9557742          DOI: 10.1093/sysbio/syac028

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   9.160


  20 in total

Review 1.  Models, algorithms and programs for phylogeny reconciliation.

Authors:  Jean-Philippe Doyon; Vincent Ranwez; Vincent Daubin; Vincent Berry
Journal:  Brief Bioinform       Date:  2011-09       Impact factor: 11.622

2.  Nodal distances for rooted phylogenetic trees.

Authors:  Gabriel Cardona; Mercè Llabrés; Francesc Rosselló; Gabriel Valiente
Journal:  J Math Biol       Date:  2009-09-16       Impact factor: 2.259

3.  Space of gene/species trees reconciliations and parsimonious models.

Authors:  Jean-Philippe Doyon; Cedric Chauve; Sylvie Hamel
Journal:  J Comput Biol       Date:  2009-10       Impact factor: 1.479

4.  FastTree 2--approximately maximum-likelihood trees for large alignments.

Authors:  Morgan N Price; Paramvir S Dehal; Adam P Arkin
Journal:  PLoS One       Date:  2010-03-10       Impact factor: 3.240

5.  Exploring the space of gene/species reconciliations with transfers.

Authors:  Yao-Ban Chan; Vincent Ranwez; Céline Scornavacca
Journal:  J Math Biol       Date:  2014-12-14       Impact factor: 2.259

6.  Information theoretic generalized Robinson-Foulds metrics for comparing phylogenetic trees.

Authors:  Martin R Smith
Journal:  Bioinformatics       Date:  2020-12-22       Impact factor: 6.937

7.  ALF--a simulation framework for genome evolution.

Authors:  Daniel A Dalquen; Maria Anisimova; Gaston H Gonnet; Christophe Dessimoz
Journal:  Mol Biol Evol       Date:  2011-12-08       Impact factor: 16.240

8.  SPR distance computation for unrooted trees.

Authors:  Glenn Hickey; Frank Dehne; Andrew Rau-Chaplin; Christian Blouin
Journal:  Evol Bioinform Online       Date:  2008-02-09       Impact factor: 1.625

9.  Standardized benchmarking in the quest for orthologs.

Authors:  Adrian M Altenhoff; Brigitte Boeckmann; Salvador Capella-Gutierrez; Daniel A Dalquen; Todd DeLuca; Kristoffer Forslund; Jaime Huerta-Cepas; Benjamin Linard; Cécile Pereira; Leszek P Pryszcz; Fabian Schreiber; Alan Sousa da Silva; Damian Szklarczyk; Clément-Marie Train; Peer Bork; Odile Lecompte; Christian von Mering; Ioannis Xenarios; Kimmen Sjölander; Lars Juhl Jensen; Maria J Martin; Matthieu Muffato; Toni Gabaldón; Suzanna E Lewis; Paul D Thomas; Erik Sonnhammer; Christophe Dessimoz
Journal:  Nat Methods       Date:  2016-04-04       Impact factor: 28.547

10.  A generalized Robinson-Foulds distance for labeled trees.

Authors:  Samuel Briand; Christophe Dessimoz; Nadia El-Mabrouk; Manuel Lafond; Gabriela Lobinska
Journal:  BMC Genomics       Date:  2020-11-18       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.