Literature DB >> 32308731

From pairs of most similar sequences to phylogenetic best matches.

Peter F Stadler1,2,3,4,5,6, Manuela Geiß1,7, David Schaller1, Alitzel López Sánchez8, Marcos González Laffitte8, Dulce I Valdivia9, Marc Hellmuth10, Maribel Hernández Rosales8.   

Abstract

BACKGROUND: Many of the commonly used methods for orthology detection start from mutually most similar pairs of genes (reciprocal best hits) as an approximation for evolutionary most closely related pairs of genes (reciprocal best matches). This approximation of best matches by best hits becomes exact for ultrametric dissimilarities, i.e., under the Molecular Clock Hypothesis. It fails, however, whenever there are large lineage specific rate variations among paralogous genes. In practice, this introduces a high level of noise into the input data for best-hit-based orthology detection methods.
RESULTS: If additive distances between genes are known, then evolutionary most closely related pairs can be identified by considering certain quartets of genes provided that in each quartet the outgroup relative to the remaining three genes is known. A priori knowledge of underlying species phylogeny greatly facilitates the identification of the required outgroup. Although the workflow remains a heuristic since the correct outgroup cannot be determined reliably in all cases, simulations with lineage specific biases and rate asymmetries show that nearly perfect results can be achieved. In a realistic setting, where distances data have to be estimated from sequence data and hence are noisy, it is still possible to obtain highly accurate sets of best matches.
CONCLUSION: Improvements of tree-free orthology assessment methods can be expected from a combination of the accurate inference of best matches reported here and recent mathematical advances in the understanding of (reciprocal) best match graphs and orthology relations. AVAILABILITY: Accompanying software is available at https://github.com/david-schaller/AsymmeTree.
© The Author(s) 2020.

Entities:  

Keywords:  Best matches; Gene tree; Orthology; Reconciliation; Species tree

Year:  2020        PMID: 32308731      PMCID: PMC7147060          DOI: 10.1186/s13015-020-00165-2

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  52 in total

1.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.

Authors:  S Whelan; N Goldman
Journal:  Mol Biol Evol       Date:  2001-05       Impact factor: 16.240

2.  Detecting putative orthologs.

Authors:  D P Wall; H B Fraser; A E Hirsh
Journal:  Bioinformatics       Date:  2003-09-01       Impact factor: 6.937

3.  Criteria for optimising phylogenetic trees and the problem of determining the root of a tree.

Authors:  D Penny
Journal:  J Mol Evol       Date:  1976-08-03       Impact factor: 2.395

4.  Distinguishing homologous from analogous proteins.

Authors:  W M Fitch
Journal:  Syst Zool       Date:  1970-06

5.  A non-sequential method for constructing trees and hierarchical classifications.

Authors:  W M Fitch
Journal:  J Mol Evol       Date:  1981       Impact factor: 2.395

6.  Biologically feasible gene trees, reconciliation maps and informative triples.

Authors:  Marc Hellmuth
Journal:  Algorithms Mol Biol       Date:  2017-08-29       Impact factor: 1.405

7.  Gene duplication and the adaptive evolution of a classic genetic switch.

Authors:  Chris Todd Hittinger; Sean B Carroll
Journal:  Nature       Date:  2007-10-11       Impact factor: 49.962

8.  Relaxed phylogenetics and dating with confidence.

Authors:  Alexei J Drummond; Simon Y W Ho; Matthew J Phillips; Andrew Rambaut
Journal:  PLoS Biol       Date:  2006-03-14       Impact factor: 8.029

9.  A genome-wide survey of changes in protein evolutionary rates across four closely related species of Saccharomyces sensu stricto group.

Authors:  Yoshihiro Kawahara; Tadashi Imanishi
Journal:  BMC Evol Biol       Date:  2007-01-29       Impact factor: 3.260

10.  The Effect of Nonreversibility on Inferring Rooted Phylogenies.

Authors:  Svetlana Cherlin; Sarah E Heaps; Tom M W Nye; Richard J Boys; Tom A Williams; T Martin Embley
Journal:  Mol Biol Evol       Date:  2018-04-01       Impact factor: 16.240

View more
  4 in total

1.  Indirect identification of horizontal gene transfer.

Authors:  David Schaller; Manuel Lafond; Peter F Stadler; Nicolas Wieseke; Marc Hellmuth
Journal:  J Math Biol       Date:  2021-07-03       Impact factor: 2.259

2.  Reconstruction of time-consistent species trees.

Authors:  Manuel Lafond; Marc Hellmuth
Journal:  Algorithms Mol Biol       Date:  2020-08-20       Impact factor: 1.405

3.  Complete Characterization of Incorrect Orthology Assignments in Best Match Graphs.

Authors:  David Schaller; Manuela Geiß; Peter F Stadler; Marc Hellmuth
Journal:  J Math Biol       Date:  2021-02-19       Impact factor: 2.259

4.  Best match graphs and reconciliation of gene trees with species trees.

Authors:  Manuela Geiß; Marcos E González Laffitte; Alitzel López Sánchez; Dulce I Valdivia; Marc Hellmuth; Maribel Hernández Rosales; Peter F Stadler
Journal:  J Math Biol       Date:  2020-01-30       Impact factor: 2.259

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.