Literature DB >> 25540456

Assessing approaches for inferring species trees from multi-copy genes.

Ruchi Chaudhary1, Bastien Boussau2, J Gordon Burleigh2, David Fernández-Baca2.   

Abstract

With the availability of genomic sequence data, there is increasing interest in using genes with a possible history of duplication and loss for species tree inference. Here we assess the performance of both nonprobabilistic and probabilistic species tree inference approaches using gene duplication and loss and coalescence simulations. We evaluated the performance of gene tree parsimony (GTP) based on duplication (Only-dup), duplication and loss (Dup-loss), and deep coalescence (Deep-c) costs, the NJst distance method, the MulRF supertree method, and PHYLDOG, which jointly estimates gene trees and species tree using a hierarchical probabilistic model. We examined the effects of gene tree and species sampling, gene tree error, and duplication and loss rates on the accuracy of phylogenetic estimates. In the 10-taxon duplication and loss simulation experiments, MulRF is more accurate than the other methods when the duplication and loss rates are low, and Dup-loss is generally the most accurate when the duplication and loss rates are high. PHYLDOG performs well in 10-taxon duplication and loss simulations, but its run time is prohibitively long on larger data sets. In the larger duplication and loss simulation experiments, MulRF outperforms all other methods in experiments with at most 100 taxa; however, in the larger simulation, Dup-loss generally performs best. In all duplication and loss simulation experiments with more than 10 taxa, all methods perform better with more gene trees and fewer missing sequences, and they are all affected by gene tree error. Our results also highlight high levels of error in estimates of duplications and losses from GTP methods and demonstrate the usefulness of methods based on generic tree distances for large analyses.
© The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  Deep coalescence; MulRF; NJst; PHYLDOG; gene duplication; gene loss; gene tree parsimony

Mesh:

Year:  2014        PMID: 25540456     DOI: 10.1093/sysbio/syu128

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   15.683


  4 in total

1.  DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition.

Authors:  James Willson; Mrinmoy Saha Roddur; Baqiao Liu; Paul Zaharias; Tandy Warnow
Journal:  Syst Biol       Date:  2022-04-19       Impact factor: 9.160

2.  ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy.

Authors:  Chao Zhang; Celine Scornavacca; Erin K Molloy; Siavash Mirarab
Journal:  Mol Biol Evol       Date:  2020-11-01       Impact factor: 16.240

3.  MIPhy: identify and quantify rapidly evolving members of large gene families.

Authors:  David M Curran; John S Gilleard; James D Wasmuth
Journal:  PeerJ       Date:  2018-05-29       Impact factor: 2.984

4.  FastMulRFS: fast and accurate species tree estimation under generic gene duplication and loss models.

Authors:  Erin K Molloy; Tandy Warnow
Journal:  Bioinformatics       Date:  2020-07-01       Impact factor: 6.937

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.