Literature DB >> 34245291

Species Tree Inference Methods Intended to Deal with Incomplete Lineage Sorting Are Robust to the Presence of Paralogs.

Zhi Yan1, Megan L Smith2, Peng Du1, Matthew W Hahn2, Luay Nakhleh1,3.   

Abstract

Many recent phylogenetic methods have focused on accurately inferring species trees when there is gene tree discordance due to incomplete lineage sorting (ILS). For almost all of these methods, and for phylogenetic methods in general, the data for each locus are assumed to consist of orthologous, single-copy sequences. Loci that are present in more than a single copy in any of the studied genomes are excluded from the data. These steps greatly reduce the number of loci available for analysis. The question we seek to answer in this study is: what happens if one runs such species tree inference methods on data where paralogy is present, in addition to or without ILS being present? Through simulation studies and analyses of two large biological data sets, we show that running such methods on data with paralogs can still provide accurate results. We use multiple different methods, some of which are based directly on the multispecies coalescent model, and some of which have been proven to be statistically consistent under it. We also treat the paralogous loci in multiple ways: from explicitly denoting them as paralogs, to randomly selecting one copy per species. In all cases, the inferred species trees are as accurate as equivalent analyses using single-copy orthologs. Our results have significant implications for the use of ILS-aware phylogenomic analyses, demonstrating that they do not have to be restricted to single-copy loci. This will greatly increase the amount of data that can be used for phylogenetic inference.[Gene duplication and loss; incomplete lineage sorting; multispecies coalescent; orthology; paralogy.].
© The Author(s) 2021. Published by Oxford University Press on behalf of the Society of Systematic Biologists.

Entities:  

Mesh:

Year:  2022        PMID: 34245291      PMCID: PMC8978208          DOI: 10.1093/sysbio/syab056

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   15.683


  54 in total

1.  Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci.

Authors:  Bruce Rannala; Ziheng Yang
Journal:  Genetics       Date:  2003-08       Impact factor: 4.562

2.  TESTING THE CONSTANT-RATE NEUTRAL ALLELE MODEL WITH PROTEIN SEQUENCE DATA.

Authors:  Richard R Hudson
Journal:  Evolution       Date:  1983-01       Impact factor: 3.694

3.  Estimating the per-base-pair mutation rate in the yeast Saccharomyces cerevisiae.

Authors:  Gregory I Lang; Andrew W Murray
Journal:  Genetics       Date:  2008-01       Impact factor: 4.562

4.  A maximum pseudo-likelihood approach for estimating species trees under the coalescent model.

Authors:  Liang Liu; Lili Yu; Scott V Edwards
Journal:  BMC Evol Biol       Date:  2010-10-11       Impact factor: 3.260

5.  The Multilocus Multispecies Coalescent: A Flexible New Model of Gene Family Evolution.

Authors:  Qiuyi Li; Celine Scornavacca; Nicolas Galtier; Yao-Ban Chan
Journal:  Syst Biol       Date:  2021-06-16       Impact factor: 15.683

6.  Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss.

Authors:  Brandon Legried; Erin K Molloy; Tandy Warnow; Sébastien Roch
Journal:  J Comput Biol       Date:  2020-12-15       Impact factor: 1.479

7.  Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics.

Authors:  Ya Yang; Stephen A Smith
Journal:  Mol Biol Evol       Date:  2014-08-25       Impact factor: 16.240

8.  SimPhy: Phylogenomic Simulation of Gene, Locus, and Species Trees.

Authors:  Diego Mallo; Leonardo De Oliveira Martins; David Posada
Journal:  Syst Biol       Date:  2015-11-01       Impact factor: 15.683

9.  A maximum pseudo-likelihood approach for phylogenetic networks.

Authors:  Yun Yu; Luay Nakhleh
Journal:  BMC Genomics       Date:  2015-10-02       Impact factor: 3.969

10.  ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees.

Authors:  Chao Zhang; Maryam Rabiee; Erfan Sayyari; Siavash Mirarab
Journal:  BMC Bioinformatics       Date:  2018-05-08       Impact factor: 3.169

View more
  7 in total

1.  DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition.

Authors:  James Willson; Mrinmoy Saha Roddur; Baqiao Liu; Paul Zaharias; Tandy Warnow
Journal:  Syst Biol       Date:  2022-04-19       Impact factor: 9.160

2.  Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication.

Authors:  Haifeng Xiong; Danying Wang; Chen Shao; Xuchen Yang; Jialin Yang; Tao Ma; Charles C Davis; Liang Liu; Zhenxiang Xi
Journal:  Syst Biol       Date:  2022-10-12       Impact factor: 9.160

3.  Using all Gene Families Vastly Expands Data Available for Phylogenomic Inference.

Authors:  Megan L Smith; Dan Vanderpool; Matthew W Hahn
Journal:  Mol Biol Evol       Date:  2022-06-02       Impact factor: 8.800

Review 4.  Recent progress on methods for estimating and updating large phylogenies.

Authors:  Paul Zaharias; Tandy Warnow
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2022-08-22       Impact factor: 6.671

5.  The large-sample asymptotic behaviour of quartet-based summary methods for species tree inference.

Authors:  Yao-Ban Chan; Qiuyi Li; Celine Scornavacca
Journal:  J Math Biol       Date:  2022-08-17       Impact factor: 2.164

6.  Target capture data resolve recalcitrant relationships in the coffee family (Rubioideae, Rubiaceae).

Authors:  Olle Thureborn; Sylvain G Razafimandimbison; Niklas Wikström; Catarina Rydin
Journal:  Front Plant Sci       Date:  2022-09-08       Impact factor: 6.627

7.  How to Tackle Phylogenetic Discordance in Recent and Rapidly Radiating Groups? Developing a Workflow Using Loricaria (Asteraceae) as an Example.

Authors:  Martha Kandziora; Petr Sklenář; Filip Kolář; Roswitha Schmickl
Journal:  Front Plant Sci       Date:  2022-01-07       Impact factor: 5.753

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.