Literature DB >> 32797207

Do Alignment and Trimming Methods Matter for Phylogenomic (UCE) Analyses?

Daniel M Portik1,2, John J Wiens1.   

Abstract

Alignment is a crucial issue in molecular phylogenetics because different alignment methods can potentially yield very different topologies for individual genes. But it is unclear if the choice of alignment methods remains important in phylogenomic analyses, which incorporate data from hundreds or thousands of genes. For example, problematic biases in alignment might be multiplied across many loci, whereas alignment errors in individual genes might become irrelevant. The issue of alignment trimming (i.e., removing poorly aligned regions or missing data from individual genes) is also poorly explored. Here, we test the impact of 12 different combinations of alignment and trimming methods on phylogenomic analyses. We compare these methods using published phylogenomic data from ultraconserved elements (UCEs) from squamate reptiles (lizards and snakes), birds, and tetrapods. We compare the properties of alignments generated by different alignment and trimming methods (e.g., length, informative sites, missing data). We also test whether these data sets can recover well-established clades when analyzed with concatenated (RAxML) and species-tree methods (ASTRAL-III), using the full data ($\sim $5000 loci) and subsampled data sets (10% and 1% of loci). We show that different alignment and trimming methods can significantly impact various aspects of phylogenomic data sets (e.g., length, informative sites). However, these different methods generally had little impact on the recovery and support values for well-established clades, even across very different numbers of loci. Nevertheless, our results suggest several "best practices" for alignment and trimming. Intriguingly, the choice of phylogenetic methods impacted the phylogenetic results most strongly, with concatenated analyses recovering significantly more well-established clades (with stronger support) than the species-tree analyses. [Alignment; concatenated analysis; phylogenomics; sequence length heterogeneity; species-tree analysis; trimming].
© The Author(s) 2020. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2021        PMID: 32797207     DOI: 10.1093/sysbio/syaa064

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   15.683


  2 in total

Review 1.  A roadmap for metagenomic enzyme discovery.

Authors:  Serina L Robinson; Jörn Piel; Shinichi Sunagawa
Journal:  Nat Prod Rep       Date:  2021-11-17       Impact factor: 13.423

2.  The construction of neurogenesis-related ceRNA network of ischemic stroke treated by oxymatrine.

Authors:  Xiaoling Zhang; Wentao Yao; Wannian Zhao; Yingru Sun; Zongkai Wu; Weiliang He; Yingxiao Ji; Yaran Gao; Xiaoli Niu; Litao Li; Hebo Wang
Journal:  Neuroreport       Date:  2022-09-07       Impact factor: 1.703

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.