Literature DB >> 33289892

Automated Removal of Non-homologous Sequence Stretches with PREQUAL.

Iker Irisarri1,2,3, Fabien Burki4,5, Simon Whelan6.   

Abstract

Large-scale multigene datasets used in phylogenomics and comparative genomics often contain sequence errors inherited from source genomes and transcriptomes. These errors typically manifest as stretches of non-homologous characters and derive from sequencing, assembly, and/or annotation errors. The lack of automatic tools to detect and remove sequence errors leads to the propagation of these errors in large-scale datasets. PREQUAL is a command line tool that identifies and masks regions with non-homologous adjacent characters in sets of unaligned homologous sequences. PREQUAL uses a full probabilistic approach based on pair hidden Markov models. On the front end, PREQUAL is user-friendly and simple to use while also allowing full customization to adjust filtering sensitivity. It is primarily aimed at amino acid sequences but can handle protein-coding nucleotide sequences. PREQUAL is computationally efficient and shows high sensitivity and accuracy. In this chapter, we briefly introduce the motivation for PREQUAL and its underlying methodology, followed by a description of basic and advanced usage, and conclude with some notes and recommendations. PREQUAL fills an important gap in the current bioinformatics tool kit for phylogenomics, contributing toward increased accuracy and reproducibility in future studies.

Keywords:  Filtering; Genomics; HMM; Homology; Phylogenomics; Sequence analysis

Mesh:

Year:  2021        PMID: 33289892     DOI: 10.1007/978-1-0716-1036-7_10

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  2 in total

1.  PREQUAL: detecting non-homologous characters in sets of unaligned homologous sequences.

Authors:  Simon Whelan; Iker Irisarri; Fabien Burki
Journal:  Bioinformatics       Date:  2018-11-15       Impact factor: 6.937

2.  Phylogenetic Tree Estimation With and Without Alignment: New Distance Methods and Benchmarking.

Authors:  Marcin Bogusz; Simon Whelan
Journal:  Syst Biol       Date:  2017-03-01       Impact factor: 15.683

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.