Literature DB >> 33289889

Multiple Sequence Alignment for Large Heterogeneous Datasets Using SATé, PASTA, and UPP.

Tandy Warnow1, Siavash Mirarab2.   

Abstract

The estimation of very large multiple sequence alignments is a challenging problem that requires special techniques in order to achieve high accuracy. Here we describe two software packages-PASTA and UPP-for constructing alignments on large and ultra-large datasets. Both methods have been able to produce highly accurate alignments on 1,000,000 sequences, and trees computed on these alignments are also highly accurate. PASTA provides the best tree accuracy when the input sequences are all full-length, but UPP provides improved accuracy compared to PASTA and other methods when the input contains a large number of fragmentary sequences. Both methods are available in open source form on GitHub.

Entities:  

Keywords:  Ensembles of Hidden Markov Models; Multiple sequence alignment; PASTA; SATé; UPP

Mesh:

Year:  2021        PMID: 33289889     DOI: 10.1007/978-1-0716-1036-7_7

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  31 in total

1.  SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.

Authors:  Kevin Liu; Tandy J Warnow; Mark T Holder; Serita M Nelesen; Jiaye Yu; Alexandros P Stamatakis; C Randal Linder
Journal:  Syst Biol       Date:  2011-12-01       Impact factor: 15.683

2.  An algorithm for progressive multiple alignment of sequences with insertions.

Authors:  Ari Löytynoja; Nick Goldman
Journal:  Proc Natl Acad Sci U S A       Date:  2005-07-06       Impact factor: 11.205

3.  BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny.

Authors:  Marc A Suchard; Benjamin D Redelings
Journal:  Bioinformatics       Date:  2006-05-05       Impact factor: 6.937

4.  Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees.

Authors:  Kevin Liu; Sindhu Raghavan; Serita Nelesen; C Randal Linder; Tandy Warnow
Journal:  Science       Date:  2009-06-19       Impact factor: 47.728

5.  TIPP: taxonomic identification and phylogenetic profiling.

Authors:  Nam-Phuong Nguyen; Siavash Mirarab; Bo Liu; Mihai Pop; Tandy Warnow
Journal:  Bioinformatics       Date:  2014-10-29       Impact factor: 6.937

6.  PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

Authors:  Siavash Mirarab; Nam Nguyen; Sheng Guo; Li-San Wang; Junhyong Kim; Tandy Warnow
Journal:  J Comput Biol       Date:  2014-12-30       Impact factor: 1.479

7.  [Cortical electrostimulation in skull and brain injury].

Authors:  F A Gurchin; S V Medvedev; V Iu Puzenko
Journal:  Fiziol Cheloveka       Date:  1988 Mar-Apr

8.  Private practice physicians and community hospitals: endangered species?

Authors:  S Memel
Journal:  Alaska Med       Date:  1979-09

9.  Expression of the lck tyrosine kinase gene in human colon carcinoma and other non-lymphoid human tumor cell lines.

Authors:  A Veillette; F M Foss; E A Sausville; J B Bolen; N Rosen
Journal:  Oncogene Res       Date:  1987 Sep-Oct

10.  Incorporating indel information into phylogeny estimation for rapidly emerging pathogens.

Authors:  Benjamin D Redelings; Marc A Suchard
Journal:  BMC Evol Biol       Date:  2007-03-14       Impact factor: 3.260

View more
  1 in total

1.  Assembling a Reference Phylogenomic Tree of Bacteria and Archaea by Summarizing Many Gene Phylogenies.

Authors:  Qiyun Zhu; Siavash Mirarab
Journal:  Methods Mol Biol       Date:  2022
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.