Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Multiple Sequence Alignment Computation Using the T-Coffee Regressive Algorithm Implementation.

Literature DB >> 33289888

Multiple Sequence Alignment Computation Using the T-Coffee Regressive Algorithm Implementation.

Edgar Garriga¹, Paolo Di Tommaso¹, Cedrik Magis¹, Ionas Erb¹, Leila Mansouri¹, Athanasios Baltzis¹, Evan Floden¹, Cedric Notredame^2,3.

Abstract

Many fields of biology rely on the inference of accurate multiple sequence alignments (MSA) of biological sequences. Unfortunately, the problem of assembling an MSA is NP-complete thus limiting computation to approximate solutions using heuristics solutions. The progressive algorithm is one of the most popular frameworks for the computation of MSAs. It involves pre-clustering the sequences and aligning them starting with the most similar ones. The scalability of this framework is limited, especially with respect to accuracy. We present here an alternative approach named regressive algorithm. In this framework, sequences are first clustered and then aligned starting with the most distantly related ones. This approach has been shown to greatly improve accuracy during scale-up, especially on datasets featuring 10,000 sequences or more. Another benefit is the possibility to integrate third-party clustering methods and third-party MSA aligners. The regressive algorithm has been tested on up to 1.5 million sequences, its implementation is available in the T-Coffee package.

Keywords: Guide tree; MSA; Progressive alignment; Sequence alignment

Mesh：

Year: 2021 PMID： 33289888 DOI： 10.1007/978-1-0716-1036-7_6

Source DB: PubMed Journal: Methods Mol Biol ISSN： 1064-3745

6 in total

1. T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors: C Notredame; D G Higgins; J Heringa
Journal: J Mol Biol Date: 2000-09-08 Impact factor: 5.469

2. PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences.

Authors: Kazutaka Katoh; Hiroyuki Toh
Journal: Bioinformatics Date: 2006-11-21 Impact factor: 6.937

3. [Cortical electrostimulation in skull and brain injury].

Authors: F A Gurchin; S V Medvedev; V Iu Puzenko
Journal: Fiziol Cheloveka Date: 1988 Mar-Apr

4. The alignment of sets of sequences and the construction of phyletic trees: an integrated method.

Authors: P Hogeweg; B Hesper
Journal: J Mol Evol Date: 1984 Impact factor: 2.395

5. Sequence embedding for fast construction of guide trees for multiple sequence alignment.

Authors: Gordon Blackshields; Fabian Sievers; Weifeng Shi; Andreas Wilm; Desmond G Higgins
Journal: Algorithms Mol Biol Date: 2010-05-14 Impact factor: 1.405

6. Pfam: the protein families database.

Authors: Robert D Finn; Alex Bateman; Jody Clements; Penelope Coggill; Ruth Y Eberhardt; Sean R Eddy; Andreas Heger; Kirstie Hetherington; Liisa Holm; Jaina Mistry; Erik L L Sonnhammer; John Tate; Marco Punta
Journal: Nucleic Acids Res Date: 2013-11-27 Impact factor: 16.971

6 in total

1 in total

1. Evidence that nuclear receptors are related to terpene synthases.

Authors: Douglas R Houston; Jane G Hanna; J Constance Lathe; Stephen G Hillier; Richard Lathe
Journal: J Mol Endocrinol Date: 2022-03-14 Impact factor: 5.098

1 in total