Literature DB >> 16362903

Improvement in the accuracy of multiple sequence alignment program MAFFT.

Kazutaka Katoh1, Kei-ichi Kuma, Takashi Miyata, Hiroyuki Toh.   

Abstract

In 2002, we developed and released a rapid multiple sequence alignment program MAFFT that was designed to handle a huge (up to approximately 5,000 sequences) and long data (approximately 2,000 aa or approximately 5,000 nt) in a reasonable time on a standard desktop PC. As for the accuracy, however, the previous versions (v.4 and lower) of MAFFT were outperformed by ProbCons and TCoffee v.2, both of which were released in 2004, in several benchmark tests. Here we report a recent extension of MAFFT that aims to improve the accuracy with as little cost of calculation time as possible. The extended version of MAFFT (v.5) has new iterative refinement options, G-INS-i and L-INS-i (collectively denoted as [GL]-INS-i in this report). These options use a new objective function combining the weighted sum-of-pairs (WSP) score and a score similar to COFFEE derived from all pairwise alignments. We discuss the improvement in accuracy brought by this extension, mainly using two benchmark tests released very recently, BAliBASE v.3 (for protein alignments) and BRAliBASE (for RNA alignments). According to BAliBASE v.3, the overall average accuracy of L-INS-i was higher than those of other methods successively released in 2004, although the difference among the most accurate methods (ProbCons, TCoffee v.2 and new options of MAFFT) was small. The advantage in accuracy of [GL]-INS-i became greater for the alignments consisting of approximately 50-100 sequences. By utilizing this feature of MAFFT, we also examined another possible approach to improve the accuracy by incorporating homolog information collected from database. The [GL]-INS-i options are applicable to aligning up to approximately 200 sequences, although not applicable to thousands of sequences because of time and space complexities.

Mesh:

Year:  2005        PMID: 16362903

Source DB:  PubMed          Journal:  Genome Inform        ISSN: 0919-9454


  79 in total

1.  PAPNC, a novel method to calculate nucleotide diversity from large scale next generation sequencing data.

Authors:  Wei Shao; Mary F Kearney; Valerie F Boltz; Jonathan E Spindler; John W Mellors; Frank Maldarelli; John M Coffin
Journal:  J Virol Methods       Date:  2014-03-26       Impact factor: 2.014

2.  Identification and characterization of two novel isoforms of Pirh2 ubiquitin ligase that negatively regulate p53 independent of RING finger domains.

Authors:  Chad A Corcoran; JoAnne Montalbano; Hong Sun; Qin He; Ying Huang; M Saeed Sheikh
Journal:  J Biol Chem       Date:  2009-05-29       Impact factor: 5.157

Review 3.  Dicer-like (DCL) proteins in plants.

Authors:  Qingpo Liu; Ying Feng; Zhujun Zhu
Journal:  Funct Integr Genomics       Date:  2009-02-17       Impact factor: 3.410

4.  R-PASS: A Fast Structure-based RNA Sequence Alignment Algorithm.

Authors:  Yanan Jiang; Weijia Xu; Lee Parnell Thompson; Robin R Gutell; Daniel P Miranker
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2011-12-31

5.  Kinomer v. 1.0: a database of systematically classified eukaryotic protein kinases.

Authors:  David M A Martin; Diego Miranda-Saavedra; Geoffrey J Barton
Journal:  Nucleic Acids Res       Date:  2008-10-30       Impact factor: 16.971

6.  Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features.

Authors:  Timo Lassmann; Oliver Frings; Erik L L Sonnhammer
Journal:  Nucleic Acids Res       Date:  2008-12-22       Impact factor: 16.971

7.  Complete sequence determination of a novel reptile iridovirus isolated from soft-shelled turtle and evolutionary analysis of Iridoviridae.

Authors:  Youhua Huang; Xiaohong Huang; Hong Liu; Jie Gong; Zhengliang Ouyang; Huachun Cui; Jianhao Cao; Yingtao Zhao; Xiujie Wang; Yulin Jiang; Qiwei Qin
Journal:  BMC Genomics       Date:  2009-05-14       Impact factor: 3.969

8.  Dynamic coupling of pattern formation and morphogenesis in the developing vertebrate retina.

Authors:  Alexander Picker; Florencia Cavodeassi; Anja Machate; Sabine Bernauer; Stefan Hans; Gembu Abe; Koichi Kawakami; Stephen W Wilson; Michael Brand
Journal:  PLoS Biol       Date:  2009-10-13       Impact factor: 8.029

9.  MetaBioME: a database to explore commercially useful enzymes in metagenomic datasets.

Authors:  Vineet K Sharma; Naveen Kumar; Tulika Prakash; Todd D Taylor
Journal:  Nucleic Acids Res       Date:  2009-11-11       Impact factor: 16.971

10.  Evolution of nonstop, no-go and nonsense-mediated mRNA decay and their termination factor-derived components.

Authors:  Gemma C Atkinson; Sandra L Baldauf; Vasili Hauryliuk
Journal:  BMC Evol Biol       Date:  2008-10-23       Impact factor: 3.260

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.