Literature DB >> 20459682

MTRAP: pairwise sequence alignment algorithm by a new measure based on transition probability between two consecutive pairs of residues.

Toshihide Hara1, Keiko Sato, Masanori Ohya.   

Abstract

BACKGROUND: Sequence alignment is one of the most important techniques to analyze biological systems. It is also true that the alignment is not complete and we have to develop it to look for more accurate method. In particular, an alignment for homologous sequences with low sequence similarity is not in satisfactory level. Usual methods for aligning protein sequences in recent years use a measure empirically determined. As an example, a measure is usually defined by a combination of two quantities (1) and (2) below: (1) the sum of substitutions between two residue segments, (2) the sum of gap penalties in insertion/deletion region. Such a measure is determined on the assumption that there is no an intersite correlation on the sequences. In this paper, we improve the alignment by taking the correlation of consecutive residues.
RESULTS: We introduced a new method of alignment, called MTRAP by introducing a metric defined on compound systems of two sequences. In the benchmark tests by PREFAB 4.0 and HOMSTRAD, our pairwise alignment method gives higher accuracy than other methods such as ClustalW2, TCoffee, MAFFT. Especially for the sequences with sequence identity less than 15%, our method improves the alignment accuracy significantly. Moreover, we also showed that our algorithm works well together with a consistency-based progressive multiple alignment by modifying the TCoffee to use our measure.
CONCLUSIONS: We indicated that our method leads to a significant increase in alignment accuracy compared with other methods. Our improvement is especially clear in low identity range of sequences. The source code is available at our web page, whose address is found in the section "Availability and requirements".

Entities:  

Mesh:

Year:  2010        PMID: 20459682      PMCID: PMC2875243          DOI: 10.1186/1471-2105-11-235

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  29 in total

1.  T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors:  C Notredame; D G Higgins; J Heringa
Journal:  J Mol Biol       Date:  2000-09-08       Impact factor: 5.469

2.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors:  Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

3.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

4.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

5.  Consistency of optimal sequence alignments.

Authors:  O Gotoh
Journal:  Bull Math Biol       Date:  1990       Impact factor: 1.758

6.  A tool for multiple sequence alignment.

Authors:  D J Lipman; S F Altschul; J D Kececioglu
Journal:  Proc Natl Acad Sci U S A       Date:  1989-06       Impact factor: 11.205

7.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

8.  Principles that govern the folding of protein chains.

Authors:  C B Anfinsen
Journal:  Science       Date:  1973-07-20       Impact factor: 47.728

9.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

10.  Progressive sequence alignment as a prerequisite to correct phylogenetic trees.

Authors:  D F Feng; R F Doolittle
Journal:  J Mol Evol       Date:  1987       Impact factor: 2.395

View more
  4 in total

1.  Improving the alignment quality of consistency based aligners with an evaluation function using synonymous protein words.

Authors:  Hsin-Nan Lin; Cédric Notredame; Jia-Ming Chang; Ting-Yi Sung; Wen-Lian Hsu
Journal:  PLoS One       Date:  2011-12-02       Impact factor: 3.240

2.  Variable-order sequence modeling improves bacterial strain discrimination for Ion Torrent DNA reads.

Authors:  Thomas M Poulsen; Martin Frith
Journal:  BMC Bioinformatics       Date:  2017-06-12       Impact factor: 3.169

3.  An Extension of the Kimura Two-Parameter Model to the Natural Evolutionary Process.

Authors:  Takuma Nishimaki; Keiko Sato
Journal:  J Mol Evol       Date:  2019-01-10       Impact factor: 2.395

4.  Sequence polymorphism of the waxy gene in waxy maize accessions and characterization of a new waxy allele.

Authors:  Meijie Luo; Yaxing Shi; Yang Yang; Yanxin Zhao; Yunxia Zhang; Yamin Shi; Mengsi Kong; Chunhui Li; Zhen Feng; Yanli Fan; Li Xu; Shengli Xi; Baishan Lu; Jiuran Zhao
Journal:  Sci Rep       Date:  2020-09-28       Impact factor: 4.379

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.