Literature DB >> 16328949

Randomized and parallel algorithms for distance matrix calculations in multiple sequence alignment.

Sanguthevar Rajasekaran1, Vishal Thapar, Hardik Dave, Chun-Hsi Huang.   

Abstract

Multiple sequence alignment (MSA) is a vital problem in biology. Optimal alignment of multiple sequences becomes impractical even for a modest number of sequences since the general version of the problem is NP-hard. Because of the high time complexity of traditional MSA algorithms, even today's fast computers are not able to solve the problem for large number of sequences. In this paper we present a randomized algorithm to calculate distance matrices, which is a major step in many multiple sequence alignment algorithms. The basic idea employed is sampling (along the lines of). We also illustrate how to parallelize this algorithm. In Section we introduce the problem of multiple sequence alignments. In Section we provide a discussion on various methods that have been employed in the literature for Multiple Sequence Alignment. In this section we also introduce our new sampling approach. We extend our randomized algorithm to the case of non-uniform length sequences as well. We show that our algorithms are amenable to parallelism in Section. In Section we back up our claim of speedup and accuracy with empirical data and examples. In Section we provide some concluding remarks.

Entities:  

Mesh:

Year:  2005        PMID: 16328949     DOI: 10.1007/s10877-005-0680-3

Source DB:  PubMed          Journal:  J Clin Monit Comput        ISSN: 1387-1307            Impact factor:   1.977


  10 in total

Review 1.  Recent progress in multiple sequence alignment: a survey.

Authors:  Cédric Notredame
Journal:  Pharmacogenomics       Date:  2002-01       Impact factor: 2.533

2.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors:  Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

3.  A novel randomized iterative strategy for aligning multiple protein sequences.

Authors:  M P Berger; P J Munson
Journal:  Comput Appl Biosci       Date:  1991-10

4.  Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics.

Authors:  T I Zarembinski; L W Hung; H J Mueller-Dieckmann; K K Kim; H Yokota; R Kim; S H Kim
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-22       Impact factor: 11.205

5.  Multiple sequence alignment with hierarchical clustering.

Authors:  F Corpet
Journal:  Nucleic Acids Res       Date:  1988-11-25       Impact factor: 16.971

6.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

7.  Progressive sequence alignment as a prerequisite to correct phylogenetic trees.

Authors:  D F Feng; R F Doolittle
Journal:  J Mol Evol       Date:  1987       Impact factor: 2.395

8.  Further improvement in methods of group-to-group sequence alignment with generalized profile operations.

Authors:  O Gotoh
Journal:  Comput Appl Biosci       Date:  1994-07

9.  Optimal alignment between groups of sequences and its application to multiple sequence alignment.

Authors:  O Gotoh
Journal:  Comput Appl Biosci       Date:  1993-06

10.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

Authors:  J D Thompson; D G Higgins; T J Gibson
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.