| Literature DB >> 16328949 |
Sanguthevar Rajasekaran1, Vishal Thapar, Hardik Dave, Chun-Hsi Huang.
Abstract
Multiple sequence alignment (MSA) is a vital problem in biology. Optimal alignment of multiple sequences becomes impractical even for a modest number of sequences since the general version of the problem is NP-hard. Because of the high time complexity of traditional MSA algorithms, even today's fast computers are not able to solve the problem for large number of sequences. In this paper we present a randomized algorithm to calculate distance matrices, which is a major step in many multiple sequence alignment algorithms. The basic idea employed is sampling (along the lines of). We also illustrate how to parallelize this algorithm. In Section we introduce the problem of multiple sequence alignments. In Section we provide a discussion on various methods that have been employed in the literature for Multiple Sequence Alignment. In this section we also introduce our new sampling approach. We extend our randomized algorithm to the case of non-uniform length sequences as well. We show that our algorithms are amenable to parallelism in Section. In Section we back up our claim of speedup and accuracy with empirical data and examples. In Section we provide some concluding remarks.Entities:
Mesh:
Year: 2005 PMID: 16328949 DOI: 10.1007/s10877-005-0680-3
Source DB: PubMed Journal: J Clin Monit Comput ISSN: 1387-1307 Impact factor: 1.977