Literature DB >> 15591359

Statistical evaluation and comparison of a pairwise alignment algorithm that a priori assigns the number of gaps rather than employing gap penalties.

Yasuyuki Nozaki1, Matthew Bellgard.   

Abstract

MOTIVATION: Although pairwise sequence alignment is essential in comparative genomic sequence analysis, it has proven difficult to precisely determine the gap penalties for a given pair of sequences. A common practice is to employ default penalty values. However, there are a number of problems associated with using gap penalties. First, alignment results can vary depending on the gap penalties, making it difficult to explore appropriate parameters. Second, the statistical significance of an alignment score is typically based on a theoretical model of non-gapped alignments, which may be misleading. Finally, there is no way to control the number of gaps for a given pair of sequences, even if the number of gaps is known in advance.
RESULTS: In this paper, we develop and evaluate the performance of an alignment technique that allows the researcher to assign a priori set of the number of allowable gaps, rather than using gap penalties. We compare this approach with the Smith-Waterman and Needleman-Wunsch techniques on a set of structurally aligned protein sequences. We demonstrate that this approach outperforms the other techniques, especially for short sequences (56-133 residues) with low similarity (<25%). Further, by employing a statistical measure, we show that it can be used to assess the quality of the alignment in relation to the true alignment with the associated optimal number of gaps. AVAILABILITY: The implementation of the described methods SANK_AL is available at http://cbbc.murdoch.edu.au/ CONTACT: matthew@cbbc.murdoch.edu.au.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15591359     DOI: 10.1093/bioinformatics/bti198

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

1.  Aligning sequences by minimum description length.

Authors:  John S Conery
Journal:  EURASIP J Bioinform Syst Biol       Date:  2007

2.  Mapping sequences by parts.

Authors:  Gilles Didier; Carito Guziolowski
Journal:  Algorithms Mol Biol       Date:  2007-09-19       Impact factor: 1.405

3.  Bacterial Foraging Optimization -Genetic Algorithm for Multiple Sequence Alignment with Multi-Objectives.

Authors:  P Manikandan; D Ramyachitra
Journal:  Sci Rep       Date:  2017-08-18       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.