Literature DB >> 17877802

A basic analysis toolkit for biological sequences.

Raffaele Giancarlo1, Alessandro Siragusa, Enrico Siragusa, Filippo Utro.   

Abstract

This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at http://www.math.unipa.it/~raffaele/BATS/ under the GNU GPL.

Entities:  

Year:  2007        PMID: 17877802      PMCID: PMC2147010          DOI: 10.1186/1748-7188-2-10

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  10 in total

1.  A statistical method for finding transcription factor binding sites.

Authors:  S Sinha; M Tompa
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  2000

2.  YMF: A program for discovery of novel transcription factor binding sites by statistical overrepresentation.

Authors:  Saurabh Sinha; Martin Tompa
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

3.  Amino acid substitution matrices from protein blocks.

Authors:  S Henikoff; J G Henikoff
Journal:  Proc Natl Acad Sci U S A       Date:  1992-11-15       Impact factor: 11.205

4.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

5.  Optimal sequence alignments.

Authors:  W M Fitch; T F Smith
Journal:  Proc Natl Acad Sci U S A       Date:  1983-03       Impact factor: 11.205

6.  Over- and underrepresentation of short DNA words in herpesvirus genomes.

Authors:  M Y Leung; G M Marsh; T P Speed
Journal:  J Comput Biol       Date:  1996       Impact factor: 1.479

7.  Sequence comparison with concave weighting functions.

Authors:  W Miller; E W Myers
Journal:  Bull Math Biol       Date:  1988       Impact factor: 1.758

8.  Performance evaluation of amino acid substitution matrices.

Authors:  S Henikoff; J G Henikoff
Journal:  Proteins       Date:  1993-09

9.  An improved algorithm for matching biological sequences.

Authors:  O Gotoh
Journal:  J Mol Biol       Date:  1982-12-15       Impact factor: 5.469

10.  Efficient sequence alignment algorithms.

Authors:  M S Waterman
Journal:  J Theor Biol       Date:  1984-06-07       Impact factor: 2.691

  10 in total
  1 in total

1.  A software pipeline for processing and identification of fungal ITS sequences.

Authors:  R Henrik Nilsson; Gunilla Bok; Martin Ryberg; Erik Kristiansson; Nils Hallenberg
Journal:  Source Code Biol Med       Date:  2009-01-15
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.