Literature DB >> 11459354

Sequence alignment: an approximation law for the Z-value with applications to databank scanning.

J N Bacro1, J P Comet.   

Abstract

The Z-value is an attempt to estimate the statistical significance of a Smith and Waterman dynamic programming alignment score (H-score) through the use of a Monte-Carlo procedure. In this paper, we give an approximation for the Z-value law deduced from the Poisson clumping heuristic developed by Waterman and Vingron (Stat. Sci. 9 (1994) 367) in the case of independent and identically distributed sequences comparison. As for non-gapped alignment scores, our approximation is of Gumbel type but with parameters that are sequence independent. This result makes clear the related experimental results mentioned by Comet et al. (Comput. Chem. 23 (1999) 317). Using 'quasi-real' sequences (i.e. randomly shuffled sequences of the same length and amino acid composition as the real ones) we investigate the relevance of our approximation result. Since the Monte-Carlo approach we use generates a bias for the Gumbel decay parameter estimation, a correction procedure is proposed. Applications to real sequences are considered and we show how our results can be used to detect the potential biological relationships between real sequences.

Entities:  

Mesh:

Year:  2001        PMID: 11459354     DOI: 10.1016/s0097-8485(01)00074-2

Source DB:  PubMed          Journal:  Comput Chem        ISSN: 0097-8485


  8 in total

1.  QuRe: software for viral quasispecies reconstruction from next-generation sequencing data.

Authors:  Mattia C F Prosperi; Marco Salemi
Journal:  Bioinformatics       Date:  2011-11-15       Impact factor: 6.937

2.  Combinatorial analysis and algorithms for quasispecies reconstruction using next-generation sequencing.

Authors:  Mattia C F Prosperi; Luciano Prosperi; Alessandro Bruselles; Isabella Abbate; Gabriella Rozera; Donatella Vincenti; Maria Carmela Solmone; Maria Rosaria Capobianchi; Giovanni Ulivi
Journal:  BMC Bioinformatics       Date:  2011-01-05       Impact factor: 3.169

3.  A simple derivation of the distribution of pairwise local protein sequence alignment scores.

Authors:  Olivier Bastien
Journal:  Evol Bioinform Online       Date:  2008-02-14       Impact factor: 1.625

Review 4.  Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

Authors:  Lyn-Marie Birkholtz; Olivier Bastien; Gordon Wells; Delphine Grando; Fourie Joubert; Vinod Kasam; Marc Zimmermann; Philippe Ortet; Nicolas Jacq; Nadia Saïdani; Sylvaine Roy; Martin Hofmann-Apitius; Vincent Breton; Abraham I Louw; Eric Maréchal
Journal:  Malar J       Date:  2006-11-17       Impact factor: 2.979

Review 5.  No wisdom in the crowd: genome annotation in the era of big data - current status and future prospects.

Authors:  Antoine Danchin; Christos Ouzounis; Taku Tokuyasu; Jean-Daniel Zucker
Journal:  Microb Biotechnol       Date:  2018-05-28       Impact factor: 5.813

6.  A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities.

Authors:  Olivier Bastien; Philippe Ortet; Sylvaine Roy; Eric Maréchal
Journal:  BMC Bioinformatics       Date:  2005-03-10       Impact factor: 3.169

7.  Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores.

Authors:  Olivier Bastien; Eric Maréchal
Journal:  BMC Bioinformatics       Date:  2008-08-07       Impact factor: 3.169

8.  Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges.

Authors:  Mattia C F Prosperi; Li Yin; David J Nolan; Amanda D Lowe; Maureen M Goodenow; Marco Salemi
Journal:  Sci Rep       Date:  2013-10-03       Impact factor: 4.379

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.