Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Hybrid alignment: high-performance with universal statistics.

Literature DB >> 12075022

Hybrid alignment: high-performance with universal statistics.

Yi-Kuo Yu¹, Ralf Bundschuh, Terence Hwa.

Abstract

The score statistics of a recently introduced 'hybrid alignment' algorithm is studied in detail numerically. An extensive survey across the 2216 models of protein domains contained in the Pfam v5.4 database (Bateman et al., Nucleic Acids Res., 28, 263-266, 2000) verifies the theoretical predictions: For the position-specific scoring functions used in the Pfam models, the score statistics of hybrid alignment obey the Gumbel distribution, with the key Gumbel parameter lambda taking on the asymptotic value 1 universally for all models. Thus, the use of hybrid alignment eliminates the time-consuming computer simulations normally needed to assign p-values to alignment scores, freeing the users to experiment with different scoring parameters and functions. The performance of the hybrid algorithm in detecting sequence homology is also studied. For protein sequences from the SCOP database (Murzin et al., J. Mol. Biol., 247, 536-540, 1995) using uniform scoring functions, the performance is found to be comparable to the best of the existing methods. Preliminary results using the PfamA database suggest that the hybrid algorithm achieves similar performance as existing methods for position-specific scoring systems as well. Hybrid alignment is thereby established as a high performance alignment algorithm with well-characterized, universal statistics.

Mesh：

Year: 2002 PMID： 12075022 DOI： 10.1093/bioinformatics/18.6.864

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

7 in total

1. Finding functional sequence elements by multiple local alignment.

Authors: Martin C Frith; Ulla Hansen; John L Spouge; Zhiping Weng
Journal: Nucleic Acids Res Date: 2004-01-02 Impact factor: 16.971

2. A cell-based simulation software for multi-cellular systems.

Authors: Stefan Hoehme; Dirk Drasdo
Journal: Bioinformatics Date: 2010-08-13 Impact factor: 6.937

3. HMM-ModE--improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences.

Authors: Prashant K Srivastava; Dhwani K Desai; Soumyadeep Nandi; Andrew M Lynn
Journal: BMC Bioinformatics Date: 2007-03-27 Impact factor: 3.169

Hybrid alignment: high-performance with universal statistics.

1. Finding functional sequence elements by multiple local alignment.

2. A cell-based simulation software for multi-cellular systems.

3. HMM-ModE--improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences.

4. Optimizing amino acid substitution matrices with a local alignment kernel.

5. The effectiveness of position- and composition-specific gap costs for protein similarity searches.

6. The identification of complete domains within protein sequences using accurate E-values for semi-global alignment.

7. A probabilistic model of local sequence alignment that simplifies statistical significance estimation.