Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A test for the statistical significance of DNA sequence similarities for application in databank searches.

Literature DB >> 2720462

A test for the statistical significance of DNA sequence similarities for application in databank searches.

Abstract

A method is developed, based on word-searching, which provides a rapid test for the statistical significance of DNA sequence similarities for use in databank searching. The method makes allowance for the lengths and dinucleotide compositions of the sequences being compared. A way is also described to calculate the power of the test, i.e. the probability of detecting a given similarity as being statistically significant. The effects on the power of the test of the scoring method, word length, sequence length, and sequence composition are examined. A novel scoring method is shown to be superior to the method currently used in most word-searching algorithms.

Entities: Chemical

Mesh：

Substances：
Globins

Year: 1989 PMID： 2720462 DOI： 10.1093/bioinformatics/5.2.123

Source DB: PubMed Journal: Comput Appl Biosci ISSN： 0266-7061

Keyword Cloud
Cited

4 in total

1. An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequences.

Authors: R F Mott; T B Kirkwood; R N Curnow
Journal: Bull Math Biol Date: 1990 Impact factor: 1.758

A test for the statistical significance of DNA sequence similarities for application in databank searches.

1. An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequences.

2. Poisson, compound Poisson and process approximations for testing statistical significance in sequence comparisons.

3. Maximum-likelihood estimation of the statistical distribution of Smith-Waterman local sequence similarity scores.

4. Conditioning on the number of bands in interpreting matches of multilocus DNA profiles.