Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequences.

Literature DB >> 2279194

An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequences.

Abstract

An accurate approximation is derived to the distribution of the length of the longest matching word present between two random DNA sequences of finite length, using only elementary probability arguments. The distribution is shown to be consistent with previous asymptotic results for the mean and variance of longest common words. The application of the distribution to assessing the statistical significance of sequence similarities is considered. It is shown how the distribution can be modified to take account of non-independence of neighbouring bases in real sequences.

Mesh：

Substances：
DNA

Year: 1990 PMID： 2279194 DOI： 10.1007/bf02460808

Source DB: PubMed Journal: Bull Math Biol ISSN： 0092-8240 Impact factor: 1.758

8 in total

3 in total

An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequences.

1. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.

2. A test for the statistical significance of DNA sequence similarities for application in databank searches.

3. Significance levels for biological sequence comparison using non-linear similarity functions.

4. The statistical distribution of nucleic acid similarities.

5. A comprehensive set of sequence analysis programs for the VAX.

6. Statistical characterization of nucleic acid sequence functional domains.

7. Identification of common molecular subsequences.

8. New approaches for computer analysis of nucleic acid sequences.

1. Poisson, compound Poisson and process approximations for testing statistical significance in sequence comparisons.

2. Maximum-likelihood estimation of the statistical distribution of Smith-Waterman local sequence similarity scores.

3. Pattern matching between two non-aligned random sequences.