Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A fast word search algorithm for the representation of sequence similarity in genomic DNA.

Literature DB >> 8127677

A fast word search algorithm for the representation of sequence similarity in genomic DNA.

Abstract

Representation of sequence similarity by dot matrix plots is a method widely used for comparing biological sequences. The user is presented with an overall view of similarity between two sequences. Computation of this plot has been reconsidered here. An improvement is proposed through the preprocessing of the data into an automation recognizing the word structure of a sequence. The main advantage of this approach is to systematically eliminate the repetitions during word comparison. Simple heuristics are also considered to greatly speed up pattern matching. As a result, large sequences are handled very efficiently. This is illustrated by a comparison of large genomic DNA. The algorithm has been implemented in an interactive application on a microcomputer.

Mesh：

Substances：
DNA, Fungal

Year: 1994 PMID： 8127677 PMCID： PMC523596 DOI： 10.1093/nar/22.3.404

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

18 in total

A fast word search algorithm for the representation of sequence similarity in genomic DNA.

1. Locating gaps in amino acid sequences to optimize the homology between two proteins.

2. The diagram, a method for comparing sequences. Its use with amino acid and nucleotide sequences.

3. Locating well-conserved regions within a pairwise alignment.

4. Matrix program to analyze primary structure homology.

5. Enhanced graphic matrix analysis of nucleic acid and protein sequences.

6. Two-dimensional graphic analysis of DNA sequence homologies.

7. An interactive graphics program for comparing and aligning nucleic acid and amino acid sequences.

8. Rapid similarity searches of nucleic acid and protein data banks.

9. Tests for comparing related amino-acid sequences. Cytochrome c and cytochrome c 551 .

10. Three cDNA clones encoding mouse transplantation antigens: homology to immunoglobulin genes.

1. Fast analysis of genomic homologies: primate immunodeficiency virus.

2. Support vector machine (SVM) based multiclass prediction with basic statistical analysis of plasminogen activators.