| Literature DB >> 6572363 |
Abstract
With the development of large data banks of protein and nucleic acid sequences, the need for efficient methods of searching such banks for sequences similar to a given sequence has become evident. We present an algorithm for the global comparison of sequences based on matching k-tuples of sequence elements for a fixed k. The method results in substantial reduction in the time required to search a data bank when compared with prior techniques of similarity analysis, with minimal loss in sensitivity. The algorithm has also been adapted, in a separate implementation, to produce rigorous sequence alignments. Currently, using the DEC KL-10 system, we can compare all sequences in the entire Protein Data Bank of the National Biomedical Research Foundation with a 350-residue query sequence in less than 3 min and carry out a similar analysis with a 500-base query sequence against all eukaryotic sequences in the Los Alamos Nucleic Acid Data Base in less than 2 min.Mesh:
Substances:
Year: 1983 PMID: 6572363 PMCID: PMC393452 DOI: 10.1073/pnas.80.3.726
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205