Literature DB >> 3208180

A finite state machine algorithm for finding restriction sites and other pattern matching applications.

R Smith1.   

Abstract

Existing algorithms for finding restriction endonuclease recognition sites use brute-force algorithms which run in time 0(NM) where N is the number of nucleotides in the sequence under analysis and M is the total number of nucleotides in all the different sites being searched for. This paper presents a deterministic finite state machine algorithm which runs in time 0(N). Memory use can be as high as 0(M4) but a slight modification to the basic algorithm can impose a theoretical upper bound of 0(M) at the cost of some added complexity in the execution of the state machine. The algorithm can operate with a single pass through the sequence under analysis, with no need to back up or (for non-circular sequences) store more than a single input character at a time. This type of algorithm can be adapted to many pattern-matching tasks and is simple enough to implement in hardware that it could, for example, be built into a disk controller as part of a specialized database machine.

Mesh:

Substances:

Year:  1988        PMID: 3208180     DOI: 10.1093/bioinformatics/4.4.459

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  2 in total

1.  Identification of a DNA sequence motif required for expression of iron-regulated genes in pseudomonads.

Authors:  I T Rombel; B J McMorran; I L Lamont
Journal:  Mol Gen Genet       Date:  1995-02-20

2.  PatMaN: rapid alignment of short sequences to large databases.

Authors:  Kay Prüfer; Udo Stenzel; Michael Dannemann; Richard E Green; Michael Lachmann; Janet Kelso
Journal:  Bioinformatics       Date:  2008-05-08       Impact factor: 6.937

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.