Literature DB >> 1614873

WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences.

G Pesole1, N Prunella, S Liuni, M Attimonelli, C Saccone.   

Abstract

We present here a fast and sensitive method designed to isolate short nucleotide sequences which have non-random statistical properties and may thus be biologically active. It is based on a first order Markov analysis and allows us to detect statistically significant sequence motifs from six to ten nucleotides long which are significantly shared (or avoided) in the sequences under investigation. This method has been tested on a set of 521 sequences extracted from the Eukaryotic Promoter Database (2). Our results demonstrate the accuracy and the efficiency of the method in that the sequence motifs which are known to act as eukaryotic promoters, such as the TATA-box and the CAAT-box, were clearly identified. In addition we have found other statistically significant motifs, the biological roles of which are yet to be clarified.

Mesh:

Substances:

Year:  1992        PMID: 1614873      PMCID: PMC336935          DOI: 10.1093/nar/20.11.2871

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  31 in total

1.  Statistical analysis of nucleotide sequences.

Authors:  E E Stückle; C Emmrich; U Grob; P J Nielsen
Journal:  Nucleic Acids Res       Date:  1990-11-25       Impact factor: 16.971

2.  SQUIRREL: Sequence QUery, Information Retrieval and REporting Library. A program package for analyzing signals in nucleic acid sequences for the VAX.

Authors:  C J Gartmann; U Grob
Journal:  Nucleic Acids Res       Date:  1991-11-11       Impact factor: 16.971

3.  Discrete high molecular weight RNA transcribed from the long interspersed repetitive element L1Md.

Authors:  J P Dudley
Journal:  Nucleic Acids Res       Date:  1987-03-25       Impact factor: 16.971

4.  Recognition of characteristic patterns in sets of functionally equivalent DNA sequences.

Authors:  G Mengeritsky; T F Smith
Journal:  Comput Appl Biosci       Date:  1987-09

5.  Linguistics of nucleotide sequences: morphology and comparison of vocabularies.

Authors:  V Brendel; J S Beckmann; E N Trifonov
Journal:  J Biomol Struct Dyn       Date:  1986-08

6.  Intervening sequences exhibit distinct vocabulary.

Authors:  J S Beckmann; V Brendel; E N Trifonov
Journal:  J Biomol Struct Dyn       Date:  1986-12

7.  Evolution of the genome and the genetic code: selection at the dinucleotide level by methylation and polyribonucleotide cleavage.

Authors:  E Beutler; T Gelbart; J H Han; J A Koziol; B Beutler
Journal:  Proc Natl Acad Sci U S A       Date:  1989-01       Impact factor: 11.205

8.  Computer analysis of nucleic acid regulatory sequences.

Authors:  L J Korn; C L Queen; M N Wegman
Journal:  Proc Natl Acad Sci U S A       Date:  1977-10       Impact factor: 11.205

9.  Deletional analysis of the promoter region of the human transferrin receptor gene.

Authors:  J L Casey; B Di Jeso; K K Rao; T A Rouault; R D Klausner; J B Harford
Journal:  Nucleic Acids Res       Date:  1988-01-25       Impact factor: 16.971

10.  Cell-type specific protein binding to the enhancer of simian virus 40 in nuclear extracts.

Authors:  I Davidson; C Fromental; P Augereau; A Wildeman; M Zenke; P Chambon
Journal:  Nature       Date:  1986 Oct 9-15       Impact factor: 49.962

View more
  10 in total

1.  Remarkable sequence signatures in archaeal genomes.

Authors:  Ahmed Fadiel; Stuart Lithwick; Gopi Ganji; Stephen W Scherer
Journal:  Archaea       Date:  2003-10       Impact factor: 3.273

2.  Analysis of eukaryotic promoter sequences reveals a systematically occurring CT-signal.

Authors:  N I Larsen; J Engelbrecht; S Brunak
Journal:  Nucleic Acids Res       Date:  1995-04-11       Impact factor: 16.971

3.  Extraction of functional binding sites from unique regulatory regions: the Drosophila early developmental enhancers.

Authors:  Dmitri A Papatsenko; Vsevolod J Makeev; Alex P Lifanov; Mireille Régnier; Anna G Nazina; Claude Desplan
Journal:  Genome Res       Date:  2002-03       Impact factor: 9.043

4.  Atypical regions in large genomic DNA sequences.

Authors:  S Scherer; M S McPeek; T P Speed
Journal:  Proc Natl Acad Sci U S A       Date:  1994-07-19       Impact factor: 11.205

Review 5.  Computational identification of transcriptional regulatory elements in DNA sequence.

Authors:  Debraj GuhaThakurta
Journal:  Nucleic Acids Res       Date:  2006-07-19       Impact factor: 16.971

6.  Unsupervised statistical discovery of spaced motifs in prokaryotic genomes.

Authors:  Hao Tong; Paul Schliekelman; Jan Mrázek
Journal:  BMC Genomics       Date:  2017-01-05       Impact factor: 3.969

7.  An algorithm for identifying novel targets of transcription factor families: application to hypoxia-inducible factor 1 targets.

Authors:  Yue Jiang; Bojan Cukic; Donald A Adjeroh; Heath D Skinner; Jie Lin; Qingxi J Shen; Bing-Hua Jiang
Journal:  Cancer Inform       Date:  2009-03-04

Review 8.  A survey of DNA motif finding algorithms.

Authors:  Modan K Das; Ho-Kwok Dai
Journal:  BMC Bioinformatics       Date:  2007-11-01       Impact factor: 3.169

9.  Kangaroo--a pattern-matching program for biological sequences.

Authors:  Doron Betel; Christopher W V Hogue
Journal:  BMC Bioinformatics       Date:  2002-07-31       Impact factor: 3.169

10.  The RHNumtS compilation: features and bioinformatics approaches to locate and quantify Human NumtS.

Authors:  Daniela Lascaro; Stefano Castellana; Giuseppe Gasparre; Giovanni Romeo; Cecilia Saccone; Marcella Attimonelli
Journal:  BMC Genomics       Date:  2008-06-03       Impact factor: 3.969

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.