Literature DB >> 8402207

Discovering simple DNA sequences by the algorithmic significance method.

A Milosavljević1, J Jurka.   

Abstract

A new method, 'algorithmic significance', is proposed as a tool for discovery of patterns in DNA sequences. The main idea is that patterns can be discovered by finding ways to encode the observed data concisely. In this sense, the method can be viewed as a formal version of the Occam's Razor principle. In this paper the method is applied to discover significantly simple DNA sequences. We define DNA sequences to be simple if they contain repeated occurrences of certain 'words' and thus can be encoded in a small number of bits. Such definition includes minisatellites and microsatellites. A standard dynamic programming algorithm for data compression is applied to compute the minimal encoding lengths of sequences in linear time. An electronic mail server for identification of simple sequences based on the proposed method has been installed at the Internet address pythia/anl.gov.

Mesh:

Substances:

Year:  1993        PMID: 8402207     DOI: 10.1093/bioinformatics/9.4.407

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  9 in total

1.  Discovering functional modules by identifying recurrent and mutually exclusive mutational patterns in tumors.

Authors:  Christopher A Miller; Stephen H Settle; Erik P Sulman; Kenneth D Aldape; Aleksandar Milosavljevic
Journal:  BMC Med Genomics       Date:  2011-04-14       Impact factor: 3.063

2.  Pash: efficient genome-scale sequence anchoring by Positional Hashing.

Authors:  Ken J Kalafus; Andrew R Jackson; Aleksandar Milosavljevic
Journal:  Genome Res       Date:  2004-04       Impact factor: 9.043

3.  A method for fast database search for all k-nucleotide repeats.

Authors:  G Benson; M S Waterman
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

4.  Sixty-nine kilobases of contiguous human genomic sequence containing the alpha-galactosidase A and Bruton's tyrosine kinase loci.

Authors:  J C Oeltjen; X Liu; J Lu; R C Allen; D Muzny; J W Belmont; R A Gibbs
Journal:  Mamm Genome       Date:  1995-05       Impact factor: 2.957

5.  On the representability of complete genomes by multiple competing finite-context (Markov) models.

Authors:  Armando J Pinho; Paulo J S G Ferreira; António J R Neves; Carlos A C Bastos
Journal:  PLoS One       Date:  2011-06-30       Impact factor: 3.240

6.  Accelerated Profile HMM Searches.

Authors:  Sean R Eddy
Journal:  PLoS Comput Biol       Date:  2011-10-20       Impact factor: 4.475

7.  CTD: An information-theoretic algorithm to interpret sets of metabolomic and transcriptomic perturbations in the context of graphical models.

Authors:  Lillian R Thistlethwaite; Varduhi Petrosyan; Xiqi Li; Marcus J Miller; Sarah H Elsea; Aleksandar Milosavljevic
Journal:  PLoS Comput Biol       Date:  2021-01-29       Impact factor: 4.475

8.  Complete DNA sequence of yeast chromosome II.

Authors:  H Feldmann; M Aigle; G Aljinovic; B André; M C Baclet; C Barthe; A Baur; A M Bécam; N Biteau; E Boles; T Brandt; M Brendel; M Brückner; F Bussereau; C Christiansen; R Contreras; M Crouzet; C Cziepluch; N Démolis; T Delaveau; F Doignon; H Domdey; S Düsterhus; E Dubois; B Dujon; M El Bakkoury; K D Entian; M Feurmann; W Fiers; G M Fobo; C Fritz; H Gassenhuber; N Glandsdorff; A Goffeau; L A Grivell; M de Haan; C Hein; C J Herbert; C P Hollenberg; K Holmstrøm; C Jacq; M Jacquet; J C Jauniaux; J L Jonniaux; T Kallesøe; P Kiesau; L Kirchrath; P Kötter; S Korol; S Liebl; M Logghe; A J Lohan; E J Louis; Z Y Li; M J Maat; L Mallet; G Mannhaupt; F Messenguy; T Miosga; F Molemans; S Müller; F Nasr; B Obermaier; J Perea; A Piérard; E Piravandi; F M Pohl; T M Pohl; S Potier; M Proft; B Purnelle; M Ramezani Rad; M Rieger; M Rose; I Schaaff-Gerstenschläger; B Scherens; C Schwarzlose; J Skala; P P Slonimski; P H Smits; J L Souciet; H Y Steensma; R Stucka; A Urrestarazu; Q J van der Aart; L van Dyck; A Vassarotti; I Vetter; F Vierendeels; S Vissers; G Wagner; P de Wergifosse; K H Wolfe; M Zagulski; F K Zimmermann; H W Mewes; K Kleine
Journal:  EMBO J       Date:  1994-12-15       Impact factor: 11.598

9.  A probabilistic model of local sequence alignment that simplifies statistical significance estimation.

Authors:  Sean R Eddy
Journal:  PLoS Comput Biol       Date:  2008-05-30       Impact factor: 4.475

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.