Literature DB >> 9146959

Detection of significant patterns by compression algorithms: the case of approximate tandem repeats in DNA sequences.

E Rivals1, O Delgrange, J P Delahaye, M Dauchet, M O Delorme, A Hénaut, E Ollivier.   

Abstract

MOTIVATION: Compression algorithms can be used to analyse genetic sequences. A compression algorithm tests a given property on the sequence and uses it to encode the sequence: if the property is true, it reveals some structure of the sequence which can be described briefly, this yields a description of the sequence which is shorter than the sequence of nucleotides given in extenso. The more a sequence is compressed by the algorithm, the more significant is the property for that sequence.
RESULTS: We present a compression algorithm that tests the presence of a particular type of dosDNA (defined ordered sequence-DNA): approximate tandem repeats of small motifs (i.e. of lengths < 4). This algorithm has been experimented with on four yeast chromosomes. The presence of approximate tandem repeats seems to be a uniform structural property of yeast chromosomes.

Entities:  

Mesh:

Substances:

Year:  1997        PMID: 9146959     DOI: 10.1093/bioinformatics/13.2.131

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  7 in total

Review 1.  Mapping the bacterial cell architecture into the chromosome.

Authors:  A Danchin; P Guerdoux-Jamet; I Moszer; P Nitschké
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2000-02-29       Impact factor: 6.237

2.  An appraisal of the potential for illegitimate recombination in bacterial genomes and its consequences: from duplications to genome reduction.

Authors:  Eduardo P C Rocha
Journal:  Genome Res       Date:  2003-05-12       Impact factor: 9.043

3.  mreps: Efficient and flexible detection of tandem repeats in DNA.

Authors:  Roman Kolpakov; Ghizlane Bana; Gregory Kucherov
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

4.  Primer design for large scale sequencing.

Authors:  S Haas; M Vingron; A Poustka; S Wiemann
Journal:  Nucleic Acids Res       Date:  1998-06-15       Impact factor: 16.971

5.  TRStalker: an efficient heuristic for finding fuzzy tandem repeats.

Authors:  Marco Pellegrini; M Elena Renda; Alessio Vecchio
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

6.  DNA sequences at a glance.

Authors:  Armando J Pinho; Sara P Garcia; Diogo Pratas; Paulo J S G Ferreira
Journal:  PLoS One       Date:  2013-11-21       Impact factor: 3.240

7.  Input-output maps are strongly biased towards simple outputs.

Authors:  Kamaludin Dingle; Chico Q Camargo; Ard A Louis
Journal:  Nat Commun       Date:  2018-02-22       Impact factor: 14.919

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.