Literature DB >> 16819800

Statistical measures of the structure of genomic sequences: entropy, complexity, and position information.

Yuriy L Orlov1, Rene Te Boekhorst, Irina I Abnizova.   

Abstract

Identifying regions of DNA with extreme statistical characteristics is an important aspect of the structural analysis of complete genomes. Linguistic methods, mainly based on estimating word frequency, can be used for this as they allow for the delineation of regions of low complexity. Low complexity may be due to biased nucleotide composition, by tandem- or dispersed repeats, by palindrome-hairpin structures, as well as by a combination of all these features. We developed software tools in which various numerical measures of text complexity are implemented, including combinatorial and linguistic ones. We also added Hurst exponent estimate to the software to measure dependencies in DNA sequences. By applying these tools to various functional genomic regions, we demonstrate that the complexity of introns and regulatory regions is lower than that of coding regions, whilst Hurst exponent is larger. Further analysis of promoter sequences revealed that the lower complexity of these regions is associated with long-range correlations caused by transcription factor binding sites.

Mesh:

Year:  2006        PMID: 16819800     DOI: 10.1142/s0219720006001801

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  3 in total

1.  Visualization of the protein-coding regions with a self adaptive spectral rotation approach.

Authors:  Bo Chen; Ping Ji
Journal:  Nucleic Acids Res       Date:  2010-10-14       Impact factor: 16.971

2.  Novel read density distribution score shows possible aligner artefacts, when mapping a single chromosome.

Authors:  Fedor M Naumenko; Irina I Abnizova; Nathan Beka; Mikhail A Genaev; Yuriy L Orlov
Journal:  BMC Genomics       Date:  2018-02-09       Impact factor: 3.969

3.  Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics.

Authors:  Suping Deng; Yixiang Shi; Liyun Yuan; Yixue Li; Guohui Ding
Journal:  BMC Genomics       Date:  2012-12-17       Impact factor: 3.969

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.