| Literature DB >> 19416538 |
Francisco Fernandes1, Ana T Freitas, Jonas S Almeida, Susana Vinga.
Abstract
BACKGROUND: In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning.Entities:
Year: 2009 PMID: 19416538 PMCID: PMC2686720 DOI: 10.1186/1756-0500-2-72
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Figure 1Suffix tree and side links. Example of a suffix tree and side links for word "ATTACAC" showing the suffix counts limited to depth 3. All substrings of length 3 are represented in the tree ("ATT", "TTA", "TAC", "ACA", "CAC"). Also shown are the side links connecting nodes of the same depth.
Figure 2Study sequence by position. Example of the input and output screens of the Entropic Profiler application while studying around a position in the Escherichia coli K12 genome.
Figure 3Study sequence by motif. Example of the input and output screens of the Entropic Profiler application while searching for motif "AAGTGCGGT" on H. influenzae.
Performance of Entropic Profiler for larger genomes.
| 97.6 | 75 | 171 | |
| 239.2 | 164 | 290 | |
| 382.7 | 263 | 411 | |
| 940.1 | 611 | 1016 | |
| 1491.0 | 960 | 1567 | |
For each species name, the length of each sequence tested (in 106 bases), the total running time (in seconds) and the memory usage (megabytes) are reported.
Note: The sequences can be assessed in the following links: Caenorhabditis elegans ; Homo sapiens (CHR1) ; Takifugu rubripes ; Gallus gallus ; Danio rerio .