Literature DB >> 18301721

Identifying statistical dependence in genomic sequences via mutual information estimates.

Hasan Metin Aktulga1, Ioannis Kontoyiannis, L Alex Lyznik, Lukasz Szpankowski, Ananth Y Grama, Wojciech Szpankowski.   

Abstract

Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA) that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, they are used for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5' untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's combined DNA index system (CODIS), we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats-an application of importance in genetic profiling.

Entities:  

Year:  2007        PMID: 18301721      PMCID: PMC3171327          DOI: 10.1155/2007/14741

Source DB:  PubMed          Journal:  EURASIP J Bioinform Syst Biol        ISSN: 1687-4145


  12 in total

Review 1.  Initiation of translation in prokaryotes and eukaryotes.

Authors:  M Kozak
Journal:  Gene       Date:  1999-07-08       Impact factor: 3.688

2.  Exploiting the past and the future in protein secondary structure prediction.

Authors:  P Baldi; S Brunak; P Frasconi; G Soda; G Pollastri
Journal:  Bioinformatics       Date:  1999-11       Impact factor: 6.937

3.  The mutual information: detecting and evaluating dependencies between variables.

Authors:  R Steuer; J Kurths; C O Daub; J Weise; J Selbig
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

4.  ASF/SF2-like maize pre-mRNA splicing factors affect splice site utilization and their transcripts are alternatively spliced.

Authors:  Huirong Gao; William J Gordon-Kamm; L Alexander Lyznik
Journal:  Gene       Date:  2004-09-15       Impact factor: 3.688

5.  A genomic code for nucleosome positioning.

Authors:  Eran Segal; Yvonne Fondufe-Mittendorf; Lingyi Chen; AnnChristine Thåström; Yair Field; Irene K Moore; Ji-Ping Z Wang; Jonathan Widom
Journal:  Nature       Date:  2006-07-19       Impact factor: 49.962

Review 6.  Regulation of gene expression by alternative untranslated regions.

Authors:  Thomas A Hughes
Journal:  Trends Genet       Date:  2006-01-23       Impact factor: 11.639

7.  Gene mapping and marker clustering using Shannon's mutual information.

Authors:  Zaher Dawy; Bernhard Goebel; Joachim Hagenauer; Christophe Andreoli; Thomas Meitinger; Jakob C Mueller
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2006 Jan-Mar       Impact factor: 3.710

8.  Should genetics get an information-theoretic education? Genomes as error-correcting codes.

Authors:  Gérard Battail
Journal:  IEEE Eng Med Biol Mag       Date:  2006 Jan-Feb

9.  Comparative analysis of orthologous eukaryotic mRNAs: potential hidden functional signals.

Authors:  Svetlana A Shabalina; Aleksey Y Ogurtsov; Igor B Rogozin; Eugene V Koonin; David J Lipman
Journal:  Nucleic Acids Res       Date:  2004-03-18       Impact factor: 16.971

10.  Comparative analysis of transcription start sites using mutual information.

Authors:  D Ashok Reddy; Chanchal K Mitra
Journal:  Genomics Proteomics Bioinformatics       Date:  2006-08       Impact factor: 7.691

View more
  4 in total

1.  Identification of specificity determining residues in peptide recognition domains using an information theoretic approach applied to large-scale binding maps.

Authors:  Kevin Y Yip; Lukas Utz; Simon Sitwell; Xihao Hu; Sachdev S Sidhu; Benjamin E Turk; Mark Gerstein; Philip M Kim
Journal:  BMC Biol       Date:  2011-08-11       Impact factor: 7.431

2.  Improved algorithm for analysis of DNA sequences using multiresolution transformation.

Authors:  T M Inbamalar; R Sivakumar
Journal:  ScientificWorldJournal       Date:  2015-04-27

3.  To control false positives in gene-gene interaction analysis: two novel conditional entropy-based approaches.

Authors:  Xiaoyu Zuo; Shaoqi Rao; An Fan; Meihua Lin; Haoli Li; Xiaolei Zhao; Jiheng Qin
Journal:  PLoS One       Date:  2013-12-10       Impact factor: 3.240

4.  CpG_MI: a novel approach for identifying functional CpG islands in mammalian genomes.

Authors:  Jianzhong Su; Yan Zhang; Jie Lv; Hongbo Liu; Xiaoyan Tang; Fang Wang; Yunfeng Qi; Yujia Feng; Xia Li
Journal:  Nucleic Acids Res       Date:  2009-10-23       Impact factor: 16.971

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.