Literature DB >> 29028175

The Information Content of Discrete Functions and Their Application in Genetic Data Analysis.

Nikita A Sakhanenko1, James Kunert-Graf1, David J Galas1.   

Abstract

The complex of central problems in data analysis consists of three components: (1) detecting the dependence of variables using quantitative measures, (2) defining the significance of these dependence measures, and (3) inferring the functional relationships among dependent variables. We have argued previously that an information theory approach allows separation of the detection problem from the inference of functional form problem. We approach here the third component of inferring functional forms based on information encoded in the functions. We present here a direct method for classifying the functional forms of discrete functions of three variables represented in data sets. Discrete variables are frequently encountered in data analysis, both as the result of inherently categorical variables and from the binning of continuous numerical variables into discrete alphabets of values. The fundamental question of how much information is contained in a given function is answered for these discrete functions, and their surprisingly complex relationships are illustrated. The all-important effect of noise on the inference of function classes is found to be highly heterogeneous and reveals some unexpected patterns. We apply this classification approach to an important area of biological data analysis-that of inference of genetic interactions. Genetic analysis provides a rich source of real and complex biological data analysis problems, and our general methods provide an analytical basis and tools for characterizing genetic problems and for analyzing genetic data. We illustrate the functional description and the classes of a number of common genetic interaction modes and also show how different modes vary widely in their sensitivity to noise.

Entities:  

Keywords:  discrete functions; function classes; genetic interactions; information theory; multivariable dependence

Mesh:

Year:  2017        PMID: 29028175      PMCID: PMC5729883          DOI: 10.1089/cmb.2017.0143

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  8 in total

1.  Predictability, complexity, and learning.

Authors:  W Bialek; I Nemenman; N Tishby
Journal:  Neural Comput       Date:  2001-11       Impact factor: 2.026

2.  Describing the complexity of systems: multivariable "set complexity" and the information basis of systems biology.

Authors:  David J Galas; Nikita A Sakhanenko; Alexander Skupin; Tomasz Ignac
Journal:  J Comput Biol       Date:  2013-12-30       Impact factor: 1.479

3.  Biological Information as Set-Based Complexity.

Authors:  David J Galas; Matti Nykter; Gregory W Carter; Nathan D Price; Ilya Shmulevich
Journal:  IEEE Trans Inf Theory       Date:  2010-02-25       Impact factor: 2.501

4.  Human Epistatic Interaction Controls IL7R Splicing and Increases Multiple Sclerosis Risk.

Authors:  Gaddiel Galarza-Muñoz; Farren B S Briggs; Irina Evsyukova; Geraldine Schott-Lerner; Edward M Kennedy; Tinashe Nyanhete; Liuyang Wang; Laura Bergamaschi; Steven G Widen; Georgia D Tomaras; Dennis C Ko; Shelton S Bradrick; Lisa F Barcellos; Simon G Gregory; Mariano A Garcia-Blanco
Journal:  Cell       Date:  2017-03-23       Impact factor: 41.582

Review 5.  Epistasis and quantitative traits: using model organisms to study gene-gene interactions.

Authors:  Trudy F C Mackay
Journal:  Nat Rev Genet       Date:  2013-12-03       Impact factor: 53.242

6.  Biological data analysis as an information theory problem: multivariable dependence measures and the shadows algorithm.

Authors:  Nikita A Sakhanenko; David J Galas
Journal:  J Comput Biol       Date:  2015-09-03       Impact factor: 1.479

7.  Relations between the set-complexity and the structure of graphs and their sub-graphs.

Authors:  Tomasz M Ignac; Nikita A Sakhanenko; David J Galas
Journal:  EURASIP J Bioinform Syst Biol       Date:  2012-09-21

8.  Discovering pair-wise genetic interactions: an information theory-based approach.

Authors:  Tomasz M Ignac; Alexander Skupin; Nikita A Sakhanenko; David J Galas
Journal:  PLoS One       Date:  2014-03-26       Impact factor: 3.240

  8 in total
  4 in total

1.  Multivariate Analysis of Data Sets with Missing Values: An Information Theory-Based Reliability Function.

Authors:  Lisa Uechi; David J Galas; Nikita A Sakhanenko
Journal:  J Comput Biol       Date:  2018-11-29       Impact factor: 1.479

2.  Complex genetic dependencies among growth and neurological phenotypes in healthy children: Towards deciphering developmental mechanisms.

Authors:  Lisa Uechi; Mahjoubeh Jalali; Jayson D Wilbur; Jonathan L French; N L Jumbe; Michael J Meaney; Peter D Gluckman; Neerja Karnani; Nikita A Sakhanenko; David J Galas
Journal:  PLoS One       Date:  2020-12-03       Impact factor: 3.240

3.  Toward an Information Theory of Quantitative Genetics.

Authors:  David J Galas; James Kunert-Graf; Lisa Uechi; Nikita A Sakhanenko
Journal:  J Comput Biol       Date:  2020-12-31       Impact factor: 1.479

4.  Cerebrospinal Fluid MicroRNA Changes in Cognitively Normal Veterans With a History of Deployment-Associated Mild Traumatic Brain Injury.

Authors:  Theresa A Lusardi; Ursula S Sandau; Nikita A Sakhanenko; Sarah Catherine B Baker; Jack T Wiedrick; Jodi A Lapidus; Murray A Raskind; Ge Li; Elaine R Peskind; David J Galas; Joseph F Quinn; Julie A Saugstad
Journal:  Front Neurosci       Date:  2021-09-09       Impact factor: 4.677

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.