Literature DB >> 26335709

Biological data analysis as an information theory problem: multivariable dependence measures and the shadows algorithm.

Nikita A Sakhanenko1, David J Galas1,2.   

Abstract

Information theory is valuable in multiple-variable analysis for being model-free and nonparametric, and for the modest sensitivity to undersampling. We previously introduced a general approach to finding multiple dependencies that provides accurate measures of levels of dependency for subsets of variables in a data set, which is significantly nonzero only if the subset of variables is collectively dependent. This is useful, however, only if we can avoid a combinatorial explosion of calculations for increasing numbers of variables.  The proposed dependence measure for a subset of variables, τ, differential interaction information, Δ(τ), has the property that for subsets of τ some of the factors of Δ(τ) are significantly nonzero, when the full dependence includes more variables. We use this property to suppress the combinatorial explosion by following the "shadows" of multivariable dependency on smaller subsets. Rather than calculating the marginal entropies of all subsets at each degree level, we need to consider only calculations for subsets of variables with appropriate "shadows." The number of calculations for n variables at a degree level of d grows therefore, at a much smaller rate than the binomial coefficient (n, d), but depends on the parameters of the "shadows" calculation. This approach, avoiding a combinatorial explosion, enables the use of our multivariable measures on very large data sets. We demonstrate this method on simulated data sets, and characterize the effects of noise and sample numbers. In addition, we analyze a data set of a few thousand mutant yeast strains interacting with a few thousand chemical compounds.

Entities:  

Keywords:  discovery; entropy; gene network; interaction information; multivariable dependency

Mesh:

Year:  2015        PMID: 26335709      PMCID: PMC4642827          DOI: 10.1089/cmb.2015.0051

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  7 in total

1.  Describing the complexity of systems: multivariable "set complexity" and the information basis of systems biology.

Authors:  David J Galas; Nikita A Sakhanenko; Alexander Skupin; Tomasz Ignac
Journal:  J Comput Biol       Date:  2013-12-30       Impact factor: 1.479

2.  RSC, an essential, abundant chromatin-remodeling complex.

Authors:  B R Cairns; Y Lorch; Y Li; M Zhang; L Lacomis; H Erdjument-Bromage; P Tempst; J Du; B Laurent; R D Kornberg
Journal:  Cell       Date:  1996-12-27       Impact factor: 41.582

3.  The Sec34/35 Golgi transport complex is related to the exocyst, defining a family of complexes involved in multiple steps of membrane traffic.

Authors:  J R Whyte; S Munro
Journal:  Dev Cell       Date:  2001-10       Impact factor: 12.270

4.  Mapping the cellular response to small molecules using chemogenomic fitness signatures.

Authors:  Anna Y Lee; Robert P St Onge; Michael J Proctor; Iain M Wallace; Aaron H Nile; Paul A Spagnuolo; Yulia Jitkova; Marcela Gronda; Yan Wu; Moshe K Kim; Kahlin Cheung-Ong; Nikko P Torres; Eric D Spear; Mitchell K L Han; Ulrich Schlecht; Sundari Suresh; Geoffrey Duby; Lawrence E Heisler; Anuradha Surendra; Eula Fung; Malene L Urbanus; Marinella Gebbia; Elena Lissina; Molly Miranda; Jennifer H Chiang; Ana Maria Aparicio; Mahel Zeghouf; Ronald W Davis; Jacqueline Cherfils; Marc Boutry; Chris A Kaiser; Carolyn L Cummins; William S Trimble; Grant W Brown; Aaron D Schimmer; Vytas A Bankaitis; Corey Nislow; Gary D Bader; Guri Giaever
Journal:  Science       Date:  2014-04-11       Impact factor: 47.728

5.  Relations between the set-complexity and the structure of graphs and their sub-graphs.

Authors:  Tomasz M Ignac; Nikita A Sakhanenko; David J Galas
Journal:  EURASIP J Bioinform Syst Biol       Date:  2012-09-21

6.  Discovering pair-wise genetic interactions: an information theory-based approach.

Authors:  Tomasz M Ignac; Alexander Skupin; Nikita A Sakhanenko; David J Galas
Journal:  PLoS One       Date:  2014-03-26       Impact factor: 3.240

7.  Hypergraphs and cellular networks.

Authors:  Steffen Klamt; Utz-Uwe Haus; Fabian Theis
Journal:  PLoS Comput Biol       Date:  2009-05-29       Impact factor: 4.475

  7 in total
  11 in total

1.  Symmetries among Multivariate Information Measures Explored Using Möbius Operators.

Authors:  David J Galas; Nikita A Sakhanenko
Journal:  Entropy (Basel)       Date:  2019-01-18       Impact factor: 2.524

2.  Multivariate Analysis of Data Sets with Missing Values: An Information Theory-Based Reliability Function.

Authors:  Lisa Uechi; David J Galas; Nikita A Sakhanenko
Journal:  J Comput Biol       Date:  2018-11-29       Impact factor: 1.479

3.  The Information Content of Discrete Functions and Their Application in Genetic Data Analysis.

Authors:  Nikita A Sakhanenko; James Kunert-Graf; David J Galas
Journal:  J Comput Biol       Date:  2017-10-13       Impact factor: 1.479

Review 4.  Systems Genetics for Mechanistic Discovery in Heart Diseases.

Authors:  Christoph D Rau; Aldons J Lusis; Yibin Wang
Journal:  Circ Res       Date:  2020-06-04       Impact factor: 17.367

5.  Computational Inference Software for Tetrad Assembly from Randomly Arrayed Yeast Colonies.

Authors:  Nikita A Sakhanenko; Gareth A Cromie; Aimée M Dudley; David J Galas
Journal:  G3 (Bethesda)       Date:  2019-07-09       Impact factor: 3.154

6.  Complex genetic dependencies among growth and neurological phenotypes in healthy children: Towards deciphering developmental mechanisms.

Authors:  Lisa Uechi; Mahjoubeh Jalali; Jayson D Wilbur; Jonathan L French; N L Jumbe; Michael J Meaney; Peter D Gluckman; Neerja Karnani; Nikita A Sakhanenko; David J Galas
Journal:  PLoS One       Date:  2020-12-03       Impact factor: 3.240

7.  Toward an Information Theory of Quantitative Genetics.

Authors:  David J Galas; James Kunert-Graf; Lisa Uechi; Nikita A Sakhanenko
Journal:  J Comput Biol       Date:  2020-12-31       Impact factor: 1.479

8.  Feature selection with interactions in logistic regression models using multivariate synergies for a GWAS application.

Authors:  Easton Li Xu; Xiaoning Qian; Qilian Yu; Han Zhang; Shuguang Cui
Journal:  BMC Genomics       Date:  2018-03-21       Impact factor: 3.969

9.  Measuring Interactions in Categorical Datasets Using Multivariate Symmetrical Uncertainty.

Authors:  Santiago Gómez-Guerrero; Inocencio Ortiz; Gustavo Sosa-Cabrera; Miguel García-Torres; Christian E Schaerer
Journal:  Entropy (Basel)       Date:  2021-12-30       Impact factor: 2.524

10.  Cerebrospinal Fluid MicroRNA Changes in Cognitively Normal Veterans With a History of Deployment-Associated Mild Traumatic Brain Injury.

Authors:  Theresa A Lusardi; Ursula S Sandau; Nikita A Sakhanenko; Sarah Catherine B Baker; Jack T Wiedrick; Jodi A Lapidus; Murray A Raskind; Ge Li; Elaine R Peskind; David J Galas; Joseph F Quinn; Julie A Saugstad
Journal:  Front Neurosci       Date:  2021-09-09       Impact factor: 4.677

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.