| Literature DB >> 33327095 |
Sjoerd Viktor Beentjes1,2, Ava Khamseh3,4,5.
Abstract
The problem of inferring pairwise and higher-order interactions in complex systems involving large numbers of interacting variables, from observational data, is fundamental to many fields. Known to the statistical physics community as the inverse problem, it has become accessible in recent years due to real and simulated big data being generated. Current approaches to the inverse problem rely on parametric assumptions, physical approximations, e.g., mean-field theory, and ignoring higher-order interactions which may lead to biased or incorrect estimates. We bypass these shortcomings using a cross-disciplinary approach and demonstrate that none of these assumptions and approximations are necessary: We introduce a universal, model-independent, and fundamentally unbiased estimator of all-order symmetric interactions, via the nonparametric framework of targeted learning, a subfield of mathematical statistics. Due to its universality, our definition is readily applicable to any system at equilibrium with binary and categorical variables, be it magnetic spins, nodes in a neural network, or protein networks in biology. Our approach is targeted, not requiring fitting unnecessary parameters. Instead, it expends all data on estimating interactions, hence substantially increasing accuracy. We demonstrate the generality of our technique both analytically and numerically on (i) the two-dimensional Ising model, (ii) an Ising-type model with four-point interactions, (iii) the restricted Boltzmann machine, and (iv) simulated individual-level human DNA variants and representative traits. The latter demonstrates the applicability of this approach to discover epistatic interactions causal of disease in population biomedicine.Entities:
Year: 2020 PMID: 33327095 DOI: 10.1103/PhysRevE.102.053314
Source DB: PubMed Journal: Phys Rev E ISSN: 2470-0045 Impact factor: 2.529