Literature DB >> 25122276

Large pseudocounts and L2-norm penalties are necessary for the mean-field inference of Ising and Potts models.

J P Barton1, S Cocco2, E De Leonardis3, R Monasson4.   

Abstract

The mean-field (MF) approximation offers a simple, fast way to infer direct interactions between elements in a network of correlated variables, a common, computationally challenging problem with practical applications in fields ranging from physics and biology to the social sciences. However, MF methods achieve their best performance with strong regularization, well beyond Bayesian expectations, an empirical fact that is poorly understood. In this work, we study the influence of pseudocount and L(2)-norm regularization schemes on the quality of inferred Ising or Potts interaction networks from correlation data within the MF approximation. We argue, based on the analysis of small systems, that the optimal value of the regularization strength remains finite even if the sampling noise tends to zero, in order to correct for systematic biases introduced by the MF approximation. Our claim is corroborated by extensive numerical studies of diverse model systems and by the analytical study of the m-component spin model for large but finite m. Additionally, we find that pseudocount regularization is robust against sampling noise and often outperforms L(2)-norm regularization, particularly when the underlying network of interactions is strongly heterogeneous. Much better performances are generally obtained for the Ising model than for the Potts model, for which only couplings incoming onto medium-frequency symbols are reliably inferred.

Mesh:

Year:  2014        PMID: 25122276     DOI: 10.1103/PhysRevE.90.012132

Source DB:  PubMed          Journal:  Phys Rev E Stat Nonlin Soft Matter Phys        ISSN: 1539-3755


  6 in total

1.  Influence of multiple-sequence-alignment depth on Potts statistical models of protein covariation.

Authors:  Allan Haldane; Ronald M Levy
Journal:  Phys Rev E       Date:  2019-03       Impact factor: 2.529

2.  Maximum Entropy Framework for Predictive Inference of Cell Population Heterogeneity and Responses in Signaling Networks.

Authors:  Purushottam D Dixit; Eugenia Lyashenko; Mario Niepel; Dennis Vitkup
Journal:  Cell Syst       Date:  2019-12-18       Impact factor: 10.304

3.  Mi3-GPU: MCMC-based Inverse Ising Inference on GPUs for protein covariation analysis.

Authors:  Allan Haldane; Ronald M Levy
Journal:  Comput Phys Commun       Date:  2020-04-17       Impact factor: 4.390

4.  GEMME: a simple and fast global epistatic model predicting mutational effects.

Authors:  Elodie Laine; Yasaman Karami; Alessandra Carbone
Journal:  Mol Biol Evol       Date:  2019-08-12       Impact factor: 16.240

5.  Coevolutionary Landscape Inference and the Context-Dependence of Mutations in Beta-Lactamase TEM-1.

Authors:  Matteo Figliuzzi; Hervé Jacquier; Alexander Schug; Oliver Tenaillon; Martin Weigt
Journal:  Mol Biol Evol       Date:  2015-10-06       Impact factor: 16.240

6.  Direct coupling analysis of epistasis in allosteric materials.

Authors:  Barbara Bravi; Riccardo Ravasio; Carolina Brito; Matthieu Wyart
Journal:  PLoS Comput Biol       Date:  2020-03-02       Impact factor: 4.475

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.