Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes.

Literature DB >> 27856745

Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes.

Carlo Baldassi^1,2, Christian Borgs³, Jennifer T Chayes³, Alessandro Ingrosso^4,2, Carlo Lucibello^4,2, Luca Saglietti^4,2, Riccardo Zecchina^4,2,5.

Abstract

In artificial neural networks, learning from data is a computationally demanding task in which a large number of connection weights are iteratively tuned through stochastic-gradient-based heuristic processes over a cost function. It is not well understood how learning occurs in these systems, in particular how they avoid getting trapped in configurations with poor computational performance. Here, we study the difficult case of networks with discrete weights, where the optimization landscape is very rough even for simple architectures, and provide theoretical and numerical evidence of the existence of rare-but extremely dense and accessible-regions of configurations in the network weight space. We define a measure, the robust ensemble (RE), which suppresses trapping by isolated configurations and amplifies the role of these dense regions. We analytically compute the RE in some exactly solvable models and also provide a general algorithmic scheme that is straightforward to implement: define a cost function given by a sum of a finite number of replicas of the original cost function, with a constraint centering the replicas around a driving assignment. To illustrate this, we derive several powerful algorithms, ranging from Markov Chains to message passing to gradient descent processes, where the algorithms target the robust dense states, resulting in substantial improvements in performance. The weak dependence on the number of precision bits of the weights leads us to conjecture that very similar reasoning applies to more conventional neural networks. Analogous algorithmic schemes can also be applied to other optimization problems.

Keywords: machine learning; neural networks; optimization; statistical physics

Year: 2016 PMID： 27856745 PMCID： PMC5137727 DOI： 10.1073/pnas.1608103113

Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN： 0027-8424 Impact factor: 11.205

14 in total

1. Analytic and algorithmic solution of random satisfiability problems.

Authors: M Mézard; G Parisi; R Zecchina
Journal: Science Date: 2002-06-27 Impact factor: 47.728

2. Finding undetected protein associations in cell signaling by belief propagation.

Authors: M Bailly-Bechet; C Borgs; A Braunstein; J Chayes; A Dagkessamanskaia; J-M François; R Zecchina
Journal: Proc Natl Acad Sci U S A Date: 2010-12-27 Impact factor: 11.205

3. Subdominant Dense Clusters Allow for Simple Learning and High Computational Performance in Neural Networks with Discrete Synapses.

Authors: Carlo Baldassi; Alessandro Ingrosso; Carlo Lucibello; Luca Saglietti; Riccardo Zecchina
Journal: Phys Rev Lett Date: 2015-09-18 Impact factor: 9.161

4. Efficient supervised learning in networks with binary synapses.

Authors: Carlo Baldassi; Alfredo Braunstein; Nicolas Brunel; Riccardo Zecchina
Journal: Proc Natl Acad Sci U S A Date: 2007-06-20 Impact factor: 11.205

5. Learning by message passing in networks of discrete synapses.

Authors: Alfredo Braunstein; Riccardo Zecchina
Journal: Phys Rev Lett Date: 2006-01-25 Impact factor: 9.161

6. Optimization by simulated annealing.

Authors: S Kirkpatrick; C D Gelatt; M P Vecchi
Journal: Science Date: 1983-05-13 Impact factor: 47.728

7. Locked constraint satisfaction problems.

Authors: Lenka Zdeborová; Marc Mézard
Journal: Phys Rev Lett Date: 2008-08-15 Impact factor: 9.161

8. Entropy landscape and non-Gibbs solutions in constraint satisfaction problems.

Authors: L Dall'Asta; A Ramezanpour; R Zecchina
Journal: Phys Rev E Stat Nonlin Soft Matter Phys Date: 2008-03-17

9. Learning may need only a few bits of synaptic precision.

Authors: Carlo Baldassi; Federica Gerace; Carlo Lucibello; Luca Saglietti; Riccardo Zecchina
Journal: Phys Rev E Date: 2016-05-27 Impact factor: 2.529

10. Origin of the computational hardness for learning with binary synapses.

Authors: Haiping Huang; Yoshiyuki Kabashima
Journal: Phys Rev E Stat Nonlin Soft Matter Phys Date: 2014-11-17

15 in total

1. The inverse variance-flatness relation in stochastic gradient descent is critical for finding flat minima.

Authors: Yu Feng; Yuhai Tu
Journal: Proc Natl Acad Sci U S A Date: 2021-03-02 Impact factor: 11.205

2. Archetypal landscapes for deep neural networks.

Authors: Philipp C Verpoort; Alpha A Lee; David J Wales
Journal: Proc Natl Acad Sci U S A Date: 2020-08-25 Impact factor: 11.205

3. Fast automated analysis of strong gravitational lenses with convolutional neural networks.

Authors: Yashar D Hezaveh; Laurence Perreault Levasseur; Philip J Marshall
Journal: Nature Date: 2017-08-30 Impact factor: 49.962

4. Modelling Self-Organization in Complex Networks Via a Brain-Inspired Network Automata Theory Improves Link Reliability in Protein Interactomes.

Authors: Carlo Vittorio Cannistraci
Journal: Sci Rep Date: 2018-10-25 Impact factor: 4.379

5. PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging.

Authors: Anthony Sicilia; Xingchen Zhao; Anastasia Sosnovskikh; Seong Jae Hwang
Journal: Med Image Comput Comput Assist Interv Date: 2021-09-21

6. Explorability and the origin of network sparsity in living systems.

Authors: Daniel M Busiello; Samir Suweis; Jorge Hidalgo; Amos Maritan
Journal: Sci Rep Date: 2017-09-26 Impact factor: 4.379

7. Pioneering topological methods for network-based drug-target prediction by exploiting a brain-network self-organization theory.

Authors: Claudio Durán; Simone Daminelli; Josephine M Thomas; V Joachim Haupt; Michael Schroeder; Carlo Vittorio Cannistraci
Journal: Brief Bioinform Date: 2018-11-27 Impact factor: 11.622

8. Efficiency of quantum vs. classical annealing in nonconvex learning problems.

Authors: Carlo Baldassi; Riccardo Zecchina
Journal: Proc Natl Acad Sci U S A Date: 2018-01-30 Impact factor: 11.205

9. Optimization of neural networks via finite-value quantum fluctuations.

Authors: Masayuki Ohzeki; Shuntaro Okada; Masayoshi Terabe; Shinichiro Taguchi
Journal: Sci Rep Date: 2018-07-02 Impact factor: 4.379

10. Can local-community-paradigm and epitopological learning enhance our understanding of how local brain connectivity is able to process, learn and memorize chronic pain?

Authors: Vaibhav Narula; Antonio Giuliano Zippo; Alessandro Muscoloni; Gabriele Eliseo M Biella; Carlo Vittorio Cannistraci
Journal: Appl Netw Sci Date: 2017-08-30