Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Flat minima.

Literature DB >> 9117894

Flat minima.

Abstract

We present a new algorithm for finding low-complexity neural networks with high generalization capability. The algorithm searches for a "flat" minimum of the error function. A flat minimum is a large connected region in weight space where the error remains approximately constant. An MDL-based, Bayesian argument suggests that flat minima correspond to "simple" networks and low expected overfitting. The argument is based on a Gibbs algorithm variant and a novel way of splitting generalization error into underfitting and overfitting error. Unlike many previous approaches, ours does not require gaussian assumptions and does not depend on a "good" weight prior. Instead we have a prior over input-output functions, thus taking into account net architecture and training set. Although our algorithm requires the computation of second-order derivatives, it has backpropagation's order of complexity. Automatically, it effectively prunes units, weights, and input lines. Various experiments with feedforward and recurrent nets are described. In an application to stock market prediction, flat minimum search outperforms conventional backprop, weight decay, and "optimal brain surgeon/optimal brain damage".

Mesh：

Year: 1997 PMID： 9117894 DOI： 10.1162/neco.1997.9.1.1

Source DB: PubMed Journal: Neural Comput ISSN： 0899-7667 Impact factor: 2.026

Keyword Cloud
Cited

20 in total

Flat minima.

1. Initialization and self-organized optimization of recurrent neural network connectivity.

2. Provenance of correlations in psychological data.

Review 3. REBUS and the Anarchic Brain: Toward a Unified Model of the Brain Action of Psychedelics.

4. The inverse variance-flatness relation in stochastic gradient descent is critical for finding flat minima.

5. Archetypal landscapes for deep neural networks.

6. Global Model Analysis of Cognitive Variability.

7. The Iterated Classification Game: A New Model of the Cultural Transmission of Language.

8. Discrimination of smoking status by MRI based on deep learning method.

9. PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging.

10. Degeneracy and Redundancy in Active Inference.