Literature DB >> 32599541

Noise can speed backpropagation learning and deep bidirectional pretraining.

Bart Kosko1, Kartik Audhkhasi2, Osonde Osoba3.   

Abstract

We show that the backpropagation algorithm is a special case of the generalized Expectation-Maximization (EM) algorithm for iterative maximum likelihood estimation. We then apply the recent result that carefully chosen noise can speed the average convergence of the EM algorithm as it climbs a hill of probability or log-likelihood. Then injecting such noise can speed the average convergence of the backpropagation algorithm for both the training and pretraining of multilayer neural networks. The beneficial noise adds to the hidden and visible neurons and related parameters. The noise also applies to regularized regression networks. This beneficial noise is just that noise that makes the current signal more probable. We show that such noise also tends to improve classification accuracy. The geometry of the noise-benefit region depends on the probability structure of the neurons in a given layer. The noise-benefit region in noise space lies above the noisy-EM (NEM) hyperplane for classification and involves a hypersphere for regression. Simulations demonstrate these noise benefits using MNIST digit classification. The NEM noise benefits substantially exceed those of simply adding blind noise to the neural network. We further prove that the noise speed-up applies to the deep bidirectional pretraining of neural-network bidirectional associative memories (BAMs) or their functionally equivalent restricted Boltzmann machines. We then show that learning with basic contrastive divergence also reduces to generalized EM for an energy-based network probability. The optimal noise adds to the input visible neurons of a BAM in stacked layers of trained BAMs. Global stability of generalized BAMs guarantees rapid convergence in pretraining where neural signals feed back between contiguous layers. Bipolar coding of inputs further improves pretraining performance.
Copyright © 2020 Elsevier Ltd. All rights reserved.

Keywords:  Backpropagation; Bidirectional associative memory; Contrastive divergence; Expectation–Maximization algorithm; Noise benefit; Stochastic resonance

Mesh:

Year:  2020        PMID: 32599541     DOI: 10.1016/j.neunet.2020.04.004

Source DB:  PubMed          Journal:  Neural Netw        ISSN: 0893-6080


  1 in total

1.  Noise Eliminated Ensemble Empirical Mode Decomposition Scalogram Analysis for Rotating Machinery Fault Diagnosis.

Authors:  Atik Faysal; Wai Keng Ngui; Meng Hee Lim; Mohd Salman Leong
Journal:  Sensors (Basel)       Date:  2021-12-04       Impact factor: 3.576

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.