Literature DB >> 34557675

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC.

Wei Deng1, Qi Feng2, Liyao Gao1, Faming Liang1, Guang Lin1.   

Abstract

Replica exchange Monte Carlo (reMC), also known as parallel tempering, is an important technique for accelerating the convergence of the conventional Markov Chain Monte Carlo (MCMC) algorithms. However, such a method requires the evaluation of the energy function based on the full dataset and is not scalable to big data. The naïve implementation of reMC in mini-batch settings introduces large biases, which cannot be directly extended to the stochastic gradient MCMC (SGMCMC), the standard sampling method for simulating from deep neural networks (DNNs). In this paper, we propose an adaptive replica exchange SGMCMC (reSGMCMC) to automatically correct the bias and study the corresponding properties. The analysis implies an acceleration-accuracy trade-off in the numerical discretization of a Markov jump process in a stochastic environment. Empirically, we test the algorithm through extensive experiments on various setups and obtain the state-of-the-art results on CIFAR10, CIFAR100, and SVHN in both supervised learning and semi-supervised learning tasks.

Entities:  

Year:  2020        PMID: 34557675      PMCID: PMC8457534     

Source DB:  PubMed          Journal:  Proc Mach Learn Res


  5 in total

1.  Replica Monte Carlo simulation of spin glasses.

Authors: 
Journal:  Phys Rev Lett       Date:  1986-11-24       Impact factor: 9.161

2.  Optimization by simulated annealing.

Authors:  S Kirkpatrick; C D Gelatt; M P Vecchi
Journal:  Science       Date:  1983-05-13       Impact factor: 47.728

Review 3.  Parallel tempering: theory, applications, and new perspectives.

Authors:  David J Earl; Michael W Deem
Journal:  Phys Chem Chem Phys       Date:  2005-12-07       Impact factor: 3.676

4.  Dynamic weighting in Monte Carlo and optimization.

Authors:  W H Wong; F Liang
Journal:  Proc Natl Acad Sci U S A       Date:  1997-12-23       Impact factor: 11.205

  5 in total
  1 in total

1.  A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions.

Authors:  Wei Deng; Guang Lin; Faming Liang
Journal:  Adv Neural Inf Process Syst       Date:  2020-12
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.