Literature DB >> 26139959

Scalable estimation strategies based on stochastic approximations: Classical results and new insights.

Edoardo M Airoldi1, Panos Toulis1.   

Abstract

Estimation with large amounts of data can be facilitated by stochastic gradient methods, in which model parameters are updated sequentially using small batches of data at each step. Here, we review early work and modern results that illustrate the statistical properties of these methods, including convergence rates, stability, and asymptotic bias and variance. We then overview modern applications where these methods are useful, ranging from an online version of the EM algorithm to deep learning. In light of these results, we argue that stochastic gradient methods are poised to become benchmark principled estimation procedures for large data sets, especially those in the family of stable proximal methods, such as implicit stochastic gradient descent.

Entities:  

Keywords:  asymptotic analysis; big data; efficient estimation; exponential family; implicit stochastic gradient descent; maximum likelihood; optimal learning rate; recursive estimation; stochastic gradient descent methods

Year:  2015        PMID: 26139959      PMCID: PMC4484776          DOI: 10.1007/s11222-015-9560-y

Source DB:  PubMed          Journal:  Stat Comput        ISSN: 0960-3174            Impact factor:   2.559


  5 in total

1.  On-line EM algorithm for the normalized gaussian network.

Authors:  M Sato; S Ishii
Journal:  Neural Comput       Date:  2000-02       Impact factor: 2.026

2.  Adaptive method of realizing natural gradient learning for multilayer perceptrons.

Authors:  S Amari; H Park; K Fukumizu
Journal:  Neural Comput       Date:  2000-06       Impact factor: 2.026

3.  Training products of experts by minimizing contrastive divergence.

Authors:  Geoffrey E Hinton
Journal:  Neural Comput       Date:  2002-08       Impact factor: 2.026

4.  Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.

Authors:  S Geman; D Geman
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  1984-06       Impact factor: 6.226

5.  Justifying and generalizing contrastive divergence.

Authors:  Yoshua Bengio; Olivier Delalleau
Journal:  Neural Comput       Date:  2009-06       Impact factor: 2.026

  5 in total
  1 in total

1.  Distributed Simultaneous Inference in Generalized Linear Models via Confidence Distribution.

Authors:  Lu Tang; Ling Zhou; Peter X-K Song
Journal:  J Multivar Anal       Date:  2019-11-28       Impact factor: 1.473

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.