Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 The general inefficiency of batch training for gradient descent learning.

Literature DB >> 14622875

The general inefficiency of batch training for gradient descent learning.

Abstract

Gradient descent training of neural networks can be done in either a batch or on-line manner. A widely held myth in the neural network community is that batch training is as fast or faster and/or more 'correct' than on-line training because it supposedly uses a better approximation of the true gradient for its weight updates. This paper explains why batch training is almost always slower than on-line training-often orders of magnitude slower-especially on large training sets. The main reason is due to the ability of on-line training to follow curves in the error surface throughout each epoch, which allows it to safely use a larger learning rate and thus converge with less iterations through the training data. Empirical results on a large (20,000-instance) speech recognition task and on 26 other learning tasks demonstrate that convergence can be reached significantly faster using on-line training than batch training, with no apparent difference in accuracy.

Mesh：

Year: 2003 PMID： 14622875 DOI： 10.1016/S0893-6080(03)00138-2

Source DB: PubMed Journal: Neural Netw ISSN： 0893-6080

Keyword Cloud
Cited

13 in total

1. Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided-learning through a two-layer neural network.

Authors: Eshel Faraggi; Bin Xue; Yaoqi Zhou
Journal: Proteins Date: 2009-03

2. Comparative assessment of glucose prediction models for patients with type 1 diabetes mellitus applying sensors for glucose and physical activity monitoring.

Authors: K Zarkogianni; K Mitsis; E Litsa; M-T Arredondo; G Ficο; A Fioravanti; K S Nikita
Journal: Med Biol Eng Comput Date: 2015-06-07 Impact factor: 2.602

3. Optimizing artificial neural network models for metabolomics and systems biology: an example using HPLC retention index data.

Authors: L Mark Hall; Dennis W Hill; Lochana C Menikarachchi; Ming-Hui Chen; Lowell H Hall; David F Grant
Journal: Bioanalysis Date: 2015 Impact factor: 2.681

4. Improving quantitative structure-activity relationship models using Artificial Neural Networks trained with dropout.

Authors: Jeffrey Mendenhall; Jens Meiler
Journal: J Comput Aided Mol Des Date: 2016-02-01 Impact factor: 3.686

5. Deep Unsupervised Learning on a Desktop PC: A Primer for Cognitive Scientists.

Authors: Alberto Testolin; Ivilin Stoianov; Michele De Filippo De Grazia; Marco Zorzi
Journal: Front Psychol Date: 2013-05-06

6. Operant conditioning: a minimal components requirement in artificial spiking neurons designed for bio-inspired robot's controller.

Authors: André Cyr; Mounir Boukadoum; Frédéric Thériault
Journal: Front Neurorobot Date: 2014-07-25 Impact factor: 2.650

7. Dual Temporal Scale Convolutional Neural Network for Micro-Expression Recognition.

Authors: Min Peng; Chongyang Wang; Tong Chen; Guangyuan Liu; Xiaolan Fu
Journal: Front Psychol Date: 2017-10-13

8. Diagnosis of Malignancy in Thyroid Tumors by Multi-Layer Perceptron Neural Networks With Different Batch Learning Algorithms.

Authors: Saeedeh Pourahmad; Mohsen Azad; Shahram Paydar
Journal: Glob J Health Sci Date: 2015-03-30

9. Convergence of batch gradient learning with smoothing regularization and adaptive momentum for neural networks.

Authors: Qinwei Fan; Wei Wu; Jacek M Zurada
Journal: Springerplus Date: 2016-03-08

10. Chromatin accessibility prediction via convolutional long short-term memory networks with k-mer embedding.

Authors: Xu Min; Wanwen Zeng; Ning Chen; Ting Chen; Rui Jiang
Journal: Bioinformatics Date: 2017-07-15 Impact factor: 6.937