Literature DB >> 21803542

A generalized LSTM-like training algorithm for second-order recurrent neural networks.

Derek Monner1, James A Reggia.   

Abstract

The long short term memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM's original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the generalized long short-term memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks.
Copyright © 2011 Elsevier Ltd. All rights reserved.

Entities:  

Mesh:

Year:  2011        PMID: 21803542      PMCID: PMC3217173          DOI: 10.1016/j.neunet.2011.07.003

Source DB:  PubMed          Journal:  Neural Netw        ISSN: 0893-6080


  8 in total

1.  Learning to forget: continual prediction with LSTM.

Authors:  F A Gers; J Schmidhuber; F Cummins
Journal:  Neural Comput       Date:  2000-10       Impact factor: 2.026

2.  Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets.

Authors:  Juan Antonio Pérez-Ortiz; Felix A Gers; Douglas Eck; Jürgen Schmidhuber
Journal:  Neural Netw       Date:  2003-03

3.  Learning, invariance, and generalization in high-order neural networks.

Authors:  C L Giles; T Maxwell
Journal:  Appl Opt       Date:  1987-12-01       Impact factor: 1.980

Review 4.  Simulating single word processing in the classic aphasia syndromes based on the Wernicke-Lichtheim-Geschwind theory.

Authors:  Scott A Weems; James A Reggia
Journal:  Brain Lang       Date:  2006-07-07       Impact factor: 2.381

5.  Training recurrent networks by Evolino.

Authors:  Jürgen Schmidhuber; Daan Wierstra; Matteo Gagliolo; Faustino Gomez
Journal:  Neural Comput       Date:  2007-03       Impact factor: 2.026

6.  LSTM recurrent networks learn simple context-free and context-sensitive languages.

Authors:  F A Gers; E Schmidhuber
Journal:  IEEE Trans Neural Netw       Date:  2001

7.  Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks.

Authors:  G V Puskorius; L A Feldkamp
Journal:  IEEE Trans Neural Netw       Date:  1994

8.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

  8 in total
  2 in total

1.  Using LSTM and PSO techniques for predicting moisture content of poplar fibers by Impulse-cyclone Drying.

Authors:  Feng Chen; Xun Gao; Xinghua Xia; Jing Xu
Journal:  PLoS One       Date:  2022-04-11       Impact factor: 3.240

2.  Device-Free Human Activity Recognition with Low-Resolution Infrared Array Sensor Using Long Short-Term Memory Neural Network.

Authors:  Cunyi Yin; Jing Chen; Xiren Miao; Hao Jiang; Deying Chen
Journal:  Sensors (Basel)       Date:  2021-05-20       Impact factor: 3.576

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.