Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Learning to forget: continual prediction with LSTM.

Literature DB >> 11032042

Learning to forget: continual prediction with LSTM.

F A Gers¹, J Schmidhuber, F Cummins.
1. IDSIA, Lugano, Switzerland.

Abstract

Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel, adaptive "forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review illustrative benchmark problems on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve continual versions of these problems. LSTM with forget gates, however, easily solves them, and in an elegant way.

Entities: Disease

Mesh：

Year: 2000 PMID： 11032042 DOI： 10.1162/089976600300015015

Source DB: PubMed Journal: Neural Comput ISSN： 0899-7667 Impact factor: 2.026

Keyword Cloud
Cited

188 in total

1. Alternative time representation in dopamine models.

Authors: François Rivest; John F Kalaska; Yoshua Bengio
Journal: J Comput Neurosci Date: 2009-10-22 Impact factor: 1.621

2. Machine Learning on Sequential Data Using a Recurrent Weighted Average.

Authors: Jared Ostmeyer; Lindsay Cowell
Journal: Neurocomputing Date: 2018-11-29 Impact factor: 5.719

3. Short-term memory based on activated long-term memory: A review in response to Norris (2017).

Authors: Nelson Cowan
Journal: Psychol Bull Date: 2019-08 Impact factor: 17.737

4. DeepPASTA: deep neural network based polyadenylation site analysis.

Authors: Ashraful Arefeen; Xinshu Xiao; Tao Jiang
Journal: Bioinformatics Date: 2019-11-01 Impact factor: 6.937

5. Segmenting and classifying activities in robot-assisted surgery with recurrent neural networks.

Authors: Robert DiPietro; Narges Ahmidi; Anand Malpani; Madeleine Waldram; Gyusung I Lee; Mija R Lee; S Swaroop Vedula; Gregory D Hager
Journal: Int J Comput Assist Radiol Surg Date: 2019-04-29 Impact factor: 2.924

Review 6. Technical and clinical overview of deep learning in radiology.

Authors: Daiju Ueda; Akitoshi Shimazaki; Yukio Miki
Journal: Jpn J Radiol Date: 2018-12-01 Impact factor: 2.374

Review 7. A Survey of Data Mining and Deep Learning in Bioinformatics.

Authors: Kun Lan; Dan-Tong Wang; Simon Fong; Lian-Sheng Liu; Kelvin K L Wong; Nilanjan Dey
Journal: J Med Syst Date: 2018-06-28 Impact factor: 4.460

Review 8. Toward Electrophysiology-Based Intelligent Adaptive Deep Brain Stimulation for Movement Disorders.

Authors: Andrea A Kühn; R Mark Richardson; Wolf-Julian Neumann; Robert S Turner; Benjamin Blankertz; Tom Mitchell
Journal: Neurotherapeutics Date: 2019-01 Impact factor: 7.620

Review 9. Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0).

Authors: Abhyuday Jagannatha; Feifan Liu; Weisong Liu; Hong Yu
Journal: Drug Saf Date: 2019-01 Impact factor: 5.606

10. Improving Prediction of Low-Prior Clinical Events with Simultaneous General Patient-State Representation Learning.

Authors: Matthew Barren; Milos Hauskrecht
Journal: Artif Intell Med Conf Artif Intell Med (2005-) Date: 2021-06-08