Literature DB >> 22942915

Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting.

Martin Wöllmer, Erik Marchi, Stefano Squartini, Björn Schuller.   

Abstract

Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today's automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database-a corpus containing emotionally colored conversations with a cognitive system for "Sensitive Artificial Listening".

Keywords:  Cognitive agents; Histogram equalization; Keyword spotting; Long short-term memory; Neural networks

Year:  2011        PMID: 22942915      PMCID: PMC3179540          DOI: 10.1007/s11571-011-9166-9

Source DB:  PubMed          Journal:  Cogn Neurodyn        ISSN: 1871-4080            Impact factor:   5.082


  6 in total

1.  Learning to forget: continual prediction with LSTM.

Authors:  F A Gers; J Schmidhuber; F Cummins
Journal:  Neural Comput       Date:  2000-10       Impact factor: 2.026

2.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures.

Authors:  Alex Graves; Jürgen Schmidhuber
Journal:  Neural Netw       Date:  2005 Jun-Jul

3.  Learning long-term dependencies with gradient descent is difficult.

Authors:  Y Bengio; P Simard; P Frasconi
Journal:  IEEE Trans Neural Netw       Date:  1994

4.  Learning long-term dependencies in NARX recurrent neural networks.

Authors:  T Lin; B G Horne; P Tino; C L Giles
Journal:  IEEE Trans Neural Netw       Date:  1996

5.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

6.  On the reciprocal interaction between believing and feeling: an adaptive agent modelling perspective.

Authors:  Zulfiqar A Memon; Jan Treur
Journal:  Cogn Neurodyn       Date:  2010-10-06       Impact factor: 5.082

  6 in total
  1 in total

1.  Noise effects on robust synchronization of a small pacemaker neuronal ensemble via nonlinear controller: electronic circuit design.

Authors:  Elie Bertrand Megam Ngouonkadi; Hilaire Bertrand Fotsin; Martial Kabong Nono; Patrick Herve Louodop Fotso
Journal:  Cogn Neurodyn       Date:  2016-06-11       Impact factor: 5.082

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.