Literature DB >> 30233453

Commentary on "Interaction in Spoken Word Recognition Models".

Dennis Norris1, James M McQueen2,3, Anne Cutler3,4.   

Abstract

Entities:  

Keywords:  cognitive science; computer simulation; feedback; speech perception; word recognition

Year:  2018        PMID: 30233453      PMCID: PMC6129619          DOI: 10.3389/fpsyg.2018.01568

Source DB:  PubMed          Journal:  Front Psychol        ISSN: 1664-1078


× No keyword cloud information.
Magnuson et al. (2018: MMLSH), responding to Norris et al. (2016: NMC16), postulate that feedback of activation from words to pre-lexical representations is helpful in spoken-word recognition. Their argument (1) is flawed by being bound to a particular class of model, (2) misses the central point about parsimony in recognition models, and (3) ignores crucial data. MMLSH describe simulations with the interactive-activation model TRACE (McClelland and Elman, 1986). Activation feedback is a key feature of TRACE: activation feeds back from word-form representations to influence the activation of pre-lexical phoneme representations. The simulations show that (for most though not all words) feedback improves word recognition when noise is added to the input. As we will argue, however, this demonstration has no bearing on the larger theoretical question of whether activation feedback is necessary, or even helpful, in speech recognition (Norris et al., 2000: NMC00; NMC16). The MMLSH simulations do not show that activation feedback necessarily improves word recognition because showing that it helps TRACE does not entail that it will help other models. If the frequency of all words is assumed to be the same, then the best that any speech recognition system can do is compute the match between input features and lexical representations and select the best-matching word (more specifically, pick the word with the maximum likelihood). Since words differ in frequency, however, priors are available. The task is then to compute the posterior probability of the words as the product of the likelihood and prior (i.e., use Bayes' rule). This is how Shortlist B (Norris and McQueen, 2008: NM08) works. Shortlist B is feedforward and, by virtue of implementing Bayesian inference, performs optimally; its use of Bayes' rule guarantees that the best-matching word must be recognized. Why then can TRACE benefit from feedback? The inescapable conclusion is that TRACE does not perform optimally, as just defined. This is not surprising. TRACE's internal currency is not probability, but activation. As one of the developers of TRACE explained (McClelland, 1991, 2013), interactive-activation models do not compute posterior probabilities. Instead, the decision about which word is present depends on a response threshold set on the output of the Luce choice rule. Reaching this threshold depends on differences among the activations of different candidate words. Crucially, because there is no internal noise, feedback has free rein to amplify these differences in arbitrary ways. These activation values therefore do not reflect the posterior probabilities of words. Contrary to MMLSH's claim, TRACE's behavior is thus neither optimal nor Bayesian. In an optimal system operating on noisy input without the Luce choice rule, feedback will amplify both signal and noise, and hence will achieve nothing. Indeed, as MMLSH's simulations show, adding feedback to TRACE has little effect when there is no noise in the input. Rather, what feedback does is protect the model's speed and accuracy against the negative effects of increasing noise: feedback from word to phoneme nodes amplifies initial differences in phoneme-node activations and this in turn amplifies differences in word-node activations, counteracting the reductions in those differences that increasing noise has caused. This helps TRACE because its initial behavior is suboptimal, but says nothing about the need to include feedback in other models. MMLSH's discussion about whether activation feedback causes “hallucinations” is also model-specific. Activation feedback does not cause listeners to hallucinate indiscriminately, but it does run the risk of creating hallucinations (NMC00, NMC16). Parameters in TRACE can be adjusted to avoid these negative effects, but, as McClelland et al. (2014) showed, it takes a very different kind of interactive-activation model to behave in a fully Bayesian way. A model built from the start on Bayesian principles would need no such parameter tweaking and would always behave optimally anyway. MMLSH argue that, on a count of nodes and connections, models with activation feedback are simpler than those without it. TRACE actually performs very badly in such a count because of massive reduplication of nodes over time slices (Norris, 1994); this is why MMSLH had to exclude many activated nodes to keep their simulations within bounds (p. 5). If number of parameters is the metric used, Bayesian models (because of their strong principles) need far fewer free parameters than interactive-activation models (7 as opposed to 16, comparing the Bayes-based Merge B with the activation-based Merge A; NM08). The divergent performance of different metrics only emphasizes the pointlessness of making claims about the relative complexity of different models in an informal and arbitrary manner; such comparisons should be formal (c.f. Vandekerckhove et al., 2015) and use fully-specified models, as in the Merge A/B case. Also on parsimony, MMLSH misinterpret NMC00's: “Information flow from word processing to these earlier stages is not required by the logic of speech recognition and cannot replace the necessary flow of information from sounds to words. Thus it could only be included […] as an additional component” (NMC00, p. 299). MMLSH curiously read “not required by logic” as “illogical” (Is loving your spouse required by logic? Certainly not, but that does not make it illogical). An accurate reading of “not required by logic” is, of course, “not necessary”, and this is the central point about parsimony: additional components should only be added if it is strictly necessary to do so. MMSLH do not address this point. Crucial behavioral evidence is inconsistent with activation feedback (McQueen et al., 2009; Kingston et al., 2016). MMLSH fail to discuss this evidence. MMSLH note neuroscientific findings, but such evidence is inconclusive, as it could arise from other types of feedback (e.g., for learning or binding; NMC16). These other types of feedback are helpful, may indeed be necessary in speech recognition, and, in some cases, are supported by evidence (e.g., feedback for learning, Norris et al., 2003). Activation feedback is the only type with a function that is not self-evident and which is confuted by existing evidence. Theoretical arguments and the available empirical data thus indicate that activation feedback is not necessary in on-line speech recognition. Indeed, activation feedback is unable to improve the already optimal performance of any Bayesian feedforward model.

Author contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  10 in total

1.  Merging information in speech recognition: feedback is never necessary.

Authors:  D Norris; J M McQueen; A Cutler
Journal:  Behav Brain Sci       Date:  2000-06       Impact factor: 12.579

2.  Stochastic interactive processes and the effect of context on perception.

Authors:  J L McClelland
Journal:  Cogn Psychol       Date:  1991-01       Impact factor: 3.468

3.  Perceptual learning in speech.

Authors:  Dennis Norris; James M McQueen; Anne Cutler
Journal:  Cogn Psychol       Date:  2003-09       Impact factor: 3.468

Review 4.  Shortlist B: a Bayesian model of continuous speech recognition.

Authors:  Dennis Norris; James M McQueen
Journal:  Psychol Rev       Date:  2008-04       Impact factor: 8.934

5.  Eye movement evidence for an immediate Ganong effect.

Authors:  John Kingston; Joshua Levy; Amanda Rysling; Adrian Staub
Journal:  J Exp Psychol Hum Percept Perform       Date:  2016-08-15       Impact factor: 3.332

6.  Interactive activation and mutual constraint satisfaction in perception and cognition.

Authors:  James L McClelland; Daniel Mirman; Donald J Bolger; Pranav Khaitan
Journal:  Cogn Sci       Date:  2014-08-07

7.  The TRACE model of speech perception.

Authors:  J L McClelland; J L Elman
Journal:  Cogn Psychol       Date:  1986-01       Impact factor: 3.468

8.  Integrating probabilistic models of perception and interactive neural networks: a historical and tutorial review.

Authors:  James L McClelland
Journal:  Front Psychol       Date:  2013-08-20

9.  Interaction in Spoken Word Recognition Models: Feedback Helps.

Authors:  James S Magnuson; Daniel Mirman; Sahil Luthra; Ted Strauss; Harlan D Harris
Journal:  Front Psychol       Date:  2018-04-03

10.  Prediction, Bayesian inference and feedback in speech recognition.

Authors:  Dennis Norris; James M McQueen; Anne Cutler
Journal:  Lang Cogn Neurosci       Date:  2015-09-04       Impact factor: 2.331

  10 in total
  1 in total

Review 1.  Probabilistic modeling of orthographic learning based on visuo-attentional dynamics.

Authors:  Emilie Ginestet; Sylviane Valdois; Julien Diard
Journal:  Psychon Bull Rev       Date:  2022-03-22
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.