| Literature DB >> 26124728 |
Simon Garrod1, Martin J Pickering2.
Abstract
For addressees to respond in a timely fashion, they cannot simply process the speaker's utterance as it occurs and wait till it finishes. Instead, they predict both when the speaker will conclude and what linguistic forms will be used. While doing this, they must also prepare their own response. To explain this, we draw on the account proposed by Pickering and Garrod (2013a), in which addressees covertly imitate the speaker's utterance and use this to determine the intention that underlies their upcoming utterance. They use this intention to predict when and how the utterance will end, and also to drive their own production mechanisms for preparing their response. Following Arnal and Giraud (2012), we distinguish between mechanisms that predict timing and content. In particular, we propose that the timing mechanism relies on entrainment of low-frequency oscillations between speech envelope and brain. This constrains the context that feeds into the determination of the speaker's intention and hence the timing and form of the upcoming utterance. This approach typically leads to well-timed contributions, but also provides a mechanism for resolving conflicts, for example when there is unintended speaker overlap.Entities:
Keywords: content; dialog; prediction; timing; turn-taking
Year: 2015 PMID: 26124728 PMCID: PMC4463931 DOI: 10.3389/fpsyg.2015.00751
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1A schematic illustration of the turn ending prediction mechanism, with . Above the line, B's unfolding utterance content is shown as and , which refer to semantic, syntactic, and phonological representations of the current utterance (at time t) and the upcoming utterance (at time t + 1, with the underlining indicating that they are B's representations; see Pickering and Garrod, 2013a). The timing of B's speech is represented in terms of the entrained theta oscillations in B's speech envelope. Below the line, A's prediction of the content of B's unfolding utterance is shown as ĉ[sem, syn, phon](t + 1) and A's prediction of B's speech timing is shown in terms of theta oscillations in A's auditory cortex. The predicted content comes from A covertly imitating B's utterance at time t, deriving B's putative production command at time t+1 and then feeding this production command into forward models to generate the predictions for time t+1. The predicted timing comes from entrainment of B's cortical theta oscillations with theta oscillations in A's speech envelope.