| Literature DB >> 30418995 |
Violet A Brown1, Maryam Hedayati1, Annie Zanger1, Sasha Mayn1, Lucia Ray1, Naseem Dillman-Hasso1, Julia F Strand1.
Abstract
The McGurk effect is a classic audiovisual speech illusion in which discrepant auditory and visual syllables can lead to a fused percept (e.g., an auditory /bɑ/ paired with a visual /gɑ/ often leads to the perception of /dɑ/). The McGurk effect is robust and easily replicated in pooled group data, but there is tremendous variability in the extent to which individual participants are susceptible to it. In some studies, the rate at which individuals report fusion responses ranges from 0% to 100%. Despite its widespread use in the audiovisual speech perception literature, the roots of the wide variability in McGurk susceptibility are largely unknown. This study evaluated whether several perceptual and cognitive traits are related to McGurk susceptibility through correlational analyses and mixed effects modeling. We found that an individual's susceptibility to the McGurk effect was related to their ability to extract place of articulation information from the visual signal (i.e., a more fine-grained analysis of lipreading ability), but not to scores on tasks measuring attentional control, processing speed, working memory capacity, or auditory perceptual gradiency. These results provide support for the claim that a small amount of the variability in susceptibility to the McGurk effect is attributable to lipreading skill. In contrast, cognitive and perceptual abilities that are commonly used predictors in individual differences studies do not appear to underlie susceptibility to the McGurk effect.Entities:
Mesh:
Year: 2018 PMID: 30418995 PMCID: PMC6231656 DOI: 10.1371/journal.pone.0207160
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
McGurk stimuli and expected fusions.
| Auditory Stimuli | Visual Stimuli | Expected Fusions |
|---|---|---|
| bɑ | gɑ | dɑ, ðɑ, θɑ |
| bɑ | fɑ | vɑ |
| mɑ | gɑ | nɑ |
| pɑ | kɑ | tɑ, ðɑ, θɑ |
Summary statistics for all tasks.
| Task | N | Mean (SD) | Range |
|---|---|---|---|
| MGS | 175 | 0.54 (0.29) | 0–0.99 |
| Lipreading | 180 | 0.32 (0.07) | 0.08–0.60 |
| Lipreading POA | 180 | 0.75 (0.09) | 0.18–0.92 |
| VAS | 181 | 0.40 (0.58) | -0.73–1.69 |
| Flanker | 179 | 36 (30) | -17–158 |
| LDT | 178 | 632 (82) | 484–871 |
| Ospan | 163 | 21.24 (10.92) | 0–50 |
Note. MGS is measured in proportion of responses to incongruent stimuli that were scored as fusion responses. Lipreading scores are measured in proportion correct. VAS scores are scaled for ease of interpretation. Flanker and LDT are measured in reaction time (RT). Ospan is measured on a scale from 0 through 50. RTs are in milliseconds. MGS = McGurk susceptibility; POA = place of articulation; VAS = visual analogue scale score; Flanker = Flanker test (mean incongruent RT—mean congruent RT); LDT = lexical decision task.
Fig 1Mean by-participant McGurk fusion rate in ascending order.
Shaded region represents two standard errors from each participant’s mean fusion rate. N = 175.
Fig 2Distribution of VAS responses for a representative gradient (A) and categorical (B) listener.
VAS = visual analogue scale.
Fig 3Scatterplot and correlations (r values; *** p < .001; * p < .05) showing the relationship between MGS and each of the predictor variables: Lipreading, lipreading place of articulation (POA), perceptual gradiency (visual analogue scale task; VAS), attentional control (flanker), processing speed (lexical decision task; LDT), working memory capacity (operation span; Ospan).
Line represents regression line of best fit. Raw VAS scores are shown here whereas centered and scaled scores are shown in Table 2 for ease of interpretation. Note that one participant had a particularly low lipreading POA score and also had a relatively low MGS fusion rate (top row, middle panel). To ensure that this participant’s data were not driving the observed correlation between fusion rate and POA score, we performed an exploratory analysis computing this correlation without that single participant. Results were very similar to those reported in the text (r = 0.27; p < .001).
Akaike Information Criterion (AIC) and Bayesian information criterion (BIC) for each of the mixed effects models compared.
AIC and BIC values shown here are relative to the intercept-only model. Therefore, negative numbers indicate that a model is better fit for the data than the intercept-only model.
| Model | AIC | BIC |
|---|---|---|
| Flanker + LDT + Ospan + VAS + lipreading POA | -3.75 | 32.79 |
| LDT + VAS + lipreading POA | -7.58 | 14.35 |
| Lipreading POA |