| Literature DB >> 31803107 |
Marzieh Sorati1, Dawn Marie Behne1.
Abstract
In audiovisual speech perception, visual information from a talker's face during mouth articulation is available before the onset of the corresponding audio speech, and thereby allows the perceiver to use visual information to predict the upcoming audio. This prediction from phonetically congruent visual information modulates audiovisual speech perception and leads to a decrease in N1 and P2 amplitudes and latencies compared to the perception of audio speech alone. Whether audiovisual experience, such as with musical training, influences this prediction is unclear, but if so, may explain some of the variations observed in previous research. The current study addresses whether audiovisual speech perception is affected by musical training, first assessing N1 and P2 event-related potentials (ERPs) and in addition, inter-trial phase coherence (ITPC). Musicians and non-musicians are presented the syllable, /ba/ in audio only (AO), video only (VO), and audiovisual (AV) conditions. With the predictory effect of mouth movement isolated from the AV speech (AV-VO), results showed that, compared to audio speech, both groups have a lower N1 latency and P2 amplitude and latency. Moreover, they also showed lower ITPCs in the delta, theta, and beta bands in audiovisual speech perception. However, musicians showed significant suppression of N1 amplitude and desynchronization in the alpha band in audiovisual speech, not present for non-musicians. Collectively, the current findings indicate that early sensory processing can be modified by musical experience, which in turn can explain some of the variations in previous AV speech perception research.Entities:
Keywords: audiovisual; event-related potential (ERP); inter-trial phase coherence (ITPC); musical training; musicians; non-musicians; prediction; speech perception
Year: 2019 PMID: 31803107 PMCID: PMC6874039 DOI: 10.3389/fpsyg.2019.02562
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Means and standard deviations (in parentheses), for musicians and non-musicians based on a questionnaire.
| Musicians | 23 years (3 years) | 9 females, 9 males | 9 (1)/10 | 19 hr (13 hr) | 8 years (2 years) | 14 years (3 years) |
| Non-musicians | 23 years (3 years) | 10 females, 11 males | 5 (2)/10 | 5 hr (5 hr) | - | Less than a year |
Figure 1The trial timeline for three conditions: audio only (AO), video only (VO), and audiovisual (AV). All three conditions start with a 500 ms fixation cross and finish with a 800 ms still image as the last frame. Each frame in the figure represents two frames in the actual stimuli.
Musicians and non-musicians' correct responses in percentage and standard deviations in parenthesis, in response to the target in AO, VO, and AV trials.
| Musicians | 98% (0) | 92% (1) | 99% (1) | 96% (2) |
| Non-musicians | 96% (1) | 92% (1) | 97% (1) | 95% (3) |
Figure 2(A) Grand averaged waveforms at Cz and topographical maps in N1 and P2 windows for audio only (AO) in blue, video only (VO) in red dashed-line, audiovisual (AV) in red dotted-line, and audiovisual minus video only (AV−VO) in red line. (B,C) Shift functions at N1 amplitudes and latencies for musicians and non-musicians. Each gray dashed-line connect each participants AO and AV−VO data points.
Mean and standard deviation (SD) of N1 and P2 amplitude (μV), latency (ms) and ITPC for delta, theta, alpha, and beta activity, for musicians and non-musicians.
| Musicians | AO | −1.19 (0.53) | 108 (10) | 1.33 (0.53) | 222 (31) | 0.23 (0.09) | 0.24 (0.07) | 0.16 (0.05) | 0.12 (0.03) |
| AV−VO | −0.78 (0.36) | 98 (13) | 0.85 (0.4) | 206 (28) | 0.21 (0.08) | 0.21 (0.08) | 0.09 (0.05) | 0.11 (0.02) | |
| Non-musicians | AO | −1.01 (0.67) | 103 (8) | 1.65 (1.02) | 213 (31) | 0.25 (0.09) | 0.25 (0.1) | 0.18 (0.08) | 0.15 (0.05) |
| AV−VO | −1.04 (0.56) | 93 (11) | 1.06 (0.81) | 209 (27) | 0.21 (0.09) | 0.22 (0.07) | 0.17 (0.04) | 0.11 (0.03) | |
Summary of F-statistics of main effects and interactions.
| Condition (AO vs. AV−VO) | 3.94 | 24.46 | 18 | 3.95 | 6.12 | 5.75 | 7.25 | 5.46 |
| Background (musicians vs. non-musicians) | 0.08 | 3.26 | 2.04 | 0.21 | 0.32 | 0.57 | 10.58 | 1.99 |
| condition × background | 4.99 | 0.039 | 0.19 | 1.66 | 0.46 | 0 | 4.65 | 2.23 |
p ≤ 0.05,
p < 0.001,
p < 0.0001.
Figure 3Correlation between N1 magnitude in AV−VO and hours of instrumental practice per week for musicians and non-musicians.
Figure 4Point-by-point two-tailed t-test for event-related potentials in AO and AV−VO at C3 and C4 for musicians and non-musicians.
Figure 5Trial-to-trial phased-locking measured by ITPC for audio only (AO) and audiovisual minus video only (AV−VO).