| Literature DB >> 26082738 |
Abstract
In well-controlled laboratory experiments, researchers have found that humans can perceive delays between auditory and visual signals as short as 20 ms. Conversely, other experiments have shown that humans can tolerate audiovisual asynchrony that exceeds 200 ms. This seeming contradiction in human temporal sensitivity can be attributed to a number of factors such as experimental approaches and precedence of the asynchronous signals, along with the nature, duration, location, complexity and repetitiveness of the audiovisual stimuli, and even individual differences. In order to better understand how temporal integration of audiovisual events occurs in the real world, we need to close the gap between the experimental setting and the complex setting of everyday life. With this work, we aimed to contribute one brick to the bridge that will close this gap. We compared perceived synchrony for long-running and eventful audiovisual sequences to shorter sequences that contain a single audiovisual event, for three types of content: action, music, and speech. The resulting windows of temporal integration showed that participants were better at detecting asynchrony for the longer stimuli, possibly because the long-running sequences contain multiple corresponding events that offer audiovisual timing cues. Moreover, the points of subjective simultaneity differ between content types, suggesting that the nature of a visual scene could influence the temporal perception of events. An expected outcome from this type of experiment was the rich variation among participants' distributions and the derived points of subjective simultaneity. Hence, the designs of similar experiments call for more participants than traditional psychophysical studies. Heeding this caution, we conclude that existing theories on multisensory perception are ready to be tested on more natural and representative stimuli.Entities:
Keywords: audiovisual synchrony; complex stimuli; multisensory perception; temporal integration; visual distortion
Year: 2015 PMID: 26082738 PMCID: PMC4451240 DOI: 10.3389/fpsyg.2015.00736
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1The four screenshots represent the timeline of the long-running chess sequence; the first frame also illustrates the single event contained within the 1-s stimulus version.
Figure 2The four screenshots represent the timeline of the long-running drums sequence; the first frame also illustrates the single event contained within the 1-s stimulus version.
Figure 3The four screenshots represent the timeline of the long-running speech sequence.
Figure 4Screenshots of single frames included in the 1-s presentations of the /BA/ syllable, at the original 1024×576 pixel resolution .
Detailed descriptions of the long and short audiovisual sequences presented to participants.
| The video portrays a game of chess played by two young men in a Renaissance setting. In the opening scene of the long version, the chessboard and the players' hands are seen from above. The camera slowly zooms out and pans down to gradually include the two players and the surrounding room. Five pieces are picked up, moved and put down during the 13 s presentation. The short sequence includes a single chess move, from an overhead view. The content was sampled from the movie Assassin's Creed: Lineage (Part 1), with permission from Ubisoft. | A young man introduces the 13-s music sequence by hitting his drumsticks together three times, using wide and rapid, but visible, movements. He then commences to play the drums, while the camera zooms slowly out to include the alley where he sits. The sequence concludes with the appearance of first one, then two, of the drummer's clones, both with bass-guitars. In the 1-s excerpt, only one instance of the drumsticks hitting together is presented. The video was produced by Freddie Wong and Brandon Laatsch for the freddiew channel on YouTube. | The long sequence shows a female news anchor in studio, she presents a story about the return of two injured football. (Norwegian transcript: |
ANOVA results for asynchrony direction, stimulus duration, content and blur levels, significant results are marked with asterisks.
| Blur | |
| Duration | |
| Content | |
| Asynchrony direction | |
| Blur*Duration | |
| Blur*Content | |
| Blur*Asynchrony direction | |
| Duration*Content | |
| Duration*Asynchrony direction | |
| Content*Asynchrony direction | |
| Blur*Duration*Content | |
| Blur*Duration*Asynchrony direction | |
| Blur*Content*Asynchrony direction | |
| Duration*Content*Asynchrony direction | |
| Blur*Duration*Content*Async direction |
Figure 5Overall rates of perceived synchrony averaged across all asynchronous presentations, separated by direction of asynchrony (left side) and stimulus duration (right side). Error bars represent the 95% confidence intervals and results from the post-hoc paired-comparison t-tests are overlaid, with asterisks marking significant contrasts.
Results from ANOVAs exploring content and duration for audio lead and audio lag conditions, significant results are marked with asterisks.
| Audio lead | Duration | |
| Content | ||
| Asynchrony | F(4, 72) = 234.38, p<0.001*, η2p = 0.93 | |
| Duration*Content | ||
| Duration*Asynchrony | ||
| Content*Asynchrony | ||
| Duration*Content*Asynchrony | ||
| Audio lag | Duration | |
| Content | ||
| Asynchrony | ||
| Duration*Content | ||
| Duration*Asynchrony | ||
| Content*Asynchrony | ||
| Duration*Content*Asynchrony |
Goodness of fit of individual Gaussian curves, with the lowest, highest, average and standard deviation () of participants' scores, represented by the coefficient of determination, .
| Chess (13 s) | 0.79 | 0.98 | 0.89 (0.05) | 0.87 | 0.98 | 0.93 (0.03) |
| Chess (1 s) | 0.74 | 0.98 | 0.89 (0.07) | 0.89 | 0.98 | 0.93 (0.03) |
| Drums (13 s) | 0.61 | 0.94 | 0.85 (0.07) | 0.75 | 0.97 | 0.92 (0.05) |
| Drums (1 s) | 0.60 | 0.95 | 0.82 (0.12) | 0.76 | 0.97 | 0.91 (0.06) |
| Speech (13 s) | 0.78 | 0.97 | 0.88 (0.05) | 0.88 | 0.97 | 0.93 (0.03) |
| Speech (1 s) | 0.70 | 0.95 | 0.83 (0.06) | 0.87 | 0.99 | 0.94 (0.04) |
Figure 6Points of subjective simultaneity (PSS) for all stimulus contents and durations, plotted for each participant as well as the overall mean. The PSS values spread across a large range of audiovisual asynchrony from ≈100 ms audio lead to ≈250 ms audio lag.
Figure 7Perceived synchrony distributions for each content type, fitted to the displayed mean points that correspond to the average of all participants' means for the presented asynchrony and content conditions. The overlaid windows of temporal integration are calculated from the mean FWHM values, resulting in some discrepancy between the derived audio lead and lag thresholds and the visual curve examples.
Summary of paired comparison -tests for FWHM and PSS, significant results are marked with asterisks.
| FWHM | Chess vs. Drums | |
| Chess vs. Speech | ||
| Drums vs. Speech | ||
| PSS | Chess vs. Drums | |
| Chess vs. Speech | ||
| Drums vs. Speech | ||
| Chess (13 s) vs. Chess (1 s) | ||
| Drums (13 s) vs. Drums (1 s) | ||
| Speech (13 s) vs. Speech (1 s) |
Results from ANOVAs run separately for d' (sensitivity) and β (response bias), significant results are marked with asterisks.
| β | Asynchrony direction | |
| Duration | ||
| Content | ||
| Asynchrony direction*Duration | ||
| Asynchrony direction*Content | ||
| Duration*Content | ||
| Asynchrony direction*Duration*Content | ||
| d' | Asynchrony direction | |
| Duration | ||
| Content | ||
| Asynchrony direction*Duration | ||
| Asynchrony direction*Content | ||
| Duration*Content | ||
| Asynchrony direction*Duration*Content |
Figure 8Response bias β measures for long and short chess, drums and speech sequences, averaged across participants. Error bars represent the 95% confidence intervals, no significant contrasts where found with the Holm-Bonferroni adjustments.
Figure 9Signal detection d' sensitivity values, grouped according to content type and stimulus duration, averaged across participants. Error bars represent the 95% confidence intervals and asterisks denote significant contrasts.