In everyday life, our senses are flooded with a plethora of sensory inputs. To interact effectively, the brain should combine signals when they come from common events and process signals independently when they arise from different events. Integrating information across the senses is crucial for an animal's survival, enabling faster and more accurate responses. A critical question is how the ability to integrate signals across the senses depends on sensory experience during neurodevelopment.Previous elegant neurophysiological work has shown that early audiovisual experience is critical for the alignment of spatial receptive fields across visual (eye‐centred) and auditory (head‐centred) reference frames (Hyde & Knudsen, 2002; Stein et al., 2014a). Neurons in the superior colliculus of dark‐reared cats failed to show a non‐linear response amplification for multisensory relative to unisensory stimuli, typically observed in normally reared cats (Stein et al., 2014b; Wallace et al., 2004). Likewise, selective exposure to independently sampled auditory and visual signals during neurodevelopment reduces the fraction of audiovisual neurons with superadditive responses to audiovisual stimuli (Xu et al., 2012). By contrast, exposure to collocated audiovisual signals appears to sharpen the spatio‐temporal window in which superadditive responses can be observed.A recent study in EJN (Smyre et al., 2021) started to probe the functional relevance of those neurophysiological changes. As one example of abnormal audiovisual statistics, it investigated how dark rearing influences an animal's audiovisual localization behaviour. The study presented cats, reared normally or in darkness, with auditory, visual or collocated audiovisual stimuli from six locations along the azimuth. Further, catch trials were included, on which no stimuli were presented. The animal had to approach the location of the stimulus or refrain from making a response on catch trials. Therefore, the task combined stimulus detection and spatial localization components. For stimulus detection, a multisensory benefit was observed predominantly for normally reared animals. Dark‐reared animals were worse at detecting audiovisual stimuli than normally reared animals despite comparable or even better detection of unisensory stimuli.A more complex picture arose for spatial localization, as here dark rearing profoundly affected localization accuracy even under unisensory stimulation. To enable a comparison between normally and dark‐reared animals, the study developed predictions for audiovisual spatial localization using a statistical facilitation model. Inspired by the classical race model typically used for response time analyses (Miller, 1982; Otto & Mamassian, 2012), the authors applied the statistical facilitation model to response choices, thereby enabling predictions for audiovisual localization accuracy under independent processing assumptions. Specifically, the facilitation model sampled auditory and visual estimates independently and selected the more accurate one for making an overt response. Because in reality the brain cannot assess the accuracy of its own auditory and visual estimates, the model's predictions form an extreme upper bound for audiovisual localization accuracy under the assumption of independent processing. Nevertheless, the localization accuracy of normally reared animals exceeded the statistical facilitation predictions. By contrast, the localization performance of dark‐reared animals did not pass this upper bound, suggesting that the development of audiovisual integration abilities relies on exposure to audiovisual inputs during sensitive periods of neurodevelopment.Further insights into how exposure to abnormal audiovisual spatial statistics affects localization behaviour can be provided from a normative Bayesian perspective (Ernst & Banks, 2002; Körding et al., 2007; Noppeney, 2021). A Bayesian observer sets a benchmark of optimal performance against which observers' behaviour can be compared. According to normative Bayesian inference, observers should combine signals that are known to come from one source weighted by their relative reliabilities (i.e. inverse of noise or variance) into more precise perceptual estimates (Ernst & Banks, 2002; Fetsch et al., 2012, 2013). Indeed, accumulating research has shown that human observers integrate spatial signals from vision and audition near‐optimally into more reliable spatial estimates (Alais & Burr, 2004) though modest deviations from optimality have been noted (Meijer et al., 2019).Recent modelling efforts have moved towards more complex situations, in which signals can come from common or separate causes. In these situations with causal uncertainty, observers need to infer the signals' causal structure by combining noisy correspondence cues (e.g. spatial disparity or temporal synchrony) with prior expectations about the signals' causal structure. Intriguingly, human and non‐human observers have been shown to gracefully transition from sensory integration to segregation for increasing audiovisual spatial disparity or asynchrony consistent with normative models of Bayesian Causal Inference (Acerbi et al., 2018; Aller & Noppeney, 2019; Cao et al., 2019; Körding et al., 2007; Magnotti et al., 2013; Mohl et al., 2020; Rohe et al., 2019; Rohe & Noppeney, 2015a, 2015b; Wozny et al., 2010). Observers integrate audiovisual signals when they are close in time and space, yet keep them separate at large spatio‐temporal conflicts.A central aim for future research is to define the role of early audiovisual experience in shaping how animals perform causal inference and arbitrate between sensory integration and segregation. How does exposure to abnormal audiovisual spatial statistics affect the key computational ingredients of Bayesian causal inference? From the perspective of efficient coding (Wei & Stocker, 2015), the brain may mould receptive fields and tuning functions to optimize the sensory representations to the abnormal audiovisual input statistics—which may in turn alter Bayesian causal inference via changes in the likelihood function. For instance, less precise spatial representations would make the arbitration between sensory integration and segregation less reliable. Further, exposure to independently sampled auditory and visual signals may reduce binding of spatio‐temporally coincident audiovisual signals. Critically, however, profound changes in sensory experience during neurodevelopment may not only affect key parameters of Bayesian causal inference but even result in multisensory processing that deviates from normative computational principles.
CONFLICT OF INTEREST
The author declare no conflict of interest.
PEER REVIEW
The peer review history for this article is available at https://publons.com/publon/10.1111/ejn.15576.