| Literature DB >> 32174816 |
Di Fu1,2,3, Cornelius Weber3, Guochun Yang1,2, Matthias Kerzel3, Weizhi Nan4, Pablo Barros3, Haiyan Wu1,2, Xun Liu1,2, Stefan Wermter3.
Abstract
Selective attention plays an essential role in information acquisition and utilization from the environment. In the past 50 years, research on selective attention has been a central topic in cognitive science. Compared with unimodal studies, crossmodal studies are more complex but necessary to solve real-world challenges in both human experiments and computational modeling. Although an increasing number of findings on crossmodal selective attention have shed light on humans' behavioral patterns and neural underpinnings, a much better understanding is still necessary to yield the same benefit for intelligent computational agents. This article reviews studies of selective attention in unimodal visual and auditory and crossmodal audiovisual setups from the multidisciplinary perspectives of psychology and cognitive neuroscience, and evaluates different ways to simulate analogous mechanisms in computational models and robotics. We discuss the gaps between these fields in this interdisciplinary review and provide insights about how to use psychological findings and theories in artificial intelligence from different perspectives.Entities:
Keywords: auditory attention; computational modeling; crossmodal learning; deep learning; selective attention; visual attention
Year: 2020 PMID: 32174816 PMCID: PMC7056875 DOI: 10.3389/fnint.2020.00010
Source DB: PubMed Journal: Front Integr Neurosci ISSN: 1662-5145
Figure 1(A) Neuroanatomical model of bottom-up and top-down attentional processing in the visual cortex. The dorsal system (green) executes the top-down attentional control. FEF, frontal eye field; IPS, intraparietal sulcus. The ventral system (red) executes the bottom-up processing. VFC, ventral frontal cortex; TPJ, temporoparietal junction (adapted from Corbetta and Shulman, 2002); (B) Cortical oscillation model of attentional control in visual and auditory sensory areas. The posterior medial frontal cortex (pMFC) modulates selective attention by the excitation of task-relevant processing and the inhibition of task-irrelevant processing. Theta oscillations facilitate the communication between the pMFC and lateral prefrontal cortex (LPFC) (purple arrow). Gamma oscillations and alpha oscillations are promoted in task-relevant and task-irrelevant cortical areas, respectively (gray arrows) (adapted from Clayton et al., 2015).
Main theories of visual selective attention based on various processing pathways.
| Stimulus-driven Theory (1992) | Singletons automatically capture visual attention | Bottom-up |
| Goal-driven Theory (1992) | Individuals' intentions determine attentional capture | Top-down |
| Contingent Capture Hypothesis (1992) | Contingent on attentional control settings induced by task demands | Top-down |
| Attention Selection Bias Competition (1995) | Response to distractors around the target is inhibited | Bottom-up & Top-down |
| Signal Suppression Hypothesis (2010) | Salience signal automatically generated by singletons can be suppressed | Bottom-up & Top-down |
Figure 2(A) Visual saliency model. Features are extracted from the input image. The center-surround mechanism and normalization are used to generate the individual feature saliency maps. Finally, the saliency map is generated by a linear combination of different individual saliency maps (adapted from Itti et al., 1998); (B) Auditory saliency model. The structure of the model is similar to the visual saliency model by converting sound inputs into a frequency “intensity image” (adapted from Kayser et al., 2005).
Figure 3(A) Auditory selective attention model with interaction between bottom-up processing and top-down modulation. The compound sound enters the bottom-up processing in the form of segregated units and then the units are grouped into streams. After segregation and competition, foreground sound stands out from the background noise. The wider arrow represents the salient object with higher attentional weights. Top-down attention control can modulate processing on each stage (adapted from Bregman, 1994; Shinn-Cunningham, 2008). (B) The “where” and “what” cortical pathways of auditory attention processing. Within the dorsal “where” pathway, the superior frontal gyrus (SFG) and superior parietal (SP) areas activate during sound localization. Within the ventral “what” pathway, inferior frontal gyrus (IFG) and auditory cortex activate to recognize the object (adapted from Alain et al., 2001).
Figure 4Locally Excitatory, Globally Inhibitory Oscillator Network (LEGION) (adapted from Wang and Terman, 1995).
Figure 5(A) Human crossmodal integration and attentional control. The black and gray arrows denote the feed-forward bottom-up stimulus saliency processing and the green arrows denote the top-down modulation of attention. The yellow dashed arrows represent the recurrent adjustment (adapted from Talsma et al., 2010); (B) Artificial neural networks of crossmodal integration. The crossmodal integration mechanisms are used to realign the input from visual and auditory modalities (adapted from Parisi et al., 2017, 2018).