Literature DB >> 23450981

How plausible is a subcortical account of rapid visual recognition?

Maxime Cauchoix1, Sébastien M Crouzet.   

Abstract

Entities:  

Year:  2013        PMID: 23450981      PMCID: PMC3583107          DOI: 10.3389/fnhum.2013.00039

Source DB:  PubMed          Journal:  Front Hum Neurosci        ISSN: 1662-5161            Impact factor:   3.169


× No keyword cloud information.
Primates recognize objects in natural visual scenes with great rapidity. The ventral visual cortex is usually assumed to play a major role in this ability (“high-road”). However, the “low-road” alternative frequently proposed is that the visual cortex is bypassed by a rapid subcortical route to the amygdala, especially in the case of biologically relevant and emotional stimuli. This paper highlights the lack of evidence from psychophysics and computational models to support this “low-road” alternative. Most importantly, the timing of neural responses invites a serious reconsideration of the low-road role in rapid processing of visual objects.

The speed of sight

The rapid and accurate processing of complex visual scenes has been demonstrated by Thorpe and colleagues using the rapid visual categorization protocol (Thorpe et al., 1996), in which participants reported the presence of animals in natural scenes as soon as 250 ms after image onset. This result sets strong time constraints on the neural mechanisms underlying object categorization. Diagnostic category information might actually be available even earlier, since selective eye movement responses can be produced only 100–120 ms after stimulus onset (Kirchner and Thorpe, 2006; Crouzet et al., 2010). What neural mechanisms could account for such rapid vision?

The cortical “high-road”

A widely held view is that object recognition results from the interplay of hierarchically organized areas along the ventral visual stream (Dicarlo et al., 2012) from the primary visual cortex (V1) through extrastriate visual areas (V2 and V4), to the inferotemporal cortex (IT) where high-level visual representations are encoded. To reconcile this view with the short behavioral latencies observed in rapid categorization tasks, several authors have suggested that a pure feedforward sweep of activity through the ventral stream might be sufficient to perform core object recognition (Thorpe et al., 1996; Serre et al., 2007b).

The subcortical “low-road”

On the other hand, a subcortical shortcut—the so-called “low-road”—might seem to be a plausible alternative. This hypothesis finds its origin in the rapid amygdala responses reported by (LeDoux, 1996) during auditory fear conditioning. In a series of experiments in rodents, he delineated a quick route that bypasses the cortex by directly reaching the amygdala via the thalamus. Such a subcortical shortcut would, in specific cases such as threatening situations, enable the rapid initiation of appropriate defense responses even before the sensory cortices become involved. Furthermore, since the amygdala has been linked to emotion recognition (particularly fear) in humans (Adolphs et al., 1994), this alternative pathway was proposed as an explanation for rapid, automatic, and unconscious reactions among humans and monkeys to biologically relevant visual stimuli (Öhman and Mineka, 2001; Johnson, 2005; Öhman, 2005; Vuilleumier, 2005; Tamietto et al., 2009; Tamietto and de Gelder, 2010; de Gelder et al., 2011). Here we argue that there is no convincing evidence in support of the “low-road” theory when extended to rapid visual object processing in primates. To preface our arguments, first, real-world object categorization requires computational properties that have not yet been found in a subcortical pathway (see “Real-world Recognition Requires Selectivity and Invariance”). Second, among the characteristics attributed to the “low-road,” we argue that genuine rapidity has not yet been demonstrated appropriately (see “What is Rapid Visual Processing?”). Finally, we will demonstrate how the low-road hypothesis is at odds with neural latencies reported in the amygdala and the visual cortex (see “Ventral Stream Visual Cortex is Activated Before the Amygdala”). Altogether, these arguments point to an earlier role for cortical areas and suggest a serious reconsideration of the role of the “low-road” in rapid vision.

Real-world recognition requires selectivity and invariance

To support recognition, a neural system needs to reach a high level of selectivity while dealing with the inherent variability of sensory input. This balance between selectivity and invariance is a hallmark feature of visual recognition in primates, and remains a challenge for computer vision. In macaque monkeys, selective neural responses to complex objects are typically found in the IT (Dicarlo et al., 2012). These neuronal responses are also tolerant to changes in retinal position, scale, or pose of the object (Hung et al., 2005). Studies using intracranial recordings in human epileptic patients have also shown that neural responses from the visual cortex provide a categorical signal tolerant to changes in scale and position (Liu et al., 2009). Driven by results from electrophysiology, a plausible model of how selectivity and invariance could be built through the ventral stream has emerged. It is based on two successive operations, template-matching and non-linear pooling, repeated at each stage of the ventral hierarchy (Serre et al., 2007b). Such hierarchical models have been shown to accurately mimic primate rapid categorization performance (Serre et al., 2007b; Crouzet and Serre, 2011) and neural responses of the visual ventral stream (Serre et al., 2007a). Among subcortical structures, human single-unit studies showed that the amygdala contains neurons selective to categories or objects such as animals, famous faces, or places (Kreiman et al., 2000; Quiroga et al., 2005; Mormann et al., 2011). Interestingly, these neurons are highly invariant since they respond to various pictures of their preferential objects, but also to their written or spoken names. However, there is currently no model of how this high level of both selectivity and invariance could be built from a direct subcortical route. A more reasonable assumption would thus be that it gets its input from high-level areas of the ventral stream, rather than from the thalamus (shortcut “low-road” route).

What is rapid visual processing?

Numerous studies investigated affective stimulus processing with short image presentation and masking protocols to show that emotions such as fear can be processed unconsciously and “rapidly” (Bar et al., 2006; Öhman et al., 2007; Adolphs, 2008). While there is no doubt that masking is a powerful experimental tool to reveal unconscious sensory processing, it does not provide information on the genuine rapidity of visual processing. In backward masking protocols, the stimulus onset asynchrony (SOA, time interval between target and mask onset) is a measure of the visual uptake time (or temporal resolution), not of the time required for complete visual processing. In other words, even in a perfect pipeline model of the visual system, the mask interference would only give information about the time spent at each stage, and not about the cumulative time for all stages (Vanrullen, 2011). For example, the fact that fear information can be extracted from faces masked after an SOA of 39 ms (Bar et al., 2006) informs us about the minimal visual uptake time necessary for fear processing but does not say anything about the time at which fear information is available to trigger behavioral responses. The speed of processing for object or scene categorization has been extensively studied using rapid categorization protocols. Using minimal reaction time measurements (the time at which correct responses start to significantly outnumber incorrect ones) it has been shown that humans can categorize images as containing an animal in only 250 ms (Thorpe et al., 1996), while monkeys can perform the same task by 180 ms (Fabre-Thorpe et al., 1998). Even faster, reliable saccades toward faces and animals can be triggered as soon as 100–120 ms after image onset (Kirchner and Thorpe, 2006; Crouzet et al., 2010). As far as we know, there is no evidence for faster processing of emotional stimuli, as would be predicted by the “low-road” hypothesis.

Ventral stream visual cortex is activated before the amygdala

Most of the studies on humans investigating the role of the amygdala in visual processing used fMRI and PET scans (Morris et al., 1999; Whalen et al., 2004; see Pessoa and Adolphs, 2010 and Vuilleumier, 2005 for reviews). These two techniques, because of their poor temporal resolution, do not allow conclusions about the temporal dynamics of stimulus processing. Despite this limitation, it was assumed that amygdala responses to emotional stimuli, notably to fear-inducing stimuli, were based on a rapid low-road activation (Öhman and Mineka, 2001; Vuilleumier, 2005). A review of electrophysiological studies reporting neural latencies suggests a clearly different picture. Many studies investigating the properties of IT cells have reported selective responses to shapes, faces or object categories occurring as soon as 70–100 ms after stimulus onset (Perrett et al., 1982; Li et al., 1993; Tovee et al., 1994; Liu and Richmond, 2000; Hung et al., 2005; see Mormann et al., 2008 for a review). Similarly, in human epileptic patients, IFP recorded from the occipito-temporal cortex were object category selective as early as 100 ms after stimulus onset (Liu et al., 2009). These category selective latencies are compatible with the rapid behavioral responses observed in natural scene categorization tasks (Thorpe et al., 1996; Fabre-Thorpe et al., 1998; Kirchner and Thorpe, 2006; Girard et al., 2008). Furthermore, this early ventral stream activity has been shown to be causally linked with behavioral responses in monkeys (Afraz et al., 2006) and humans (Pitcher et al., 2007; Sadeh et al., 2011). On the other hand, selective responses to visual features or objects in monkeys' amygdala tend to have a greater time-lapse (Gothard et al., 2007). One single-unit study (Leonard et al., 1985) compared the two pathways directly (on the same monkeys) and showed that neurons in the STS (superior temporal sulcus, top of the ventral stream) had latencies (90–140 ms) that systematically preceded those from the amygdala (110–200 ms). In humans, two intracranial recording studies have tested the existence of rapid amygdala responses to emotional stimuli. But the earliest responses were reported at 200 ms (Krolak-Salmon et al., 2004) and 250–500 ms (Rutishauser et al., 2011), which is much slower than the fast occipito-temporal selectivity reported for objects categories (Liu et al., 2009). Moreover, the amygdala responses observed for emotional stimuli were not occurring earlier than what is generally reported for object categories (Mormann et al., 2008). Among the medial temporal lobe structures (i.e., perirhinal cortex, entorhinal cortex, hippocampus, and amygdala), the amygdala is actually the one with the slowest visual responses (average latencies of 271 ms in the perirhinal cortex for example). The pattern of neural latencies observed in both human and monkeys thus clearly vouches for a cortical “high-road” precedence.

Conclusion

In this paper we questioned the hypothesis that a subcortical low-road could account for the speed of sight. Several observations from psychophysics, computational modeling, and electrophysiology strongly suggest that the low-road account is mostly incompatible with the characteristics of rapid visual categorization. On the contrary, a large collection of evidence confirms that the cortical high-road, through the visual ventral stream, can accomplish a rapid, selective, and invariant analysis of the scene. The latency of neural visual activation and response characteristics in the amygdala clearly suggest that its involvement in visual processing is downstream of the ventral visual cortex, after core object recognition has been performed. Thus, contrary to what is commonly acknowledged, rapid, and automatic processing of visual objects is likely to be under cortical-dependence while subcortical structures would be involved in slower (probably higher-level) processing. This conclusion conforms with recent results and reviews pointing out the unexpected role of subcortical structure in high cognitive functions (Parvizi, 2009). Amygdala for instance is now thought to play a major role in the evaluation of the biological significance of stimuli (Pessoa and Adolphs, 2010) and the pulvinar, showing dense connection with many cortical areas, has recently been shown to play a role in regulating information transmission across the visual cortex (Saalmann et al., 2012).
  43 in total

Review 1.  Neural bases of the non-conscious perception of emotional signals.

Authors:  Marco Tamietto; Beatrice de Gelder
Journal:  Nat Rev Neurosci       Date:  2010-09-02       Impact factor: 34.870

2.  Fast saccades toward faces: face detection in just 100 ms.

Authors:  Sébastien M Crouzet; Holle Kirchner; Simon J Thorpe
Journal:  J Vis       Date:  2010-04-28       Impact factor: 2.240

3.  Timing, timing, timing: fast decoding of object information from intracranial field potentials in human visual cortex.

Authors:  Hesheng Liu; Yigal Agam; Joseph R Madsen; Gabriel Kreiman
Journal:  Neuron       Date:  2009-04-30       Impact factor: 17.173

4.  Unseen facial and bodily expressions trigger fast emotional reactions.

Authors:  Marco Tamietto; Lorys Castelli; Sergio Vighetti; Paola Perozzo; Giuliano Geminiani; Lawrence Weiskrantz; Beatrice de Gelder
Journal:  Proc Natl Acad Sci U S A       Date:  2009-10-05       Impact factor: 11.205

5.  Emotion in the brain: of low roads, high roads and roads less travelled.

Authors:  Beatrice de Gelder; Jack van Honk; Marco Tamietto
Journal:  Nat Rev Neurosci       Date:  2011-06-15       Impact factor: 34.870

Review 6.  Emotion processing and the amygdala: from a 'low road' to 'many roads' of evaluating biological significance.

Authors:  Luiz Pessoa; Ralph Adolphs
Journal:  Nat Rev Neurosci       Date:  2010-11       Impact factor: 34.870

7.  Corticocentric myopia: old bias in new cognitive sciences.

Authors:  Josef Parvizi
Journal:  Trends Cogn Sci       Date:  2009-08       Impact factor: 20.229

Review 8.  A quantitative theory of immediate visual recognition.

Authors:  Thomas Serre; Gabriel Kreiman; Minjoon Kouh; Charles Cadieu; Ulf Knoblich; Tomaso Poggio
Journal:  Prog Brain Res       Date:  2007       Impact factor: 2.453

Review 9.  Fear, faces, and the human amygdala.

Authors:  Ralph Adolphs
Journal:  Curr Opin Neurobiol       Date:  2008-08-12       Impact factor: 6.627

10.  Ultra-rapid categorisation in non-human primates.

Authors:  P Girard; C Jouffrais; C H Kirchner
Journal:  Anim Cogn       Date:  2008-02-08       Impact factor: 3.084

View more
  8 in total

1.  A Rapid Subcortical Amygdala Route for Faces Irrespective of Spatial Frequency and Emotion.

Authors:  Jessica McFadyen; Martial Mermillod; Jason B Mattingley; Veronika Halász; Marta I Garrido
Journal:  J Neurosci       Date:  2017-03-10       Impact factor: 6.167

2.  Preferential attention to animals and people is independent of the amygdala.

Authors:  Shuo Wang; Naotsugu Tsuchiya; Joshua New; Rene Hurlemann; Ralph Adolphs
Journal:  Soc Cogn Affect Neurosci       Date:  2014-05-01       Impact factor: 3.436

3.  Fixations Gate Species-Specific Responses to Free Viewing of Faces in the Human and Macaque Amygdala.

Authors:  Juri Minxha; Clayton Mosher; Jeremiah K Morrow; Adam N Mamelak; Ralph Adolphs; Katalin M Gothard; Ueli Rutishauser
Journal:  Cell Rep       Date:  2017-01-24       Impact factor: 9.423

Review 4.  Neural activities during the Processing of unattended and unseen emotional faces: a voxel-wise Meta-analysis.

Authors:  Zeguo Qiu; Xue Lei; Stefanie I Becker; Alan J Pegna
Journal:  Brain Imaging Behav       Date:  2022-06-23       Impact factor: 3.224

5.  The functional role of the periphery in emotional language comprehension.

Authors:  David A Havas; James Matheson
Journal:  Front Psychol       Date:  2013-05-27

6.  Differential processing of natural scenes in posterior cortical atrophy and in Alzheimer's disease, as measured with a saccade choice task.

Authors:  Muriel Boucart; Gauthier Calais; Quentin Lenoble; Christine Moroni; Florence Pasquier
Journal:  Front Integr Neurosci       Date:  2014-07-25

7.  A robust and representative lower bound on object processing speed in humans.

Authors:  Magdalena M Bieniek; Patrick J Bennett; Allison B Sekuler; Guillaume A Rousselet
Journal:  Eur J Neurosci       Date:  2015-11-14       Impact factor: 3.386

8.  Rapid and coarse face detection: With a lack of evidence for a nasal-temporal asymmetry.

Authors:  Laura Cabral; Bobby Stojanoski; Rhodri Cusack
Journal:  Atten Percept Psychophys       Date:  2020-05       Impact factor: 2.199

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.