| Literature DB >> 25374535 |
Michael H Herzog1, Aaron M Clarke1.
Abstract
In classical models of object recognition, first, basic features (e.g., edges and lines) are analyzed by independent filters that mimic the receptive field profiles of V1 neurons. In a feedforward fashion, the outputs of these filters are fed to filters at the next processing stage, pooling information across several filters from the previous level, and so forth at subsequent processing stages. Low-level processing determines high-level processing. Information lost on lower stages is irretrievably lost. Models of this type have proven to be very successful in many fields of vision, but have failed to explain object recognition in general. Here, we present experiments that, first, show that, similar to demonstrations from the Gestaltists, figural aspects determine low-level processing (as much as the other way around). Second, performance on a single element depends on all the other elements in the visual scene. Small changes in the overall configuration can lead to large changes in performance. Third, grouping of elements is key. Only if we know how elements group across the entire visual field, can we determine performance on individual elements, i.e., challenging the classical stereotypical filtering approach, which is at the very heart of most vision models.Entities:
Keywords: Gestalt; Verniers; crowding; feedback; object recognition
Year: 2014 PMID: 25374535 PMCID: PMC4205941 DOI: 10.3389/fncom.2014.00135
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 2.380
Figure 1Left: A typical hierarchical, feedforward model, where information processing starts at the retina, proceeds to the LGN, then to V1, V2, V4, and IT. Decisions about stimuli are made in the frontal cortex. Center: Lower visual areas have smaller receptive fields, while neurons in higher areas have gradually increasing receptive field sizes, integrating information over larger and larger regions of the visual field. Right: Lower visual areas, such as V1, code for basic features such as edges and lines. Higher-level neurons pool information over multiple low-level neurons with smaller receptive fields and code for more complex features. There is thus a hierarchy of features. Figure adapted from Manassi et al. (2013).
Figure 2Vernier offset discrimination as a function of stimulus configuration. (A). The reference stimulus is the un-flanked vernier shown in a. Enclosing the vernier in a square deteriorates performance. Adding additional squares leads to increasingly better performance. (B). We replicated the results with the squares (b,c). In addition, rotating the flanking squares to form diamonds (d) undoes the effect of grouping and reinstates the crowding effect. From Manassi et al. (2013).