Literature DB >> 20517518

Metacontrast masking and the cortical representation of surface color: dynamical aspects of edge integration and contrast gain control.

Abstract

This paper reviews recent theoretical and experimental work supporting the idea that brightness is computed in a series of neural stages involving edge integration and contrast gain control. It is proposed here that metacontrast and paracontrast masking occur as byproducts of the dynamical properties of these neural mechanisms. The brightness computation model assumes, more specifically, that early visual neurons in the retina, and cortical areas V1 and V2, encode local edge signals whose magnitudes are proportional to the logarithms of the luminance ratios at luminance edges within the retinal image. These local edge signals give rise to secondary neural lightness and darkness spatial induction signals, which are summed at a later stage of cortical processing to produce a neural representation of surface color, or achromatic color, in the case of the chromatically neutral stimuli considered here. Prior to the spatial summation of these edge-based induction signals, the weights assigned to local edge contrast are adjusted by cortical gain mechanisms involving both lateral interactions between neural edge detectors and top-down attentional control. We have previously constructed and computer-simulated a neural model of achromatic color perception based on these principles and have shown that our model gives a good quantitative account of the results of several brightness matching experiments. Adding to this model the realistic dynamical assumptions that 1) the neurons that encode local contrast exhibit transient firing rate enhancement at the onset of an edge, and 2) that the effects of contrast gain control take time to spread between edges, results in a dynamic model of brightness computation that predicts the existence Broca-Sulzer transient brightness enhancement of the target, Type B metacontrast masking, and a form of paracontrast masking in which the target brightness is enhanced when the mask precedes the target in time.

Entities: Disease Gene Species

Keywords: achromatic color; brightness; brightness induction; edge integration; lightness; masking; metacontrast; paracontrast; type B masking

Year: 2008 PMID： 20517518 PMCID： PMC2864963 DOI： 10.2478/v10053-008-0034-z

Source DB: PubMed Journal: Adv Cogn Psychol ISSN： 1895-1171

Introduction

A longstanding tenet of cognitive psychology holds that retinal images are transformed through a series of neural stages from a pixel-based code to higher-order cognitive codes whose properties more closely mirror those of phenomenally perceived objects. The history of the field is, to a large extent, the history of debates concerning the nature of these transformations, the representations corresponding to transformation stages, and their instantiation in neural hardware (Boden, 2006; Lachman, Lachman, & Butterfield, 1979). Over the last few decades, this serial view of the nature of cognitive information processing in the brain has progressively eroded as neurophysiological data has come to light documenting the importance of parallel processing, lateral connections, and feedback in the construction and maintenance of visual representations. For example, both lateral neural interactions (Gilbert & Wiesel, 1989; Grinvald, Lieke, Frostig, & Hildesheim, 1994; Hirsch & Gilbert, 1991; Kapadia, Westheimer, & Gilbert, 2000; Mizobe, Polat, Pettet, & Kasamatsu, 2001; Stettler, Das, Bennett, & Gilbert, 2002) and re-entrant feedback from higher cortical areas (Lamme, 1995; Lamme, Rodriguez-Rodriguez, & Spekreijse, 1999; Lamme & Spekreijse, 2000; Lamme, Super, & Spekreijse, 1998; Lamme, Zipser, & Spekreijse, 2002; Lee, Mumford, Romero, & Lamme, 1998; Zipser, Lamme, & Schiller, 1996) have been shown to play important roles in the development of neural responses in area V1, an area that was believed a few decades ago to be the home of cells that act as linear spatial filters, passively extracting local edge information. According to our current understanding, the temporally earliest neural responses in V1 do, in fact, encode the local contrast at edges, but network responses modify these local edge responses later in time (Lamme, 1995; Lamme, Rodriguez-Rodriguez, & Spekreijse, 1999; Lamme & Spekreijse, 2000; Lamme, Super, & Spekreijse, 1998; Lamme, Zipser, & Spekreijse, 2002; Lee, Mumford, Romero, & Lamme, 1998; Zipser, Lamme, & Schiller, 1996). The early, edge-based, responses thus act as “seeds” from which the subsequent network responses self-organize into the complex patterns that form the basis of our conscious perceptions. What is early visual cortex doing with these early edge-based responses? What is the function of the network responses? The results of several recent neurophysiological studies suggest that at least one of the most functional roles played by neural activity in V1 is to support the neural representation of surfaces in the visual environment. This paper consists of two parts. The first part consists of the description of a model of the cortical computation of surface color based on the idea that color computation involves just a few additional mechanisms beyond the initial edge-based responses in V1, namely, a mechanism that spatially integrates extended edge responses and a mechanism that controls the neural gain applied to these extended edge responses. Both types of mechanisms have been documented to exist in the early cortical visual areas V1 and V2 (Cornelissen, Wade, Vladusich, Dougherty, & Wandell, 2006; Gilbert & Wiesel, 1989; Grinvald, Lieke, Frostig, & Hildesheim, 1994; Haynes, Lotto, & Rees, 2004; Hirsch & Gilbert, 1991; Hung, Ramsden, Chen, & Roe, 2001; Kapadia, Westheimer, & Gilbert, 2000; Kinoshita & Komatsu, 2001; Lee, Mumford, Romero, & Lamme, 1998; MacEvoy, Kim, & Paradiso, 1998; Mizobe, Polat, Pettet, & Kasamatsu, 2001; Rossi & Paradiso, 1999; Rossi, Rittenhouse, & Paradiso, 1996; Sasaki & Watanabe, 2004; Stettler, Das, Bennett, & Gilbert, 2002), although the theory presented here is new. The second part of the paper consists of a theory of metacontrast masking based on the neural model of surface color computation. There it is shown that the model predicts the existence of Type B metacontrast masking, as well as paracontrast brightness enhancement of the target. While the arguments for the color computation model are well-supported by recent psychophysical and neural data, the metacontrast masking model is more speculative.

Perceptual evidence for edge integration in achromatic color perception

As a prelude to describing the cortical model of surface color computation, it may be helpful to review some basic facts of spatial color vision. It is well known that the perceived color of a target patch can be strongly influenced by the surrounding spatial context. A chromatic surround tends to induce a tint in the target having a hue complementary to that of the surround (Chevreul, 1839/1967; Goethe, 1810/1970; Hering, 1874/1964; Hurvich, 1981; Jameson & Hurvich, 1964). Similarly, an achromatic gray patch looks darker when it is surrounded by a white surface than it does when it is surrounded by a black surface. These perceptual effects are referred to as simultaneous color contrast and simultaneous lightness (or brightness) contrast, respectively. In what follows, we will restrict our discussion to achromatic stimuli to keep things simple. Figure 1 illustrates a perceptual phenomenon that is related to, but not identical with, simultaneous lightness contrast. Here, two identical achromatic disk-and-ring (DAR) stimuli are presented against a background consisting of a luminance gradient. The DAR on the left is positioned against a dark portion of the gradient background and the DAR on the right is positioned against a light portion of the background. The DAR positioned against the dark portion of the background looks lighter than the one that is positioned against the light portion. The key observation for present purposes is that the dark background not only lightens the ring portion of each DAR, which is contiguous with background; it also lightens the central disk, which is not contiguous with the background. We might have imagined otherwise. The background might potentially have affected the appearance of the immediately adjacent surface only; or alternatively the dark background might have lightened the ring on the left which, in turn, might have darkened the left disk. In fact, the latter (false) outcome is predicted by the color model of Jameson and Hurvich (1964) . The fact that the local background affects both the ring lightness and the disk provides an important clue to the nature of lightness processing and is one of the main pieces of support for the lightness model presented below.

Figure 1.

Demonstration of edge integration in lightness perception. The disks and rings on the two sides of the display have identical luminances, but appear lighter when viewed against a dark background. The effect of contrast effect induced by the background affects not only to the ring, which shares a border with the background, but also to the disk, which does not. The disk lightness is also affected by its luminance contrast with respect to the ring (simultaneous contrast). Quantitative studies of lightness matching have shown that the lightness of a target disk is determined by a weighted sum of the local log luminance ratios evaluated at the disk/ring and ring/background borders. The fact that the disks in Figure 1 are affected not only by the immediately adjacent ring but also by the noncontiguous background is consistent with the idea that the lightness of each target disk depends on a sum of the luminance contrasts of the disk/ring edge and the ring/background edge corresponding to that disk (Arend, Buehler, & Lockheart, 1971; Gilchrist, 1988; Popa & Rudd, in preparation; Reid & Shapley, 1988; Rudd, 2001; Rudd, 2003a, 2003b; Rudd & Arrington, 2001; Rudd & Zemach, 2004, 2005, 2007; Rudd, in preparation; Rudd & Popa, 2004a, 2004b, 2007; Shapley & Reid, 1985). That is, surrounding the disk with a lighter ring tends to make it look dark because of the contrast of the disk/ring edge, but the contrast of the ring/background edge produces an additional achromatic color induction effect that either lightens or darkens the disk depending on whether the background is lighter or darker than the ring. I will refer here to the idea that edge contrasts are summed perceptually across space to compute colors as edge integration. The idea of edge integration was introduced into the color perception literature by Edwin Land, whose Retinex theory of color vision (Land, 1977, 1983, 1986, Land & McCann, 1971) was one of the earliest biologically-inspired computational vision models and is still influential in the image processing and human vision literatures. For the last several years, my colleagues and I have been developing a quantitative edge integration model that makes predictions that are more consistent with perceptual data on lightness matching than are the predictions of Retinex theory. Our model modifies the Retinex edge integration algorithm in some simple but important ways. To understand the model, a little math is required. The reader is reminded of a basic fact of high school mathematics: that multiplying a series of numbers is mathematically equivalent to adding the logarithms of those numbers. It follows that multiplying the local luminance ratios that Retinex computes at luminance borders within the Mondrian across space is mathematically equivalent to summing the logarithms of the edge ratios. The Retinex lightness computation model gives equal weight to each of the log luminance ratios that is summed, but our edge integration model modifies the equal weight rule by postulating the existence of several new principles that determine how the various edges in the scene are weighted in the computing the lightness of a target. The edge weighting rules that we have discovered to date are listed below. These edge weighting rules, when combined with the idea that lightness is computed from a weighted sum of log luminance ratios at edges, form the basis for our edge integration model. The edge weighting rules: 1) Edge weights diminish as a function of distance from the target whose lightness is being computed (Reid & Shapley, 1988; Rudd, 2001; Rudd & Arrington, 2001; Rudd & Zemach, 2004, 2005, 2007; Shapley & Reid, 1985). 2) Edge weights depend on the contrast polarity of the edge whose log luminance ratio is being weighted; that is, the edge weight will be different depending on whether the dark side of the edge, or the light side, points towards the target regions whose lightness is being computed (Popa & Rudd, in preparation; Rudd & Popa, 2004a, 2004b, 2007; Rudd & Zemach, 2004, 2005, 2007). A corollary of this principle is that edges that are perpendicular to the target edges do not contribute to the target lightness at all (Zemach & Rudd, 2007). 3) Edge weights vary as a function of proximities, contrasts, and contrast polarities of other edges in the scene, including, but not restricted to, the target edge (Popa & Rudd, in preparation; Rudd & Popa, 2004a, 2007). 4) Edge weights are subject to top-down attentional control (Rudd, in preparation). It should be emphasized that these edge-weighting principles are only the ones that have been documented to date. The visual system is quite likely to apply other edge weighting principles, as well. For example, although we have not performed experiments with stimuli in which disparity cues are present to help segment surfaces in depth, we would expect the likelihood of edge integration to depend on whether the target is perceived to be located in the same depth plane as the contextual edges that may potentially contribute to the edge integration computation. I will next discuss the psychophysical evidence that has led us to propose these edge-weighting principles and I will describe the edge integration model that we have built to instantiate them. Following the exposition of the edge integration model, I will discuss how the model might be extended into the time domain to account for brightness suppression in metacontrast masking.

Edge weights depend on the distance between the edge and the target

Rudd and Zemach (2004) carried out a study of brightness matching using DAR stimuli consisting of decremental disks surrounded by lighter rings. Before discussing their experiments and results, it is necessary to clarify my use of terminology. Technically, lightness refers to perceived reflectance and brightness to perceived luminance. But in most studies of “brightness” matching, including that of Rudd and Zemach (2004) , the observer is not given specific instructions to judge either perceived reflectance or perceived luminance, so it is unclear exactly what attribute of the stimulus is matched. The term “brightness” is typically used to refer to the attribute of appearance that is matched in such experiments, although use of the term in such a context does not imply that the observer matched the stimuli in terms of their perceived luminance. As a general rule, the matches made in experiments in which the observers are instructed to match on perceived reflectance are different than those that are made when the observers are instructed to match on perceived luminance or to make a naïve appearance match (Arend & Spehar, 1993a, 1993b; Rudd, in preparation). In my previous work, I have advocated using the term achromatic color to refer to stimulus attribute that is matched in naïve matching studies because the term “achromatic color” – like the term “color” – can refer to either perceived surface properties or properties of self-luminous stimuli. Here I will use the colloquial term “brightness” to refer to this attribute in order to avoid the more awkward term “achromatic color” and because brightness is the term that is usually employed to refer to this attribute in the masking literature. In any case, the conclusions that I state in this paper hold regardless of whether the subject is asked to match on perceived reflectance or on perceived luminance, or to perform a naïve appearance match. Rudd and Zemach (2004) fitted the results of their naïve appearance matching experiment with an edge integration model based on the assumption that the disk color is computed from a weighted sum of the local log luminance ratios evaluated at the border of the disk and at the outer border of the surround ring. According to this model, the following brightness matching condition should hold at the match point: where D represents the luminance of the target disk, whose brightness was judged in the experiment; D represents the luminance of the matching disk, which was adjusted by the observer to achieve a brightness match between the two disks; R represents the luminance of the ring surrounding the target, which was varied by the experimenter; R represents the constant luminance of the ring surrounding the matching disk; B is the constant background luminance, and w and w are the weights assigned to the log luminance ratios at the inner and outer edges of the surround ring, respectively, by the edge integration algorithm. Equation (1) has been solved to obtain an expression for the logarithm of the model observer’s matching disk setting as a function of the luminance of the ring surrounding the target (Rudd & Zemach, 2004, 2005). The solution leads to the prediction that a log-log plot of the matching disk luminances versus the luminance of the target ring will be a straight line having a slope equal to w/w –1. By fitting a linear regression model to a plot of experimental data and estimating the slope, we can estimate the ratio w/w of the weights associated with the outer and inner edges of the surround ring. The brightness matching study performed by Rudd and Zemach (2004) using decremental disks yielded weight ratio estimates ranging from 0.21 to 0.36 for four observers (Rudd & Zemach, 2004, Experiment 1). We repeated this experiment with rings of various widths and found that the weight ratio estimates decreased monotonically with increases in the ring width (Rudd & Zemach, 2004, Experiment 2). This latter result is consistent with the assumption that the weights assigned to edges in the computation of the target color diminish with distance from the test or matching disk. The brightness matching equation (1), which is based on a weighted sum of log luminance ratios, was found to provide a better fit to the data from these experiments than did appearance models based on luminance matching (w=w), ratio matching (w = 0; Wallach, 1948, 1963, 1976), or a weighted sum of the local Michelson contrasts evaluated at the inner and outer edges of the ring (Reid & Shapley, 1988; Shapley & Reid, 1985).

Edge weights depend on the contrast polarities of the perceptually integrated edges

In a follow-up study (Rudd & Zemach, 2005), observers performed brightness matches with disks that were luminance increments with respect to their surround rings. The disks and rings were identical in size to those used in Experiment 1 of our experiment with decremental disks. The weight ratio estimates obtained in the study with incremental targets ranged from .64 to .95. These weight ratio estimates can be converted to quantitative measures of the magnitude of the brightness induction from the surround: that is, to measures of the degree to which manipulating the surround luminance influences the target brightness (Rudd & Zemach, 2005). According to this measure, incremental targets were subject to a 5-36% brightness induction effect from the surround, where a 100% contrast effect is defined as a match based on equal disk/ring luminance ratios and a 0% contrast effect is defined as a match based on the disk luminance alone. By comparison, the matches performed with decrements indicated a 60-80% surround induction effect. The magnitude of the contrast effect obtained when the targets were increments was both considerably smaller (3.25 times smaller, on average) and more variable than that of the contrast effect obtained when the targets are increments. From the point of view of edge integration theory, the main difference between the stimuli used in the two studies was the contrast polarity of the disk edge. In the 2004 study using decremental targets, the disk edge was dark-inside, while in the 2005 study using incremental targets, the disk edge polarity was light-inside. In both studies, the outer edge of the surround ring was light-inside. The differences between the weight ratio estimates obtained in the two studies might therefore be attributed to differences in the relative weights given to dark-inside and light-inside edges. Taking this as our working hypothesis, we conclude that the weights associated with edges of the light-inside type are, on average, about 3.25 times smaller than the weights associated with edges of the dark-inside type, all other things being equal. This difference quantifies the well-known asymmetry in the magnitudes of the surround induction effects obtained in previous studies of achromatic color matching performed with incremental and decremental stimuli (Agostini & Bruno, 1996; Bressan & Actis-Grosso, 2001; Gilchrist, Kossyfidis, Bonato, Agostini, Cataliotti, Spehar et al., 1999; Heinemann, 1955, 1972; Hess & Pretori, 1884/1970; Jacobsen & Gilchrist, 1988; Kozaki, 1963, 1965; Wallach, 1948, 1963, 1976; Whittle & Challands, 1969).

The role of contrast gain control in achromatic color perception

In addition to the evidence for edge integration in brightness perception cited above, we have also found evidence for interactions between edges, where the term edge integration refers to the presence of an additional term in the brightness matching equation — not included in Equation (1) — involving products of log luminance ratios evaluated at separate edges; for example, a term of the form k log(D/R)log(R/B), where k is a constant. Multiplicative terms of this sort must be added to the brightness matching equation to account for statistically-significant quadratic components seen in the log-log plots of matching disk luminance versus test ring luminance obtained in matching experiments performed with both incremental and decremental DAR stimuli (Rudd & Zemach, 2004, 2005, 2007; Vladusich, Lucassen, & Cornelissen, 2006). That is, log D vs log R plots are better fit by regression models based on parabolic curves than by models based on straight lines, although the parabolic curves are sometimes sufficiently straight to be well-approximated as straight lines. In some studies and for some observers, the amount of variance explained by the quadratic component was as small as a fraction of a percent, while in other studies and for other observers, the amount of variance explained by the quadratic component was large. The curvature in the log D vs log R tends to be most pronounced when the DAR stimuli are presented against a light background and thus when the contrast polarity outer edge of the surround ring is dark-inside (Rudd & Zemach, 2007). Under these conditions, the curvature of the log D vs log R plot may be large enough to rule out the use of a linear approximation. When the background field is dark, deviations from the linear model are typically negligible, although such deviations can be detected using statistical methods (Rudd & Zemach, 2004, 2005, 2007; Vladusich, Lucassen, & Cornelissen, 2006). The need to include edge interaction effects in the brightness matching equation was first noted by Rudd (2001; Rudd & Arrington, 2001), who proposed a mechanistic model to account for the edge interactions. Like the model corresponding to Equation (1), which does not include such interactions, the model of Rudd and Arrington assumes that the brightness of a target region is computed from a weighted sum of brightness induction signals derived from multiple borders. However, the Rudd-Arrington model makes the further assumption that spatially spreading color filling-in signals originating from remote edges are partially “blocked” by the target border. The hypothesis was originally proposed to account for data from matching experiments carried out with target disks surrounded by two rings, rather than one ring. But the idea behind the model is perhaps best illustrated using the example of a test disk surrounded by a single ring (Rudd & Zemach, 2007). According to the edge integration model discussed above in the context of the experiments of Rudd and Zemach – the model without blockage – the brightness of such a disk is computed from a weighted sum of the log luminance ratios associated with the inner and outer borders of the ring. That is where the symbol Φ denotes the magnitude of a neural signal on which judgments of the disk appearance are based. The blockage model modifies Equation (2) so that the effective weight associated with the outer border of the surround ring depends on the log luminance ratio of the disk/ring border. The modified equation for the magnitude of the neural signal associated with the disk brightness is Rudd and Arrington suggested that Equation (2a) is the signature of an underlying brightness filling-in mechanism in which the brightness induction signal originating from the outer ring edge is partially blocked, in a contrast-dependent manner, by the inner ring edge. According to this interpretation of equation (2a), the magnitude of the brightness induction signal that is produced by the outer border of the ring and contributes to the disk brightness would, in the absence of blockage, be wlog(R/B) if it were not for the fact that a percentage β|log(D/R)| of this induction signal is prevented from reaching the disk by a contrast-dependent blockage at the disk/ring border. Rudd and Arrington proposed the blockage interpretation in the context of a filling-in theory of brightness induction. Brightness filling-in theories assert that induction signals originating from borders diffuse like dye within a spatiotopic cortical map of the retinal image to fill in regions lying between borders (Arrington, 1994; Cohen & Grossberg, 1984; Gerrits, de Haan, & Vendrik, 1966; Gerrits & Timmermann,1969; Gerrits & Vendrik,1970; Grossberg & Mingolla, 1985; Grossberg & Todorovic, 1988; Paradiso & Hahn, 1996; Paradiso & Nakayama, 1991; Pessoa, Thompson, & Noe, 1998; Rossi & Paradiso, 1996, 1999; Sasaki & Watanabe, 2004). According to such theories, edge-based induction signals are blocked – or, in the case of the Rudd-Arrington model, they are partially blocked – by other borders that these spreading neural signals encounter while diffusing within a cortical map of the visual scene. The absolute value sign appearing in the term for the percent of the signal that is blocked in Equation (2a) is necessitated by the fact that the proportion of the filling-in signal that is blocked is assumed to be physiologically instantiated as a firing rate of a cortical neuron and firing rates must necessarily be positive. The firing rate is, in turn, assumed to be proportional to the log luminance ratio of the disk-ring edge that is encoded by the edge detector neuron whose neural activity blocks the filling-in signal. The log luminance ratio can be either positive or negative depending on the contrast polarity of that edge, but the firing rate must be positive; so the absolute value sign is required to map the log luminance ratio of the blocking edge into the firing rate associated with a neuron that encodes the edge contrast. The edge contrast polarity is assumed to be implicitly encoded by the polarity preference of the edge detector neuron that does the blocking (labeled line). The proportionality constant β determines the percentage of the filling-in signal that is blocked as a function of the log luminance ratio of the disk-ring edge. This constant is referred to as the blocking coefficient . Rudd and Zemach (2007) fit Equation ((2a) to the data from brightness matching experiments carried out with DAR stimuli having all four possible combinations of inner and outer ring edge contrast polarities. In addition to the matching data from the experiments cited above, in which DAR stimuli with incremental and decremental disks were presented against dark backgrounds, Rudd and Zemach analyzed data from two new matching experiments in which incremental and decremental DARs were presented against light backgrounds. Although brightness matching equation (2a) was found to provide an excellent fit to the data from all four experiments, the sign of the blocking coefficient was found to vary with the contrast polarity of the inner ring border. The fact that the “blocking” coefficent is sometimes negative rules out a mechanistic interpretation of the equation in terms of the partial blocking of a diffusing color signal, because such an interpretation would then require that a negative proportion of the induction signal originating from the outer edge be blocked in those conditions where β is negative, which is clearly nonsensical. Because of this problem, my colleagues and I (Popa & Rudd, in preparation; Rudd & Popa, 2004a, 2004b, 2007, Rudd & Zemach, 2007) have recently proposed an alternative neural mechanism to account for the edge interaction effects that have now been seen several studies (Rudd & Arrington, 2001; Rudd & Zemach, 2004, 2005, 2007; Vladusich, Lucassen, & Cornelissen, 2006). This alternative mechanism explains the edge interaction effects on the basis of a cortical gain control process by which the spike rates of cortical edge detector neurons in the cortical map of the image are modified by the activities of other nearby edge detector units. The theory combines this cortical gain control mechanism with a neural edge integration process that is assumed to occur at a later stage of visual processing. This model is able to account for the results of all of the brightness matching studies that have analyzed to date. In what follows, I will refer to this model that combines edge integration and contrast gain control as the contrast gain control model, for short. The contrast gain control model differs from the blockage model by assuming not only that the effect of an induction signal originating from the outer ring edge can be influenced by the local contrast of the disk edge (as in the blockage model), but that an induction signal derived from the disk edge can also be influenced by the local contrast of the outer ring edge (Rudd & Zemach, 2007). The contrast gain control model further assumes that the gain control is strongest when the edges are close together and diminishes in magnitude as a linear function of distance (Rudd & Popa, 2007). A diagram illustrating the various stages of neural processing contributing to the computation of brightness in the contrast gain control model is presented in Figure 2.

Figure 2.

Schematic diagram illustrating the stages involved in computing the brightness of a light target surrounded by a dark ring viewed against a light background, according to the edge integration model with contrast gain control. The graph at the top of the figure, labeled “luminance” shows a one-dimensional cross-section of the stimulus profile. This stimulus comprises the input to the edge integration computation. The graph below that, labeled “neural edge code,” shows the locations in which edge detector neurons encode the presence and the log luminance ratios of luminance borders in the input image. Separate neurons are assumed to encode edges having different contrast polarities. The third graph in the figure illustrates the fact that the responses of the edge encoding units that are nearer to the target disk are weighted more heavily in the computation of target brightness than are the response of remote edge encoding units. Contrast gain control acting between the inner and outer edges of the surround ring also contributes to the steady state values of the weights applied to the two edges. The bottom graph shows the profile of the target brightness, which is computed from the weighted sum of the disk/ring and ring/background edges. The inner edge, which has a light-inside contrast polarity, lightens the target to a degree that depends on the weighted log luminance ratio of the inner edge. The outer edge, which has a dark-inside contrast polarity, darkens the target to a degree that depends on the weighted log luminance ratio of the outer edge. Since the absolute magnitude of the weighted log luminance ratio at the inner edge is larger than the absolute magnitude of the weighted log luminance ratio at the outer edge, the target will appear light, rather than dark, relative to the background. These assumptions have been formalized mathematically (Popa & Rudd, in preparation; Rudd & Popa, 2007) and are expressed in the following equation, which asserts that the brightness of a disk surrounded by a ring of homogeneous luminance is determined by the expression: where w1* and w2*are the weights that would be assigned to the inner and outer ring edges in the achromatic color computation in the absence of any gain-modulating influence (i.e., if there were no other nearby edges, or the log luminance ratios of the nearby edges were equal to zero); d is the ring width (i.e., the distance between the inner and outer ring edges); the symbol [ ]+ signifies the half-wave rectification operation, which returns either the value of the expression in brackets or the value zero, whichever is larger; and the model parameters v, v, s, and s are interpreted as follows. The parameter v specifies the rate at which the magnitude of the gain applied to edge j by a gain control signal originating from edge i grows as a function of the absolute value of the local log luminance ratio of edge i. The sign of v determines whether the gain-modulating signal directed from edge i to edge j acts to either increase (plus sign) or decrease (negative sign) the gain applied to neurons encoding the log luminance ratio of edge j. The parameter s represents the maximum spatial spread of the neural gain-modulating signal directed from edge i to edge j. The expression within the half-wave rectification brackets models the fact that the magnitude of the contrast gain modulation decreases as a linear function of the distance between the edges. The half-wave rectification ensures that no gain modulation will occur when this distance exceeds the spatial range s of the gain control directed from i to j. As in the blockage model, the log luminance ratios log(D/R) and log(R/B) are assumed in the contrast gain control model to be neurally instantiated as firing rates. Again, these firing rates must necessarily be positive and are often modeled by half-wave rectifying the output of a model spatial receptive field. It follows that different cortical neurons will encode the log luminance ratio at an edge depending on the contrast polarity of that edge. For example, negative log luminance ratios will be encoded by neurons responding to dark-inside edges. When log(D/R) is negative, we therefore assume that the log luminance ratio of the disk-ring edge is encoded by a half-wave rectifying neuron whose firing rate represents the quantity [log(R/D)]+, which in this case is the same thing as the absolute value of log(D/R), and that the firing rate is given a negative synaptic weight in the neural edge integration computation. In this way, the quantity log(D/R) can be represented by synaptically-weighted neural firing rates, even though the rates are positive and the log luminance ratio is negative. That is, the positive firing rate of an edge-detector neuron will inhibit the activities of the higher-order neurons that encode the disk brightness or, equivalently, excite higher-order neurons that encode the disk darkness. When log(D/R) is positive, on the other hand, the log luminance ratio of the disk-ring edge will be encoded by a different neuron: a neuron whose receptive field is in the same location as that of the first neuron but whose firing rate represents the quantity [log(R/D)]+. The response of this second neuron will be given a positive weight in the edge integration computation. Equation (3) also involves terms with absolute values of log luminance ratios, such as |log(D/R)|. These terms also must be neurally instantiated in order to realize the contrast gain control mechanism proposed in the model. Again, the firing rates of two different edge detector units, having receptive fields located at the same retinal position and being sensitive to edges having the same orientation, will represent either the mathematical quantity [log(D/R)]+ or the mathematical quantity [log(R/D)]+, depending on whether the edge detector responds preferentially to edges of the light-inside edge or the dark-inside type. Because these cortical neurons half-wave rectify their inputs, whenever one of these two neurons fires the other will be silent. It follows that the outputs of the pair of neurons must be summed in order to compute the quantity |log(D/R)|, independent of the contrast polarity of the disk-ring edge, as required by Equation (3). Similar considerations apply to the computation of the log luminance ratios and the absolute values of the log luminance ratios corresponding to all the edges in the image. Thus, we see that neural mechanisms described above would suffice to instantiate the mathematical operations required by Equation (3) in a neurally-plausible manner. Popa and Rudd (in preparation) have developed a computer program for the purpose of simulating this neural model of edge integration and contrast gain control. We have used our program to simulate the data from some brightness matching experiments in which the luminance of a test square surrounded by a frame and the frame width were independently varied. By fitting the model to this new data, we have discovered that the sign of the gain modulation term (that is, whether the contrast gain control originating from any particular edge detector unit acts to increase or to decrease the gain of a neighboring unit) depends on the preferred contrast polarities of the two units and, therefore, on the contrast polarities of the luminance borders that drive them.

Neurophysiological evidence for edge-based color induction mechanisms in areas V1 and V2

It has been known since the early single-cell recording studies of Hubel and Wiesel (1959, 1968, 1977) that neurons in area V1 respond preferentially to properly oriented bars or edges presented with the classical receptive fields of these neurons. The results of recent physiological studies indicate that neurons in V1 (and V2) play a much larger role in perceptual organization and surface perception than the picture of neurons in these areas as mere edge detectors suggests. For example, it has been shown that neurons in these cortical areas are sensitive to Gestalt stimulus properties such as figure-ground segmentation (Lamme, 1995; Lamme, Rodriguez-Rodriguez, & Spekreijse, 1999; Lee, Mumford, Romero, & Lamme, 1998; Zipser, Lamme, & Schiller, 1996) and border ownership (Friedman, Zhou, & von der Heydt, 2003; Qiu & von der Heydt, 2005; von der Heydt, Friedman, & Zhou, 2003; von der Heydt, Zhou, & Friedman, 2003; Zhou, Friedman, & von der Heydt, 2000). The role played by V1 and V2 in the representation of surface lightness, brightness, and color is less clear, but we know that at least some of the neurons in these areas respond to modulation of border contrast outside of their classical receptive fields (Cornelissen, Wade, Vladusich, Dougherty, & Wandell, 2006; Kinoshita & Komatsu, 2001; MacEvoy, Kim, & Paradiso, 1998; Rossi & Paradiso, 1999; Rossi, Rittenhouse, & Paradiso, 1996; Vladusich, Lucassen, & Cornelissen, 2006), which suggests that these neurons help to mediate spatial color induction from borders and may even form the stage of neural processing that is most closely associated with the perceptual filling-in of surface color (Haynes, Lotto, & Rees, 2004; Hung, Ramsden, Chen, & Roe, 2001; Kinoshita & Komatsu, 2001; Lee, Mumford, Romero, & Lamme, 1998; MacEvoy, Kim, & Paradiso, 1998; Rossi & Paradiso, 1999; Rossi, Rittenhouse, & Paradiso, 1996; Sasaki & Watanabe, 2004). Of special interest from the standpoint of edge integration theory is a recent fMRI study by Cornelissen et al. (2006) showing long-range edge responses that span a distance of about 18 mm on the cortical surface, which is well beyond the spatial limits of the classical V1 and V2 receptive fields. In terms of visual angle, the span of these long-range edge response is about 5-10 deg, which corresponds roughly to the spatial spread of the achromatic color induction effects measured in psychophysical studies (Cole & Diamond, 1971; Diamond, 1953, 1955; Dunn & Leibowitz, 1961; Hong & Shevell, 2004; Leibowitz, Mote, & Thurlow, 1953; Reid & Shapley, 1988; Rudd & Zemach, 2004). Cornelissen et al. suggested that long-range edge responses in V1 and V2 might subserve the function of either edge integration, or “contextual influences on the edge,” or both. This raises the possibility that at least some of the neural processes predicted by our contrast gain control model may be carried out in areas V1 and V2. The extended edge responses might represent the activities of neural processes that “reach out” to adjust the weights of other nearby edge detector units (contrast gain control), or they might correspond to the edge-based color induction signal itself, or they might reflect a mixture of these two types of activity. The spatial summation of edge-based induction signals that is required to account for the achromatic color matching results has not been explicitly investigated by neurophysiologists. This summation might also take place in either V1 or V2. Or it might be carried out at a higher level of the visual system. Area V4 seems a likely site of the neural edge integration operation, since the outputs of V1 and V2 neurons project to V4 and the large receptive fields of V4 neurons would allow for a spatial summation over many degrees of visual angle, as is required to account for the psychophysical data. The latter suggestion is also consistent with the proposition, put forth by several previous investigators, that V4 plays a special role in color constancy (Bartels & Zeki, 2000; Clarke, Walsh, Schoppig, Assal, & Cowey, 1998; Kennard, Lawden, Morland, & Ruddock, 1995; Kentridge, Heywood, & Cowey, 2004; Smithson, 2005; Walsh, 1999; Zeki, Aglioti, McKeefy, & Berlucchi, 1999; Zeki & Marini, 1998), since the purpose of edge integration is to help achieve constancy (Land, 1977, 1983, 1986; Land & McCann, 1971). In the remainder of this paper, I will extend this model into the time domain to devise a dynamic brightness perception model that accounts for the existence of both metacontrast and paracontrast masking phenomena.

Possible relationship of metacontrast to edge integration and contrast gain control

In this section of the paper, I will discuss how the brightness computation model presented above might relate to metacontrast masking. I will not present any new masking data, but I will propose a theory of metacontrast and discuss how this theory could be tested in future experiments. In metacontrast masking, a mask that follows the target in time suppresses the target brightness. The mask often (but not always) has its greatest effect when it follows the target by a delay of about 50-100 milliseconds. When the target brightness is measured as a function of the temporal delay between the target and the masking stimulus, a U-shaped brightness function is obtained. The U-shaped brightness function is often taken to be one of the characteristic features of metacontrast masking (Alpern, 1953; Breitmeyer, 1984; Breitmeyer & Öğmen, 2006). Although situations do occur in which the brightness function associated with metacontrast masking is a monotonic rather than a U-shaped function of time, I will here restrict my discussion to the special case of U-shaped (Type B) metacontrast masking (Breitmeyer, 1984), leaving it for future work to extend the model presented here to account for monotonic metacontrast masking functions. Early metacontrast studies typically employed either an oriented bar as the target and flanking bars as the mask (e.g., Alpern, 1953), or a disk as the target and a surround ring as the mask (e.g., Werner, 1935). The potency of the mask was found to be greatest when: 1) the mask followed the target with the correct stimulus onset asynchrony (SOA); 2) the target and mask edges were in close spatial proximity (Alpern, 1953; Breitmeyer, 1984; Fry, 1934; Kolers, 1962; Kolers & Rosner, 1960; Levine, Didner, & Tobenkin, 1967; Stigler, 1926; Weisstein & Growney, 1969); and 3) the mask had a large contrast energy relative to that of the target (Breitmeyer, 1978a; Breitmeyer, 1984; Fehrer & Smith, 1962; Kolers, 1962; Spencer & Shuntich, 1970; Stewart & Purcell, 1974). Several investigators have noted the special importance of border contour in metacontrast masking (Breitmeyer, 1984; Kolers, 1962; Weisstein, 1971; Werner, 1935). For example, Werner (1935) found that metacontrast was strongest when the borders of the target and mask were most similar. Weisstein (1971) obtained a U-shaped masking curve by masking a small disk target with a larger disk mask. She interpreted her results in terms of the hypothesis that metacontrast entails interactions between edges, rather than interactions between surfaces or objects. In fact, many of the same stimulus factors that control the strength of edge interactions in metacontrast displays – e.g., spatial proximity, contour similarity, and border contrast polarity (Becker & Anstis, 2004; Breitmeyer, 1978b) – have also been shown to influence the strength of edge integrations in the perception of static brightness displays (Bindman & Chubb, 2004a, 2004b; Hong & Shevell, 2004; Popa & Rudd, in preparation; Reid & Shapley, 1988; Rudd, 2001, 2003a; Rudd & Arrington, 2001; Rudd & Popa, 2004a, 2004b, 2007; Rudd & Zemach, 2004, 2005, 2007; Vladusich, Lucassen, & Cornelissen, 2006; Zemach & Rudd, 2007). The similarities between the brightness suppression that occurs in metacontrast masking and the contrast gain control phenomena observed in studies using static DAR displays are provocative. These similarities suggest that perhaps both phenomena might be accounted for by the same underlying mechanism or mechanisms. The theory of metacontrast masking presented here is based on the idea that metacontrast occurs at a stage of neural processing at which edges interact and at which multiple edges may influence the target brightness via the mechanism of edge integration, but at which an object representation has not yet been formed. The fact that metacontrast can occur when the target and mask are presented to separate eyes implies a cortical locus for the interaction (Breitmeyer, 1984; Kolers & Rosner, 1960; May, Grannis, & Porter, 1980; Schiller & Smith, 1968; Stigler, 1926; Weisstein, 1971; Werner, 1940). Many theories have been advanced to account for the U-shaped metacontrast masking function. Francis (2000) has presented a useful classification of some of these theories (see also Francis & Cho, 2006; Francis & Herzog, 2004). Perhaps the most common type of theory invokes a mechanism whereby an afferent neural signal originating from the mask overtakes in time and inhibits a corresponding neural signal from the target (Breitmeyer, 1984; Breitmeyer & Ganz, 1976; Breitmeyer & Öğmen, 2006; Stigler, 1926). But the “overtake and inhibit” hypothesis is far from universally accepted. Francis considers several other mechanistic accounts of metacontrast, and Reeves (1982) has argued that the U-shaped masking function does not result from a single process — such as inhibition of the target by the mask at some preferred delay — but rather from two separate processes that each produce a monotonic change in the target brightness as a function of the temporal delay between target and mask. For an in-depth review of both the basic data on metacontrast and a larger body of theories that have been proposed to account for it, the interested reader is referred to review articles by Alpern (1952) , Weisstein (1972) , Lefton (1973) , Breitmeyer (1984) , Francis (2000) , and Breitmeyer and Öğmen (2006) . In the remainder of the present paper, I will confine my remarks to the discussion of the hypothesis that metacontrast phenomena should be viewed as a byproduct of the dynamical properties of brightness computation by human visual cortex and that metacontrast masking results, more specifically, from the dynamics of edge integration and contrast gain control.

Metacontrast masking from edge integration dynamics

To investigate the possible connection between edge integration and metacontrast, we first need to address the problem of how the edge integration model might be extended into the time domain. For concreteness, we will analyze the metacontrast paradigm introduced by Weisstein (1971), in which a target disk is followed in time by a larger masking disk. This is a particularly simple stimulus display from the standpoint of the edge integration model, since it involves only one target edge and one mask edge. The Weisstein display differs from the static DAR stimuli used in our previous matching experiments only in that a temporal delay is imposed between the onset of the target edge and the onset of the more distant edge. To study metacontrast with such a stimulus, it is best to present the target and masking disks to different eyes; otherwise brightness masking (Turvey, 1973) occurs at brief stimulus onset asynchronies in addition to the Type B metacontrast effect, which is seen at longer SOAs. Thus, a W-shaped masking function is obtained with the target and mask are presented to the same eye (Weisstein, 1971). In considering the dynamics of edge integration, it is important to take into consideration the so-called Broca-Sulzer effect: the brightness of a flashed stimulus is temporally enhanced at stimulus onset (Alpern, 1963; Boynton, 1961; Breitmeyer, 1984; Broca & Sulzer, 1902, 1904; Stainton, 1928) (see Figure 3). The Broca-Sulzer effect is likely due to transient components of the firing rates of early visual neurons (see, for example, Saito & Fukada, 1986), and is closely related to Crawford masking (Breitmeyer, 1984; Crawford, 1947).

Figure 3.

Broca-Sulzer brightness enhancement occurs at stimulus onset for high intensity incremental targets. Here flash brightness is plotted as a function of duration for flashes of different luminances. Data from Hart (1987). By adding the assumption that the neural responses to edge contrast exhibit such transient components to the other postulates of the edge integration model, we arrive at a model that can account for some of the known properties of metacontrast masking and that also makes testable theoretical predictions. Our main focus will be on applying this model to Weisstein’s masking paradigm in which the target and mask are both disks, so that the target and mask each have a single edge. The response of the model to such a stimulus is much simpler to analyze than is the model response to the more typical masking stimulus in which the target is a disk, the mask is a ring, and there is potential a gap between them. The latter stimulus includes three edges that could produce fairly complex cortical interactions between edge detector units, given that any pairwise combination of edge detector responses may be subject to two-way gain control interactions. Whereas in the case of the Weisstein stimulus we only need to keep track of two gain control signals (the outward and inward directed signals acting between the target and mask edges), a total of six gain control signals could come into play when the masking stimulus is a ring. According to the edge integration model, when a light target disk (i.e., a target that is a luminance increment with respect to its immediate surround) is presented in isolation against a dark background field, the disk brightness will be completely determined by the log luminance ratio at its border. As a result of the Broca-Sulzer effect, the disk will appear transiently brighter immediately after its onset than it does in the steady state. It is well-known that the steady-state brightness of a disk viewed in isolation obeys Stevens’ brightness law, which states that the brightness of a static target viewed in the dark is proportional to the target luminance raised to approximately the 1/3 power (Rudd & Popa, 2007; Stevens, 1953, 1961, 1967, 1975; Stevens & Marks, 1999). The exponent of the brightness law decreases from about 1/2 to about 1/3 as the flash duration is increased from 0.5 to 1000 msec (Aiba & Stevens, 1964; Raab, 1962; Stevens, 1966; Stevens & Hall, 1966). The exponent of Stevens’ law can be viewed as an “exponential gain” applied to the target (Rudd & Popa, 2007; Whittle, 1994). According to the edge integration model, the gain applied to the target reflects the gain of neural edge detector units in early visual cortex (Popa & Rudd, in preparation; Rudd & Popa, 2004a, 2004b, 2007). These findings all follow from the edge integration model if is assumed that the outputs of the edge detector units that encode the edges of the target exhibit a transient increase in their firing rates at stimulus onset, an assumption that is well-supported by physiology. Now suppose that a second, larger, masking disk is presented to the eye contralateral to the one that sees the target. Further suppose that the mask is presented after a variable interstimulus interval (ISI) following the target disk presentation, as in Weisstein’s experiment. In her experiment, the target and masking disks were both luminance increments with respect to their immediate surrounds, but we will begin here by analyzing the situation in which the masking disk is a luminance decrement with respect to its surround (i.e., the background field). In this case, the dark side of the mask edge fsaces the incremental disk target (see Figure 4).

Figure 4.

A metacontrast masking paradigm modeled after the experiment of Weisstein (1971). A target consisting of an incremental disk displayed against a dark background is shown to one eye. Following a dark interstimulus interval of variable duration, a masking stimulus consisting of a decremental disk, larger in size than the target disk, is displayed to the contralateral eye. This stimulus differs from Weisstein's in that here the masking disk is dark, whereas in Weisstein's paradigm the target and mask both consisted of bright disks displayed against dark backgrounds. In both experimental paradigms, the target and mask each have only one edge. When the stimulus onset asynchrony is very short (i.e., SOA = 0), the target and mask onsets are simultaneous. In that case, according to the edge integration model, the target brightness will be determined by a weighted sum of contributions from the target and mask edges, as in the case of a static DAR stimulus. At the onset of both target and mask, the amplitudes of both of these components will be temporarily boosted by the transient neural activity in the edge detector neurons that encode the edges. As these transient activities decay, there may be a change in the target brightness, but this decay will be monotonic in time since the weighted sum of two monotonically decaying functions will also exhibit monotonic decay. The decay of transient activity cannot by itself account for Type B metacontrast masking, since any explanation of Type B metacontrast requires a mechanism that can produce a U-shaped masking function under the appropriate stimulus conditions. What if we delay the onset of the mask relative to that of the target? One consequence of this delay will be that the observer has a longer time to view the target in isolation without its brightness being influenced by the darkness induction signal contributed by the mask. But when the mask does come on, the darkness induction signal that it generates will initially be particularly potent as a result of the transient component of the neural response to the mask edge and this will produce a transient darkening of the disk measured relative to the target brightness that would obtain if the target and mask were both left on indefinitely. If we assume that the brightness percept is calculated at each infinitesimal moment in time, then we would expect the target to appear bright, then somewhat darker, then somewhat brighter. Whether the observer reports the target as being suppressed by the delayed mask or not would depend on when he or she reads out the target brightness from the neural code. We need to make an additional assumption about how the temporal readout occurs before we can make an unambiguous brightness prediction based on this dynamical brightness computation model. In what follows, we will assume that the target brightness is computed over a finite integration time that includes the period in which the target is viewed in isolation and at least some of the period in which the neural activations generated by the target and mask overlap in time (Bloch, 1885; Breitmeyer, 1984). This assumption seems reasonable because it would be optimal for the observer to report the target brightness without it being influenced by the brightness suppression introduced by the mask via the edge integration mechanism. But we know that the observer does not behave optimally: there is, in fact, some brightness suppression due to an interaction between the mask and the target. Given this temporal linking hypothesis, we conclude that delaying the mask in time can only make the target more visible compared to the case where the SOA is zero, since the only effect of delaying the mask is to potentially reduce the percentage of the target integration time in which the neural response to the mask affects the target brightness. Thus, the U-shaped metacontrast masking function is not predicted from a model that combines transient and sustained neural activations with edge integration alone. But, to this point, we have ignored the potential influence of contrast gain modulations acting between edges. It is these interactions that are proposed to be responsible for metacontrast masking. To predict the contribution of contrast gain control to the target brightness dynamics it seems reasonable to assume that it will also take some time for the contrast gain control originating from an edge to be felt at the location of the mask edge. The target onset is therefore expected to generate a spatially spreading contrast gain control signal having a “wave front” that travels outward from the target edge and modulates the gains of any active nearby edge detector neurons that it encounters. Through a secondary action, this spreading gain control signal will, according to the edge integration model, modulate the amplitudes of any lightness or darkness induction signals that are generated by these nearby edge detector neurons. Based on our previous experiments with static brightness matching displays (Popa & Rudd, in preparation; Rudd & Popa, 2007), we anticipate that the contrast gain control signal will act either to amplify or attenuate these induction signals, depending on the particular combination of contrast polarities of the interacting edges. This may sound like a vague prediction, but the direction of the gain modulation – either amplifying or attenuating – can be predicted on the basis the results of our past modeling of brightness matches performed with static displays composed on these same combinations of inner and outer edges (Popa & Rudd, in preparation; Rudd & Popa, 2007). For the combination of target and mask edge contrast polarities considered here, the gain control acting from the target edge onto the mask edge is known from our past work to be amplifying and the gain control acting from the mask edge onto the target edge is known to be attenuating. The edge integration model asserts that the target brightness is computed from a spatial sum of induction signals derived from these two edges, so a gain control acting either from the mask edge to the target edge or from the target edge to the mask edge would be expected to influence the target brightness. The transient activity generated by neural edge detector units at edge onset will be inherited by any gain control modulation that is exerted by those units onto other, nearby, edge detector neurons. Thus, the spreading gain control wave front should also exhibit a wave crest, which will produce either a transient increase or a transient decrease in the gain of any edge detector that it encounters. The transient gain modulation produced by this traveling wave crest will be in the same direction as the sustained gain change (i.e., either amplifying or attenuating), but of greater magnitude. Since the influence of contrast gain control takes time to spread between neural edge detector units, the effect of this transient boost in gain modulation strength will be to produce a time-delayed transiently-enhanced amplification or attenuation of any lightness or darkness induction signals that are generated by the nearby edge detector units. The time delay corresponding to the peak gain modulation will increase with increasing spatial separation between the gain-modulating edge and the gain-modulated edge. The results of our previous experiments with static DAR displays lead us to expect that the gain control that operates from a light-inside target border to a dark-inside mask border will amplify, rather than attenuate, the strength of the darkness induction signal originating from the mask border (Popa & Rudd, in preparation; Rudd & Popa, 2007). We expect this to be the case because the strength of the darkness induction signal associated with an outer ring border increases when either the contrast of an inner ring border is increased or the borders are moved closer together by decreasing the width of the surround ring (Popa & Rudd, in preparation; Rudd & Popa, 2007). This behavior could account for the U-shaped masking metacontrast masking function in the following way. Suppose that the transient gain amplification of the darkness induction signal originating from the mask edge occurs at the same time that the neurons responding to the mask edge are exhibiting the regular transient activation that occurs at mask edge onset. These two transient amplification effects will combine multiplicatively (because gain control interactions are multiplicative by nature) to produce a particularly potent amplification of the darkness induction signal originating from the mask edge. This potent darkness induction signal will then sum with the lightness induction signal from the target edge to determine the target brightness, according to the basic assumption of the edge integration model. Note that the multiplicative “double-whammy” amplification of the darkness induction signal will only occur if the mask onset is delayed with respect to that of the target onset by the right time interval. Thus, when the mask edge is delayed relative to the target edge by the right interval, we expect that a brightness suppression of the target (i.e., metacontrast masking) will result. In order for the double-whammy darkness induction signal amplification to explain metacontrast, it is necessary is that the gain increase applied to the mask edge by the double-whammy is more than sufficient to compensate for any tendency for the target brightness to be spared from temporally integrating with the darkness-inducing mask edge as a result of the target-mask delay. We assume that during part of the visual integration time the target is presented in isolation, and thus would normally appear bright, but during the rest of the visual integration time, the lightness and darkness induction signals elicited by target and mask edges overlap in time and bind spatially through the mechanism of edge integration to determine the overall target brightness. Given the appropriate temporal delay, the darkness that is induced in the target by the mask edge during the time that the target and mask bind spatially is sufficiently potent – as a result of the delayed contrast gain control effect – that the overall integrated brightness signal is smaller than it would be if the target and mask were presented either simultaneously (short SOA) or with a large temporal separation (long SOA). In the latter case, of course, the target and mask will not bind spatially at all. In Figure 5 is presented a diagram illustrating how Type B metacontrast masking is produced by the dyna-mics of edge integration and contrast gain control in the case just discussed, in which an incremental target disk is followed in time by a larger dark masking disk (where a “dark” masking disk here means dark relative to the larger surround or background field). To my knowledge, this experiment has not been performed and thus the theory makes a novel prediction: that metacontrast masking should occur with this display. We next derive the model predictions for an experiment which has been performed; that is, the experiment of Weisstein (1971) mentioned earlier, in which a light target disk is masked by the delayed onset of a larger light masking disk.

Figure 5.

Proposed explanation of metacontrast based on edge integration and contrast gain control. The top graph in the figure shows the luminance profile of the stimulus. The target outline is indicated by a dotted line to signify that the target appears in an earlier frame than the mask (solid line). The presentation of the mask activates neural edge encoding units having the appropriate contrast polarity sensitivities and receptive fields at the locations of the mask edges (solid lines in the second graph). During the period in which the mask is presented, there may also be persisting activations in the edge encoding neurons that were activated by the target edges (dotted lines in the second graph). Both types of neural activations will potentially contribute to the target brightness, to a degree that depends on the edge weights. The third graph illustrates a case in which the weighted values of the neural activations corresponding to the target and mask edges happen to be identical. The edge weights are affected by two different processes. First, the target brightness computation algorithm tends to weight the target edge more heavily than it weights the more distant mask edge, all other things being equal. Second, a time-delayed contrast gain modulation acting from the target edge onto the mask edge will tend to boost the weight applied to the mask edge, with a particularly strong transient boost occurring at the optimal delay for metacontrast. In the hypothetical case illustrated, the darkness-inducing effect of the mask edge exactly cancels that lightness-inducing effect of the target edge, which results in the target brightness being neither higher nor lower than that of its immediate surround; thus, the target is made invisible. More generally, the target brightness may be modulated to a variable degree by the contrast gain control mechanism, with the largest target suppression effect occurring at the optimal SOA for metacontrast masking. If the contrast polarity of the mask edge is reversed, as in Weisstein's 1971 masking study, the transient gain modulation is attenuating, rather than amplifying (Rudd & Popa, in press). Since the presence of the mask edge in that case tends to lighten, rather than darken, the target, the transient attenuation of the lightness induction signal generated by the mask edge will also result in metacontrast masking. In Weisstein’s experiment, the target and mask borders both had contrast polarities of the light-inside type. For this combination of edge contrast polarities, the target and mask edges should both make a positive contribution to the target brightness as a result of edge integration. At short SOAs, the target should appear particularly bright as a result of the transient activation of neurons that encode the contrasts of the target and mask edges. This transient activation is expected to dissipate over time, resulting in a monotonic decrease in the target brightness, if the potential contribution of gain control interactions occurring between the target and mask edges is neglected. Next consider the effects of adding the contrast gain control. For static disk-and-ring stimuli in which both edges are light-inside, we have shown in our previous work (Popa & Rudd, in preparation; Rudd & Popa, 2007) that the contrast gain control acting from the disk edge onto the outer ring edge in a DAR display acts to attenuate the lightness induction signal generated by the outer edge. In the Weisstein paradigm we would thus expect to see, given an appropriate time delay between the target and the mask, a transient suppression of the lightness inducing effect of the mask edge on the target brightness. Again, the gain control dynamics, when combined with the basic assumption of edge integration, predict the U-shaped masking function that is the hallmark of metacontrast masking. The gain control model thus predicts that metacontrast should be observed regardless of the contrast polarity of the mask edge. The theory also predicts that there should be forward brightness modulation effects analogous to the backward masking effects already described. Such effects have been previously studied and are known as paracontrast masking (Breitmeyer, 1984; Breitmeyer & Öğmen, 2006; Breitmeyer, Kafaligonul, Öğmen, Mardon, Todd, Siegler, 2006). According to the model, paracontrast masking results from gain control processes that are initiated by the onset of the mask and act, after a time delay, to modulate the gain applied to the target edge. On the basis of our parametric model fits to DAR brightness matching data from experiments with static displays (Popa & Rudd, in preparation; Rudd & Popa, 2007), we expect these forward masking effects to be brightness enhancing, rather than brightness suppressing, when the target and mask each comprise a single edge. This prediction holds regardless of the contrast polarity mask edge, as long at the target disk is a luminance increment with respect to its immediate surround. A full justification for this claim is given in an upcoming paper (Popa & Rudd, in preparation). It is not yet clear how the magnitudes of the transient forward and backward brightness modulation effects might be expected to compare, but it seems likely that definite predictions regarding the relative magnitudes of the forward and backward brightness effects could be made on the basis of the parameter estimates obtained from fitting the model to brightness matches made with static stimuli. In a recent study, Breitmeyer et al. (2006) studied paracontrast masking using a stimulus consisting of a dark disk target surrounded by a dark masking ring, with a spatial gap between the disk and ring. Their experimental results suggest that paracontrast consists of at least three separate effects: one involving excitation and two involving inhibition. As stated above, a stimulus containing a disk and a surround ring that is separated from the target disk by a spatial gap includes three edges and thus is expected to elicit considerably more complex gain control interactions than would the single-target-edge, single-mask-edge stimuli discussed above. It would not be surprising to discover that the former stimulus could generate three or more gain modulation effects having different time courses. But specific predictions remain to be worked out.

99 in total

1. Lateral connectivity and contextual interactions in macaque primary visual cortex.

Authors: Dan D Stettler; Aniruddha Das; Jean Bennett; Charles D Gilbert
Journal: Neuron Date: 2002-11-14 Impact factor: 17.173

2. Simultaneous brightness induction as a function of inducing and test-field luminances.

Authors: E G HEINEMANN
Journal: J Exp Psychol Date: 1955-08

3. Effects of surround articulation on lightness depend on the spatial arrangement of the articulated region.

Authors: Iris K Zemach; Michael E Rudd
Journal: J Opt Soc Am A Opt Image Sci Vis Date: 2007-07 Impact factor: 2.129

4. Stevens's brightness law, contrast gain control, and edge integration in achromatic color perception: a unified model.

Authors: Michael E Rudd; Dorin Popa
Journal: J Opt Soc Am A Opt Image Sci Vis Date: 2007-09 Impact factor: 2.129

5. Brightness perception and filling-in.

Authors: M A Paradiso; K Nakayama
Journal: Vision Res Date: 1991 Impact factor: 1.886

Review 6. Feedforward, horizontal, and feedback processing in the visual cortex.

Authors: V A Lamme; H Supèr; H Spekreijse
Journal: Curr Opin Neurobiol Date: 1998-08 Impact factor: 6.627

7. The retinex theory of color vision.

Authors: E H Land
Journal: Sci Am Date: 1977-12 Impact factor: 2.142

Review 8. Intensity functions in sensory systems.

Authors: S S Stevens
Journal: Int J Neurol Date: 1967

9. Cortical point-spread function and long-range lateral interactions revealed by real-time optical imaging of macaque monkey primary visual cortex.

Authors: A Grinvald; E E Lieke; R D Frostig; R Hildesheim
Journal: J Neurosci Date: 1994-05 Impact factor: 6.167

10. Synaptic physiology of horizontal connections in the cat's visual cortex.

Authors: J A Hirsch; C D Gilbert
Journal: J Neurosci Date: 1991-06 Impact factor: 6.167

1 in total

1. Visual masking and the dynamics of human perception, cognition, and consciousness A century of progress, a contemporary synthesis, and future directions.

Authors: Ulrich Ansorge; Gregory Francis; Michael H Herzog; Haluk Oğmen
Journal: Adv Cogn Psychol Date: 2008-07-15

1 in total