Literature DB >> 25581077

Competition with and without priority control: linking rivalry to attention through winner-take-all networks with memory.

Svenja Marx¹, Gina Gruenhage, Daniel Walper, Ueli Rutishauser, Wolfgang Einhäuser.

Abstract

Competition is ubiquitous in perception. For example, items in the visual field compete for processing resources, and attention controls their priority (biased competition). The inevitable ambiguity in the interpretation of sensory signals yields another form of competition: distinct perceptual interpretations compete for access to awareness. Rivalry, where two equally likely percepts compete for dominance, explicates the latter form of competition. Building upon the similarity between attention and rivalry, we propose to model rivalry by a generic competitive circuit that is widely used in the attention literature-a winner-take-all (WTA) network. Specifically, we show that a network of two coupled WTA circuits replicates three common hallmarks of rivalry: the distribution of dominance durations, their dependence on input strength ("Levelt's propositions"), and the effects of stimulus removal (blanking). This model introduces a form of memory by forming discrete states and explains experimental data better than competitive models of rivalry without memory. This result supports the crucial role of memory in rivalry specifically and in competitive processes in general. Our approach unifies the seemingly distinct phenomena of rivalry, memory, and attention in a single model with competition as the common underlying principle.

Entities: Chemical Disease Gene Species

Keywords: attention; binocular rivalry; modeling; psychophysics; vision; winner-take-all network

Mesh：

Year: 2015 PMID： 25581077 PMCID： PMC4376592 DOI： 10.1111/nyas.12575

Source DB: PubMed Journal: Ann N Y Acad Sci ISSN： 0077-8923 Impact factor: 5.691

Introduction

When confronted with complex and potentially ambiguous input, human sensory systems have to deal with two forms of competition. First, different items in the visual field compete for processing resources; second, different possible interpretations of the sensory signal compete for perceptual awareness.

Attention as biased competition

The first form of competition is typically resolved by attention, enhancing one stimulus at the expense of the other.1 This is most evident in the framework of biased competition,2 where attention corresponds to resolving competition by setting biases (i.e., controlling priority) according to task demands.3 Biased competition has become one of the most influential attention models4,5 and is supported by ample physiological evidence: when two stimuli are brought into a cell's receptive field (RF), of which one alone would drive the cell and the other would not, the cell's response to the combined stimulus falls in-between the two individual responses, as a consequence of competition. When, however, one stimulus is attended, the neuron's response quickly behaves as if only the attended stimulus would be present in the RF; that is, competition is biased in favor of the attended stimulus.6

Attention and memory

Attention and visual working memory are tightly linked.7,8 For example, items held in working memory can interfere with attentional selection and vice versa.9–12 Consistent with such evidence, an early formalization of the biased competition idea, Bundesen's theory of visual attention13 (TVA) and its later neural implementation (neural theory of visual attention, NTVA),14 describes attention as a race of competing items for visual short-term memory. TVA formalizes the interplay of Broadbent's two mechanisms of attention:15 filtering, the mechanism for the selection of items, and pigeonholing, the mechanism to allocate evidence to categories. Since filtering represents the probability of an item to be selected, while pigeonholing represents the probability of a category to be selected, their complementary functions parallel the aforementioned two forms of competition: filtering resolves competition between items; pigeonholing resolves competition between different categories, including different perceptual interpretations. NTVA14 provides a neuronal implementation of these mechanisms that is consistent with physiological data. In NTVA, filtering and pigeonholing are related to specific neural mechanisms, namely the allocation of RFs to select elements and of gain control to select categories. In an extension of the NTVA, a Poisson counter model is used to explain how during visual identification mutually confusable stimuli can be resolved.16 It implies that while the stimulus is analyzed, temporary categorizations are made at a constant Poisson rate. The response is then based on the category that was chosen most frequently. Thereby, the Poisson counter model provides a mechanism by which the interplay of attention and memory can resolve competition between distinct perceptual interpretations of a visual stimulus.

Rivalry as a model for competition

The second form of competition, the competition of perceptual interpretations for awareness, is unavoidable during natural vision. Because of the inherent ambiguity when mapping the outside world on the receptive surface,17 prior knowledge is needed to infer the most likely interpretation. Such prior knowledge can manifest itself in terms of fixed rules about object structure—with Gestalt laws as a prime example18—or formalized in terms of Bayesian prior distributions,19–23 which may be flexibly adapted to environmental and motor constraints.24 On the basis of sensory input alone, many perceptual alternatives may be equally likely, but the combination of this likelihood with the prior assumptions allow the sensory system to arrive at a unique interpretation of the world. If no sufficiently strong prior information is available to resolve the ambiguity in the input, the system will nonetheless perceive one unique interpretation at any point in time, but the dominant interpretation alternates over time. This phenomenon is referred to as rivalry, which can be induced either through bi- or multistable figures, such as geometrical figures that alternate in three-dimensional interpretation,25,26 figure-ground reversals,27,28 or overlaid patterns that alternate between compound and constituents,29,30 or as “binocular rivalry,” when two sufficiently distinct patterns are presented to either eye31 (for review, see Ref. 32). Most forms of rivalry have several properties in common.33,34 The times that a certain precept dominates are distributed with a leptokurtic (heavy-tailed) distribution35 and respond in a well-defined manner to changes in input strength (Levelt's propositions36).

The role of memory in rivalry

If the stimulus is removed (“blanked”) for a considerable duration (>500 ms) during a rivalry task, the probability that the same perceptual interpretation reemerges after the blank increases substantially.37,38 Thus, blanking stabilizes the percept. In contrast, for short blank durations (<500 ms), the percept tends to destabilize and thus the alternative percept is more likely to emerge after the blank than expected by chance.37 The time course of the blanking effect is reminiscent of a recently proposed “third stage” in visual working memory encoding that protects an item from deletion when its processing takes longer than the completion of a competition epoch,39 and it is tempting to speculate that the stabilization of the blanked percept is a consequence of such protective maintenance. Stabilization of the percept across extended periods of blanking indicates that a form of memory—in this case, the dominant percept before onset of the blank period—plays a role in rivalry. Additional evidence for the role of memory in rivalry comes from experiments with tri-stable rivalry (i.e., a stimulus with three possible percepts). In these experiments, the sequence of states is not Markovian (i.e., previous percepts influence processing of the current perceptual state40). For brief intermittent presentations, the dominant percept is location specific: at a given location of the visual field, the same percept is dominant at onset after blanking throughout.41 Although these biases in onset rivalry are highly variable between observers, they remain stable within the same individual over weeks. This suggests involvement of long-term memory. Taken together, blanking, the non-Markovian property of tri-stable rivalry, and the observer-specific location bias of onset rivalry show that rivalry is influenced by a number of memory processes that operate on a variety of time scales.

A common framework for rivalry and attention as competitive processes—winner-take-all circuits

In neuronal circuits, competition is frequently implemented by winner-take-all (WTA) circuits. WTA behavior emerges if a population of excitatory neurons is recurrently connected to itself and shares a common inhibitory signal42,43 with sufficiently high gain. Such recurrent connectivity is a building block of neocortical circuitry44–46 and is readily implemented in neuromorphic hardware.47 WTA networks can model arbitrary state machines,48 states can remain in the absence of input, and state transitions can be triggered by external input given the current state. WTA circuits have frequently been used in models of attention. The output stage of the saliency map,49 which must select a winning location, is typically implemented as a WTA circuit. More deeply, attention models can be built by cascading WTA circuits50 or by implementing WTA mechanisms between visual filters.51 In a related architecture, Hahnloser et al.52 argue that a recurrently coupled map alone cannot implement attention to a region of the map, but rather propose an excitatory reciprocal coupling between the map and a “pointer” map, whose neurons are more broadly tuned in space. Here, we propose to exploit the structural similarity between rivalry and attention as forms of competition and present a WTA model of rivalry. We start with a generic WTA model rooted in neocortical physiology43 and test the extent to which it replicates the dominance distributions and Levelt's propositions as main hallmarks of rivalry. We then demonstrate that the required memory state emerges from the network's dynamics. The model predicts interactions between blanking duration and input strength that we subsequently test experimentally.

Materials and methods

Modeling

Our aim is to construct a comprehensive model of rivalry that replicates the three key features common to all rivalry processes: leptokurtic dominance distributions, Levelt's propositions, and the role of memory, in particular for the phenomena related to stimulus blanking. We propose that a network consisting of two coupled WTA circuits exhibits all these features. For comparison, we also analyze representatives of other modeling approaches that have been proposed for rivalry by embedding them in the WTA framework. Specifically, we here compare three models of rivalry (Fig.1), which serve as prototypes for broad classes of rivalry models: first, networks of self-exciting units with mutual inhibition (model 1); second, the same network augmented with an adaptation mechanism in the excitatory units (model 2); third, our new approach, two coupled networks that implicitly form a memory state (model 3). In this section, the models are outlined; for a detailed mathematical description, implementation details, and parameter choices (Table S1), the reader is referred to the methods described in the Supporting Information.

Figure 1

Network models: the three models tested in this study. (A) Model 1: a single WTA circuit; each excitatory unit is recurrently coupled to itself with weight α and to the inhibitory unit with weight β2. In turn, the inhibitory unit is coupled to both excitatory units with weight β1, but not to itself. Input is applied to both excitatory units, and the perceptual states are recorded directly from these units. (B) Model 2: identical to model 1, except that both excitatory units are adapting (see Methods in the Supporting Information for details). (C) Model 3: two WTA circuits, as used in model 1, are coupled by connecting their excitatory units across circuits; all connections between the circuits have the same weight ϕ, but feedback connections cross between the two sets of neurons representing different states. Input is applied to map I, and percepts are recorded from map P. The fundamental circuit for all three cases is a single WTA network. This network consists of two excitatory units and one inhibitory unit. Each excitatory unit excites itself and projects to a global inhibitory unit, which, in turn, projects back to both excitatory units. Units are mean-rate approximations of the activity of a group of individual neurons, and activity is modeled with respect to average rates rather than individual spike times. Since there is no explicit mapping from the time units in the simulation to real time, we consistently use a unit of 1000 steps of the Euler integration (see Methods in the Supporting Information) as the time unit when reporting modeling data and parameters. Simulations were performed in Matlab (The MathWorks, Natick, MA, USA) on the basis of the code available at http://www.ini.uzh.ch/∼urut/DFAWTA48; each condition (combination of input currents) was simulated five times with different random noise patterns.

Model 1: mutual inhibition

For model 1, the fundamental WTA circuit is considered in isolation (Fig.1A). Even though the inhibitory unit is modeled explicitly, this network corresponds to a network with self-exciting units that mutually inhibit each other,53 except for the delayed inhibition caused by the inhibitory unit. To probe the model, input currents with Gaussian noise are applied to both excitatory units. The value of the input current represents the sensory input corresponding to one of the possible percepts. The activity level of the excitatory units determines which percept is currently dominant. The perception belonging to one unit is considered dominant whenever the activity of the respective unit exceeds double the activity of the other unit. The remainder constitute transition periods, which were not considered further in the present context, neither experimentally nor in modeling.

Model 2: mutual inhibition with adaptation

Many models of rivalry assume a form of “fatigue” or habituation; that is, if a percept has been dominant for some time, its representation fatigues and thus the other percept becomes dominant. On a neuronal level, the equivalent of such fatigue is neuronal adaptation. In model 2, we implement adaptation by adding an additional term to each excitatory unit (Fig.1B). Otherwise, model 2 is identical to model 1. This results in a model of mutual inhibition with adaption, akin to the model used in Ref. 54.

Model 3: two coupled circuits, implicit memory state

When the external input is removed from a single WTA network, its activity relaxes back to zero and it therefore has no memory. As this is in conflict with experimental evidence, in particular with the increased survival probability of a percept after prolonged stimulus removal (blanking), we consider a third model that implicitly implements a memory state in its dynamics. To do so, we couple two of the WTA circuits as used in model 1 (Fig.1C). One of the circuits (I, with excitatory units i1 and i2) represents the input layer, while the other (P, with excitatory units p1 and p2) represents the perception layer, from which activity is “recorded.” Importantly, the feedforward connection from the units of the input circuit (i1 and i2) project to the corresponding perception units (p1 and p2, respectively), while the feedback projections map onto the input corresponding to the alternate percept (p2 to i1 and p1 to i2). This network can maintain its current winner (state), even if the external input is removed.43 Such persistent activity that is maintained in the absence of external input endows the network with a memory, because the state active during the removal of external input is maintained (see Ref. 48 for details). In the context of rivalry, this makes the percept after the blank period has ended conditional on the state (i.e., the percept) before stimulus removal.

Model input; simulation of blanking

In typical simulation runs, constant input with Gaussian noise is supplied to both input neurons (model 1/2: u1 and u2; model 3: i1 and i2) of the network. Stimulus strength is set by the mean current applied. For simulating stimulus removal and reappearance (blanking), we in addition model a sensory neuron that is located upstream (i.e., lower in the visual hierarchy) to the rivalry circuit. This is done by modulating the injected current accordingly: at stimulus onset, the current transiently rises to thrice the sustained value, followed by a rapid exponential decay to the sustained value (α function t/τ × exp(1 − t/τ), with time constant τ = 0.025). Stimulus offset is modeled by the current decaying to baseline level (activity of 0.1) in the shape of a hyperbolic tangent (half life: 0.080). Stimulus durations were fixed to 1.0 and blank duration was varied between 0.1 and 2.0 in steps of 0.1 (all times in units of simulated time as defined above).

Analysis of modeling data

To mimic an instruction in which observers report exclusive dominance, we define a percept to be dominant, whenever the respective unit's activity exceeds the other unit's activity by at least a factor of 2. Periods in which none of the percepts are dominant according to this definition are defined as transition periods. We define the dominance duration of a percept as the time from its onset to its offset irrespective of whether the same or another percept follows. For further analysis, we excluded every first and last dominance duration since these are not restricted by the network dynamics but by onset and offset of the simulation. Since dominance durations within a trial cannot be expected to follow a Gaussian distribution, we quantify the distribution of dominance durations in each condition by the distribution's median. For comparison to the experimental data and among models, we normalize all dominance durations by the condition with highest input to both eyes. For quantification of the dominance of one state, which is independent of the respective median dominance duration, all dominance durations of this percept obtained in one simulation period were added. Denoting the resulting sums for the two percepts as D1 and D2, respectively, we define the relative dominance as:

Analysis of blanking

To quantify the effect of blanking, we define a percept's survival probability as the number of blanking intervals for which the dominant percept before and after the blank was identical divided by the total number of blanks. To closely match our experimental instructions and accounting for human reaction time, the dominant percept for each presentation interval is measured 0.1 time units after the onset of the respective presentation. Similarly, we define a switch probability across blanks as the number of blanking intervals for which the dominant percept before and after the blank was different. Since, occasionally, observers (and models) do not report a dominant percept during a presentation interval, survival probability and switch probability do not necessarily add up to 1 and the difference of their sum and 1 quantifies such failures to report. Unlike in behavior, the recorded units signal a percept even during the blanking period. Such “hallucinations” can easily be suppressed by an additional downstream gating mechanism that allows a percept only to get to awareness, if any input is present. Here, we do not model this explicitly, but merely ignore the period of the blank as such for further analysis.

Behavioral experiments

Observers

Five observers (age: 21–26 years; four female) participated in experiment 1; five (age: 24–26 years, three female) participated in experiment 2. Two observers participated in both experiments. All had normal or corrected-to-normal vision and were naive to the purpose of this study. All gave written consent before the experiment. The experiments conformed with the Declaration of Helsinki and were approved by the local ethics committee (Ethikkommission FB04).

Stimuli

Each eye was presented one sinusoidal grating (3.4 cycles per degree; mean luminance = 25.1 cd/m2), oriented +45° in one eye and −45° in the other. Gratings had full contrast in a circular patch of 0.3° radius outside of which contrast fell off with a Gaussian profile (SD = 0.11°). To facilitate fusion, the patch was surrounded by an alignment annulus (radius = 1°, width = 0.06°) of white noise of the same mean luminance. The contrasts of the gratings were adjusted to each individual's detection threshold, which was defined as the 75% correct level as identified by a 2AFC QUEST55 procedure. For none of our observers was there any significant difference in threshold between their eyes. Similar to the methods described in Ref. 56, the lowest contrast used was 0.75 log10 above this threshold; the highest contrast was 100% Michelson contrast, and the four contrast levels in between were logarithmically spaced. This defines six contrast levels in each individual, which hereafter are referred to as contrast 1 to contrast 6.

Apparatus

Stimuli were presented separately to each eye through a stereoscope on two 21-in. Samsung Syncmaster CRT screens at a viewing distance of 30 cm by an Optiplex Dell computer running Matlab with a Psychophysics toolbox extension (http://psychtoolbox.org57,58). Each screen had a resolution of 1280 × 1024 pixels, a refresh rate of 85 Hz, and was γ corrected to achieve the same linear mapping of pixel values to stimulus luminance in both eyes. Eye position was monitored throughout the experiment by an Eyelink-1000 (SR Research, Osgoode, ON, Canada) eye-tracking device, and the device was calibrated at the onset of each trial; although in the context of this study, eye-tracking data are not considered further.

Procedure: experiment 1

After identifying the individual's detection threshold, observers performed 72 2-min experimental trials, chunked into 12 blocks of 6 trials. For these trials, each of the six contrast levels was combined with any other level and this was done for both possible assignments of grating orientation to eye (72 = 6 × 6 × 2). Order of these trials was random. To control for changes in overall behavior and for normalization purposes, between experimental trials 3 and 4 of each block, an additional control trial that used full-contrast stimuli in each eye (contrast level 6) was inserted. For these 12 control trials, we did not observe any significant change for any individual in any measure of interest. To have an equal amount of experimental trials in each condition, in the main analysis control trials were used only for normalization of dominance durations and not analyzed otherwise. Participants reported their current percept by pressing and holding one of two buttons on a game pad. They were instructed to fixate the gratings throughout an experimental session and only report a percept when it appeared clearly dominant and refrain from any button press, when both percepts appeared about equal.

Procedure: experiment 2

In the second experiment, each trial started with 90 s of continuous presentation of the rivalrous stimuli, followed by a 180-s period of intermittent presentation and another 90 s of continuous presentation. In the intermittent part, stimuli were repeatedly presented for 0.5 s and removed for a fixed period of blanking. Across trials, three different contrasts (levels 2, 4, 6 to both eyes) and four different blanking durations (0.5, 1, 2, and 4 s) were used, resulting in 12 trials per participant. Since presentation duration was short, participants were instructed to press the button indicating their percept only once during each presentation period or shortly afterward and press no button during the blanking periods. Otherwise, the procedure was identical to experiment 1.

Data analysis

Akin to the analysis of the modeling data, we define each period during a trial in which exactly one button was pressed as dominance period for the respective percept, and other periods as transition periods. In each trial, the time before the first button press and the last dominance period, which the trial end interrupted, were excluded from analysis. To normalize for interindividual differences in group analyses and comparisons to modeling, all dominance durations were divided by the median dominance duration of the 12 control trials with full contrast to both eyes. Definitions of relative dominance, dominance durations, and switch rate are then analogously defined to analyzing modeling data. Since we did not observe any differences between grating orientation, we pooled dominance durations across orientations. For the analysis of relative dominance and switch rate, we separate by left and right eye, resulting in an effective 6 × 6 design with six levels for left-eye contrast and six levels for right eye contrast. Since we did not observe any eye to be preferred for any observer, analysis of median dominance durations, where the contrast to the eye whose dominance duration is considered (ipsilateral eye) has to be distinguished from the other eye (contralateral eye), is pooled across eyes.

Results

To compare different computational models of rivalry, we simulated three different networks of increasing complexity. Model 1 is a single WTA circuit with mutual inhibition and noisy input; model 2 adds adaption; and model 3, by combining two WTA circuits, an implicit memory state. We assessed each model according to three hallmarks of rivalry: dominance distributions, Levelt's propositions, and the effects of periodical stimulus removal, and compared the predictions to new experimental data. To simulate rivalry, noisy external inputs were provided to units u1 and u2 in the single WTA cases or to units i1 and i2 of the circuit I in the double WTA case, respectively. Input strength was modeled by adjusting the mean of the input currents and adding Gaussian noise of constant standard deviation. In the behavioral experiments, input strength was given by the log contrast of the stimulus relative to the individual's threshold.

Example data, dominance durations, and dominance

In the experimental data, all observers experienced rivalry, and valid dominance data (exactly one button pressed, no interruption by trial end) were obtained for 85.8% of the total time (range across observers and conditions: 76.3–95.4%). The remaining time consists of periods of mixed percepts, transition periods, and discarded data at the beginning and end of the trial. All three model networks show bistable behavior (Fig.2), which allows us to define periods of perceptual dominance. Akin to the experimental instruction to report a percept only if it is clearly dominant, we define a percept to be dominant in simulation if its unit's activity exceeds twice the activity of the other unit. Using this criterion and the same end-of-trial exclusion as in the experimental data, we can—again averaged over all conditions—identify a dominant percept for 68.6% (range: 0–95.6%) of time for model 1, 87.3% (81.8–91.1%) of time for model 2, and 89.9% (88.8–91.1%) for model 3. Except model 1, in which for some asymmetric input conditions one percept is dominant throughout, yielding no valid data, the amount of data usable therefore is comparable to the experimental situation.

Figure 2

Raw activity. Activity traces for the three models when noisy input (strength 6.5 for models 1 and 2 and strength 5.5 for model 3) is applied to both eyes (see Methods in the Supporting Information for units). All models show bistable behavior, with the excitatory units (blue, red) alternating in dominance. The currently dominant percept, according to the definition used throughout, is indicated by the red and blue bars on top of each plot for models 1 and 2, and for the percept units of model 3. Green trace represents the activity of the inhibitory unit.

Distribution of dominance durations

To address the most typical rivalry situation in which both percepts are about equally strong, we first consider the conditions in which stimuli of the same strength were presented (symmetric input). This was done by injecting currents of the same mean into each input unit (simulation) or presenting stimuli of the same contrast to each eye (experiment). While absolute values of dominance durations and the spread of distributions typically vary largely between individuals and rivalry type,59 nearly all rivalry types exhibit leptokurtic (heavy-tailed) dominance durations. Our experiment 1 confirms this tendency, with showing leptokurtic dominance distributions for all contrast levels tested (minimal kurtosis: 6.3, with values larger 3 implying leptokurtic distributions; Fig.3). Across all symmetric input conditions, all models show a kurtosis of larger 3, with minimum values over conditions of 5.1 (model 1), 9.9 (model 2), and 4.2 (model 3). However, model 1 and 2 show an abundance of short dominance durations (Fig.3, left panels and respective insets) as compared to model 3 (and to a lesser extent our experimental data). Nonetheless, all models qualitatively replicate the leptokurtic distribution of dominance durations that is common to nearly all rivalry phenomena.

Figure 3

Dominance distributions. Example distributions of dominance durations for the three models and experiment 1 for a medium input strength. Modeling data are based on a single simulation run and experimental data on a single individual. Dominance durations are pooled over both percepts. Insets depict finer resolution for the left-most bin (model 1) or two left-most bins (model 2), corresponding to five time units to ease comparison with model 3.

Levelt's propositions

For a more detailed analysis of the dependence of rivalry on input strength, we consider situations in which both input strengths are varied independently. For a broad range of rivalry phenomena, the dependence of dominance, dominance durations, and switch rates on the two input strengths then follows certain rules, typically referred to as Levelt's propositions.36 Here, we test the extent to which our models reproduce Levelt's propositions and again compare the data to our experimental observation.

Levelt's first proposition: increase of stimulus strength in one eye will increase the predominance of the stimulus

For all models and the experimental data, we calculate the relative dominance of each combination of input strengths (input currents or contrast levels, respectively). By definition, a relative dominance of 0 corresponds to equal dominance of either percept, positive values dominance of percept 2 or right eye, negative values of percept 1 or left eye. Consistent with Levelt's first proposition, we find relative dominance to increase when input to the right eye or the corresponding input unit u2 or i2 is increased, to decrease when input to the left eye (or unit u1 or i1) is increased, and to fall around 0 when the input to both is the same (Fig.4A). Quantitatively, however, there are substantial differences: model 1, the single WTA circuit, only has a narrow band around equal input strength in which dominance does not get stuck at the extreme. When input is applied asymmetrically, there is no mechanism to release the nondominant state from suppression as soon as noise becomes negligible. Adaption in model 2 counters this effect, and the extremes are approached in a more shallow fashion. Importantly, a qualitatively very similar behavior is observed for the double WTA network of model 3, even though there is no explicit adaptation mechanism at the level of an individual unit. The experimental data also show a broad range and smooth variation as do models 2 and 3. Unlike those models, however, experimental data reach the extremes of full dominance, while these models do not exceed a relative dominance of about ±0.5 (i.e., one input dominating for 75% of time) for the input range tested. Nonetheless, the double WTA (model 3) and the single WTA with adaptation (model 2) similarly capture the smooth transition of relative dominance from one eye to the other when input strength is changed.

Figure 4

Levelt's propositions. (A) Levelt's first proposition tested for the three models and data of experiment 1; relative dominance is color coded individually per panel. In the panel for model 1, some simulations are stuck within the same state throughout, and—as for all analysis the last period is excluded—no data is available, indicated in gray. (B) Levelt's second proposition: log dominance duration for one eye (ipsilateral eye) while input strength to this eye and to the other eye (contralateral eye) are varied independently. Data are collapsed over both eyes (left/right) or units (i1/i2, p1/p2). Log scale is used for illustration, and correlations are computed on the original data. (C) Levelt's third and fourth propositions: dependence of switch rate on input strength to either eye.

Levelt's second proposition: increase of stimulus strength in one eye will not affect the mean dominance time for the same eye

Of the four Levelt's propositions, the second is arguably the most counterintuitive and has been challenged recently.56,60 The resulting revised version of this proposition states that “changes in contrast of one eye affect the mean dominance duration of the highest contrast eye.”56 To analyze our models and data with respect to Levelt's second proposition, we plot the dominance duration of one eye/input unit as a function of the input of this (ipsilateral) eye/input unit and the other (contralateral) eye/input unit (Fig.4B). For analysis, we fix one input strength and vary the other (i.e., we proceed either along rows or columns of the panels in Fig.4B). We then can consider either the dominance durations of the “fixed” input or of the “variable” input. For illustration, Figure S1 shows some of the data of Figure4(B) in this representation. The modified version of Levelt's second propositions predicts that the median dominance duration of the percept receiving higher input strength should vary most. In the extreme cases of highest and lowest fixed input strength, this would result in decreasing dominance durations of the percept receiving fixed input strength in the first case and in increasing dominance durations of the percept receiving variable input strength in the latter, while the dominance durations of the other percepts remain stable. At lowest fixed input strength, all networks show qualitatively the same behavior as the experimental data (Fig. S1A); namely, the median dominance duration of the percept receiving variable input increases with increasing input strength, while the median dominance durations of the percept receiving fixed input strength stays largely constant. The simulated and experimental data are thus in line with the modified version of Levelt's second proposition. However, when the fixed input strength is increased, the single WTAs behaves differently from the double WTA model. Only the double WTA model is consistent with the experimental data (Fig. S1B and C). To quantify this, we compute correlations between fixed input strength and median dominance durations for each level of variable input strength and vice versa (i.e., we compute correlations within either each row or column of the panels in Fig.4B). For experiment and model 3, correlations between input strength and median dominance durations of the percept receiving variable input are strictly positive and significant for all input strengths (double WTA: all r(9) > 0.96, all P < 3.7 × 10−6; experiment: all r(4) > 0.90, all P < 0.019)). In contrast, dominance durations of the input receiving fixed input strength and input strength to the variable input are correlated negatively for all input strengths (double WTA: r(9) < −0.96, P < 1.82 × 10−5)); experiment: r(4) < −0.89, P < 0.014)). The single WTA without adaptation (model 1) still trends to a negative correlation for the fixed input and the positive correlation for the variable input, even though not always significant (fixed: r(9) < −0.50, P < 0.11; variable: r(9) > 0.61, P < 0.045). In contrast, the single WTA with adaptation (model 2) shows positive correlations between median dominance durations and input strength for both, the percept receiving variable and fixed input strengths (all r(9) > 0.85, all P < 0.00089), which is—for the variable input—the exact opposite of the experimental observation. Hence, models 1 and 3 replicate the modified version of Levelt's second proposition, whereas a single WTA with adaptation (model 2) shows qualitatively different behavior.

Levelt's third proposition: increase of stimulus strength to one eye will increase the alternation frequency

Levelt's third and fourth proposition are closely linked. Both make predictions on the alternation frequency (here: switch rate), when input strength is varied. The switch rate is the number of switches in dominance per unit time (simulation time or seconds). Every transition from one percept to the other or from one percept to the same percept when there was a transition time in between (same-state transition) is considered a switch for the present purpose. Levelt's third proposition states that when input strength or contrast to one unit/eye was fixed, increasing input strength/contrast to the other would result in a higher switch rate. In its revised version,60 the proposition instead states that switch rate is “maximal at and symmetric around equi-dominance.” Our experimental data, which reach up to high-contrast levels, confirm the revised version of the proposition (Fig.4C, right). Model 1 replicates this property, but switch rates rapidly drop to 0 when leaving equi-dominance (Fig.4C, left). The single WTA with adaptation (model 2), in turn, does not show the symmetry around equi-dominance (Fig.4C, second panel). In contrast, the double WTA (model 3) shows a distribution of switch rates that is symmetric (Fig.4C, third panel) and maximal around equi-dominance. Hence, only model 3 qualitatively captures the revised version of the third proposition and is in line with the experimental data.

Levelt's fourth proposition: increase of stimulus strengths in both eyes will increase the alternation frequency

This proposition predicts an increase of switch rate when stimulus strength is increased simultaneously in both units/eyes, which would be reflected by an increase of switch rate along the diagonals in Figure4(C). Again, models 1 and 2 deviate qualitatively from this prediction by showing a decrease along the diagonal toward increasing input strength (model 1: r(9) = −0.94, P = 2.2 × 10−5; model 2: r(9) = −0.98, P = 1.0 × 10−7). In contrast, model 3 qualitatively captures the increase with increasing input strength, which we also observe in our experimental data (model 3: r(9) = 0.67, P = 0.024; experimental data: r(4) = 0.97, P = 9.4 × 10−4). In sum, even though models 1 and 2 capture some aspects of rivalry, only model 3 is—at least on a qualitative level—in line with the experimental observation.

Blanking

Another key phenomenon of rivalry is blanking: after the stimulus is removed intermittently for a sufficiently long time (>500 ms), the percept stabilizes,37,38 whereas it destabilizes when the blanking duration is shorter than about 500 ms.37 Stability here means that the same percept is dominant before and after the blank (see Methods). Blanking is an example of the involvement of memory in rivalry. The double WTA model (model 3) has memory for the current percept (its state). We next test this model's ability to replicate the main features of blanking. Experimentally (experiment 2) and in simulation (Fig.5A), we vary blanking duration and input strength (Fig.5B). Example traces of activity in model 3 already indicate that the model may replicate the tendency for longer blank durations to lead to more stabilization (Fig.5C) in line with the experimental example (Fig.5D).

Figure 5

Blanking, model output. (A) Time course of an experimental blanking trial. Blank intervals are not to scale. (B) Input function for modeling blanking in the models; typical example with added noise. At onset there is a steep rise with exponential decay to the sustained activity; at offset a smooth relaxation to baseline. (C) Example traces of model 3 for three different blanking durations. Blue and red curves correspond to neurons p1 and p2, respectively. (D) Experimental data for blanking in a single subject. Percept changes more frequently in the absence of blanks (<90 s, >270 s), and stabilization depends on blank duration. Quantitatively, we investigate blanking with respect to survival probability, the number of times a percept remerges after the blank is divided by the number of all blanks (Fig.6A). This number would be 0 if percepts perfectly alternated, 1 if there was the same percept always present, and 0.5 if alternations were random (as there is no bias to either percept in simulation nor experiment). As a consequence of the definition of dominance in simulation and the instruction to only report a percept when it was clearly dominant, a dominant percept is not always identifiable (especially for models 1 and 2) during the presentation. Hence, we also analyze switch probability as the fraction of blanks after which the other precept reemerges after a blank (Fig.6B). The difference between 1 and the sum of switch and survival probability (Fig.6C) provides the fraction of unidentifiable transitions through a blanking period. Not surprisingly, the two single WTA models (models 1 and 2) do not replicate the blanking phenomenon. Once the input decayed (cf. Fig.5B), no information about the preceding state is left, and switch and survival probability are similar (Fig.6A and B, left columns). In addition, there are many situations (up to 67.2%; Fig.6C, left columns) in which the presentation time does not allow for a clear dominant percept to emerge after a blink. In contrast, the double WTA model (model 3) replicates the increase of survival probability with increasing blanking duration (Fig.6A, third panel) and the corresponding decrease of switch probability across the blank (Fig.6B, third panel). In addition, there are fewer (up to 25.2%) presentations during which a dominant percept cannot be identified and these situations occur mainly at short blanking durations (Fig.6C). This is in line with the experimental data, where no dominant percept was reported in up to 18.8% of the total experiment time. This happened primarily at short blanking durations, possibly due to the short time between presented stimuli. The model makes an important further prediction, namely that survival probability should decrease with stronger input. Our experimental data (Fig.6, right column)—at least qualitatively—confirms this prediction.

Figure 6

Blanking, model results. (A) Survival probability, (B) switch probability, and (C) their sum for the three models and the data of experiment 2. Different line colors indicate different input strengths (consistent within each column as given in the top-row panels).

Discussion

In this paper, we argue that rivalry can be understood as the result of a competition, just like attention can be understood as competition with priority control. WTA networks have been suggested as models of attention52,61,62 and combining two WTA circuits together results in networks that have memory states.48 Here, we demonstrated that a WTA model with state dependence replicates all key features of rivalry. In contrast, we found that simpler models without a memory state were unable to reproduce key aspects of rivalry. In particular, state-less models were unable to reproduce the phenomena of blanking. We conclude that memory plays an important role in competitive processes. Our model provides a first approach to how rivalry, attention, and memory can be integrated into a single neuronally motivated model.

Limits of the present model

The present model was constructed to reproduce the key aspects common to nearly all forms of rivalry. As such, the model does not reproduce each and every aspect of any given rivalry experiment. In particular, we did not explicitly model time constants in a quantitative fashion. Furthermore, the input strength will depend on experimental details (as does the definition of what constitutes input strength in the first place63), and in the case of blanking, the survival probabilities in some cases can take far lower values than those found in our simulations and experiments. Models of a specific rivalry phenomenon would then have to include the upstream sensory circuits that realistically represent the input, where we here just made the reasonable but simplifying assumption that log stimulus contrast maps linearly to input currents. A specific model would also need to include the motor representation of the effector to report the precept64,65 and include a notion of the rivalry stimulus’ spatial extent to capture the spatial dynamics of dominance transitions.66,67 Unlike in the experimental data and in contrast to their excess in models 1 and 2, extremely short dominance durations are absent for the double WTA model. To some extent this is a tradeoff between switching and memory, and to some extent it is a consequence of our definition of dominance (twice the other activity). This criterion was chosen to mimic the notion of (near) exclusive dominance in the experimental condition, and indeed, periods of no report were similar in frequency in model 3 and the experiment. While this is clearly a limitation of the present model, which has no natural mapping of its time axis to experimental time, a more detailed downstream readout and modeling the spatial distribution of dominance at any given point in time, will presumably allow relaxation of this criterion. All these restrictions notwithstanding, with the double WTA we succeeded in modeling key properties that are common to all forms of rivalry in a single model: leptokurtic distributions, Levelt's propositions, and blanking.

Other modeling approaches for rivalry

Many attempts have been taken to model rivalry, capturing specific aspects. Many models of rivalry replicate the leptokurtic distribution of dominance durations.54,68–73 Some of them also account for Levelt's second proposition, even though all of them only tested its original version fixed at highest input strength and did not investigate behavior at other fixed input strengths.54,68,71 Levelt's fourth proposition has also been simulated by some of the existing models,68,72 but how switch rate behaves under asymmetrical input has not been reported. The stabilizing effect of stimulus removal has been replicated over a large range of blanking and presentation durations by Noest et al.74 and has been refined with a multi-timescale extension by Brascamp et al.75 to cover their experimental findings. Still, this model is specifically designed to account for blanking behavior and percept choice at stimulus onset, leaving Levelt's propositions unaddressed. Wilson54 extended his network to incorporate memory and thereby replicated the basic stabilizing effect of blanking, but leaves the functional relation between blank duration and survival probability unaddressed. The model of Gigante et al.73 also accounts for blanking but leaves Levelt's propositions unaddressed. Thus, most networks perform well in replicating some of the key hallmarks of rivalry, but rarely are all of them addressed in a single framework. Only very few networks target all characteristics of rivalry and if they do, the whole range of input strength is not investigated. Hence, the double WTA network we presented here is the first to address Levelt's propositions, as well as blanking, for a wide range of input strengths.

Rivalry and memory

The key motivation for using the double WTA network is the memory state implicitly modeled by its dynamics. Consequently, only this network was able to reproduce the phenomenon of blanking. Of particular note is that this form of memory resulted in replicating Levelt's propositions. For a long time, rivalry was considered a memoryless process and thus, successive dominance durations were assumed to be independent and the timing of switches are unpredictable.35,76,77 Recently, this notion has been challenged experimentally, both on short timescales of a few transitions40,78 up to long-term fluctuations.79 In addition, some physiological measures that have been tied to rivalry, such as eye position,25,80,81 (micro-)saccade frequency,82,83 eye blinks, and pupil size,84,85 can also be used as predictors of subsequent dominance,81,84 again arguing for some information about subsequent states being available and thus against a memory-free process.

Rivalry and attention

As discussed above, we note that attention and rivalry are conceptually similar competitive processes. Many of the markers of rivalry, including eye position, saccades, and pupil size are also markers of attentional processes. In addition, attention and rivalry are also related behaviorally. Already von Helmholtz noted in his discussion of Schröder's26 staircase and related multi-stable figures that he could volitionally switch his percept.17 Similarly, for binocular rivalry, von Helmholtz17 states that he could exert attentional control to keep one pattern dominant—an “arbitrary” amount of time for a simple line stimulus and by performing a task (e.g., counting) with the respective percept for more complex patterns. Recent research agrees with this notion: although transitions in rivalry seem to be spontaneous, some degree of volitional control can be exerted86,87 and usefulness for the task can increase the dominance of the corresponding perceptual state.88 Attention to a stimulus speeds up rivalry switching89,90 and if attention is withdrawn from a stimulus, it has a stabilizing effect similar to stimulus removal; that is, rivalry is essentially abolished.91,92 While the relation between attention and rivalry is a topic of intense research, these phenomena have so far been regarded as separate. In contrast, we here propose that both constitute a form of competition. This study is a first attempt to include memory and rivalry in a common model that has also been used to model attention. We expect that the present model can be extended to explicitly model the interactions between rivalry, memory, and attention. In summary, our model suggests that competition—with or without priority control—is a fundamental principle that links seemingly distinct phenomena.

76 in total

1. Attention activates winner-take-all competition among visual filters.

Authors: D K Lee; L Itti; C Koch; J Braun
Journal: Nat Neurosci Date: 1999-04 Impact factor: 24.884

2. Bayesian integration in sensorimotor learning.

Authors: Konrad P Körding; Daniel M Wolpert
Journal: Nature Date: 2004-01-15 Impact factor: 49.962

Review 3. Neuronal circuits of the neocortex.

Authors: Rodney J Douglas; Kevan A C Martin
Journal: Annu Rev Neurosci Date: 2004 Impact factor: 12.449