Literature DB >> 27168868

Fast and Conspicuous? Quantifying Salience With the Theory of Visual Attention.

Alexander Krüger¹, Jan Tünnermann¹, Ingrid Scharlau¹.

Abstract

Particular differences between an object and its surrounding cause salience, guide attention, and improve performance in various tasks. While much research has been dedicated to identifying which feature dimensions contribute to salience, much less regard has been paid to the quantitative strength of the salience caused by feature differences. Only a few studies systematically related salience effects to a common salience measure, and they are partly outdated in the light of new findings on the time course of salience effects. We propose Bundesen's Theory of Visual Attention (TVA) as a theoretical basis for measuring salience and introduce an empirical and modeling approach to link this theory to data retrieved from temporal-order judgments. With this procedure, TVA becomes applicable to a broad range of salience-related stimulus material. Three experiments with orientation pop-out displays demonstrate the feasibility of the method. A 4th experiment substantiates its applicability to the luminance dimension.

Entities: Chemical Disease Gene Species

Keywords: Bayesian inference; computational modeling; salience; theory of visual attention; visual attention

Year: 2016 PMID： 27168868 PMCID： PMC4862317 DOI： 10.5709/acp-0184-1

Source DB: PubMed Journal: Adv Cogn Psychol ISSN： 1895-1171

Introduction

As early as 1890, William James (1890, p.416) described a kind of attention caused by “an instinctive stimulus, a perception which, by reason of its nature rather than its mere force, appeals to one of our normal congenital impulses”. Though over a century old and in an uncommon wording, the quote expresses the idea that some objects trigger basic attentional mechanisms that all humans share. These mechanisms are feature-specific instead of being based on sensory strength. This description fits the current idea of stimulus-driven or bottom-up attention. For both James’ description and the modern perspective, however, there remains the question which features attract such attention. Among James’ rather uncommon examples are strange things, moving things, bright things, and metallic things. From today’s knowledge, we would argue that it is not simply the properties, but the context in which the object occurs which are of great importance. This relation is captured by the term salience (among others) which describes a local feature difference that attracts attention. Thus, a bright stimulus among other bright stimuli would not attract much attention, and neither would an object moving in the same direction and with the same speed as other moving objects. James’ (1890) initial question which features are essential for guiding attention has been extensively studied within visual attention research (for a summary see Wolfe & Horowitz, 2004). However, much less research has addressed the strength of salience dimensions and their quantitative influence on attention, which is the focus of the present article. If you want to be seen, would it be better to be moving, or to be bright—or even metallic? There are several, mostly model-based approaches to answer this question. Early visual processing is based on the receptive fields of neurons tuned to particular features (e.g., Hubel & Wiesel, 1959, 1968), which are the source of bottom-up influences on perception and attention (for a review see Treue, 2003). The strength of these neurophysiological responses depends on the strength of the presented features (Zhang, Zhaoping, Zhou, & Fang, 2012). This strength and combinations of features of varying strength have predominantly been tackled using methods from engineering (e.g., Itti & Koch, 2001b; Zhao & Koch, 2013). Computational modeling approaches allow to simulate retinotopic salience maps for natural input images (for a review see Frintrop, Rome, & Christensen, 2010). Different mathematical strategies have been explored to compute a salience value for every location in the image. Because of the difficulties of solving these problems algorithmically, machine learning techniques have been employed (Itti & Koch, 2001b; Zhao & Koch, 2013). Although such approaches may be applied in computer vision, it is unclear if they correspond to salience in human attention. For instance, many computational models such as that by Itti and Koch (2001a) predict that a higher luminance contrast attracts more attention. Einhäuser and König (2003) experimentally manipulated the luminance contrast of images. The participants in their study had to carefully study natural and modified natural images. The correlation of luminance contrast and fixation probability, however, failed to confirm the model prediction. The neurophysiological salience model by Li (2002) makes quantitative predictions about human performance in salience related tasks. Li assumes that the strength of salience is represented implicitly by the firing rate of retinotopic neurons in V1 that encode specific features or combinations of features. This model accounts qualitatively for a wide range of empirical findings like search asymmetries in visual search (e.g., Li, 1999). It simulates the neurophysiological processing of the visual information by a complex recurrent artificial neuronal network (Li, 2001). The firing rate of these artificial neurons can hence be regarded as a quantitative prediction. However, the model cannot yet account quantitatively for experimental data. Another model focusing on salience-related human performance is the fourth version of the Guided Search model by Wolfe (2007). In this model, salience is handled by a module for the bottom-up guidance of attention. This guidance is modeled by individual channels tuned to specific features (e.g., steep, shallow, left, and right for orientation). It contains a simple mathematical function for the contribution of each orientation channel. Salience itself is then computed by pairwise comparisons of these values for all visible objects. Wolfe states that the precise shape of the function that determines the contribution of a channel to overall salience is not critical for the qualitative performance of the model. This statement makes it questionable whether the model may provide good quantitative predictions on this level although it qualitatively accounts for a wide range of empirical findings on visual search. As Wolfe himself concedes, not all quantitative aspects of human behavior in terms of response times and errors can be successfully predicted. In conclusion, models do not yet provide a general explanation of the quantitative strength of salience. Some attempts to establish a quantitative measure of salience are based on the analysis of behavioral data. Among the few studies in this line of research are those by Nothdurft (1993, 2000). He asked participants to compare the conspicuousness of two singletons that are unique elements embedded in a display of homogeneous background elements. Each stimulus whose salience was to be measured was presented with a stimulus that was salient due to a luminance difference. To measure the salience of a stimulus, the salience of the reference (luminance) stimulus was systematically increased. By this means, Nothdurft (2000) related the feature dimensions motion, orientation, luminance, and color to each other and also compared combinations of features from different dimensions. He quantified salience by relating a salient stimulus to the luminance difference that would create the same salience via approximation of psychometric functions and calculating what one might call the point of subjective equal salience. This approach comes close to a general and theoretically well-founded quantification. Unfortunately, the results are difficult to replicate. While we could replicate Nothdurft’s findings using orientation and luminance, we also found that many participants showed no regular psychometric functions but rather a behaviour strongly influenced by guessing (unpublished pilot study). Similar difficulties were reported by Koene and Zhaoping (2007). Starting from this need for a better behavioral method to quantify salience, Huang and Pashler (2005) came up with a search task for the biggest and brightest square in a display of several objects. The location of a small probe on its left or right side had to be reported to verify that the target was found. The dependent variable was the response time for a correct report. The display was randomly filled with other distractor squares. Salience was measured in these trials by introducing a salient key distractor. Its salience was quantified by examining the effect of the feature differences on response times. Via this quantification, Huang and Pashler related luminance and size to each other. An additional aspect impeding the measurement of salience is its time course. Regarding the time course, several different ideas were discussed (e.g., Egeth & Yantis, 1997), with two types of temporal dynamics being especially important for the study of salience. (1) Salience-based progression of attention (e.g., Koch & Ullman, 1985) describes the shift of attention from the most salient spot in an image to the second most salient spot and so forth. (2) Time course of salience describes how the strength of salience effects varies over time. Salience effects increase from display onset to 100 or 150 ms (e.g., Couffe, Mizzi, & Michael, 2016; Kean & Lambert, 2003) and decay after approximately 300 ms. Evidence for this time course—which resembles the time course of attention (Olivers, 2007)—comes from a variety of different paradigms: probe detection (Dombrowe, Olivers, & Donk, 2010; Donk & Soesman, 2010), TOJs (Donk & Soesman, 2011), saccadic selection (Donk & van Zoest, 2008), and saccadic trajectories (van Zoest, Donk, & Van der Stigchel, 2012). This research implies that it is crucial to measure salience at specific points in time (a condition not met by Huang & Pashler, 2005). The approaches discussed above consider or measure performance as an indicator of attention. They spend less effort on the quantification of salience itself. An approach that might provide such a quantification is Bundesen’s Theory of Visual Attention (TVA; Bundesen, 1998). It comprises a psychologically inspired, general formal explanation of visual attention and selection processes and allows to infer attentional weights for specific objects in a display. The attentional weight determines if an object is encoded in visual short-term memory (VSTM)—and if so, how quickly it is encoded—that is, its processing speed. These parameters can possibly be used as a general quantification of salience in the sense that the strength of salience is the attentional weight of an object. Although promising on an abstract level, TVA has only rarely been used to investigate salience (e.g., Nordfang, Dyrholm, & Bundesen, 2013). A possible reason is that in the item-report paradigms commonly used with TVA, the potential stimulus material is restricted to highly overlearned categories like digits and letters. The experimental paradigm requires a categorization because probabilities of stimulus categorizations are estimated. Hence, TVA is not directly applicable to salience research. Recently, however, Tünnermann, Petersen, and Scharlau (2015) paved the way for such an application. Originally, they investigated whether the relatively faster perception of an attended stimulus in a pair is caused by speeded processing of this attended stimulus or decelerated processing of its unattended counterpart. Along with TVA-based item report, participants judged the temporal order (temporal-order judgment; TOJ) in which the stimuli appeared. Tünnermann et al. found that the attentional benefit originates from a combination of speeding up the attended and slowing down the unattended stimulus. This conclusion is based on a conventional TVA analysis. In the Discussion, however, they sketched a new approach. They suggested that data from TOJ might be directly modeled by TVA to obtain TVA’s attention parameters. At first sight, this might not seem ground-breaking, but the proposed method offers applying TVA-based analysis to any kind of stimulus. The aim of the present paper is to test the feasibility of this approach. In a nutshell—details will be explained below in two sections on TVA and modeling of TOJ data—the method consists of having observers judge the temporal order of two arbitrary visual stimuli. The interval between the stimuli is varied over trials. Application of TVA to the observers’ judgments allows computing of processing speed, attentional weights, and overall attentional processing capacity. By manipulating the features of the stimulus, this method allows us to quantify salience in the form of these parameters. This approach can provide a theoretically well-founded, general quantification of salience.

The Theory of Visual Attention (TVA)

The present section provides a short summary of the relevant parts of TVA as a formal theory. Key terms for the modeling as well as the experiments are introduced, most importantly attentional weight and processing capacity. The section can, however, not provide a full introduction to TVA, for which we refer the interested reader to sources such as those by Bundesen (1998) and Bundesen, Habekost, and Kyllingsbæk (2005). TVA was introduced as a unified theory of visual recognition and attentional selection. The theory achieves this by mathematically formalizing the processes associated with the processing of visual objects from presentation towards encoding in VSTM. This processing is described as a race for representation in one of the limited slots in VSTM. Stimuli race independently and in parallel. The race is influenced by many factors. Among them are the total number of elements competing for representation, the distribution of attention across the stimuli, and the categories to which the stimuli potentially belong. In order to explain the formalization of this process, we proceed backwards from the arrival in VSTM to the appearance of the stimuli. TVA assumes that the arrival times of stimuli in VSTM are exponentially distributed. Although the theory is fleshed out for multiple stimuli, the present approach is a simpler case: In the derivation proposed by Tünnermann et al. (2015) on the basis of TOJs, only two targets are encoded. Thus, the VSTM limitation can be ignored, which simplifies formalization. Back to the event of encoding an object to VSTM, the probability of an object x to be encoded before time t can then be expressed as the probability density function: The two cases that are distinguished in the equation emerge from the assumption that there is a maximal ineffective exposure duration t0. This is the interval—that is still too short to provide enough sensory evidence for the race to start at all. If t ≤ t0, there is no chance that the processing of x finishes, whereas for t > t0 there is a chance that processing has been completed. This probability depends on the exposure duration and the processing rate υx. This rate’s unit corresponds to categorizations per second, and it is composed of: The equation is based on the idea that different categorizations are possible for object x. The set R represents this set of categories and the processing rate υ(x,i) with expressing the speed of the particular categorization that x belongs to category i. This i can, for example, refer to the property of having a particular color or a certain orientation. Descending deeper into the formalization, the processing rate is defined as: This equation introduces three important factors that are η(x,i), the strength of the sensory evidence that x belongs to category i, βi, a decision bias for category i, and the relative attentional weight for x given by its own weight ωx divided by the weights for all objects in the visual field. All objects in the visual field are contained in the set S. The weights are defined by the weight equation: which again includes the sensory evidence for x as η(x,j) and a new variable Πj, which is a selection bias for category j, the pertinence value. These are summed over the set of all categories R. The present approach concentrates on the parameters attentional weight ω, processing speed υ, and overall processing capacity C. The processing speed describes how quickly a representation in VSTM is built up. The sum of all the processing speed available is the processing capacity. The attentional weight corresponds to the relative advantage of a stimulus and expresses how much attention is allocated to this object in comparison to the others. (The biases Π and β are both held constant in the context of the present experiments and are hence not estimated.) Based on this admittedly swift introduction of the formalization the reader may deem TVA too cumbersome for dealing with comparably simple salience displays. This formalization, however, offers advantages. Firstly, TVA allows precise quantification and provides psychologically meaningful parameters, such as processing speed, which can be applied to a broad range of perceptual and attentional phenomena. Secondly, salience research can be related to other phenomena that have already been studied with TVA, such as, for example, feature-difference (bottom-up) and feature-relevance (top-down) interactions (Nordfang et al., 2013). Finally, because of its precise quantitative nature, the TVA framework can be used for generating quantitative hypotheses.

Modeling TOJ Data by TVA

TVA was initially applied to multi-element displays of highly overlearned stimuli, such as letters or numbers from which all or several belonging to a certain category had to be reported. The stimuli have to be masked to derive the assumed performance. Both features—highly overlearned and maskable stimuli—have so far restricted the general applicability of TVA. As already mentioned, Tünnermann et al. (2015) discussed a TOJ model derived from TVA equations which renders TVA applicable to all kinds of visual stimuli and also does away with the necessity of masking. They did so by introducing a temporal-order task and relating the psychometric functions derived from this task mathematically to the distributions assumed by TVA. In the following section, we will explain briefly how TOJ data can be modeled with TVA. For more detail, we refer the reader to the original article. In the TOJ paradigm, the temporal order of two onsets has to be judged. We call these two targets Tprobe and Treference. In the experiments presented later, they will have different properties according to the experimental variable, but at present these names are just used to make them distinguishable. They appear with a variable interval between them. The dependent variable is the amount of judgments for Tprobe. If Tprobe precedes Treference with a large interval, judgments in favor of Tprobe will be frequent. If the other stimulus leads, the proportion of judgments for Tprobe will be low. If Tprobe and Treference are comparable, and the two stimuli are presented simultaneously, the participants’ performance should reach chance level. However, subjective perception can deviate markedly from objective events. Such judgments can, for example, be systematically influenced by attention. If one of the stimuli is attended-to in advance, this stimulus will be perceived earlier. This phenomenon is called prior entry (Spence & Parise, 2010). In terms of the judgments, this effect becomes evident in an increased proportion of reporting the attended stimulus as being perceived first. TOJ data can be fitted with psychometric functions. Possible mathematical descriptions of psychometric functions include the cumulative distribution of the normal distribution, logistic, Weibull, and Gumbel functions, of which the former two are most widely employed (for more formal descriptions and how to fit these functions see Kuss, Jäkel, & Wichmann, 2005; Wichmann & Hill, 2001a, 2001b). These functions have at least two parameters, the most important of which describe the center of the function and its slope. The center, at which both judgments are equally likely, is usually interpreted as the point of subjective simultaneity (though see Weiß & Scharlau, 2011). The slope is an indicator of discrimination performance. Importantly, it is a matter of debate which of the functions mentioned above should be used because none of them is particularly supported by theory. Hence, also the interpretation of the functions and their parameters is limited. In contrast to psychometric functions, TVA offers parameters deeply rooted in psychological theory. As an additional advantage, they can also be interpreted readily. For instance, the parameter v corresponds to processing speed. Its unit is stimuli processed per second. This model carries more information than the point of subjective simultaneity and discrimination performance which measure only performance, not the processes that drive this performance. Each data point of a psychometric function is equivalent to the proportion of one event being encoded first. This connection is illustrated in Figure 1 for the judgment of a salient and a non-salient stimulus (the main conditions in the experiments reported below). Each of the points, sampled from the psychometric function, depends on the process depicted above the function: According to the TVA-based model, each of the two bars represents a race to VSTM. The results of these two races are compared which determines the participant’s judgment. Each race is influenced by the objective onset and its speed. The process is, however, still a stochastic process—that is, these variables do not fully determine the outcome.

Figure 1.

Cognitive model. The bars in the upper part represent the races to VSTM. Formally, these races depend on the processing rates. The rates υsp and υsr from the salience condition of the experiments are shown exemplarily. The proportion of “salient first” judgments depends on the comparison of both races. SOA = Stimulus Onset Asynchrony. As proposed by Tünnermann et al. (2015) the chance of onset Tprobe being encoded first can be described with the parameters of TVA. It can be expressed by three parameters which include υp (the processing speed of Tprobe), υr (the processing speed of Treference), and Δt which incorporates the SOA and the maximal ineffective exposure duration as Δt = SOA + t0p − t0r, where t0p and t0r are the maximal ineffective exposure durations for the two stimuli. They are assumed to be equal in the context of the present experiments. In terms of these parameters, the probability of Tprobe being encoded first can be expressed as: where 1-e-v describes the probability that Tprobe is fully encoded before Treference starts the race to VSTM. The probability ev is the probability of the event that Tprobe is not encoded before Treference starts its race. Then the probability of encoding Tprobe first is given by Luce’s choice axiom . For Δt ≥ 0 it holds that: Here, analogously ev, denotes the probability that Treference is not encoded before Tprobe starts its race. If this happens, the probability of Tprobe being encoded first is given by Luce’s choice axiom. To estimate the TVA parameters introduced in this section, a suitable statistical modeling is needed. We use Bayesian statistics for modeling and data analysis because Bayesian methods are particularly well-suited for inference under an assumed model (Little, 2006). We implemented a generative model based on the mathematical description of TVA, visualized in the hierarchical graphical Bayesian model of Figure 2. Table 1 shows how the variables (nodes) are formally defined. The graphical model describes the relation between the raw data and the TVA parameters on the group level. As an intermediate step, the TVA parameters are estimated per participant. The graphical model depicted in Figure 2 belongs to one group or condition in an experiment. Each further condition is modeled analogously. If there are at least two groups, their group parameters represented at the very top can be compared. On the group level, the mean of attentional weight is represented by node ωsp m. Because of technical reasons the variance of the estimated attentional weight is represented as a separate variable node ωsp τ. Similarly, the capacity mean and variance are represented by the upper two C nodes. Additionally, we can infer the group-level processing speed for both targets as represented by the upper υ nodes. However, they do not provide additional information because they depend on the weight and capacity, as indicated by the direction of the arrows. For further information on the exact nature of the Bayesian parameter estimation process, please refer to Appendix A.

Figure 2.

Table 1.

Variables of the Hierarchical Bayesian Graphical Model (See Figure 2)

Variable	Explanation
ω_npj∼Normal(ω_npm,ω_npτ)	Attentinal weight (probe)
ω_nrj=1-ω_npj	Attentinal weight (reference)
ν_np=mean(ν_npj) j∈participants	Processing rate (probe)
ν_nr=mean(ν_nrj) j∈participants	Processing rate (reference)
C_nj∼Normal(C_nm,C_nτ)	Processing capacity
ν_npj=C_nj·ω_npj	Participant processing rate (probe)
ν_nrj=C_nj·ω_nrj	Participant processing rate (reference)
θ_sj,i←P_A(ν_np, ν_nrp, SOA)	Probability of “Probe first”
y_nj,i=Binominal(θ_nj,in_sj,i)	Count “Probe first” response

Hierarchical Bayesian graphical model of the data of the salience condition. The salience condition is indicated by the index s . The same model applies for the neutral condition n. The group level, the variables in the highest layer, estimate TVA parameters for a particular condition. This layer was compared to the neutral condition (see Table 1). SOA = Stimulus Onset Asynchrony. The following four experiments test the viability of the proposed method in salience research. To this end, we combined TOJs with salience displays. In Experiment 1, the order of stimulus onsets had to be judged. This experiment was most similar to common TOJ experiments. In Experiment 2, stimulus offsets were judged, and the stimuli of Experiment 3 flickered for a short duration. We investigated whether salience increased processing speed and attentional weights. Finally, Experiment 4 was conducted to show the applicability to the luminance dimension as well as the sensitivity of the method.

Experiment 1

Experiment 1 is based on the hypothesis that the onset of an orientation singleton achieves an increased attentional weight and is hence encoded to VSTM more quickly. It was carried out as a proof of concept to show that TVA can be successfully applied to salience research via the general TOJ method outlined by Tünnermann et al. (2015). To this end, it had to meet the requirements of both salience studies and TOJ research, requiring us to combine multi-element displays from salience research with temporally distributed targets in the most direct way possible. The participants judged the temporal order in which two targets appeared in a display of 17 × 17 bars. A center section of these displays is exemplarily shown in Figure 3. The salience display consisting of homogeneous background stimuli was shown first. The targets appeared later. One of the targets could differ in orientation whereas the other one was always non-salient—that is, of the same orientation as the background elements.

Figure 3.

Visualization of the stimulus sequence of Experiment 1 to 4. Stimuli are identical to those of the experiments, but displays have been scaled for visibility. The salience display was shown 150 ms before the probe event. The event to be judged was the onset (Experiment 1), offset (Experiment 2), or flicker (Experiment 3 and 4; depicted as white coronae). Only the salience conditions are shown. These conditions comprise a salient probe stimulus. The neutral conditions of the experiments featured a non-salient probe stimulus equal to the reference stimulus. These conditions are not depicted. The arrow depicts the flow of time. SOA = Stimulus Onset Asynchrony. This combination of multi-element displays and stimulus onsets is the direct way of checking the applicability of the method. Unfortunately, however, it is questionable whether target onsets allow salience effects to show up. Firstly, the blanks at the locations of the future targets may act as salient stimuli because they violate the background pattern (Li, 2002). Secondly, results on the temporal course of salience suggest that salience is used to gradually distribute attention over the display (Dombrowe et al., 2010): After a 30 ms delay, the salience effect is very small in comparison to its peak at 120 ms. Salience information thus might not be available initially. Finally, the onset information may be so strong that it masks any effects of salience. Because the present experiment serves as a proof of concept, this is no severe disadvantage. If the methodology works as expected, we will be able to precisely describe the reported temporal order with the help of the proposed model independent of whether an effect of salience is present on the group level. Following this proof of concept, Experiments 2 and 3 will look into effects of salience themselves.

Method

Participants

A total of 20 students at Leuphana University of Lüneburg (5 male and 15 female; Mage = 23.9 years, range 20-33) participated in Experiment 1. Seven participants took part in an additional session and one participant in three sessions. Within Bayes methodology, such variation can be taken into account in the parameter estimation for the individual participants which improves precision. The higher precision on the individual level also affects the parameter estimation on the group level. All participants reported normal or corrected-to-normal visual acuity and received a payment of 8 Euro per hour.

Apparatus

The experiment was conducted in a dimly lit experimental booth. A Windows 7 computer with a dedicated graphic card and an Iiyama Vision Master Pro512 22 inches (40.4 cm × 30.3 cm) CRT monitor was used for stimulus presentation. The refresh rate was set to 100 Hz, the resolution to 1,024 × 768 pixels with 32-bit colors. The vsync signal was used for timing the experiment. The experiment was programmed using PsychoPy (Peirce, 2007). The distance to the screen was 50 cm. Participants responded with the hand corresponding to the location that had to be reported. The control key on the bottom left and the enter key on the bottom right corner of the keyboard were used for responses.

Stimuli

Each trial started with a fixation cross in the center of the screen. After a delay of 900 ms, the participants saw a 17 × 17 array of bars. The array corresponded to 34.99° × 34.99° of visual angle. Bar length was 1.07° of visual angle and width 0.18°. The fixation cross occupied the middle of the array. The background color of the screen was set to gray, RGB (96, 96, 96) equivalent to 6.98 cd/m2 , while bars and fixation cross were white, RGB (224, 224, 224) equivalent to 66.2 cd/m2 . Each bar stimulus belonged to one of three logical categories which were not necessarily visually distinguishable. These categories are background elements, target Treference and target Tprobe. While the background elements and Treference were always homogeneously oriented, the orientation of Tprobe varied between a 0° difference to the background in the neutral condition and the maximal orientation contrast of 90° in the salience condition. The orientation of the non-salient elements was chosen randomly for each trial. The targets were presented at fixed positions on the left and right of the fixation cross with an eccentricity of 8.24° of visual angle. Both positions were empty when the array was initially presented. Tprobe was always presented 150 ms after the onset of the array of background elements. This duration was not jittered because salience effects decay over time as reported by, for example, Donk and van Zoest (2008), and the TOJ required a temporal window of -100 ms to +100 ms around this value. Treference was shown with an SOA of -100, -80, -60, -40, -20, 0, 20, 40, 60, 80 and 1 ms, respectively. After a display duration of 300 ms, all bars vanished. The number of trials varied with the SOA because the variance is expected to increase towards the 0 SOA. Twenty-four trials were present for each of the -100, -80, 80, and 100 ms SOA, 32 trials for the -60, -40, 40 and 60 ms SOA, and 48 trials for the -20, 0, and 20 ms SOA. The participants had to respond via a keystroke with either the left ctrl or the right enter key. The side at which Tprobe appeared was chosen randomly. The next trial started automatically with a delay of 1 s with a 100 ms jitter.

Procedure

Participants were instructed to fixate the cross in the center of the screen throughout each trial. Their task was to report which element occurred first, the left or the right one, and press the left or right key, respectively. There was no time pressure. The experiment started with a training phase of 40 trials that included feedback about errors. There was no feedback after the training. After 50 trials each, a break was initiated which was ended by a keypress. The experiment lasted approximately 45 min.

Results

The judgments whether the left or right stimulus appeared first were converted into the judgment whether Tprobe appeared first. Remember that Tprobe is the stimulus that stands out from its surroundings in the salience condition. As can be seen in Figure 4, the participants generated typical sigmoid TOJ data. All individual data showed this pattern which allowed us to apply the model (see the section “Modeling TOJ data by TVA” for details).

Figure 4.

Plot of raw data (mean of judgment frequency per SOA over all participants) and posterior predictive for the salient and neutral condition of Experiment 1. This plot shows predicted data based on the estimated parameters. SOA = Stimulus Onset Asynchrony. Bayesian statistics yields a full probability distribution of the model parameters, a point estimate of the parameter, which is provided by the mode of the respective distribution, and an easily interpretable measure of the certainty with which the parameter was estimated. Broad probability distributions correspond to vague estimates. This information is expressed by the highest density interval (HDI) of the distribution, the interval on the x-axis in which 95% of the likely parameters lie. The most interesting variables in the hierarchical Bayesian graphical model are on the group level because they allow us to compare the difference between the salience and neutral condition. The relation between the weight for Tprobe in the salience condition ωsp and its counterpart in the neutral condition ωnp shows if salience has an influence on attention parameters (see Figure 5). The parameter distribution for the weights are depicted in Figure 5. The parameter estimations show that wsp = .507 and ωnp = .516 differ only slightly. Interestingly, the value .5 is not among the 95% of the most probable parameters for ωnp—that is, attention is not distributed equally across the two targets in the neutral condition. Because all elements were equally salient in this condition, visual properties cannot be the cause of the higher attentional weight for Tprobe. The temporal properties, however, offer an explanation: Tprobe was always shown 150 ms after display onset. This fixed interval made it predictable. In order to measure the effect of salience unbiased by that of temporal expectation, we subtracted the deviation from the expected neutral weight .5 in the ωnp parameter from the ωsp parameter. The corrected weight is ωsp clean = .493. The correction shifts the weight of the salience condition ωsp in the opposite of the expected direction, which would be an increased weight for the salient stimulus. As explained earlier, the effect is small and hence again, ωnp and ωsp clean differed only slightly.

Figure 5.

Estimated attentional weights (ω) for the probe stimuli of Experiment 1, salience condition (ωsp = weight for the salient probe) in blue and neutral (ωsp = weight for the neutral probe) in red. The weights for the reference stimuli are 1 minus the weight of the respective probe. The processing rates for the stimuli are very similar. All are in the range of 23.3 Hz to 24.9 Hz. This result is to be expected when both weights and capacities are similar (see Figure 6).

Figure 6.

Estimated processing rates (υ) for Experiment 1. The processing rates of the salience condition (υsp = processing rate for the salient probe; υsr = processing rate for the reference in the salient probe displays) are shown in blue, those of the neutral condition (υnp = processing rate for the neutral probe; υnr = processing rate for the reference in the neutral probe displays) in red. The darker distributions belong to the probe stimulus and the lighter distributions belong to the reference stimulus. The processing capacity was similar in both conditions with Cs = 49.4 Hz and Cn = 48.1 Hz (see Figure 7). The distribution of its difference is centered on 0. Hence a difference is very unlikely. Importantly, this allows one to compare the attentional weights across conditions because it can be assumed that the same process distributes the same resources differently in the two conditions.

Figure 7.

Estimated processing capacities (C) for Experiment 1 in the salience condition (Cs = capacity for salient stimulus) in blue and the neutral condition (Cn = capacity for neutral stimulus) in red. The difference of 0 is in the highest density interval (HDI) if both distributions are subtracted, which indicates that the overall processing capacity was similar in both conditions. The posterior predictive (Figure 4) serves two purposes: It is a plausibility check of the model and compresses the evidence for the parameters in a prediction for new data. Because the parameters are given as distributions, the certainty of the predicted data can be indicated by the color gradient in the figure. For the current experiment, the conditions are strongly overlapping—that is, salience does not affect processing speed or attentional weights, and consequently the judgments are similar in both conditions.

Discussion

Staying close in design to the well-established TOJ paradigm while using multi-stimulus displays yielded plausible data that resembled psychometric functions. The TVA-based model was successfully applied to model the data. It was possible to estimate parameter distributions for individual participants as well as on the group level. These rates are comparable to what has been found in earlier TVA studies (e.g., Finke et al., 2005). In sum, this allows us to use TOJs on multi-element displays in order to compute TVA-based attentional parameters. Although one stimulus was clearly salient due to its 90° orientation difference, this salience did not increase its attentional weight nor its processing rate in comparison to its counterpart from the neutral condition. Salience thus had no influence on the distribution of attention as measured by TVA parameters. This result cannot be attributed to a lack of sensitivity: The fact that the neutral weight (.5) was located outside of the HDI for the neutral condition (likely due to the fixed time of the Tprobe onset) indicates the sensitivity of the approach. That is, if present, even small differences between attentional parameters of Treference and Tprobe should have been detected. The absence of a salience effect on attentional parameters might be explained by the lack of a delay between the property which is supposed to guide attention (the local contrast) and the events which are relevant for the TOJ—that is, the onsets. TVA assumes that the sensory evidence for onset and local contrast is available equally fast. In the V1-salience model by Li (2002), however, it is assumed that salience is computed by pyramidal cells and interneurons that interact locally and reciprocally in their layer. The onset, however, can be processed by a simple feed-forward network (VanRullen & Koch, 2003). If the sensory evidence for salience is indeed not available fast enough, this would explain why the attentional weights are unaffected by salience. This explanation also fits the results of Dombrowe et al. (2010) on the time course of salience. The following experiments changed the temporal feature of the targets. The events to be judged are target offsets in Experiment 2 and brief flickers in Experiment 3.

Experiment 2

In Experiment 2, the onsets used in Experiment 1 were replaced with offsets. Offsets are susceptible to attentional effects (Vingilis-Jaremko, Ferber, & Pratt, 2008). We hypothesized that the presence of the salience-generating property prior to the event (offset) should cue the event and hence lead to a higher attentional weight. Again, this should lead to a quicker encoding into VSTM. The offset at the potentially salient position occurred 150 ms after the onset of the display. As shown by Donk and Soesman (2010), effects of orientation salience should be present in this time range. A total of 20 participants (9 male and 11 female; Mage = 22.6, range 19-47), including the authors, participated in Experiment 2. All of them were students or members of Leuphana University of Lüneburg or Paderborn University. Each participant reported normal or corrected-to-normal visual acuity and completed one session. All participants except for the authors received a payment of 8 Euro per hour. The apparatus was the same as in Experiment 1. The same stimuli as in Experiment 1 were used. Because this time the temporal order of offsets had to be judged, all elements (background elements, Treference and Tprobe) were shown after the initial presentation of the fixation cross. The offsets of the two targets occurred with the same timing as the onsets in Experiment 1. The procedure was the same as in Experiment 1 except that participants were instructed to judge which element disappeared first. This is depicted in Figure 3. Similar to Experiment 1, the data resembled psychometric functions. Hence, it was possible to apply the model and estimate the parameters. A summary of the raw data is given in Figure 8.

Figure 8.

Plot of raw data (mean of judgment frequency per SOA over all participants) and posterior predictive for the salient and neutral condition of Experiment 2. This plot shows predicted data based on the estimated parameters. SOA = Stimulus Onset Asynchrony. The attentional weights on the group level are, again, most informative about whether attention was deployed unequally. In contrast to Experiment 1, the attentional weight for the probe in the salience condition, ωsp clean = .393, clearly differed from the equal weight distribution, as shown in Figure 9. As in Experiment 1, the attentional weight ωnp = .526 in the neutral condition deviated from the balanced value of.5. We suppose this deviation to be a consequence of the timing which differed for probe and reference stimulus. The weight in the salience condition was again corrected (uncorrected ωsp = .423), such that the small shift in weight likely due to timing does not affect the measurement of salience. The processing rate for the salient υsp = 23.4 Hz was lower than the processing speed for the neutral condition υnp = 31.6 Hz (see Figure 10 for their distributions). The processing capacity, as shown in Figure 11, was constant over the conditions which allowed the comparison of weights across conditions. The comparison of the judgment data and the posterior predictive in Figure 8 shows that the model is able to fit the data and provides a reasonable description for them.

Figure 9.

Figure 10.

Estimated processing rates (υ) for Experiment 2. The processing rates of the salience condition (υsp = processing rate for salient probe; υsr = processing rate for reference in salience displays) are shown in blue, those of the neutral condition (υnp = processing rate for neutral probe; υnr = processing rate for reference in neutral displays) in red. The darker distributions belong to the probe stimulus and the lighter distributions belong to the reference stimulus.

Figure 11.

Estimated processing capacities (C) for Experiment 2 in the salience condition (Cs, in blue) and the neutral condition (Cs, in red). The difference of 0 is in the highest density interval (HDI) if both distributions are subtracted, which indicates that the overall processing capacity was similar in both conditions.

Estimated attentional weights (ω) for the probe stimuli of Experiment 2, salience condition (ωsp = weight for the salient probe) in blue and neutral (ωnp = weight for the neutral probe) in red. The weights for the reference stimuli are 1 minus the weight of the respective probe. Estimated processing rates (υ) for Experiment 2. The processing rates of the salience condition (υsp = processing rate for salient probe; υsr = processing rate for reference in salience displays) are shown in blue, those of the neutral condition (υnp = processing rate for neutral probe; υnr = processing rate for reference in neutral displays) in red. The darker distributions belong to the probe stimulus and the lighter distributions belong to the reference stimulus. Estimated processing capacities (C) for Experiment 2 in the salience condition (Cs, in blue) and the neutral condition (Cs, in red). The difference of 0 is in the highest density interval (HDI) if both distributions are subtracted, which indicates that the overall processing capacity was similar in both conditions. Replacing the onset from Experiment 1 with the offset led to a distinct and measurable salience effect. The attentional weights shifted away from the salient to the non-salient stimulus. Contrary to theory, the salient stimulus received less attentional weight and hence was processed slower than the non-salient target which in turn means that the offset of the salient target raced slower to VSTM. This finding is not in line with the results by Vingilis-Jaremko et al. (2008), which originally motivated the use of offset events. Other findings on time perception and from designs based on response time, however, are better compatible with the results of Experiment 2. For example, New and Scholl (2009) reported that the subjective duration of an attended stimulus is longer than the duration of an unattended one which contributes to a delayed perceived offset of the attended stimulus. Similarly, Rolke, Ulrich, and Bausenhart (2006) showed that the response to a cued offset takes longer than the response to an uncued offset. They conclude that attention delays the perceived stimulus offset. Furthermore, the absence of a stimulus can be salient if it violates the local pattern (Li, 2002). Replacing the stimuli with gaps might hence have caused an unwanted manipulation of salience. Although we cannot offer a full explanation yet, it is likely that the unexpected direction of the effect caused by salience is due to the offset event. This event does not only probe salience but also manipulates it. Independent from this unexpected finding, Experiment 2 however substantiated the validity of the method proposed in the present paper. The TVA-based analysis was applicable to the data and yielded interpretable parameters. Experiment 3 makes a final attempt at disclosing effects of salience with this method by keeping the salience display as constant as possible.

Experiment 3

Although a salience effect was measured successfully in Experiment 2, its direction was unexpected. We hypothesized that the offset event was responsible for this because it changed the salience display permanently. Therefore, a short flicker was used in Experiment 3. The flicker prevents a permanent change of the salience display. Again, salience is supposed to increase the attentional weight and thus speed up the processing of the probe stimulus. A total of 19 persons (2 male and 17 female; Mage = 22.0, range 19-28) participated in Experiment 3. All of them were students at Leuphana University of Lüneburg. Each participant completed one session, reported normal or corrected-to-normal visual acuity and again received a payment of 8 Euro per hour. The apparatus was the same as in Experiment 1. The same stimuli as in Experiment 1 were used. To avoid the effects of both onset and offset, the targets flickered. The flicker was realized by removing each target for 80 ms. The timing was otherwise similar to Experiments 1 and 2. The procedure was the same as in Experiment 1 except that the participants were instructed to judge whether the first flicker was on the left or the right of the fixation cross. This procedure is depicted in Figure 3. As in the previous experiments, it was possible to apply the model to the TOJ data and to derive the parameters. For illustration, the averaged responses per SOA are given in Figure 12.

Figure 12.

Plot of raw data (mean of judgment frequency per SOA over all participants) and posterior predictive for the salient and neutral condition of Experiment 3. This plot shows predicted data based on the estimated parameters. SOA = Stimulus Onset Asynchrony. As already suggested by the different trends in the figure, the attentional weights show a clear and distinct advantage for the salient Tprobe which is ωsp clean = .643 in comparison to the non-salient target ωnp = .518. We again found a small attentional effect due to the fixed interval between onset of the display and the event occurring at the Tprobe stimulus and corrected for it as explained in the results of Experiment 1 (uncorrected ωsp = .658). The estimated weight distributions are shown in Figure 13.

Figure 13.

Estimated attentional weights (ω) for the probe stimuli of Experiment 3, salience condition (ωsp) in blue and neutral condition (ωnp) in red. The weights for the reference stimuli are 1 minus the weight of the respective probe. This result also means that processing speed changed: The salient element is processed faster υsp = 27.5 Hz than its non-salient counterpart from the neutral condition υnp = 20.6 Hz while the reference stimulus from the salience condition is processed slower υsr = 13.2 Hz than its counterpart υnr = 18.09 Hz. All estimated rate parameters are shown in Figure 14. The rates can be interpreted as a shift of resources from the non-salient reference stimulus to the salient probe stimulus in the salience condition.

Figure 14.

Estimated processing rates (υ) for Experiment 3. The processing rates of the salience condition (υsp = rate for the salient probe; υsr = rate for the reference in the salient display) are shown in blue, those of the neutral condition (υnp = rate for the neutral probe; υnr = rate for the reference in the neutral probe display) in red. The darker distributions belong to the probe stimulus and the lighter distributions belong to the reference stimulus.

Figure 15.

Estimated processing capacities (C) for Experiment 3 in the salience condition (Cs; blue) and the neutral condition (Cn; red). The difference of 0 is in the highest density interval (HDI) if both distributions are subtracted, which indicates that the overall processing capacity was similar in both conditions. Also and as the final result of modeling, the posterior predictive shows a distinct shift between the salient and neutral condition as depicted in Figure 12. The two conditions show almost no overlap. This shift indicates that the salient Tprobe is perceived earlier, in perfect accord with the parameters and summary of the raw data discussed above. Experiment 3 yielded a salience effect that increased the attentional weight on the salient stimulus and hence its processing speed. This is in line with both the salience and TVA literature and shows that TVA can be used to quantify the effects of salience on processing. This quantification happens in terms of the individual processing speed and the attentional weight. The attentional weight describes the allocation of attention across all relevant stimuli and has the advantages of measuring the salience in relation to the other stimulus in the display. Attentional weights are directly comparable if overall capacity is the same. The processing speed is a second possible measure of salience. Though attentional weight is theoretically more sound, processing speed is directly comparable even if the capacity does not stay the same. With a value of ωsp clean = .642, the shift from the neutral weight of .5 is very clear. Note that the TOJ method is rather conservative in this respect because both targets have to be encoded. This makes extreme values for the attentional weight close to 0 or 1 very unlikely. To the best of our knowledge, this is the first study in which TOJs manipulated by salience were sufficiently sampled to show the full psychometric function and the occurrence of systematic shifts in the report probability. The occurrence of this shift was already assumed by Donk and Soesman (2011). Because only one-half of the suspected psychometric function was sampled in their experiment, the actual function was not derivable. Both the data presented in Figure 12 and the posterior predictive show the expected shift in the function which speeded processing of a flickering salient element compared to a flickering non-salient element. This experiment shows that salience can lead to prior entry as already reasoned by Donk and Soesman. The size of the change in attentional weights (as inferred from the HDI and the posterior predictive) indicates that the proposed method will be appropriate to prove effects of different size, including small effects: There is nearly no overlap between the expected psychometric functions for the salient and neutral condition. This means that smaller shifts will also be detectable, as, for instance, can be expected when smaller local differences would be used. The small but reliable effect of fixed time of the Tprobe shows that the method is sensitive enough for small effects. The arguments why to prefer the TVA model over the classical analysis by psychometric functions are theoretical ones, as explained in the Introduction. We, however, also conducted a conventional analysis of psychometric functions which the interested reader finds in the Appendix. It is in accord with the present results but provides less information.

Experiment 4

Experiment 3 showed the feasibility of the proposed method. Experiment 4 was designed as a test of the generality of our approach. We furthermore analyzed feature differences smaller than the admittedly large difference between 0° and 90° in Experiments 1 to 3. To this end, we used a high-salience condition and a low-salience condition, operationalized by stimulus luminance. A total of 30 persons (14 male and 16 female; Mage = 25.7, range 19-48), including all authors, participated in Experiment 4. All were students or members of Paderborn University. Each participant completed one session, reported normal or corrected-to-normal visual acuity and again received a payment of 8 Euro per hour (except for the authors). The apparatus was the same as in Experiment 1. The same stimuli as in Experiment 1 were used, except that salience was manipulated in the luminance dimension. In the low-salience condition, a dark gray probe with RGB (80, 80, 80) (4.03 cd/m2) was used. In the high-salience condition, the probe was black RGB (0, 0, 0) (0.31 cd/m2). To keep the experiment as short as possible, the neutral condition without a salient probe was omitted. We did this because Experiment 3 already showed what can be theoretically assumed: This condition yields a weight of .5 for the target—that is, attention is distributed equally between the two visually equal targets. The procedure was the same as in Experiment 3. Again, the raw data were typical TOJ data (see Figure 16). The attentional weight for the probe in the high-salience condition ωhp = .582 was higher than in the low-salience condition ωlp = .539, which implies a difference of .043 in attentional weight. The parameter distributions are shown in Figure 17.

Figure 16.

Figure 17.

Estimated attentional weights (ω) for the probe stimuli of Experiment 4, high-salience condition (ωhp) in blue and low-salience condition (ωlp) in red. The weights for the reference stimuli are 1 minus the weight of the respective probe.

Plot of raw data (mean of judgment frequency per SOA over all participants) and posterior predictive for the high-salience and low-salience condition of Experiment 4. This plot shows predicted data based on the estimated parameters. SOA = Stimulus Onset Asynchrony. Estimated attentional weights (ω) for the probe stimuli of Experiment 4, high-salience condition (ωhp) in blue and low-salience condition (ωlp) in red. The weights for the reference stimuli are 1 minus the weight of the respective probe. Figure 18 depicts the processing rates. This figure shows that the difference between the high- and low-salience condition lies mainly in the processing of the non-salient reference stimulus: High- and low-salience probes were processed nearly equally fast with a rate of υhp = 18.7 and υlp = 18.3. The processing speed of the reference stimulus, however, varied strongly with condition, with a rate of υhr = 13.3 in the high-salience and one of υlr = 17.1 in the low-salience condition. This is important for the theoretical explanation (see below).

Figure 18.

Estimated processing rates (υ) for Experiment 4. The processing rates of the high-salience condition (υhp = rate for the highly salient probe; υhr = rate for the reference in the high-salient probe displays) are shown in blue, those of the low-salience condition (υlp = rate for the lowly salient probe; υlr = rate for the reference in the low-salient probe displays) in red. The darker distributions belong to the probe stimulus and the lighter distributions belong to the reference stimulus.

Figure 19.

Estimated processing capacities (C) for Experiment 4 in the high-salience condition (Ch; blue) and the low-salience condition (Cl; h). The difference of 0 is in the highest density interval (HDI) if both distributions are subtracted, which indicates that the overall processing capacity was similar in both conditions.

The posterior predictive, presented in Figure 16, shows an asymmetrical distribution. This accords to the processing speeds shown in Figure 18: The processing of the reference targets is affected more than the processing of the probes. Estimated processing capacities (C) for Experiment 4 in the high-salience condition (Ch; blue) and the low-salience condition (Cl; h). The difference of 0 is in the highest density interval (HDI) if both distributions are subtracted, which indicates that the overall processing capacity was similar in both conditions. Experiment 4 expanded the scope of the present method to the luminance dimension and tested two quantitative levels of salience. As expected, both singletons received increased attentional weight, and this increase scaled with their salience: The highly salient probe received more attentional weight than the less salient probe. Thus, this fourth experiment shows that the proposed method is applicable to features other than orientation, which is a promising result for further generalization. Furthermore, Experiment 4 indicated that quantitative differences in salience lead to quantitative differences in attentional weights. This result promises to enlarge the scope of our method to a general quantitative model of salience. Note, however, that this difference seems to be caused by slower processing of the reference stimuli. Faster processing of the highly salient compared to the less salient probe contributed only slightly to this difference.

General Discussion

The Theory of Visual Attention (TVA) can serve as a foundation for quantifying visual salience. We showed this by conducting four experiments. All experiments substantiate the soundness of the model, which combines TOJs and TVA. Experiment 1 demonstrated the applicability of the suggested method in general. This was achieved by combining salience displays and TOJs. Experiment 2 tested the effects of salience on attentional weights and processing speed. Although in principle successful—the experiment indeed measured effects on weight and speed—it was not entirely satisfying because attentional weights favored the non-salient stimulus, which was processed faster than the salient one. As we reasoned that the offsets we used in Experiment 2 might not have been optimal because they caused (possibly) salient gaps in the bar array, we replicated the experiment with flickering stimuli. This experiment showed the full relevant data pattern: The salient stimulus received more attentional weight and was processed faster than the non-salient one. Attention was withdrawn from the non-salient stimulus and redistributed to the salient one. In Experiment 4, we applied the flicker procedure to the luminance dimension in order to demonstrate its applicability to other stimulus dimensions as well as its sensitivity and its usefulness for a quantification of salience effects. All aims were successfully reached by Experiment 4: Salient stimuli drew attention towards themselves, and there was a difference in weights and processing speeds between highly and less salient stimuli. Beyond comparison of individual model parameters, both experiments have shown that salience redistributes resources according to feature differences. Attention which is dedicated to the salient stimulus is withdrawn from the reference stimulus. Importantly, this relation is not predefined by the TVA model. Because the processing rate of each stimulus is modeled as an independent process, it is possible that only the salient stimulus gains while the speed of the race stays constant for the reference stimulus. (Such a rate increase would result in a capacity difference between conditions.) Although we focused on a measure of salience, this may be understood as evidence for parallel processing rather than a guided serial processing as in the Guided Search models by Wolfe (e.g., 1994, 2007) that predict an increase of attention for salient stimuli. Independent of the salience-related results, the proposed method of combining salience displays with TOJs and TVA parametrization was successful in all four experiments: All yielded psychometric functions as well as plausible parameters including the attentional weights and processing speeds of the two targets as well as the overall processing capacity. To sum up, the combined TVA/TOJ method proposed in the present paper seems a promising tool. Further studies could test and model the quantitative relationship between salience values and attentional weights in more detail, for instance, by employing several levels of salience instead of only two. Also, different salience dimensions could be compared directly via attentional weights, relating the salience of, say, a colored singleton to an orientation or luminance singleton. This is, however, beyond the scope of the present article. We propose the presented procedure to measure the strength of salience because this strength can be quantitatively expressed in a theoretically meaningful parameter of a tried and tested theory. Different from earlier approaches, the method is not limited to specific salience dimensions because the task is largely independent of the type of elements. Also, it is not limited to a reference stimulus like the methods proposed by Nothdurft (2000), and Huang and Pashler (2005). A further advantage is that no assumptions about contested issues such as the relative contribution of top-down and bottom-up influences have to be made to apply the present approach. While Theeuwes (2004, 2010, 2013), for example, takes the stance that salience captures attention inevitably, other researchers claim that all salience effects are modulated by top-down task sets (e.g., Ansorge & Becker, 2014; Folk, Remington, & Johnston, 1992; Yantis & Egeth, 1999). Our method provides a useful salience measure for both perspectives. Furthermore, interactions between bottom-up and top-down influences can be studied within the TVA framework. Nordfang et al. (2013) have developed a TVA extension that tackles this problem (see also Bundesen, Vangkilde, & Petersen, 2015). Both feature contrast and task relevance are modeled as individual variables affecting the attentional weight. That is, these authors already proposed a model for the interaction of bottom-up and top-down influences on attention. Its empirical application is, however, restricted to the partial report and the stimuli suitable for the partial report, whereas our TOJ-based approach can deal with all kinds of stimuli. Besides effects of salience, we consistently detected a small effect on the attentional weight in the neutral conditions of Experiments 1 to 3. All visual features were equal for the two targets in these conditions, except for their timing. While Treference varied according to the SOA, Tprobe was always shown at a fixed point in time. With this procedure, the strength of salience is not distorted by the time course of salience. As a trade-off, we accepted the chance that an effect of predictability occurred—which indeed was the case. The formal model, however, allowed to correct for it. Note that this finding is well in line with results from Vangkilde, Coull, and Bundesen (2012), who investigated the effect of temporal predictability on perception. They examined effects of timing on t0, the minimal effective exposure duration, and the processing speed υ, whereas we detected an influence on the attentional weight ω of the predictable stimulus and its υ parameter. The precision with which the small effect was detected is promising for future studies. A further aspect concerning the timing of the experiment is the presentation duration of the display prior to the TOJ. Although we kept it equal in all conditions, decreasing and increasing the duration of the salience display is possible. By this procedure, effects of presentation duration—as in research on the time course of salience—can be related to attentional weight. Note however that the TOJ might not be optimal for this because it requires a minimal time range for the two stimuli to be presented. Besides the advantages of theory and Bayesian analysis, there are also drawbacks to the method proposed in the present paper. A weak point is that a TOJ requires a temporal event that can stand out against the salience manipulation without overriding it. For the attentional weight advantage of salient stimuli, the type of change did matter. Salience has, as demonstrated by Experiment 1, next to no influence on the attentional weight when onsets are detected. The precision of the approach can, however, be used to further investigate the reason for this finding, for example, to test whether onset information is available before salience information. To sum up, the metrics of TVA allow a precise, general, and sensitive quantification of the effects of salience. This metric can be measured in plausible parameters backed by theory. Different from earlier approaches, the present method is not limited to specific stimuli, and presentation duration can be controlled well to take the time course of salience into account. Given the sensitivity of the method, it is likely that gradual changes of local differences can be tested. Also, the method allows combining salience from different dimensions and thus offers the possibility to examine whether salience effects exhibit an underlying general metric. That is, the approach discussed in the present paper might offer a new method of quantifying visual salience.

Table A1.

Prior Distributions

w_npm∼dunif(0,1)

w_npτdt(0,1,1)T(0,)

C_nm∼dunif(0,500)

C_nτ∼dt(0,1,1)T(0,)

Note. The index m denotes the mean and the index τ denotes the precision for the processing capacity Cn and the attentional weight ωnp.

46 in total

1. A saliency map in primary visual cortex.

Authors: Zhaoping Li
Journal: Trends Cogn Sci Date: 2002-01-01 Impact factor: 20.229

2. Visual selective behavior can be triggered by a feed-forward process.

Authors: Rufin VanRullen; Christof Koch
Journal: J Cogn Neurosci Date: 2003-02-15 Impact factor: 3.225

3. Simultaneity and temporal order perception: Different sides of the same coin? Evidence from a visual prior-entry study.

Authors: Katharina Weiss; Ingrid Scharlau
Journal: Q J Exp Psychol (Hove) Date: 2010-08-06 Impact factor: 2.143

Review 4. Top-down and bottom-up control of visual selection.

Authors: Jan Theeuwes
Journal: Acta Psychol (Amst) Date: 2010-05-26

5. Bayesian inference for psychometric functions.

Authors: Malte Kuss; Frank Jäkel; Felix A Wichmann
Journal: J Vis Date: 2005-05-27 Impact factor: 2.240

6. Usability of a theory of visual attention (TVA) for parameter-based measurement of attention I: evidence from normal subjects.

Authors: Kathrin Finke; Peter Bublak; Joseph Krummenacher; Søren Kyllingsbaek; Hermann J Muller; Werner X Schneider
Journal: J Int Neuropsychol Soc Date: 2005-11 Impact factor: 2.892

Fast and Conspicuous? Quantifying Salience With the Theory of Visual Attention.

Introduction

The Theory of Visual Attention (TVA)

Modeling TOJ Data by TVA

Experiment 1

Method

Participants

Apparatus

Stimuli

Procedure

Results

Discussion

Experiment 2

Experiment 3

Experiment 4

General Discussion

1. A saliency map in primary visual cortex.

2. Visual selective behavior can be triggered by a feed-forward process.

3. Simultaneity and temporal order perception: Different sides of the same coin? Evidence from a visual prior-entry study.

Review 4. Top-down and bottom-up control of visual selection.

5. Bayesian inference for psychometric functions.

6. Usability of a theory of visual attention (TVA) for parameter-based measurement of attention I: evidence from normal subjects.

7. Effects of salience are short-lived.

8. Recent developments in a computational theory of visual attention (TVA).

9. Shifts in selective visual attention: towards the underlying neural circuitry.

10. A neural theory of visual attention: bridging cognition and neurophysiology.

1. Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments.

2. Peripheral Visual Cues: Their Fate in Processing and Effects on Attention and Temporal-Order Perception.

3. Differences between endogenous attention to spatial locations and sensory modalities.

Review 4. Early Visual Processing of Feature Saliency Tasks: A Review of Psychophysical Experiments.

5. Stuck on a Plateau? A Model-Based Approach to Fundamental Issues in Visual Temporal-Order Judgments.

6. The time course of salience: not entirely caused by salience.