Literature DB >> 29100871

Overt and covert attention to location-based reward.

Abstract

Recent research on the impact of location-based reward on attentional orienting has indicated that reward factors play an influential role in spatial priority maps. The current study investigated whether and how reward associations based on spatial location translate from overt eye movements to covert attention. If reward associations can be tied to locations in space, and if overt and covert attention rely on similar overlapping neuronal populations, then both overt and covert attentional measures should display similar spatial-based reward learning. Our results suggest that location- and reward-based changes in one attentional domain do not lead to similar changes in the other. Specifically, although we found similar improvements at differentially rewarded locations during overt attentional learning, this translated to the least improvement at a highly rewarded location during covert attention. We interpret this as the result of an increased motivational link between the high reward location and the trained eye movement response acquired during learning, leading to a relative slowing during covert attention when the eyes remained fixated and the saccade response was suppressed. In a second experiment participants were not required to keep fixated during the covert attention task and we no longer observed relative slowing at the high reward location. Furthermore, the second experiment revealed no covert spatial priority of rewarded locations. We conclude that the transfer of location-based reward associations is intimately linked with the reward-modulated motor response employed during learning, and alternative attentional and task contexts may interfere with learned spatial priorities.

Entities: Chemical

Keywords: Covert attention; Eye movements; Overt attention; Reward; Spatial attention

Mesh：

Year: 2017 PMID： 29100871 PMCID： PMC5773241 DOI： 10.1016/j.visres.2017.10.003

Source DB: PubMed Journal: Vision Res ISSN： 0042-6989 Impact factor: 1.886

Introduction

Current understanding of visual selective attention indicates that both covert and overt attention (attending to a location in our periphery or making an eye movement to that location) are linked by a common neural architecture, by means of a shared frontoparietal network (Beauchamp et al., 2001, Corbetta, 1998, de Haan et al., 2008). The extent of this overlap has, however, been controversial. The premotor theory of attention proposed by Rizzolatti, Riggio, Dascola, and Umilta (1987) postulated that covert attention is akin to the programming of an eye movement. Evidence in support of this theory comes from studies reporting higher detection accuracy at a target location coinciding with the endpoint of a saccade (Deubel and Schneider, 1996, Dore-Mazars et al., 2004), better target detection at a saccade goal even when explicitly directed to attend somewhere else (Hoffman & Subramaniam, 1995), and a gradual build-up of attention at a saccade goal, reaching a peak immediately prior to saccade onset (Deubel, 2008, Dore-Mazars et al., 2004). Investigation of the frontal component of the oculomotor network, the frontal eye fields (FEF), has revealed disruption of saccades and shifts in spatial attention after applying transcranial magnetic stimulation (TMS) or microstimulation to FEF neurons (Beckers et al., 1992, Moore and Fallah, 2001). Furthermore, microsaccades or ‘fixational eye movements’ observed during covert visual search (Martinez-Conde et al., 2004, Martinez-Conde et al., 2013) likely reflect covert attentional shifts (Engbert and Kliegl, 2003, Otero-Millan et al., 2008), supporting the concept of a common oculomotor neural underpinning for overt saccades and fixational eye movements in covert attentional settings. However, there have also been numerous studies which report findings not in line with the predictions of premotor theory, including the ability to endogenously attend to stimulus locations other than the saccade goal without disturbing the eye movement (Kowler, Anderson, Dosher, & Blaser, 1995), and a lack of facilitation of visual perception for probes presented at a saccade goal (Hunt & Kingstone, 2003). In a recent review, Smith and Schenk (2012) suggest that the main consistent finding from studies on premotor theory is that only exogenous (stimulus-driven) attention is dependent on saccade preparation (Henik et al., 1994, Sereno et al., 2006, Smith and Ratcliff, 2004). In their study examining the time-course of exogenous and endogenous effects, Belopolsky and Theeuwes (2012) found that although saccade preparation accompanied shifts in covert attention due to both exogenous and endogenous (goal-driven) cues, the saccade program to the attended location for endogenous cues was suppressed shortly after a covert attentional shift had been completed. Smith and Schenk (2012) propose that an alternative to premotor theory, the biased competition account of visual attention (Desimone, 1998, Desimone and Duncan, 1995), may provide a more appropriate framework to incorporate the empirical findings garnered from assessing the validity of premotor theory. In the biased competition account, competition between neural representations is integrated across sensory and motor systems, converging on a single ‘winning’ representation. Physically salient items in the environment have a strong representation, but competition may also be biased towards less physically salient stimuli by endogenous factors such as current goals in working memory (Soto, Hodsoll, Rotshtein, & Humphreys, 2008). The lateral intraparietal area (LIP) in the parietal lobe and the FEF have been implicated in target selection in both covert attention and saccades, comprising selective spatial receptive fields (Bisley and Goldberg, 2010, Thompson and Bichot, 2005). LIP in particular has been proposed to act as an integrated priority map of top-down and bottom-up signals for behaviorally relevant stimuli (Bisley & Goldberg, 2010). Attentional orienting has generally been described in terms of exogenous and endogenous control (Chelazzi et al., 2013, Corbetta and Shulman, 2002, Desimone and Duncan, 1995, Theeuwes, 2010). In recent years, it has been suggested that these two forms of attentional orienting do not fully account for the behavior and biases observed in the attentional orienting literature (Awh, Belopolsky & Theeuwes, 2012). Much research has shown that stimuli associated with reward can capture attention and the eyes, even when attending to the rewarded stimulus contradicts selection goals (Anderson et al., 2011a, Anderson et al., 2011b, Failing and Theeuwes, 2014, Hickey et al., 2010, Theeuwes, 2010, Le Pelley et al., 2015, McCoy and Theeuwes, 2016). Awh and colleagues therefore developed a framework incorporating past selection history with existing models, leading to an integrated priority map for attentional control (Awh, Belopolsky, & Theeuwes, 2012). In line with the selection history component described in the model of Awh et al. (2012), behavioral research has shown attentional orienting to the location of a non-salient cue that had acquired value through reward learning (Failing & Theeuwes, 2014). Similarly, eye movements have been observed to land closer to high compared to low reward-signaling distractors (Bucker et al., 2015, McCoy and Theeuwes, 2016). It has recently been suggested that reward learning of particular locations relies upon spatial priority maps, specifically when multiple potential targets compete for attention (Chelazzi et al., 2014). In the study of Chelazzi and colleagues, locations in space were first trained with reward associations, i.e. responding to a target at a particular location consistently led to a high chance of high reward. In a subsequent test phase, letter or digit targets appeared at one, two or none of the possible spatial locations, with distractor non-alphanumeric characters at the remaining locations, and participants had to detect and identify the target alphanumeric stimuli. A competitive advantage was found for targets presented in spatial locations previously associated with high compared to low reward. Participants could also correctly report two targets more often when the targets appeared in opposite visual hemifields, supporting previous findings that the two hemispheres can process information in parallel (Alvarez and Cavanagh, 2005, Luck et al., 1989, Sereno and Kosslyn, 1991). Mutual inhibition between hemifields is assumed to be less than within hemifield, due to less overlap of neuronal receptive fields for stimuli presented in different hemifields (Hickey and Theeuwes, 2011, Mounts and Gavett, 2004). The competitive integration model of saccade programming further suggests that stimuli placed closer together within the visual field lead to combined and integrated saccade activation centered at a location between these two stimuli (Godijn and Theeuwes, 2002, Trappenberg et al., 2001). Other studies examining the effects of spatial-based reward on covert attention have reported a decrease in manual response time to targets presented at a previously rewarded location (Hickey et al., 2014, Stankevich and Geng, 2014). However, very little research has been carried out in humans on how location-specific reward affects eye movements to those locations, or on how location-based reward mechanisms might transfer from overt to covert attention. In one relevant study different groups of participants had to perform a task that required overt responses in one group or covert responses in the other group (Camara, Manohar, & Husain, 2013). The results showed similar reward effects across the two groups, namely that when a particular location was associated with high compared to low monetary reward (training phase), participants later freely choose the location that had previously been more often associated with the high reward. They also found that distractors captured gaze (overt group) or increased errors (covert group) more when they appeared at the high compared to low reward location in the subsequent test phase. Although this study shows similar influences across the two attentional domains, individual participants always used the same response mode in both phases, with no within-participant transferal across attentional settings. Thus, it does not provide insight into how an individual’s learning of location-based reward translates from overt to covert attention. The present study was therefore designed to investigate whether and how reward associations based on spatial location translate from overt eye movements to covert attention within individual participants. We hypothesized that if reward associations can indeed be tied to different locations in space, and if overt and covert attentional orienting depend on overlapping neuronal populations, representing the integration of exogenous, endogenous and reward-related factors, then both overt and covert attentional measures should display similar spatial-based reward learning. Specifically, we expected this to be evident by reduced saccade latency and manual reaction time (RT) to stimuli presented at locations associated with higher reward value. Due to previous research on the effect of visual hemifield, we designed the experiment to maximally separate attentional allocation towards the high and low reward locations by placing them in opposite visual hemifields. In this way, attention to rewarded locations should not be influenced by strong within-hemifield integration or inhibition, and the outcome can be assumed to be driven by the absolute reward value at a given location. We tested our hypotheses using two different task contexts: one learning phase in which saccades were made towards a salient stimulus presented at locations associated with high, low, or no reward, and one pre-training baseline and post-training test phase in which participants fixated at the center and carried out a covert visual discrimination task with stimuli presented at these same locations. The only consistent parameter across tasks in this experiment was the relative spatial positions of the stimuli, i.e., all stimulus features were different across tasks. In this way, we wished to determine the entirely spatial nature of reward learning across the two types of attentional orienting.

Experiment 1

Materials and methods

Participants

Twenty-four participants (9 female; mean 23.6 ± 3.1 years old) with normal or corrected-to-normal vision gave written informed consent to take part in the study. The experiment was approved by the Scientific and Ethical Review Committee of the VU University Amsterdam and was carried out in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki). Participants were given fixed monetary compensation of €9 for undergoing eye tracking and received a reward bonus of up to €9.90 based on their performance during the training phase.

Stimuli and apparatus

The experiment and stimuli were created using OpenSesame (version 2.9.7) software for Windows (Mathôt, Schreij, & Theeuwes, 2012). We set the software to use a legacy back-end built on top of PyGame. Stimuli were presented on a 22-inch Samsung SyncMaster 2233RZ monitor (1680 × 1050 pixels resolution, 120 Hz refresh rate). Participants were tested in a sound-attenuated, dimly-lit room. The viewing distance was held constant at 70 cm using a chin rest. Eye movements were recorded using an Eyelink 1000 tracker (SR Research Ltd., Canada) with 1000 Hz temporal resolution and a 0.2° of visual angle spatial resolution. This tracker consisted of an infrared video-based tracking system to compute the pupil center and pupil size of both eyes. Data were collected and analyzed for the right eye only.

Training phase

Training of eye movements (overt attention) constituted the main part of the experimental session, lasting approximately 35 minutes including breaks. In this task participants had to make a quick eye movement to a target location (see Fig. 1a). Each trial began with participants focusing on the central fixation-cross (0.09° × 0.09°) presented on a grey background (rgb = (100,100,100), 5 cd/m2), and pressing the spacebar to start the trial. Immediately when the trial was initiated, four dark grey circles (rgb = (54,54,54), 19 cd/m2) appeared at four locations equally spaced on an imaginary circle with a radius of 6.6°. At this point, participants continued to remain fixated at the center. After a short jittered interval (200–800 ms), three out of the four circles turned red (rgb = (255,0,0), 34 cd/m2), with only one circle remaining the original grey color. The fixation cross disappeared the instant these circles turned red. Participants were instructed to make a quick eye movement to the remaining grey circle, within 500 ms of the change in stimulus color. They heard an incorrect low beep tone (20 ms, 250 Hz sine wave) if they did any of the following: moved their eyes from central fixation before the circles turned red, looked first to a red circle before looking to the target grey circle, or failed to make a saccade to the grey circle within the 500 ms time limit. In this way, participants were trained to make a quick and direct eye movement to the target location. Participants first completed a baseline block, with no feedback other than when the trial was incorrect. They subsequently completed three blocks in which rewarding feedback was also presented at specific locations. Of the four target locations, two locations had a special rewarding status and two were used as control locations where no reward was ever given. Of the two reward locations, one was associated with a high reward 80% of the time, and the remaining 20% of trials corresponded to low reward. The other reward location adhered to the opposite contingency; low reward feedback was presented on 80% of the trials, with high reward on 20%. Feedback was presented as yellow text ‘10’ or ‘1’ (0.5°/0.25° x 0.5°, rgb = (216,216,0), 82 cd/m2) at the center of the circle to which an eye movement was made. This corresponded to a monetary payment of 10 cents or 1 cent per high or low reward trial respectively. There was no feedback given at the two remaining control (no reward) locations. Participants were informed that they would lose 1 cent for an incorrect response. This deduction was implemented to encourage participants to always look to the target location for each trial, even if no reward was to be expected at that location. The target grey circle was presented at each location 30 times per block, leading to 120 trials in total per block. Participants received general feedback at the end of each block, indicating their accuracy and average saccadic latency. They were also presented with their accumulated earnings ¾ of the way through each block. In order to ensure subjects were staying attentive to the reward feedback, a question appeared pseudo-randomly after some trials asking how much they had received in that trial – “10c, 1c, or nothing?”, with instructions on which arrow key to press for each answer. This question was posed only after correct trials, up to eight times per block. In each block the question appeared no more than twice for each possible target location, so that attention was not inadvertently drawn to any one particular location. Finally, since previous research has shown effects of visual hemifield on stimulus competition and integration, the high and low reward locations were always presented in opposite visual hemifields. Eight separate location-contingent layouts (one per participant) covered all possible across-hemifield combinations of high and low reward locations (see Fig. 1b for two such layouts). This structure was repeated three times across all 24 participants.

Fig. 1

(a) Training phase trial sequence. (b) Reward location contingencies for two sample participants.

Test phase

The test phase was a covert attention task completed twice by each participant, once before and once after the training phase (see Fig. 2). Each test session lasted approximately 10 minutes. The first test acted as a baseline, to obtain participants’ button-press response times (RTs) to targets at specific spatial locations before any reward manipulation had occurred. Trials were presented in a random order in both sessions. The same instructions were provided on screen before each session, explicitly informing participants that no reward would be earned during this part. Participants were instructed to maintain fixation on the central fixation-cross during all trials, and to make no eye movements to the stimuli. The trial began with a short random interval (200–800 ms) showing only the fixation cross (0.09° × 0.09° visual angle) presented on a grey background (same as training phase: rgb = (100,100,100), 5 cd/m2). A pre-mask display with four yellow figure-eight templates (0.67° × 0.91°, rgb = (216,216,0), 82 cd/m2) was then presented, with the stimuli equally spaced at four spatial locations on an imaginary circle with a radius of 6.6°, i.e., at exactly the same locations as those used in the training phase. After another short jittered interval (200–800 ms), lines from these stimuli disappeared, leaving the letters ‘C’, ‘F’, ’H’, and ’P’. These letters were visible on the screen for 200 ms. The target letter was ‘C’, to which participants had to make a button press response: pressing the right arrow key when the gap was on the right of the letter (a standard ‘C’), or the left arrow key when the gap was on the left of the letter (a mirrored ‘C’). Participants had 1000 ms from target onset to make their response, while remaining fixated. They were instructed to give a response as quickly and accurately as possible. Participants heard a low beep tone (20 ms, 250 Hz sine wave) whenever they gave an incorrect response or failed to respond in time. Trials were self-paced, so participants could begin the next trial whenever they wanted by fixating and pressing the spacebar. The target was presented 30 times at each of the four locations, with 15 of these presentations requiring either a left or right arrow response, leading to 120 trials in total in each of the baseline and test sessions.

Fig. 2

Baseline and test phase trial sequence in Experiment 1.

Data analysis

Data were analyzed using custom-made Python (version 2.7) scripts, Data Viewer (2.3.1. SR Research) and SPSS software (version 23.0; Chicago, IL). For the training phase, saccade latencies of the first eye movement of each trial were analyzed. An eye movement was considered a saccade when eye velocity exceeded 35°/s or eye acceleration exceeded 9500°/s2, and the saccade was deemed complete whenever the velocity/acceleration fell below that threshold. Saccade latency was described as the time interval between the display stimuli turning red (leaving one target grey circle) and the initiation of the saccade response. Data were analyzed for correct first saccades, containing no blinks, which started from within 3° of fixation and landed within 3° of the target location. Saccade latencies less than 80 ms (anticipation errors) and greater than 500 ms (timeout) were excluded from further analysis. Latencies greater than 3 standard deviations from the mean were removed from the analysis. Manual response times (RTs) were collected during the test phases. The RTs were calculated as the difference between the time the pre-mask turned into letter stimuli and the time of the button press. Data were analyzed for correct trials with no blinks and no saccades made outside of the fixation region of interest (3° visual angle), with RTs greater than 200 ms (responses <200 ms were considered anticipation errors) and less than 1000 ms (timeout). RTs greater than 3 standard deviations from the mean were removed from the analysis. Validity of the main findings were corroborated in a Bayesian framework using Bayes factors with the BayesFactor package (version 0.9.12-2) in R (R Development Core, 2015). To assess whether there was any effect of duration of the pre-mask on manual RTs in the test phase we carried out a generalized linear mixed effects regression in R using the lme4 package (Bates, Maechler, Bolker, & Walker, 2015). The analyzed data included all correct trials for each participant in which the target was presented at the high or low reward location. RT, the dependent measure, was modeled as an inverse Gaussian distribution, with reward location and phase (baseline or test) included as fixed effects, pre-mask duration as a mean-centered covariate and subject as a random intercept. Further models were established to include interaction terms, to assess changes across phase and whether pre-mask duration differentially affected RTs to rewarded locations. An ANOVA was carried out on the original main effects model and each new model including interaction terms, for model fit comparison. Akaike information criterion (AIC) and Bayesian information criterion (BIC) for each model were also compared to see which one best explained the data (more negative AIC and BIC represent a better model).

Results

Exclusions in overt task

For the 24 participants who took part in the training session, an average of 0.86% (SD 1.07%) of participant’s trials were discarded due to saccade onset latencies of less than 80 ms, and 3.97% (SD 4.12%) were discarded for saccade onset latencies greater than the timeout of 500 ms (participants were given feedback indicating the error for these trials). 0.32% (SD 0.67%) of trials were removed due to blinking either immediately before or during the first saccade. Of the remaining trials, 80.07% (SD 12.28%) of first saccades landed within 3 degrees of the target, and 19.30% landed outside of this region. Only those first saccades landing within the target region were used for further analysis.

Reward learning in overt task

Mean saccadic onset latencies were calculated per participant per block (one baseline block and three reward blocks). Latencies in the reward blocks were obtained according to whether the target was at the high reward, low reward or no reward (control) spatial locations for that participant (see Fig. 1b for examples of participant-specific spatial-reward contingencies). Saccade latencies across reward blocks and locations are presented in Fig. 3. Saccade latency improvements at each location were calculated per participant by subtracting the mean latency at that location in each reward block from the mean latency of all trials in the first baseline block in which no reward was given. Data were pooled across the two control locations. A repeated-measures ANOVA was carried out on these improvements across blocks, with reward location (high, low, or no reward) and reward block (1 −3) as factors. This revealed a main effect of reward location (F(2,46) = 3.78, p = .03, η2 = 0.141) and a reward location * block interaction (F(4,92) = 3.17, p = .017, η2 = 0.121). Probing this interaction, pairwise comparisons (Bonferroni corrected for multiple comparisons) were made between each pair of high, low and no reward locations in each block. This revealed a significant difference between the high and no reward locations in the final block (t(23) = 3.32, p = .009), with a 15 ms reduction in saccade latency to the high relative to no reward locations (22 ms vs 6 ms for high versus no reward locations, respectively). There was also a reliable difference in improvement between the low and no reward locations in the final reward block (t(23) = 3.46, p = .006), with a 18 ms reduction in saccade latency to the low relative to no reward location in the final block (24 ms vs 6 ms for low versus no reward locations respectively). No differences were found between the high and low reward locations in any block (all p >.1). Participants were presented with the question ‘how much did you receive in the last trial’ on average 18 (SD 3) times across the three reward blocks, to ensure they paid attention to the feedback given. This was not a constant number across participants as it depended on participants responding correctly to trials, while ensuring as far as possible that the question was presented a similar number of times for the four locations and was dispersed evenly across each block. Participants responded correctly to 91% (SD 12%) of these questions.

Fig. 3

Experiment 1 training phase: saccade latency per block for each of the rewarded locations. Error bars are ± SE normalized for within-subject design (Cousineau, 2005, Loftus and Masson, 1994).

Exclusions in covert task

The test phase was designed to be relatively simple, thereby promoting good performance so that RTs of a sufficient number of correct trials could be analyzed across locations. Accuracies in the pre-training baseline and post-training test phase reflected this, with correct responses to 93.16% (SD 9.52%) and 96.84.% (SD 3.14%) of trials in each phase respectively. There were no manual RTs occurring quicker than 200 ms in either the baseline or test phase. 5.07% (SD 8.66%) and 1.25% (SD 2.87%) of all trials from the baseline and test phases respectively were removed due to RTs greater than the timeout of 1000 ms. 8.33% (SD 9.02%) and 3.44% (SD 4.54%) of trials were removed due to blinks during the first fixation of a trial. 1.84% (SD 2.14%) and 1.39% (SD 2.49%) of trials were discarded due to any saccades made outside of the fixation zone (>3° of visual angle). Exclusions are reported as percentage of all trials, however there may have been overlap in the reason for removal of a given trial, e.g. one trial may have been both incorrect and contained blinks. After these exclusions, trials with RTs greater than 3 SD were removed, leaving 88.19% (SD 10.29%) and 93.26% (SD 5.00%) of all trials from each phase available for analysis.

Improvements in covert task

Reaction times in the covert task at each location before and after training are presented in Fig. 4a, with the corresponding RT improvements at each location in Fig. 4b. As in the training phase, RTs for the no reward locations were pooled in each of the baseline and test phases, and baseline to test phase improvement was calculated from these averages. A repeated-measures ANOVA was carried out on the change in RT from the baseline to test phase with reward location (high, low, or no reward) as a factor. We found a significant main effect of reward location (F(2,46) = 5.439, p = .008, η2 = 0.191). Bonferroni-corrected pairwise comparisons on the gains at each of the high, low and no reward locations revealed a significant difference in RT gain between the high and low reward locations, with a relative slowing at the high reward location (t(23) = 3.03, p = .018; 15.87 ms vs. 36.20 ms for improvement at the high and low reward location respectively). There was a smaller and non-significant difference between the high and no reward locations, with a relative slowing at the high location (15.87 ms vs. 27.19 ms for high and no reward locations respectively; t(23) = 1.72, p >.1). The difference between low and no reward locations was also not significant (t(23) = 1.76, p >.1). To ensure the feasibility of collapsing RTs across the two no reward locations, one of the no reward locations was taken as that location in the same left/right visual hemifield as high reward, and the other as the location in the same hemifield as low reward. These were different locations depending on the subject-specific spatial-reward contingencies (Fig. 1b). Improvements in RTs to these two no reward locations were very similar, with no differences in gain between locations (27.23 ms vs. 27.14 ms for location in the high reward hemifield vs. location in low reward hemifield respectively; t(23) = 0.015, p = .988).

Fig. 4

Experiment 1 test phase. (a) Manual RTs for each of the rewarded locations, before and after the training phase, and (b) improvements in manual RTs for each of the rewarded locations. Error bars are ± SE normalized for within-subject design. An additional analysis using Bayes factors was carried out on the test phase data, to confirm the validity of the main finding. A Bayesian paired-samples t-test was conducted on the difference in RT gain (from baseline to test phase) between the high and low reward locations. This was implemented using a standard Cauchy distribution with a scale of as the prior for the alternative hypothesis. A Bayes factor of 7.52 was found in favor of the alternative compared to the null hypothesis, indicating that it was over seven times more likely that RTs to the high relative to low reward locations were different.

Improvements for relative reward locations

Due to the counterbalanced design of this experiment it was possible to split the test phase data in half, with 12 participants in one group who received the high and low reward at locations spaced 180 polar degrees (°) apart (in opposite visual hemifields along both the vertical and horizontal meridian), and the remaining 12 participants in another group with the high and low reward locations spaced 90° apart (in opposite visual hemifields along the vertical axis only). In each subset, the no reward locations were thus perfectly matched with respect to the reward locations, e.g. in the 180° condition, the two no reward locations were also 180° apart. We conducted a repeated-measures ANOVA on the change in manual RTs from baseline to test phase in the covert attention task with reward location (high, low, or no) as a within-subject factor, and angle between high and low reward (90° or 180°) as a between-subject factor (see Fig. 5). This revealed a reward location * group interaction (F(2,44) = 4.86, p = .012, η2 = 0.181). Bonferroni-corrected pairwise comparisons on the difference in RT improvements between the high and low reward location across the two groups revealed significantly more slowing at the high compared to low reward location for the group that viewed those locations 180° apart during training (t(22) = 3.50, p = .006; with a high versus low difference of 3.23 ms and 37.44 ms for the 90° and 180° group, respectively). This high vs. low reward RT difference in the 180° group was very robust (t(11) = 4.56, p <.001). There was also a significant high vs. no reward location difference in the 180° group (t(11) = 2.79, p = .032).

Fig. 5

Experiment 1 RT improvements at each location for participants with high and low reward locations separated by 90° or 180°. Error bars are ± SE normalized for within-subject design.

Experiment 1 RT improvements at each location for participants with high and low reward locations separated by 90° or 180°. Error bars are ± SE normalized for within-subject design. We again used Bayes factors to corroborate these findings. A Bayes factor analysis of the RT differences between the high and low reward location of the 180° group showed very strong evidence in favor of substantial slowing at the high relative to low reward location, with this alternative hypothesis 120 times more likely to explain the data compared to the null hypothesis (BF = 120). Analysis of the difference between the high and no reward locations in the 180° group also provided evidence in support of a relative slowing at the high reward location (BF = 7.99).

Effect of pre-mask duration

Given the design of the test phase, it was possible that the presentation of the pre-mask display before the target screen influenced attentional allocation to the different locations prior to target onset. For example, if participants attended preferentially to the high reward location during pre-mask presentation, longer pre-mask durations may have led to inhibition of return to that location, resulting in longer manual RTs if the target were subsequently presented there. The pre-mask screen was presented for a random jittered interval on each trial. In order to test its influence on RTs, we ran a generalized linear mixed effects regression model on trials in which the target was presented at either the high or low reward location. This was carried out on data from both the baseline and test phase, with a fixed-effect variable encoding the task phase (see Methods for full description of analysis). In addition to phase and reward location as fixed main effects variables, and pre-mask duration as a covariate, the inclusion of a phase * reward location interaction term resulted in a better model fit (improved model fit over main effects model: χ2(1) = 10.60, p = .001, with lower AIC and BIC). This analysis revealed a main effect of pre-mask duration, with longer durations resulting in longer RTs (t = 2.69, p = .007). It also confirmed the previous critical finding, namely a reward location * phase interaction (t = 3.26, p = .001), and the strong main effect of phase on RT reduction (t = 8.00, p <.001). Including a pre-mask duration * reward location interaction term did not improve model fit compared to the main effects only model (χ2(1) = 0.14, p = .712, with higher AIC and BIC). Overall, we have shown that the pre-mask duration influenced manual RTs during the covert attention task, but did not interact with the observed effect of reward location on RTs to targets presented at the high and low reward locations.

Discussion

This experiment was designed to determine whether reward-based biases of attention to locations in space would persist across different tasks involving overt or covert attention. During an eye movement training phase, observers learned which locations were associated with obtaining reward. Saccade latencies became faster to rewarded than to non-rewarded locations across this phase. The outcome of the covert test phase was, however, the opposite to what was expected: manual RTs to targets presented at the high reward location were slower than to all other locations. If acquired biases of spatial attention are persistent and generalize across stimuli and task contexts as argued by Chelazzi et al. (2014), we should have found benefits for directing attention to the high reward location relative to other locations. Specifically, Chelazzi et al. actually found no differences in RT to high and low reward locations during learning, but still found spatial priority for the high reward location in the test phase when there was greater competition for attention between locations. Our finding implies that reward based training involving eye movements does not result in the shaping of a “general” spatial priority map, at least not a map that persists across tasks and motor outputs. Since participants had to keep fixated throughout the test phase, we interpret the relative slowing at the high reward location as the result of greater suppression of the learned, reactive eye movement response. Increased, biased competition between rewarded locations in the test phase was likely, since the target was deciphered by covert visual search of equally salient stimuli, compared to the exogenous, bottom-up capture of the salient target in the training phase. Furthermore, distractors were the same as each other during training but were different from each other in the test phase, and were as similar to the target as they were to each other in the test phase. According to hypotheses on target-distractor dissimilarity and distractor-distractor similarity (Duncan and Humphreys, 1989, Treisman, 1991), this implies that responding to the target during the test phase was more difficult than during training, and was therefore more susceptible to biased competitive effects. An increased motivational link between the high reward location and the saccade response during learning could have manifested in greater inhibition of the reflexive response and increased RTs to that location during covert attention. Such inhibition can arise from greater competition in neurons tuned to specific spatial locations in brain regions associated with fixational eye movements and the generation of saccades – the superior colliculus (SC), FEF, and LIP. Evidence in support of this interpretation comes from research on the anti-saccade task. In this task, participants are required to look to the mirror position of a target upon its presentation, i.e. a target presented at 4 degrees directly to the left of fixation requires a saccade to a location 4 degrees to the right. Studies using this task have been carried out in an effort to dissociate stimulus encoding from saccade preparation (Bell et al., 2000, Hallett, 1978, Amador et al., 1998). In a review by Munoz and Everling (2004), the anti-saccade task is described in terms of an incongruent stimulus-response mapping. When the stimulus and response are compatible, the appearance of the stimulus automatically activates the correct saccade response. However, when they are incongruent the automatic response has to be aborted and the correct response (moving the eyes away) must be prepared. Similar to evidence from manual response tasks on congruency effects, such an abort mechanism takes time and results in longer reaction times for incongruent responses (Hommel & Prinz, 1997). In their review, Munoz and Everling (2004) describe the neural mechanisms of two important processes involved in the anti-saccade task. The first and relevant process for the current study is the suppression of the automatic pro-saccade to the target stimulus. This suppression arises from the interplay between neurons in the rostral and caudal SC. These two populations of neurons were initially termed ‘fixation’ and ‘saccade’ neurons respectively, as they displayed reciprocal activity during fixation and eye movements (Munoz et al., 1991, Munoz and Wurtz, 1993a, Munoz and Wurtz, 1993b). More recently, a number of studies have found both rostral and caudal SC to be involved in attending to a target (Cavanaugh & Wurtz, 2004), target selection (Carello & Krauzlis, 2004), and saccade trajectory (Gandhi & Keller, 1999). Emerging findings indicate that fixation is an equilibrium state in which activity distributed across the left and right SC determines gaze direction (Goffart, Hafed, & Krauzlis, 2012), with the SC containing a single map of behaviorally relevant goal locations (Hafed & Krauzlis, 2008). Fixational eye movements or microsaccades can therefore result from transient imbalances between fluctuating target position activity in the SC (Hafed and Krauzlis, 2008, Hafed et al., 2009). Although the SC drives saccade execution, and has been extensively researched in relation to saccade threshold and initiation (Sparks et al., 1976, Pettit et al., 1999, Wurtz and Goldberg, 1972), several other brain areas play important roles in saccade processes. These include the LIP, FEF, supplementary eye field (SEF), dorsolateral prefrontal cortex (DLPFC) and basal ganglia, forming recurrent connections with each other and projecting their outputs to the SC (Glimcher, 2003, Hikosaka et al., 2006, Leigh and Zee, 1999, Schall and Thompson, 1999). Indeed, Munoz and Everling (2004) postulated that it is likely that the signals arising in the SC and FEF, during the suppression of the automatic pro-saccade in the anti-saccade task, come from the SEF, DLPFC or the SNr. The SNr is particularly relevant in the current study, as it is modulated by the caudate nucleus (CN) of the basal ganglia, a crucial brain area for reward processing (Hikosaka et al., 2006, Nakahara et al., 2002, Nakamura and Hikosaka, 2006, Yasuda et al., 2012). Our additional analysis on the role of relative reward locations in learning indicates that RT slowing at the high compared to low reward location can be explained mainly by those participants who received high and low reward at mirror-opposite locations (180° apart) during reward learning. As such, the neuronal responses encoding saccades to those two locations are generally assumed to be maximally independent, and potentially demonstrate a bias towards the high reward location on the SC saccade map of behaviorally relevant goal locations. In summary, we propose that the larger value associated with the high reward location increased activity in the SC neurons encoding its location, thereby making successful inhibition of the learned, reflexive saccades to that location more effortful and time-consuming. The manual response to the target was subsequently delayed relative to responses to other locations associated with lower value.

Experiment 2

In Experiment 1, participants were instructed to stay fixated throughout the test phase. The experiment was specifically designed in this way to ensure covert attention to the target stimulus was fully spatial-based relative to fixation. In Experiment 2, we wanted to gather evidence regarding our subsequent interpretation that it was this specific imposition that led to the relative slowing at the high reward location. We therefore carried out a very similar experiment, but removed the requirement to maintain fixation in the baseline and test phase. Participants fixated the central cross only to begin a trial and did not hear a beep (which indicated an incorrect response in Experiment 1) if they moved their eyes. Experiment 2 was therefore designed to address two questions: 1) whether the relative slowing at the high reward location in Experiment 1 was due to suppression of reactive saccadic eye movements and 2) whether overt spatial-based reward learning leads to similar improvements when covertly attending these locations, when the eyes are free to move. Twenty-four new participants (15 female; mean 26 ± 6 years old) with normal or corrected-to-normal vision gave written informed consent to take part in the study. All experimental procedures were the same as in Experiment 1. The experimental setup was almost identical to that used in Experiment 1 (Fig. 1, Fig. 2), except for the presentation order of stimulus screens in the baseline and test phase. In this experiment, the target stimulus screen was presented first for 200 ms, before the pre-mask screen appeared. The pre-mask thus became a standard mask in this experiment and stayed on the screen until the end of the trial. The order of the stimulus screens was reversed to minimize eye movements before the target screen appeared; if the original order remained, the onset of the pre-mask might have encouraged eye movements towards the stimuli, meaning that covert attention at the moment of target onset might not necessarily have been considered spatial-based in relation to the current gaze direction. The location-reward associations were counterbalanced across participants as in Experiment 1. Data analysis was carried out in the same way as in Experiment 1. An additional analysis was carried out here on eye movement behavior during the test phase. Participants were not given explicit instructions regarding eye movements in this experiment, aside from needing to fixate to start a trial, so we observed a range of saccade and microsaccade behavior and strategy across participants. Based on extensive research in recent years on the role of microsaccades in visual attention (Goffart et al., 2012, Martinez-Conde et al., 2013, Otero-Millan et al., 2008), we included saccades of any size in this analysis. To make the most use of saccades and statistical power with this analysis, and to also get an appropriate estimate of overt attentional priority, we analyzed saccades according to the following criteria: 1) any saccade regardless of its size made after display onset and before the manual response, and 2) the saccade had to land with an endpoint in one of the quadrants of the screen associated with the location being analyzed (high, low, or no reward). These saccades were assessed regardless of where the target was actually presented; given that there were equal numbers of targets at each location across the task, any eye movements towards one particular location should reflect attentional priority to that position. For the 24 participants who took part in the training session, an average of 0.53% (SD 0.70%) of participant’s trials were discarded due to saccade onset latencies of less than 80 ms, and 3.53% (SD 2.73%) were discarded because of saccade onset latencies greater than the timeout of 500 ms (participants were given feedback indicating the error for these trials). 0.79% (SD 1.60%) of trials were removed due to blinking either immediately before or during the first saccade. Of the remaining trials, 82.37% (SD 9.67%) of first saccades landed within 3 degrees of the target, and 17.63% landed outside of this region. Only the first saccade of each trial landing within the target region was used for further analysis. As in Experiment 1, saccade latency improvements at each location were calculated by subtracting the mean latency at that location in each reward block from the mean latency of all trials in the first block in which no reward was given (absolute saccade latencies for reward blocks can be seen in Fig. 6). A repeated-measures ANOVA with reward location (high, low, or no reward) and reward block (1 −3) as factors revealed a main effect of reward location (F(2,46) = 4.24, p = .02, η2 = 0.156). There was no interaction between reward and block (p >.1), demonstrating a consistent effect of reward from early in the reward blocks. Pairwise comparisons (Bonferroni corrected) were made between each pair of high, low and no reward locations. This showed a consistent difference between the high and no reward locations across blocks (t(23) = 2.80, p = .03; with a high vs. no reward improvement in saccade latency of 5.95 ms vs. −3.25 ms respectively). There were no significant differences between the high and low reward locations or between the low and no reward locations (all p >.1). Participants were presented with the question ‘how much did you receive in the last trial’ on average 18 (SD 2) times across the three reward blocks, to ensure they paid attention to the feedback given. Participants responded correctly to 89% (SD 15%) of these questions.

Fig. 6

Experiment 2 training phase: saccade latency per block for each of the rewarded locations. Error bars are ± SE normalized for within-subject design.

Experiment 2 training phase: saccade latency per block for each of the rewarded locations. Error bars are ± SE normalized for within-subject design. Accuracies in the pre-training baseline and post-training test phases were 82.85% (SD 14.84%) and 86.04% (SD 12.36%) respectively. No trials were discarded due to manual RTs quicker than 200 ms in either the baseline or test phase. 3.13% (SD 3.51%) and 1.08% (SD 1.83%) of all trials were removed due to RTs greater than the timeout of 1000 ms. 5.17% (SD 9.04%) and 4.72% (SD 6.70%) of trials were discarded due to blinks during the first fixation of a trial. Finally, 17.15% (SD 14.84%) and 13.96% (SD 12.36%) of trials were removed for an incorrect button response. After these exclusions, trials with RTs greater than 3 SD were removed, leaving 78.82% (SD 14.44%) and 81.49% (SD 11.94%) of all trials from each phase available for full analysis. RT improvement from baseline to test phase at each location is shown in Fig. 7. A repeated-measures ANOVA was carried out with reward location (high, low, or no reward) as a factor. We found no main effect of reward location (F(2,46) = 0.05, p = .951), with very similar improvements for all locations (26 ms, 28 ms and 25 ms improvements for high, low and reward locations respectively). To ensure similarity in responses to no reward, one of the no reward locations was again assigned as the location in the same hemifield as high reward, and the other as the location in the same hemifield as low reward. There were again no differences in RT gain between the two no reward locations (26 ms vs. 25 ms for no reward location in the high vs. low reward hemifield, respectively).

Fig. 7

Experiment 2 baseline/test phase results. Manual RTs for each location, before and after the training phase. Error bars are ± SE normalized for within-subject design.

Experiment 2 baseline/test phase results. Manual RTs for each location, before and after the training phase. Error bars are ± SE normalized for within-subject design. As in Experiment 1, here we also used Bayes factors to confirm the test phase results. This analysis is especially useful in this case, as Bayes factors can provide evidence not only for an alternative hypothesis, but also for the null hypothesis. A Bayesian paired-samples t-test was conducted on the difference in RT gain (from baseline to test phase) between the high and low reward locations. Comparing the alternative hypothesis (that these two variables were different) to the null hypothesis, we obtained a BF of 0.22. Thus, we found some evidence in favor of the null hypothesis (BF = 4.57), with the data over four times more likely that the RTs across the high and low reward location were not different. Finally, since we found a large difference between manual RTs to the high and low reward location specifically in the 180° group in Experiment 1, we also analyzed the RTs separated by group in Experiment 2. A repeated-measures ANOVA was performed on RT improvement with reward location (high, low, or no) as a within-subject factor, and angle between high and low reward (90° or 180°) as a between-subject factor. In contrast to Experiment 1, this did not show any interaction between reward and group (F(2,44) = 0.05, p = .95), indicating that relative improvements at the high, low and no locations did not differ between groups.

Saccades in covert task

Since participants were not given explicit instructions regarding eye movements during the baseline and test phase of Experiment 2, aside from the need to fixate to begin each trial, it was possible to analyze changes from the baseline to test phase in the number of eye movements made to specific locations. Participants differed greatly in strategy, with some making very few saccades across the whole phase while others executed many eye movements (an across-subject average of 0.46 saccades per trial were made in the baseline phase (SD 0.40) and 0.43 saccades per trial in the test phase (SD 0.52)). Saccades also varied in size (from micro to regular saccades), occurred several or few times per trial, and could be executed at any point in the trial. Saccades for a given trial were analyzed according to the criteria outlined in the Data Analysis Section 3.1.3. Across participants, there were on average 28 saccades (SD 28) made between stimulus onset and manual response across all trials in the baseline phase, and 24 saccades (SD 35) made in the test phase. Two participants made no saccades during stimulus presentation in the baseline phase, with all participants making saccades in the test phase. For a given participant, the number of saccades towards each location in each phase was calculated as a percentage of the total number of saccades the participant made in the given timeframe in that phase. To assess whether saccades to the two no reward locations could be collapsed, we first carried out a two-tailed paired samples t-test as in previous analyses (comparing the no reward location in the same hemifield as high reward to the no reward location in the same hemifield as low reward). In contrast to our previous findings, we found a significant difference in saccades made towards those two locations, with more saccades made towards the no reward location in the low reward hemifield (−12.91% vs. 5.08% change in saccades towards no reward in high reward hemified compared to no reward in low reward hemifield; t(23) = 2.93, p = .008). Since these could not be collapsed to give average behavior across both locations, we carried out two separate repeated-measures ANOVA with high, low and each of the no reward locations as a within-subject factor and group (90° or 180°) as a between subject factor. Changes in saccade percentage from baseline to test phase per location can be seen in Fig. 8. The first ANOVA used the no reward location in the same hemifield as high reward; this showed no effect of group or group interaction (both p >.1), but did reveal a significant main effect of reward location (F(2,44) = 4.47, p = .017, η2 = 0.169), with a significant linear decrease in saccade percentage from high to no reward value (10.55%, 5.70%, and -12.91% for high, low and no reward locations respectively; F(1,22) = 6.77, p = .016, η2 = 0.235 for linear contrast). Pairwise comparisons (Bonferroni corrected) showed a significant difference between the high and no reward locations (t(23) = 2.66, p = 0.049) and a marginal difference between low and no reward (t(23) = 2.66, p = 0.065), with no significant difference between high and low reward location (p >.1). A similar analysis using the no reward location in the same hemifield as low reward showed no main effects or interactions of reward or group (all p >.1).

Fig. 8

Experiment 2 baseline/test phase results. Change in percentage of saccades from baseline to test phase for each location. “No(high)” is the no reward location residing in the same hemifield as high reward, and “No(low)” is the no reward location in the same hemifield as low reward. Error bars are ± SE normalized for within-subject design. The second experiment was established to provide an explanation for the outcome of Experiment 1; did the requirement to remain fixated and suppress saccades lead to the relative slowing seen at the high reward location? In Experiment 2, participants were free to move their eyes when the target was presented, and we observed that this no longer led to slowing at the high reward location, with participants showing similar RT gains for all locations after training. In Experiment 1, we found the relative slowing at the high reward location during the test phase was especially robust in the subset of participants who were trained with the high and low reward locations separated by 180°. In Experiment 2, a different sample of participants underwent exactly the same reward location training, but there we observed no RT differences to any location in the adapted test phase, regardless of the relative separation between high and low reward during training. Although this experiment was not established to specifically examine eye movements and microsaccades during the test phase, it was still possible to analyze saccade behavior of participants. In recent years, research on fixational eye movements has been providing evidence for the role of microsaccades in visual attention, indicating a microsaccade-saccade continuum rather than two discrete neuronal populations for fixation and saccade generation (Otero-Millan et al., 2008, Hafed et al., 2009, Krauzlis et al., 2017). Our results demonstrate the greatest increase in percentage of saccades (regardless of size) towards the high reward location. Specifically, when assessing locations with no reward in the same hemifield as high reward, we found a linear decrease for locations with decreasing reward value. When no and low reward were presented in the same hemifield, we observed similar eye movements behavior for both low and no reward locations (Fig. 8). This finding provides strong evidence for reward-driven within-hemifield competition. Previous research has shown slower manual response times to a target when presented in the same hemifield as a distractor compared to the contralateral hemifield (Hickey & Theeuwes, 2011), and a “global effect” in which the eyes land towards the center of gravity of two stimuli the closer they are placed together within a hemifield (Coren and Hoenig, 1972, Findlay, 1982). The greater increase in saccades towards the high reward location in the current study suggests its behavioral salience takes precedence and overt attentional priority is allocated there over the no reward location. Although we did not observe a significant difference in eye movements between groups or specifically between the high and low reward location in Experiment 2, the pattern of results from the greatest increase towards the high, low, then no reward location indicates a prioritization or bias in saccade activity towards locations of higher reward value. The difference between the high reward and no reward locations in the same hemifield are a further testament to a high reward prioritization, as this within-hemifield reward-driven competition does not occur between low and no reward locations presented within the same hemifield. Taken together, and in accordance with the previously discussed literature on the anti-saccade task, we propose that the relative slowing at the high reward location in Experiment 1 was due to the requirement for greater suppression of the learned saccade to that location. Our original question, whether improvements in saccade latency to differentially rewarded locations led to similar improvements in covert attention to those locations, was also answered here: since differences in saccade latency were seen between high and no reward locations during training, but no differences were observed in the covert attention task, this leads us to conclude that the learning of reward through eye movements did not transfer to covert attention in which a manual rather than saccade motor response was required.

General discussion

The current study investigated the extent to which selection history biases in the form of reward learning transfer across tasks, context and motor outputs. Specifically, we examined whether reward-based biases of overt attention to locations in space would transfer to another task and attentional domain. The underlying concept is that through reward-based learning, spatial priority maps can be shaped such that specific locations on the priority map become behaviorally more salient than other locations (Chelazzi et al., 2014, Hickey et al., 2014). According to Chelazzi et al. (2014) acquired biases of spatial attention are persistent, nonstrategic in nature, and should generalize across stimuli and task contexts. Previous studies demonstrating the transfer of acquired biases of spatial attention have generally used attentional tasks involving manual responses during both training and test phases (Chelazzi et al., 2014, Hickey et al., 2014). Because covert and overt attention are assumed to be linked by a common neural architecture (Beauchamp et al., 2001, Corbetta, 1998) we wished to determine whether the priority map shaped by eye movements would transfer to a different task that required only covert attention. Our results show that the eye movement task was successful in shaping the priority map as saccades to rewarded locations were executed faster than to non-rewarded locations. This implies that particular locations within the priority map became behaviorally more salient. However, following the training task, observers were not faster to execute a manual response to a target presented at the behaviorally most salient location, and were in fact slower. A second experiment in which participants were not obliged to stay fixated during the covert attention task did not reveal any benefits in manual response times to the more behaviorally salient location. On the basis of these findings, we conclude that reward-based training involving eye movements does not result in the shaping of a “general” spatial priority map which affects covert deployment of attention. In both experiments of our study, the training of overt eye movements led to a decrease in saccade latency to all locations as the training progressed; a general learning effect seen in most experimental settings (Anderson et al., 2011a, Anderson et al., 2011b, Le Pelley et al., 2015). Throughout training, saccades to high and low reward locations became significantly faster than those to no reward locations. In addition to reward availability, another factor that may have played a role in this general improvement to rewarded locations was information sampling. Eye movements to rewarded locations were always followed by information in the form of a reward value (10 cents or 1 cent), with no information to be obtained at the no reward locations. The opportunity to gain information has been shown to drive and influence saccade behavior, particularly in novel settings (Gottlieb, 2014, Gottlieb et al., 2014). However, given the reward-modulated difference found in the test phase was actually between the two specific locations that led to informative feedback during training, it is unlikely that the presence of information at those locations explains the observed result of the covert task. The reward-modulated reduction in saccade latency during learning is in agreement with much previous research reporting reduced response times to rewarded objects or locations, for both overt and covert attention tasks (Failing and Theeuwes, 2014, Hickey et al., 2014, Milstein and Dorris, 2007, Theeuwes and Belopolsky, 2012). The past selection history bias described by Awh et al. (2012) is implemented here in the form of reinforcement learning, during which humans and animals voluntarily choose (or avoid) actions that have previously been reinforced in a positive (or negative) manner (Dayan and Balleine, 2002, Schultz, 2006, Skinner, 1938, Sutton, 1998). Repetition of a chosen action based on the reward received for that behavior is known to lead to a more accurate and speeded response, becoming automatic over time (Berridge and Robinson, 1998, Della Libera and Chelazzi, 2009, Hickey et al., 2010). This conversion from voluntary actions to automatic or habitual behavior is suggested to occur via distinct, parallel circuits in the brain (Dickinson et al., 1983, Hikosaka et al., 2017) and has been linked to maladaptive behaviors, such as various forms of addiction (Everitt and Robbins, 2005, Everitt and Robbins, 2013). The outcome of the covert attention task is different to that found in other studies demonstrating reduced manual RTs at spatial locations associated with higher reward (Chelazzi et al., 2014, Failing and Theeuwes, 2014, Stankevich and Geng, 2014). However, in all of these other studies the required response was only ever manual, in both the training and test phases. The covert discrimination task in both of our experiments revealed a decrease in manual RTs at all locations after training, again demonstrating a general practice effect. However, in Experiment 1 the covert task additionally showed a relative slowing of responses to the target when it was presented at the high reward location. Specifically, there was a significant increase in RT to the high compared to the low reward location. Both no reward locations showed identical reductions in RT. Since the two no reward locations were always in opposite hemifields to each other, this revealed a location-specific rather than general hemifield disadvantage in responses to the high reward location. The difference between the high and low reward location during the test phase shows that, although eye movement latencies followed a similar pattern for these locations during learning, this general reward-driven improvement did not transfer to the covert task. Instead, biased competition between rewarded locations led to a clear separation in how motivational or behavioral biases translated across the two tasks. The relative slowing at the high reward location during the test phase of Experiment 1 may be explained by an increased motivational association between the high reward location and the eye movement response, resulting in the slowing down in manual responses to that location during covert attention when the learned eye movement response was suppressed. Such a reward-based link between location and eye movements has been demonstrated previously in monkeys, specifying the SNr and CN of the basal ganglia as the brain regions responsible for this association (Ding and Hikosaka, 2006, Hikosaka et al., 2000, Hikosaka et al., 2014, Nakamura and Hikosaka, 2006, Hikosaka et al., 2006, Yamamoto et al., 2012, Yasuda et al., 2012). Since LIP has been proposed as an integrated priority map, and forms recurrent connections with FEF (Hikosaka et al., 2000), both of these brain areas are relevant for both the biased competition model (Desimone & Duncan, 1995) and the past selection history framework (Awh, Belopolsky & Theeuwes, 2012). Extensive research on the anti-saccade task describes the suppression of the automatic pro-saccade to the target as the first step in performing an anti-saccade (Amador et al., 1998, Bell et al., 2000, Munoz and Everling, 2004). This inhibition of saccades account provides an explanation for the greater RT slowing observed at the high reward location in Experiment 1. This was made especially clear via our analysis on those participants who were trained with the high and low reward locations at mirror-opposite locations to each other, representing the case most similar to the anti-saccade task. Our analogy to the anti-saccade task may be limited since the covert attention task in our study was not established to examine eye movements in one direction or the other, and instead was supposed to be constrained in this overt behavior. However, given the extensive previous research on the role of fixational eye movements in visual attention, we believe such an analogy is useful and relevant for explaining the mechanisms involved. The fact that different task contexts were used across the training and test phases of the current study may also explain why we did not observe a transfer from the training to test phase as was reported by Chelazzi et al. (2014). During the training phase of our experiment, participants were presented with circle stimuli at specific locations. Stimuli during the test phase were letter shapes presented at these locations, shown in a different color to the training stimuli. In the study of Chelazzi and colleagues, all stimuli were black and white and positioned inside square place-holders that were identical across the training and test phases. Due to this design, it is likely that in their study participants could more easily assign value to the consistent square objects tied to specific spatial locations. Our attempt to make the reward associations entirely spatial-based in nature, with no consistent stimulus features across tasks, may have interfered with such a transfer mechanism. In conclusion, based on our findings we suggest that spatial-based reward priority is intimately linked to the learned motor response.

91 in total

Overt and covert attention to location-based reward.

Introduction

Experiment 1

Materials and methods

Participants

Stimuli and apparatus

Training phase

Test phase

Data analysis

Results

Exclusions in overt task

Reward learning in overt task

Exclusions in covert task

Improvements in covert task

Improvements for relative reward locations

Effect of pre-mask duration

Discussion

Experiment 2

Saccades in covert task

General discussion

1. Updating the premotor theory: the allocation of attention is not always accompanied by saccade preparation.

Review 2. Top-down and bottom-up control of visual selection.

3. What and where information in the caudate tail guides saccades to visual objects.

Review 4. Attention, reward, and information seeking.

Review 5. Basal ganglia circuits for reward value-guided behavior.

6. Fixation cells in monkey superior colliculus. I. Characteristics of cell discharge.

7. The role of visual attention in saccadic eye movements.

8. Visual attention mediated by biased competition in extrastriate visual cortex.

Review 9. Top-down versus bottom-up attentional control: a failed theoretical dichotomy.

10. OpenSesame: an open-source, graphical experiment builder for the social sciences.

1. Awareness is necessary for attentional biases by location-reward association.

2. Pavlovian learning in the selection history-dependent control of overt spatial attention.

Review 3. Associations and Dissociations between Oculomotor Readiness and Covert Attention.

4. Attentional Orienting by Non-informative Cue Is Shaped via Reinforcement Learning.