Literature DB >> 27570471

Selective scanpath repetition during memory-guided visual search.

Jordana S Wynn¹, Michael B Bone¹, Michelle C Dragan², Kari L Hoffman³, Bradley R Buchsbaum¹, Jennifer D Ryan¹.

Abstract

Visual search efficiency improves with repetition of a search display, yet the mechanisms behind these processing gains remain unclear. According to Scanpath Theory, memory retrieval is mediated by repetition of the pattern of eye movements or "scanpath" elicited during stimulus encoding. Using this framework, we tested the prediction that scanpath recapitulation reflects relational memory guidance during repeated search events. Younger and older subjects were instructed to find changing targets within flickering naturalistic scenes. Search efficiency (search time, number of fixations, fixation duration) and scanpath similarity (repetition) were compared across age groups for novel (V1) and repeated (V2) search events. Younger adults outperformed older adults on all efficiency measures at both V1 and V2, while the search time benefit for repeated viewing (V1-V2) did not differ by age. Fixation-binned scanpath similarity analyses revealed repetition of initial and final (but not middle) V1 fixations at V2, with older adults repeating more initial V1 fixations than young adults. In young adults only, early scanpath similarity correlated negatively with search time at test, indicating increased efficiency, whereas the similarity of V2 fixations to middle V1 fixations predicted poor search performance. We conclude that scanpath compression mediates increased search efficiency by selectively recapitulating encoding fixations that provide goal-relevant input. Extending Scanpath Theory, results suggest that scanpath repetition varies as a function of time and memory integrity.

Entities: Disease Gene Species

Keywords: Eyetracking; relational memory; scanpath; visual search

Year: 2016 PMID： 27570471 PMCID： PMC4975086 DOI： 10.1080/13506285.2016.1175531

Source DB: PubMed Journal: Vis cogn ISSN： 1350-6285

Visual search can often be a frustrating and laborious process. Indeed, games like “Where’s Waldo” and “I Spy” capitalize on the slow and serial nature of feature-based search. Yet, in our everyday lives we search for and locate common objects like keys and gloves with ease. What is it that makes these searches more efficient? Whereas search games rely on our ability to compare a set of presented visual features with a set of provided target features, real-world search capitalizes on information gained from prior experience. Over time, we acquire relational memory representations for the relative positions among items and surrounding contexts that are repeatedly encountered together. During naturalistic viewing, these relational representations, in concert with other endogenous factors such as task instructions (Borji & Itti, 2014; Castelhano, Mack, & Henderson, 2009; Draschkow, Wolfe, & Vo, 2014; Henderson, Shinkareva, Wang, Luke, & Olejarczyk, 2013; Tatler & Tatler, 2013; Yarbus, 1967) and semantic knowledge (De Graef, 1998, 2005; Draschkow et al., 2014; Einhäuser, Spain, & Perona, 2008; Henderson, Brockmole, Castelhano, & Mack, 2007; Henderson, Weeks, & Hollingworth, 1999; Loftus & Mackworth, 1978; Neider & Zelinsky, 2006a; Oliva, Torralba, Castelhano, & Henderson, 2003; Stirk & Underwood, 2007; Torralba, Oliva, Castelhano, & Henderson, 2006) can augment or override salient exogenous features for control of scanning behaviour (for review see Henderson, 2003; Vo & Wolfe, 2015). However, whereas bottom-up control mechanisms have been extensively studied and modelled (Itti & Koch, 2000, 2001; Mackworth & Morandi, 1967; Parkhurst, Law, & Niebur, 2002; Tatler, Baddeley, & Gilchrist, 2005; for review see Tatler, Hayhoe, Land, & Ballard, 2011) the mechanisms underlying memory-guided search efficiency gains have been scarcely investigated. Here, we examine the spatial and temporal similarities between sequences of eye movements during a repeated visual search task, with the aim of understanding how relational memory representations elicited by stimulus repetition guide efficient target detection. Research using eye movement monitoring suggests that eye movements are intimately linked to mnemonic processes (Brockmole & Henderson, 2006; Brockmole & Irwin, 2005; Hannula, Baym, Warren, & Cohen, 2012; Hannula, Ryan, Tranel, & Cohen, 2007; Hannula & Ranganath, 2009; Ryan & Cohen, 2004; Ryan, Hannula, & Cohen, 2007; Ryan & Villate, 2009; for review see Hannula, Althoff, Warren, Riggs, Cohen, & Ryan, 2010; Henderson & Hollingworth, 1998). Eye-movement-based repetition effects have been reported across a number of tasks and may provide a mechanism for the increased efficiency with which repeated stimuli are processed and subsequently recalled. Relative to novel stimuli, repeated stimulus displays elicit fewer fixations and regions sampled (Althoff & Cohen, 1999; Ryan, Althoff, Whitlow, & Cohen, 2000), more predictable viewing patterns (Althoff & Cohen, 1999), increased memory accuracy for target locations (Brockmole & Henderson, 2006), and speeded target search (Brockmole & Henderson, 2006; Chau, Murphy, Rosenbaum, Ryan, & Hoffman, 2011; Chun & Jiang, 1998, 2003; Peterson & Kramer, 2001; Tseng & Li, 2004). Faster detection of invariant targets in repeated relative to novel distracter configurations (i.e., contextual cueing) further suggests that the search time benefit conferred by memory for relations cannot be explained by memory for absolute target locations alone (Brockmole & Henderson, 2006; Chun & Jiang, 1998, 2003). Extending the role of relational memory in visual search to real-world scenes, Chau et al. (2011) tracked participants’ eyes while they performed a complex flicker change detection task. Both healthy humans and macaques detected targets faster in repeated displays relative to novel displays, and in humans this rapid target detection corresponded to explicit memory for targets. Indeed, remembered targets could be distinguished from forgotten targets based on search times alone. However, consistent with previous findings of relational memory and eye-movement-based memory deficits in individuals with amnesia (Hannula et al., 2007; Hannula et al., 2015; Ryan et al., 2000; for review see Hannula et al., 2010), the amnesic case DA, who has bilateral medial temporal lobe (MTL) damage, showed no explicit memory or search time benefits for repeated displays on the same task (Chau et al., 2011). Taken together, findings of increased search efficiency and reduced eye movement exploration during repeated search events, and the absence of these effects in an individual with amnesia, suggest that information maintained in memory regarding the relative positions among targets and distracters (i.e., relational memory) is used to support target detection during subsequent viewings. Critically however, it remains unclear how these relational representations guide efficient target detection during repeated search events. Some researchers have proposed that binding of display elements into a unitized representation underlies efficiency gains on repeated search trials (Fisk & Rogers, 1991; Schneider & Fisk, 1984; Shiffrin, 1988). On this account, processing of repeated stimuli is thought to be “fast, parallel” and “fairly effortless” (Schneider & Fisk, 1984). Yet, other researchers have suggested that when targets are discriminated on the basis of multiple conjunctive features, search proceeds serially (Treisman & Gelade, 1980). In line with this view, Noton and Stark’s (1971a, 1971b) seminal Scanpath Theory proposes that repeated displays are investigated sequentially and in the same manner in which they were initially encoded as “an alternating sequence of sensory and motor memory traces” or “scanpath” representing image features and the associations between them. According to the scanpath model, recapitulation of the novel viewing scanpath during subsequent viewing facilitates comparison of present perceptual input with stored sensory-motor memory representations, supporting memory retrieval. Although largely speculative, Scanpath Theory provides a meaningful framework for thinking about and interpreting the relationship between eye movements and memory. Applied to visual search, Scanpath Theory might suggest that during repeated search events, complementary sensorimotor and visual inputs cue memory for the visual display, which in turn facilitates memory for the target location in relation to the scene. Here, we suggest that scanpath recapitulation might provide a mechanism by which relational memory representations support efficient target detection during repeated search events. In its strictest interpretation, Scanpath Theory predicts that failure to repeat eye movements from image encoding at subsequent retrieval will result in memory errors, while successful memory will be accompanied by serial recapitulation of the encoding scanpath. Indeed, several studies have demonstrated scanpath recapitulation during viewing of repeated stimulus displays (Blackmon, Ho, Chernyak, Azzariti, & Stark, 1999; Foulsham & Underwood, 2008; Foulsham et al., 2012; Holm & Mantyla, 2007; Josephson & Holmes, 2002; Noton & Stark, 1971a, 1971b; Underwood, Foulsham, & Humphrey, 2009), search configurations (Choi, Mosley, & Stark, 1995; Henderson et al., 2007; Myers & Gray, 2010; Stark et al., 1992), and imagined stimuli (Brandt & Stark, 1997; Humphrey & Underwood, 2008; Johansson & Johansson, 2014; Johansson, Holsanova, & Holmqvist, 2006; Laeng & Teodorescu, 2002; for review see Ferreira, Apel, & Henderson, 2008). Notably, scanpath recapitulation in these experiments is greater than would be expected based on subject-specific viewing tendencies, chance, or visual saliency. Yet, few studies have assessed the correlation between scanpath repetition and memory, with those few yielding mixed results. Using simple grid stimuli, Laeng and Teodorescu (2002) found that the degree of similarity between the position of eye movements at perception and imagery predicted accuracy on a subsequent spatial memory test. Conversely, a study by Foulsham et al. (2012) found that only similarity in fixation durations predicted memory accuracy on a picture recognition task. In the visual search literature, as in the recognition literature, the relationship between scanpath recapitulation and memory-based efficiency gains remains poorly understood. Findings of fewer fixations (Althoff & Cohen, 1999) and speeded target detection (Brockmole & Henderson, 2006; Chun & Jiang, 1998, 2003) during repeated search events suggest that Scanpath Theory in its original formulation (anticipating serial, feature-by-feature fixation recapitulation) cannot account for search efficiency gains. Nonetheless, scanpath recapitulation has been reported for repeated search events, even when search efficiency is increased. Using a simple visual search task, Myers and Gray (2010) found that with repetition, scanpaths both decreased in length (number of fixations) and increased in similarity (within each repetition epoch), collectively termed “adaptive scanning”. Applying a more liberal interpretation of Scanpath Theory, Myers and Gray propose that scans from identical search arrays are more similar than random scans owing to the “packaging” of saccades into an efficient sequence that can be readily repeated when the same array is encountered again. Critically for the purposes of the present study, however, it remains unclear which parts of the scanpath are maintained in the “packaged sequence” as scanpaths decrease in length, and how this repetition relates to mnemonic performance. According to Scanpath Theory, image recognition is achieved by two component processes: “reproducing the successive eye-movement memories” and “verifying the successive feature memories” (Noton & Stark, 1971a). Assuming that repetition-based search efficiency gains rely on similar processes, we can make several predictions about the eye movement patterns that will be elicited during repeated visual search. First, we can predict that the scanpath made during novel search will be recapitulated during repeated search, reflecting mnemonic processes. This has been demonstrated previously in several studies (Choi et al., 1995; Henderson et al., 2007; Myers & Gray, 2010; Stark et al., 1992), however with poor saliency controls. Second, we can predict that recapitulation of the novel search scanpath at repeated search will correlate with search performance. Despite being a central feature of Scanpath Theory, the relationship between scanpath repetition and behavioural memory performance has been scarcely investigated. Finally, we can predict that scanpath recapitulation will vary as a function of memory integrity, indexed by age. If, as Scanpath Theory assumes, there is a direct and necessary link between scanpath repetition and memory, the degree of scanpath repetition should mirror the degree of memory impairment. Accordingly, we may expect to see a decline in recapitulation as a function of age-related memory loss. Alternatively, we might think of the scanpath as playing a supporting role in memory retrieval. In this case, we may expect to see greater scanpath recapitulation in older relative to younger adults to support the maintenance and retrieval of relational information given declining memory and perceptual processes. To test the prediction of scanpath recapitulation, we used a modified version of the string-edit distance (SED) method, a widely-used quantitative measure of scanpath similarity (Brandt & Stark, 1997; Choi et al., 1995; Hacisalihzade, Allen, & Stark, 1992; Josephson & Holmes, 2002; Privitera & Stark, 2000; Stark & Ellis, 1981; for review see Duchowski et al., 2010). While a strict interpretation of Scanpath Theory might predict complete reinstatement of the encoding scanpath at repeated search (or an SED score of zero), findings of increased search efficiency during repeated events suggest that this is not the case. An alternative prediction that has not yet been tested is that scanpath recapitulation during repeated viewing events is incomplete. Accordingly, to examine the extent to which there is scanpath recapitulation during subsequent viewings and how such recapitulation relates to visual search, we employed a sliding window analysis to assess the similarity (SED) between contiguous subsets of corresponding (same subject and image) novel and repeated viewing fixations at multiple time points across the search scanpath. This process critically enabled us to determine which elements of the scanpath are recapitulated during search through previously viewed scenes. Moreover, using the sliding window, we were able to evaluate our second prediction by examining how similarity at different time points correlates with search performance. Finally, to test the prediction that variability in relational memory function is associated with variability in scanpath repetition, we tested both younger and older adult participants. Across a variety of tasks, older adults demonstrate impaired memory for the relations among objects, including object-object associations (Castel & Craik, 2003; Moses, Ostreicher, & Ryan, 2010; Moses, Villate, Binns, Davidson, & Ryan, 2008; Naveh-Benjamin, 2000; Naveh-Benjamin, Hussain, Guez, & Bar-On, 2003; Naveh-Benjamin, Keshet Brav, & Levi, 2007; for review see Craik & Rose, 2012) and object-location associations (Brandstatt & Voss, 2014; Ryan, Leung, Turk-Browne, & Hasher, 2007; for review see Old & Naveh-Benjamin, 2008). As well, older adults often show a decline in volume and function in the hippocampus and extended MTL system (Chalfonte & Johnson, 1996; Dennis et al., 2008; Driscoll et al., 2003; Driscoll et al., 2009; Mitchell & Johnson, 2009; Old & Naveh-Benjamin, 2008), which are critical for relational memory function (Ryan & Cohen, 2003; Ryan et al., 2000; for review see Olsen, Moses, Riggs, & Ryan, 2012). Accordingly, if the scanpath is indeed related to relational memory processes, we should see differences in scanpath recapitulation across younger and older adults. The direction of these differences will additionally help to elucidate the nature of the scanpath itself and its relationship with memory processes. In the present study, we introduce a novel saliency-controlled and temporally-binned scanpath similarity analysis to more rigorously investigate the claims of Scanpath Theory during a repeated visual search task. Scanpaths were compared from corresponding novel and repeated visual search events during which younger adults (YA) and older adults (OA) searched for changing targets within flickering naturalistic scenes. Search effects were indexed using both behavioural (search time) and eye movement (number of fixations, fixation duration, scanpath similarity) measures and compared across age groups. Using a novel eye movement pattern similarity analysis (i.e., controlled scanpath similarity), we show that scanpath recapitulation during repeated search events is selective to initial and final novel viewing fixations. Extending previous work, we further demonstrate that scanpath recapitulation and the relationship between scanpath recapitulation and memory-guided search performance vary as a function of memory integrity (age) and time in the search process.

Materials and methods

Participants

Seventeen healthy young adults (five males; age: M = 22.8 years, SD = 3.1), ages 19–32, and 21 older adults (five males; age: M = 67.3 years, SD = 8.5), ages 55–80, participated in the study. All participants had corrected-to-normal vision. One younger adult was excluded for amblyopia. Four older adults were excluded for failure to comply with task instructions resulting in missing >15% of trial data (having fewer than 16 trials). One further older adult was removed due to an error in data recording. Data was analyzed from the 32 remaining participants (16 YA, 16 OA). Younger adults were recruited through York University’s Pond Road Residence and participated as volunteers. Older adults were recruited through the Rotman Research Institute’s adult participant pool and were compensated at a rate of $10 CAD/hr for their participation. All participants provided informed consent prior to participating in the experiment in accordance with the ethical guidelines of the Rotman Research Institute and York Human Participants Review Subcommittee.

Stimuli

Stimuli consisted of 20 naturalistic images (Figure 1) depicting a wide variety of real-world scenes including city and rural, indoor and outdoor, and scenes containing a variety of people, wildlife, buildings, and objects (described previously in Chau et al., 2011). Scenes were displayed at 1280 × 1024 pixel resolution. One object in each scene was manipulated in appearance (object removed) using Adobe Photoshop (San Jose, CA). Target objects were balanced for location (screen quadrant), size and animacy. Each image is seen both in its original (Figure 1A) and manipulated (Figure 1A’) form.

Figure 1

.Example stimulus image. (A) Unmanipulated image, (A’) Manipulated image where target is removed.

Apparatus

Eye movements were monitored throughout the experiment using a remote iView X infrared eye tracking system at 60Hz sampling rate (SensoMotoric Instruments, SMI, Berlin, Germany; described previously in Chau et al., 2011). Pupil and corneal reflectance values were sent from iView to Presentation software (NeuroBehavioral Systems, CA, USA), allowing for online detection of fixation locations and durations. Image selection, presentation timing, and response buttons were also controlled in Presentation. To minimize head movements, participants were required to place their chin on a chin rest positioned in front of a 38.0 cm by 30.5 cm display screen (1280 × 1024 pixel resolution). The screen was positioned approximately 50 cm away from young subjects and 62 cm away from older subjects. A 13-point eye movement calibration and validation was performed prior to the start of the experiment, between experimental blocks, and in the case of readjustment.

Procedure

The present experiment used a modified version of the flicker change detection visual search task used in Chau et al., 2011 (Figure 2). Older adults completed the Montreal Cognitive Assessment (MoCA; Nasreddine et al., 2005), a brief standardized neuropsychological test developed to screen for cognitive impairment, prior to the start of the experiment. Eye movements were calibrated using 13-point calibration and validation. During the experiment, subjects viewed 20 novel real-world scene images in sequence for a maximum of 45 s each. Each trial consisted of the alternating presentation of an original (unmanipulated) scene (Figure 1A) and a changed (manipulated) scene (Figure 1A’), each displayed for 500 ms, with an interleaving grey screen (50 ms), see Figure 2. The interleaving grey screen makes the target object difficult to find, necessitating an active search strategy (Resink, O’Regan, & Clark, 1997; Simons & Levin, 1997). Participants were instructed to “try to find the changing object” before the start of the experiment. Each trial was terminated following either a 1 s fixation on the target or 45 s period. The end of each trial was signalled by a flickering of the changing target for 4 s by removing the interleaving grey screen. This made the change obvious to the viewer and ensured that all subjects had an opportunity to remember the target for subsequent viewings. Each trial was followed by a verbal report screen with three questions displayed in sequence: “Was this the first time you saw this picture?”, “Did you remember what the target object was?” and “Did you remember where the target object was?” Yes/no responses were recorded, but were not included in the present analysis. Each question was followed by a confidence interval, asking the participant to rate their confidence in their answers on a scale of 1–5 (1 = unsure, 5 = very sure). A black screen was presented between trials for 5 s. After all 20 novel images were searched (novel viewing), participants searched the same 20 images again in a new sequence (repeated viewing).

Figure 2.

A trial sequence from the modified flicker change detection/visual search task.

Data analysis

Fixation times and locations were calculated by iView X iTools IDF Event Detector using a dispersion based algorithm with a minimum fixation duration of 80 ms and dispersion threshold of 100 pixels, or approximately three degrees of visual angle (see Salvucci & Goldberg, 2000, for a full description of the algorithm). Fixation times and locations were analyzed using MATLAB (Natick, MA). Search time was defined as the time from trial onset to the first fixation in the target region of interest (ROI). The algorithm for target detection selected the first fixation in the ROI that preceded 1 s of fixating in the ROI with a maximum of one fixation outside the ROI (Chau et al., 2011). Data was tracked from the left eye in all but two participants (1 OA, 1 YA). Fixations outside the screen dimensions were removed from all analyses. Trials were excluded from all analyses on the basis of calibration or target triggering errors, or where >20% of fixations were made off-screen. Trials on which the target was not found at novel viewing and trials on which fewer than three fixations were made at novel or repeated viewing were excluded from the scanpath similarity analysis to ensure that high similarity is not due to repetition of a direct path to the target. These trials were excluded only for the scanpath similarity analyses and were retained for all other analyses. The first fixation was included in the similarity analyses, however, the final fixation in each trial was removed prior to conducting similarity analyses to account for the fact that the final fixation in each trial was inside the target ROI. Finally, the correlation analysis excluded trials on which there were fewer than seven fixations at novel or repeated viewing. This was necessary to ensure that each trial contained sufficient fixations to conduct a temporally-binned analysis. In the older adults, MoCA scores ranged from 22–30 (where 26 and above is a pass), with a mean score of 25.82 (SD = 2.59). MoCA scores did not significantly correlate with any measures of task performance (Table A1 in Appendix) and as such were not a consideration for exclusion.

Table A1.

Correlation of MoCA score with measures of task performance in older adults.

	Pearson correlation	Sig. (2-tailed)
Mean novel search time	−.111	0.682
Mean repeated search time	−.190	0.482
Mean novel # fixations	−.228	0.395
Mean repeated # fixations	−.223	0.406
Mean novel fixation duration	.175	0.516
Mean repeated fixation duration	.187	0.488

To determine whether search efficiency improved as a function of repetition, average search time, number of fixations, and fixation duration were computed for each trial. A 2 × 2 repeated measures ANOVA was performed on each measure, with repetition set (novel viewing or repeated viewing) as the within subjects factor and age (older adult or younger adult) as the between subjects factor. Planned comparisons were conducted to assess age differences in repetition benefits on each measure (repeated minus novel trials). To determine whether novel viewing scanpaths were recapitulated at repeated viewing we devised a novel measure of scanpath similarity (called “controlled scanpath similarity”), based on the SED method (Brandt & Stark, 1997; Choi et al., 1995; Foulsham & Underwood, 2008; Hacisalihzade et al., 1992; Privitera & Stark, 2000; Stark & Ellis, 1981). In short, the SED method computes the minimum number of editing steps required to convert one eye movement sequence into another. Where fixations fall within the same region, no editing is needed. While multiple fixations in the same cluster are treated independently, the described fixation detection algorithm accounts for sampling overlap by treating multiple samples within a predefined spatial window as a single fixation. For the present study, we made several modifications to the SED method. First, where the SED method often uses a grid, we used a clustering algorithm to group proximate fixations according to data-driven regions of interest. Second, where the SED method yields a single score to reflect the similarity of two fixations sequences, we used a sliding window to generate multiple values reflecting the similarity of temporally contiguous subsets of corresponding novel and repeated viewing fixations across the search process. Finally, we devised a novel control measure in order to isolate the effect of image-specific memory on scanpath similarity from the influence of image saliency, described here as any non-mnemonic viewing guidance. Details of this method are described further below.

Fixation clustering

In order to calculate scanpath similarity, fixations must be spatially subdivided according to some predefined criterion such that corresponding fixations on different strings will be considered “similar” if they fall within the same region of space. This is often accomplished by dividing the stimulus image into an evenly spaced grid. Critically however, grid lines can divide fixations on a single target object into two or more clusters. Another commonly used method of clustering is to define regions of interest around target objects. However, this method relies on a-priori assumptions of regional saliency. While saliency is often measured using perceptual features like orientation, luminance, and contrast (see Itti & Koch, 2000), research suggests that semantically salient items, identified as such based on prior knowledge and expectations, can similarly drive viewing (for review see Tatler et al., 2011). Given the above considerations, we used a clustering algorithm (Rodriguez & Laio, 2014) to define data-driven regions of interest based on image-specific viewing tendencies across subjects, with natural fixation clustering taken to reflect high regional saliency (see Figure 3). Clusters were defined for each image using all fixation points from each subject during novel and repeated viewing of that image. Cluster boundaries were selected such that the smallest possible cluster could not be smaller than the 100 pixel dispersion used to define fixation points and no two cluster centres could exist within the same 100 pixel dispersion (based on the dispersion algorithm in Salvucci & Goldberg, 2000). This method ensures that clusters cannot bisect regions that would be considered a single fixation. Fixation clusters are assumed to reflect a combination of bottom-up and top-down viewing influences, guiding eye movements to visually and semantically salient scene regions.

Figure 3.

Across-subject fixations for Trial 1, clustered according to the described algorithm. Clusters are indicated by variations in shape and colour. (A) Scene image to which fixation clusters correspond, (B) Unclustered fixations from all subjects, (C) Clusters given centre of δ > 50.

Saliency control

While Scanpath Theory predicts that remembered images will elicit repetition of the pattern of eye movements produced at stimulus encoding, saliency-based models of eye movement monitoring make the same prediction based on the assumption that viewing at any given point in time will be directed to the most visually prominent or salient region of an image (Itti & Koch, 2000; Koch & Ullman, 1985; Treisman & Gelade, 1980). Accordingly, several studies using scanpath similarity as an index of memory performance have included a visual saliency control to account for the unchanging low-level visual features that attract gaze during both novel and repeated viewing (Itti & Koch, 2000; Koch & Ullman, 1985; Treisman & Gelade, 1980) Critically however, these studies often overlook other non-mnemonic viewing influences like object-level information (Einhäuser et al., 2008), viewing biases (Tatler & Vincent, 2009), and semantic knowledge and expectations (Loftus & Mackworth, 1978; Yarbus, 1967) which have likewise been shown to predict eye movements, in some cases even better than visual-saliency-based models. Indeed, incorporating these factors into saliency models has been shown to significantly increase their predictive power (Cerf, Frady, & Koch, 2009; Einhäuser et al., 2008; Tatler & Vincent, 2009). Thus, to maximize the empirical validity of our measure, we account for both bottom-up and top-down viewing influences in our saliency-based control. Using the described clustering algorithm, we converted each subject’s novel viewing fixation sequence into a string of cluster values reflecting both the spatial and temporal ordering of fixations for the given image trial (i.e., A-F-L-B-D). Next, using the clusters assigned to each novel viewing fixation (from all subjects) by the described algorithm, we generated Markov probability matrices for each image in order to construct saliency-based control fixation sequences (scanpaths). The first matrix represents the probability that a fixation in cluster A (for example) will be followed by a fixation in cluster B (forward-computed), while the second matrix represents the probability that a fixation in cluster A was preceded by a fixation in cluster B (backward-computed). Additionally, for each cluster value we determined the probability that it was the first in the sequence and the probability that it was the last in the sequence. These probabilities were used to generate an expected sequence of fixations for each image, equal in length to the given subject’s novel viewing string for that trial. We generated the first and last fixations in the control sequence using the first and last fixation probability vectors described above. We then generated the first half of the sequence using the forward-computed probability matrix and the second half of the sequence using the backward-computed matrix. This method ensures that the control string is an accurate approximate representation of an average novel viewing sequence (particularly at the end of the sequence, due to the likelihood that the final fixations are located near the target). Where the number of fixations in the sequence was uneven, the middle point was generated by multiplying the forward-computed and backward-computed probability vectors and normalizing the result. The described process was repeated 50 times, yielding 50 Markov-probability-generated fixation sequences of equal length to the given fixation sequence. In short, by using real data from the encoding scanpaths of actual subjects, the saliency control scanpath captures any viewing tendencies that emerge across subjects. These include a tendency to preferentially view regions that are near the centre of the screen, regions high in semantic content, regions high in luminance or contrast, and regions proximal to the starting and ending points of search, as well as any additional regions that show disproportionate viewing across subjects.

Controlled scanpath similarity

As described above, within each subject and trial specific scanpath, characters were assigned to each fixation based on the fixation’s cluster location (see Figure 4), with the resulting character string reflecting regions sampled in the order in which they were fixated. To compare fixation sequences, we used a modified version of the SED method whereby one string is converted into another by inserting, deleting, or replacing characters, with the editing cost of each operation set at 1. Note that in the present study, we used a sliding window of three fixations to compare subsets of corresponding novel and repeated viewing fixations across the scanpath. Accordingly, sequences being compared were always equal in length at three fixations long, precluding the need for insertion and deletion operations; only replacement and swapping (this operation was added for the present study) operations were used. Where the same fixation in both strings falls within the same cluster, no operation is necessary. The number of operations required is summed to yield a SED score, reflecting the similarity of the two fixation sequences.

Figure 4.

Process for calculating controlled scanpath similarity. Characters are assigned to novel, repeated, and control fixations based on cluster locations. String-edit distance scores are computed for novel and repeated viewing strings and for control and repeated viewing strings. SEDc is an average of the string-edit distance scores for the repeated-viewing string to all 50 control strings. The controlled scanpath similarity score reflects the similarity of the repeated-viewing scanpath to its corresponding novel-viewing scanpath, controlled for saliency. The controlled scanpath similarity score has a baseline of 0. SED scores were computed for each subject and trial specific repeated viewing scanpath to its corresponding novel viewing scanpath (call this SEDi) and for each repeated viewing scanpath to each of the 50 saliency-based control strings of equal length. Scores were than averaged across the 50 control strings, yielding a single saliency-based control SED score (call this SEDc). We then divided the SED score derived from the actual novel and repeated viewing fixations by the SED score derived from the saliency-based control strings (SEDi/SEDc). Finally, we subtracted the resulting value from 1, such that a final score of 0 reflects complete similarity between SEDi and SEDc (SEDi = SEDc), indicating that repeated viewing fixations are no more similar to novel viewing fixations than to a saliency-based control sequence, and a score of 1 reflects complete similarity between repeated and novel viewing fixations (SEDi = 0). Where the repeated viewing sequence is more similar to the control sequence than to the novel viewing sequence (SEDc < SEDi), the final score will be negative. The final value derived from the equation: 1 - (SEDi/SEDc) is the controlled scanpath similarity score and has a maximum value of 1 and a baseline of 0 with no minimum possible value. The complete process is depicted in Figure 4.

Sliding window analysis

Whereas existing measures of scanpath similarity derive a single score reflecting the similarity of two distinct scanpaths, the present similarity analysis used a sliding window approach to compute multiple similarity scores for each image, reflecting the similarity of corresponding novel and repeated search scanpaths at multiple points in time. This method critically enabled us to test our prediction of incomplete scanpath repetition as well as measure how scanpath recapitulation changes as a function of time in the search process. By calculating similarity for subsets of novel and repeated viewing fixations separately, rather than the entire sequence at once, we were able to determine broadly which novel viewing fixations were recapitulated at repeated viewing. In the present study, we used a sliding window of three fixations. For each analysis, the number of window positions was determined using the average number of novel viewing fixations within each age group, subtracting 2 to account for the last 3-fixation window (window length = M nov - 2; OA = 38, YA = 24). For our first analysis, the sliding window was applied to both novel and repeated viewing fixations, such that similarity was computed for every possible pairing of novel and repeated viewing windows (Figure 5A). To further investigate which repeated viewing fixations were driving significant similarity peaks, we conducted the same analysis using a sliding window applied to novel viewing fixations only, with similarity computed for every novel viewing window to a window containing the first three repeated viewing fixations (Figure 5B) or the last three repeated viewing fixations (Figure 5C). The analyses are described in further detail below.

Figure 5.

Visualization of sliding window similarity analysis applied to fixations from corresponding novel and repeated viewing scanpaths. Similarity is computed for all novel viewing fixations to: (A) all repeated viewing fixations, (B) the first three repeated viewing fixations, (C) the last three repeated viewing fixations. Panel D demonstrates how SED would be calculated for the comparisons depicted in panels A–C. SED scores (column 2, rows 3–5) reflect the minimum number of editing operations required to equate the 3-fixation sequences in the corresponding windows (indicated by the column and row numbers). For example, the editing cost of equating novel viewing window 3 (C-J-M) to repeated viewing window 3 (E-J-M) is 1, reflecting the single replacement operation required. The bottom row contains the average of SED scores at each novel viewing window corresponding to the analysis depicted in panel A. The highlighted rows contain the SED scores corresponding to the analyses depicted in panel B and panel C, respectively. Note, that the scores here reflect only the comparison of novel and repeated viewing strings (SED). In the actual analysis, these similarity scores are controlled for saliency (see Figure 4 for this process). For our first analysis, we calculated controlled scanpath similarity for every 3-fixation repeated viewing window to every corresponding 3-fixation novel viewing window (Figure 5A). For each trial (excluding those that did not meet the inclusion criteria for this analysis), controlled similarity scores were averaged at each novel viewing window, such that the mean controlled similarity score at each novel viewing window reflected the average similarity of the fixations within that window to all windows in the repeated viewing scanpath. In short, this allowed us to assess the overall fit of the entire repeated viewing scanpath to each novel viewing window. Figure 5D demonstrates how two sequences can be compared using this method. To examine the relationship between novel and repeated viewing fixation sequences more closely, we conducted two additional analyses focusing on early and late repeated viewing fixations. For our second analysis, we calculated controlled scanpath similarity for the first 3-fixation repeated viewing window to every corresponding 3-fixation novel-viewing window (Figure 5B). For our third analysis, we calculated controlled scanpath similarity for the last 3-fixation repeated viewing window to every corresponding 3-fixation novel viewing window (Figure 5C). These latter analyses did not require any averaging of controlled similarity scores at novel viewing windows since each novel viewing window was only compared to a single repeated viewing window. The method of determining significance was consistent for all analyses. First, for each subject, mean controlled similarity scores were averaged across trials at each novel viewing window, such that each novel viewing window in the final analysis contained a mean similarity score for each subject. Second, for each age group, a distribution of mean controlled similarity scores was generated by randomly sampling from the subject mean similarity scores at each window position 10,000 times. Where the lower boundary of the confidence interval was above 0, similarity was considered to be primarily driven by memory (see previous section for similarity analysis).

Similarity search correlations

A central tenet of Scanpath Theory is that recapitulation of novel viewing fixations at subsequent viewing supports memory retrieval. To test whether scanpath similarity predicted memory-guided search performance (search time during repeated trials), we correlated controlled scanpath similarity averaged at the beginning, middle, and end of the scanpath with repeated viewing search time for each subject. Correlation values were averaged across subjects within each age group (Figure 8). To determine significance, we performed two bootstrap procedures. First, for each subject, we randomly sampled with replacement from the subject’s trial-based similarity and search time values and computed the correlation between the variables. This process was repeated 1000 times for each subject, generating a series of subject-specific correlation distributions. Next, for each age group, we randomly selected a sample correlation value from each subject’s correlation distribution and averaged the samples. This process was repeated 10,000 times, generating a distribution of average correlation values for each age group with 95% and 99% confidence intervals for the correlation means. For this analysis, only trials with a minimum of five windows (seven fixations) were included. We defined “beginning” as the first two windows in the novel viewing sequence and “end” as the last two windows. All remaining windows were classified as “middle”. By grouping similarity scores in this way, we were able to determine whether the relationship between scanpath recapitulation and mnemonic performance varied as a function of time in the search process. Finally, to account for differences in novel search time we conducted the same analysis as described above, however we used a relative measure of search time improvement [(novel search time – repeated search time) / novel search time] in place of repeated search time.

Figure 8.

Group mean correlation values for controlled scanpath similarity, averaged at the beginning, middle, and end of the scanpath, and repeated viewing search time. Correlation values are averaged within each age group. To generate confidence intervals, a distribution of correlation values was created for each subject (by sampling similarity and search scores) and for each age group (by sampling from the subject distributions).

Results

Repetition effects

Target detection performance was high in both the younger and older adults. Younger adults found targets prior to the 45 s time limit on 89 (0.07), and 99 (0.03) % [mean (SD)] of novel and repeated trials, respectively, compared to 68 (0.18), and 90 (0.13) % for older adults. The percent of successful trials (trials in which the target is found prior to the 45 s time limit) differed significantly as a function of repetition set, F(1, 30) = 61.92, p < .001, = 0.674 and as a function of age, F(1, 30) = 17.706, p < .001, = 0.371, with a significant interaction between set and age, F(1, 30) = 9.606, p < .01, = 0.243. A regression of search time on trial number within block was non-significant [novel search: F(1, 614) = .807, p = .369, repeated search: F(1, 614) = .657, p = .418], suggesting that the repeated search benefit was not due to practice effects. As expected, search times were significantly faster on repeated (M OA = 12.16s; M YA = 4.33s) compared to novel (M OA = 25.70s; M YA = 15.14s) search trials [F(1, 30) = 209.838, p < .001, = 0.875], (Figure 6A). Repeated trials were also characterized by fewer fixations (M OA = 31.62; M YA = 9.48) relative to novel trials [M OA = 63.54; M YA = 33.20; F(1, 30) = 128.359, p < .001, = 0.811], (Figure 6B), while fixation duration did not significantly differ across sets [novel: M OA = 0.38s; M YA = 0.46s; repeated: M OA = 0.384s; M YA = 0.518s; F(1, 30) = 1.945, ns], (Figure 6C). These results are consistent with the contextual cueing effect. Critically, these results also negate the possibility of complete scanpath recapitulation at repeated viewing.

Figure 6.

Eye movement measures by age and repetition set. (A) mean search time (s), (B) mean number of fixations, (C) mean fixation duration (s). Error bars: +/− 1 SE.

Eye movement measures by age and repetition set. (A) mean search time (s), (B) mean number of fixations, (C) mean fixation duration (s). Error bars: +/− 1 SE. As expected, younger adults significantly outperformed older adults on all eye movement-based memory measures [search: F(1, 30) = 46.901, p < .001, = .610; number of fixations: (1, 30) = 26.425, p < .001, = .468; fixation duration: F(1, 30) = 7.281, p < .05, = .195]. Interactions between age and repetition set were non-significant [search: F(1, 30) = 2.637, ns; number of fixations: F(1, 30) = 2.784, ns; fixation duration: F(1, 30) = 1.499, ns].

Similarity effects

To test Scanpath Theory’s prediction of fixation recapitulation during repeated image viewing, we used a novel saliency-controlled scanpath similarity measure (controlled scanpath similarity) to compare fixations during novel and repeated search through the same scene. To test our hypothesis that scanpath recapitulation is incomplete, we used a sliding window of three fixations to examine changes in fixation similarity throughout the search process (not corrected for multiple comparisons). Controlled scanpath similarity scores at each novel viewing window were bootstrapped, yielding 95% and 99% confidence intervals. For the purposes of the present study, we were interested in where the lower boundary of the confidence interval was above 0, indicating that repeated viewing fixations are more similar to their corresponding novel viewing fixations than to a saliency-based control sequence. Where this occurs, fixation recapitulation can be attributed to non-exogenous factors. Given the constraints of the present study, we interpret similarity significantly above baseline to reflect memory guidance. Where the confidence interval includes 0, repeated viewing fixations are equally distant from their corresponding novel viewing fixations as to a saliency-based control scanpath. To maximize the number of trials used in each similarity computation, two graphs were generated for each analysis, one aligned at the start of the scanpath and one aligned at the end. The graphs were cut and merged at the midpoint. Tables A2 and A3 (see Appendix) contain the younger and older adult subject trials contributing to analysis at each novel viewing window position. Results of all three similarity analyses are depicted in Figure 7.

Table A2.

YA subject trials included at similarity analysis at each novel viewing window.

	Subject
Window	401	402	404	405	406	408	409	410	411	412	413	414	415	416	417	418
1	8	14	15	13	16	15	14	19	15	8	16	16	17	14	17	15
2	8	14	14	13	16	13	12	19	15	8	15	15	17	13	17	14
3	8	13	13	13	16	12	12	19	14	8	15	15	17	11	16	13
4	8	12	13	13	15	12	12	19	13	8	15	15	17	11	15	12
5	8	12	11	12	14	12	11	18	12	8	15	15	16	9	13	12
6	8	12	11	10	14	12	11	18	11	8	14	14	16	9	13	12
7	8	11	9	10	12	12	10	18	10	7	11	13	15	9	13	12
8	8	11	9	9	12	11	10	16	10	7	10	12	14	9	13	10
9	8	11	9	9	11	10	10	14	10	5	10	11	14	9	12	9
10	7	11	9	9	11	10	10	13	10	5	8	11	13	7	12	8
11	6	11	9	9	10	10	9	13	10	5	7	11	12	7	10	8
12	5	10	9	9	10	9	9	11	10	4	7	11	12	7	10	8
13	5	10	9	9	10	9	9	11	10	4	7	11	12	7	10	8
14	6	11	9	9	10	10	9	13	10	5	7	11	12	7	10	8
15	7	11	9	9	11	10	10	13	10	5	8	11	13	7	12	8
16	8	11	9	9	11	10	10	14	10	5	10	11	14	9	12	9
17	8	11	9	9	12	11	10	16	10	7	10	12	14	9	13	10
18	8	11	9	10	12	12	10	18	10	7	11	13	15	9	13	12
19	8	12	11	10	14	12	11	18	11	8	14	14	16	9	13	12
20	8	12	11	12	14	12	11	18	12	8	15	15	16	9	13	12
21	8	12	13	13	15	12	12	19	13	8	15	15	17	11	15	12
22	8	13	13	13	16	12	12	19	14	8	15	15	17	11	16	13
23	8	14	14	13	16	13	12	19	15	8	15	15	17	13	17	14
24	8	14	15	13	16	15	14	19	15	8	16	16	17	14	17	15

Table A3.

OA subject trials included at similarity analysis at each novel viewing window.

	Subject
Window	501	502	504	505	506	508	510	511	512	513	514	515	516	517	518	519
1	15	15	13	13	6	17	14	7	14	12	10	12	17	17	11	14
2	15	15	13	13	6	17	13	7	14	12	10	12	17	17	11	14
3	15	15	13	13	6	17	13	7	14	12	10	12	17	16	11	14
4	15	15	13	13	6	16	12	7	13	12	10	12	17	16	11	14
5	15	15	13	13	6	16	12	7	13	12	10	12	17	16	11	14
6	14	15	12	12	6	16	12	7	9	12	9	12	17	15	11	14
7	14	15	12	12	6	15	12	7	9	12	9	12	15	15	11	14
8	14	14	11	11	6	15	11	6	9	12	9	11	15	12	11	14
9	14	14	11	11	6	14	10	4	8	12	9	11	15	10	11	14
10	14	14	10	11	6	13	10	3	7	12	9	11	14	10	11	14
11	14	14	10	10	6	13	10	3	7	12	9	11	14	10	11	13
12	13	14	9	10	6	12	10	3	7	12	9	10	14	10	11	13
13	13	14	9	10	6	11	10	3	6	12	9	9	14	10	11	13
14	12	14	9	9	6	11	10	3	6	12	9	9	14	10	11	13
15	12	14	9	8	6	11	10	3	5	12	8	9	14	10	11	13
16	12	14	8	8	6	11	10	3	5	12	8	9	14	10	11	12
17	12	14	8	7	6	11	10	3	5	12	6	9	14	10	11	12
18	12	14	8	7	6	11	9	3	4	11	6	9	14	10	10	12
19	12	14	7	7	6	11	9	3	4	11	6	9	14	10	10	12
20	12	14	7	7	6	11	9	3	4	11	6	9	14	10	10	12
21	12	14	8	7	6	11	9	3	4	11	6	9	14	10	10	12
22	12	14	8	7	6	11	10	3	5	12	6	9	14	10	11	12
23	12	14	8	8	6	11	10	3	5	12	8	9	14	10	11	12
24	12	14	9	8	6	11	10	3	5	12	8	9	14	10	11	13
25	12	14	9	9	6	11	10	3	6	12	9	9	14	10	11	13
26	13	14	9	10	6	11	10	3	6	12	9	9	14	10	11	13
27	13	14	9	10	6	12	10	3	7	12	9	10	14	10	11	13
28	14	14	10	10	6	13	10	3	7	12	9	11	14	10	11	13
29	14	14	10	11	6	13	10	3	7	12	9	11	14	10	11	14
30	14	14	11	11	6	14	10	4	8	12	9	11	15	10	11	14
31	14	14	11	11	6	15	11	6	9	12	9	11	15	12	11	14
32	14	15	12	12	6	15	12	7	9	12	9	12	15	15	11	14
33	14	15	12	12	6	16	12	7	9	12	9	12	17	15	11	14
34	15	15	13	13	6	16	12	7	13	12	10	12	17	16	11	14
35	15	15	13	13	6	16	12	7	13	12	10	12	17	16	11	14
36	15	15	13	13	6	17	13	7	14	12	10	12	17	16	11	14
37	15	15	13	13	6	17	13	7	14	12	10	12	17	17	11	14
38	15	15	13	13	6	17	14	7	14	12	10	12	17	17	11	14

Figure 7.

95% and 99% confidence intervals for controlled scanpath similarity for: (A) all 3-fixation repeated viewing windows across all 3-fixation novel viewing windows, averaged at each novel viewing window, (B) First three repeated viewing fixations across a 3-fixation sliding window of novel viewing fixations, (C) Last three repeated viewing fixations across a 3-fixation sliding window of novel viewing fixations. The 0 line marks baseline or chance similarity, where repeated viewing fixations are equally distant from novel viewing and saliency-based, control-generated fixations. Window lengths: OA = 38; YA = 24. Consistent with the predictions of Scanpath Theory, fixation recapitulation was observed that was not due to visual or semantic saliency. As expected, fixation recapitulation was incomplete, with above baseline similarity observed only at the beginning and end of the scanpath. Further confirming our prediction of variable scanpath recapitulation as a function of memory integrity, we observed a large difference in the number of significant windows above baseline in older and younger adults, with young adults’ mean similarity dropping below baseline ahead of older adults (Figure 7). Finally, to test whether scanpath recapitulation during repeated search trials is correlated with memory-guided search performance, we correlated controlled scanpath similarity averaged at the beginning, middle, and end of the scanpath with average repeated viewing search time for each subject (Figure 8). Initial scanpath similarity (beginning windows) was significantly negatively correlated with repeated viewing search time in young adults (correlation: M = −0.131, SD = 0.062, p < .05), suggesting that greater early scanpath recapitulation does in fact predict faster search time at repeated viewing. Interestingly however, similarity of repeated viewing fixations to fixations made in the middle of the novel viewing scanpath was significantly positively correlated with search time at repeated viewing in the young adults (correlation: M = 0.142, SD = 0.073, p < .05), indicating poor repeated search performance. Similarity at the end of the scanpath did not significantly correlate with repeated search performance (M = 0.056, SD = 0.069, p > .05). Correlations between search time and similarity were non-significant in older adults (beginning: M = −0.042, SD = 0.065, ns; middle: M = -0.053, SD = 0.078, ns; end: M = 0.023, SD = 0.07, ns). Group mean correlation values for controlled scanpath similarity, averaged at the beginning, middle, and end of the scanpath, and repeated viewing search time. Correlation values are averaged within each age group. To generate confidence intervals, a distribution of correlation values was created for each subject (by sampling similarity and search scores) and for each age group (by sampling from the subject distributions). To ensure that the relationship between scanpath repetition and search time was not biased by the length of the encoding scanpath, we performed the same analysis using a relative measure of memory-guided search performance collapsed across age groups. Here, only similarity in the middle of the scanpath was significantly correlated with search time improvement (correlation: M = −0.114, SD = 0.055, p < .05). These findings are consistent with previous work suggesting that the repeated viewing scanpath is directly related to relational memory processes. However, it must be noted that these findings are correlational and thus cannot speak to the direction of causation between scanpath repetition and memory.

Discussion

In the present study, we returned to the scanpath model of eye-movement-based memory effects to explore the relationship between relational memory, as expressed by eye movements, and efficient target detection during repeated visual search events. According to Scanpath Theory, visual arrays are stored as a sensory-motor memory sequence of alternating fixations and saccades reflecting image features and the associations between them, respectively (Noton & Stark, 1971a, 1971b). Recapitulation of the encoding scanpath at repeated viewing is thought to facilitate image recognition by reactivating the associated memory trace feature by feature. However, findings of speeded target detection (Brockmole & Henderson, 2006; Chau et al., 2011; Chun & Jiang, 1998, 2003; Peterson & Kramer, 2001; Tseng & Li, 2004) and fewer fixations (Althoff & Cohen, 1999; Brockmole & Henderson, 2006; Peterson & Kramer, 2001; Ryan et al., 2000) during repeated search events suggest that recapitulation of the complete encoding scanpath is not a necessary precursor for memory-guided search behaviour. Accordingly, we hypothesized that if a relationship indeed exists between scanpath recapitulation and memory, repeated viewing fixations should be closer to their corresponding novel viewing fixations than to a saliency-based control scanpath. Moreover, scanpath recapitulation during repeated viewing should be incomplete to allow for repetition-based search efficiency gains. Finally, we predicted that variability in memory integrity, here indexed by age, would be reflected by variability in scanpath repetition. Scanpath recapitulation was indexed by the distance of repeated viewing fixations to novel viewing fixations (SEDi) relative to a saliency-based control scanpath (SEDc), with baseline similarity reflecting equidistance and above-baseline similarity indicating a significant contribution from memory processes. Critically, both visual and semantic saliency were incorporated into our control in order to more conservatively approximate the relationship between scanpath repetition and relational memory. Confirming our predictions, and those of Scanpath Theory, we observed recapitulation of novel viewing fixations at repeated viewing that could not be attributed to saliency, defined here as any subject-wide viewing tendencies. These findings suggest that fixations made during repeated search are not directed at random or by regional image saliency, but rather by memory for the target object relative to the surrounding context. This finding is consistent with previous research suggesting that relational memory representations guide fixation recapitulation above and beyond the guidance provided by the image itself (Brockmole & Irwin, 2005; De Graef, 1998, 2005; Einhäuser et al., 2008; Henderson et al., 2007; Oliva et al., 2003; Stirk & Underwood, 2007; Underwood, Mennie, Humphrey, & Underwood, 2008; for review see Henderson, 2003). However, it should be noted that the current results do not speak directly to the causal relationship between scanpath repetition and memory; this matter will discussed in further detail later. Using string edit distance and related similarity measures, several studies have suggested that scanpath repetition may be a necessary outcome of stimulus repetition (Brandt & Stark, 1997; Choi et al., 1995; Foulsham & Underwood, 2008; Hacisalihzade et al., 1992; Josephson & Holmes, 2002; Privitera & Stark, 2000; Stark & Ellis, 1981), even when the repeated scanpath contains fewer fixations (Myers & Gray, 2010). Critically however, these measures lack the temporal specificity to examine how serial fixation recapitulation (as proposed by Scanpath Theory) retrieves stored stimulus features and facilitates mnemonic performance over time. Drawing on Scanpath Theory, we proposed a model of incomplete scanpath reinstatement, whereby only select subsets of the encoding scanpath are recapitulated during repeated search. Importantly, this model allows for both scanpath repetition, conceived as a “serial matching of feature network and pattern” (Noton & Stark, 1971b), and increased search efficiency, indexed by decreased search time and number of fixations. Whereas previous studies have employed similarity analyses to compare complete fixation sequences, the present study is the first to our knowledge to compare fixations across the search process. To examine fixation recapitulation over time, we used a sliding window to compute the controlled scanpath similarity between temporally contiguous subsets of fixations from corresponding novel and repeated viewing scanpaths. Confirming our hypothesis of incomplete scanpath repetition, above-baseline similarity was observed only at the beginning and end of the scanpath, driven primarily by recapitulation of initial and final novel viewing fixations, respectively. Virtually no repeated viewing fixations showed significant above-baseline similarity to fixations made in the middle of novel viewing. These results provide support for Scanpath Theory’s predictions of temporal and spatial fixation recapitulation, while accounting for memory-guided search efficiency gains. Moreover, the present findings suggest that the repeated viewing scanpath might retain its serial quality by preserving some continuous sequences of fixations, while eliminating others. Indeed, in their original proposal of Scanpath Theory, Noton and Stark touch on the notion of selective repetition. Their early experiments revealed that only 25–35% of novel viewing time is actually occupied by the scanpath, with only the “essential fixations at major points of the path” repeated at later viewing (Noton & Stark, 1971b). Extending these findings to visual search, results of the present study suggest that when the experimental task is speeded target detection, the “essential fixations” are those that occur early and late in the search process. What makes fixations essential enough to be repeated? This question hinges on the notion that fixations in fact serve a functional role in memory retrieval. While the causal relationship between fixation recapitulation and memory is still unknown, the present results give cause for some speculation on the matter. Assuming, as Scanpath Theory does, that fixations are intimately linked to stimulus features, we can use the observed pattern of preserved and absent fixations in the repeated search scanpath to make inferences about the essential processes underlying search efficiency gains. Adhering to the scanpath framework, with the scanpath representing the entire encoding event and individual fixations representing the smallest units of that event (image features), it follows that series of fixations, or subcomponents of the scanpath may represent larger units of processing or distinct stages in the search process. Here, we propose that fixation repetition early and late in the scanpath subserve the essential functions of relational comparison and target detection, respectively. To date, evidence of scanpath recapitulation during recognition tasks has been primarily restricted to initial novel viewing fixations (Brandt & Stark, 1997; Choi et al., 1995; Noton & Stark, 1971a, 1971b). According to Scanpath Theory, recapitulation of initial fixations reactivates the motor and accompanying perceptual memory traces, allowing for comparison of present perceptual input with stored visual representations (Noton & Stark, 1971a, 1971b). This process of feature-by-feature comparison has been thought to facilitate image recognition and mnemonic performance. In line with this proposal, we observed recapitulation of early novel viewing fixations early in repeated viewing that was not due to visual saliency. Moreover, early fixation recapitulation was extended in older adults, a population with documented deficits in relational memory (Naveh-Benjamin, 2000), further suggesting that scanpath repetition reflects memory integrity. Finally, the degree of similarity between early novel and repeated viewing fixations was negatively correlated with repeated search time in young adults, suggesting that recapitulation of initial fixations benefits mnemonic performance. Taken together, these results support a comparison account of early fixation recapitulation whereby the image is serially compared with the memory representation, facilitating recognition and subsequent target detection. In addition to early scanpath recapitulation, we observed repetition of fixations late in the scanpath, a finding that to our knowledge has not previously been reported. Given that the final fixation in the scanpath was eliminated from the similarity analysis, the present finding suggests that fixation recapitulation late in the scanpath extends beyond target detection to the preceding fixations. This finding is consistent with previous research suggesting that active comparison of perceptual input and stored memory representations can influence online processing (Olsen et al., 2012; Ryan & Cohen, 2004). One possible explanation for the present finding is that the location of the target is stored in memory relative to other scene elements such that, having been bound with the target at novel viewing, these fixated elements may serve as landmarks, directing the eyes toward the target. Lending support to this interpretation, a study by Olsen, Chiew, Buchsbaum, and Ryan (2014) found that memory indexed by similarity between eye movements during the study and retention of a set of visual objects correlated significantly with memory for relative, but not absolute object locations. Another possible explanation comes from Olson and Chun (2001), who propose the existence of a temporal salience gradient surrounding the target. Accordingly, fixations made just prior to target detection may be prioritized in memory for their temporal proximity to the target, allowing them to cue the target’s location. As the finding of late scanpath similarity is less robust than the finding of early scanpath similarity, these speculations should be interpreted with caution. Notably, while initial and final novel search fixations were recapitulated during repeated search events, fixations made in the middle of the novel search process were not reinstated as part of the memory-guided scanpath. Instead, instances of significant similarity between repeated viewing fixations and middle novel viewing fixations were primarily driven by non-mnemonic factors, as indicated by below baseline scores. While information from regions fixated in the middle of novel search might be stored, the present data suggest that it is not prioritized in memory, nor does it guide repeated viewing. Rather, our analysis suggests that where these regions are refixated at repeated search it is because of their salient properties. For the purposes of the present study, we will refer to the observed pattern of selective fixation recapitulation, in conjunction with findings of decreased scanpath length at repeated viewing, as “scanpath compression”. Here, the term “compression” refers to both the reduction in length of the scanpath and in the informational content it supports. The present results provide the first evidence for scanpath compression and suggest that efficient target detection may be facilitated by compression of the novel search sequence into a sequence containing only fixations that are essential to the present task. While initial fixations may provide the critical input for image recognition tasks, the present findings suggest that both fixations early and late in the scanpath are essential for successful repeated search performance. On the contrary, fixations made in the middle of novel viewing are not repeated as part of the memory-guided scanpath. We propose that these fixations likely contribute to general image scanning and unsuccessful search and thus are not essential to successful repeated search. Lending support to this interpretation, similarity of repeated viewing fixations to fixations made in the middle of the novel viewing sequence predicted poor search performance in young adults, and this effect persisted when a relative measure of memory-guided search performance was used in place of repeated search time. Based on the results of our similarity analysis, this finding suggests that returning to image regions that may be salient, but provide no mnemonic guidance, increases the time to target detection and decreases search efficiency. Moreover, our correlation results provide novel evidence that recapitulation of some fixations can be detrimental to mnemonic performance. While the primary aim of the present study was to investigate the claims of Scanpath Theory during a memory-guided search task, the present results provide additional insight into the nature of age related changes in memory function. We were motivated to include an older adult group in the present study to examine differences in scanpath recapitulation as a function of memory integrity. As expected, older adults performed more poorly than young adults, producing more fixations and detecting targets more slowly. However, there were no observed age differences in the repetition benefit; both younger and older adults’ search performance similarly improved with repetition. This finding contrasts with impaired performance (no search efficiency gains) in an amnesic patient with extensive bilateral MTL damage (Chau et al., 2011) and with evidence of relational memory impairments (Castel & Craik, 2003; Moses et al., 2008; Naveh-Benjamin, 2000; Naveh-Benjamin et al., 2003; Naveh-Benjamin et al., 2007; Ryan et al., 2007; for review see Craik & Rose, 2012; Old & Naveh-Benjamin, 2008) and reduced MTL function (Chalfonte & Johnson, 1996; Dennis et al., 2008; Driscoll et al., 2003; Driscoll et al., 2009; Mitchell & Johnson, 2009; Old & Naveh-Benjamin, 2008) in older adults. Critically however, young adults detected targets significantly faster than older adults at novel viewing, leaving less room for improvement during repeated search. While older adults did show a search efficiency benefit for repeated images, it is possible that young adults would yield a much greater benefit on a more difficult task. Although the age-invariant repetition effect observed in the present study may be complicated by age differences in novel search efficiency, the presence of search time improvements in both age groups suggests that both younger and older adults can use relational memory representations acquired at novel viewing to promote speeded target detection at repeated viewing. Given that repeated search efficiency may be biased by encoding efficiency, and lacks temporal specificity, scanpath similarity may provide a more sensitive measure of age differences. By comparing younger and older adults’ fixation recapitulation across time, we can see where individual differences in memory integrity have the greatest impact on eye movements. Consistent with our predictions, fixation reinstatement at repeated image viewing varied as a function of age. Whereas young adults’ first three repeated viewing fixations were significantly similar to the first two novel viewing windows, older adults showed an extended pattern of scanpath repetition, with initial fixations recapitulating approximately the first nine windows in the encoding sequence. Fixation recapitulation late in the scanpath did not significantly differ by age. Given our earlier interpretation, the observed pattern of extended early fixation recapitulation in older adults suggests that the primary age difference in search is in the comparison stage. In line with this proposal, impaired MTL functioning with increased age suggests that hippocampal-mediated comparison mechanisms may likewise be impaired. Whereas young adults can quickly and efficiently compare and identify the image at hand, older adults may require more time to match the present image with the stored memory trace due to poorer representational quality or decreased perceptual processing. In the former case, more perceptual input would be required to compare the perceived image with the impoverished memory representation, whereas in the latter case, increased comparison time may reflect age-related deficits in online perceptual processing. Another possible explanation for the observed age differences in scanpath repetition lies in older adults’ impaired inhibitory processing. Age-related deficits in inhibition (Colcombe et al., 2003; Hasher & Zacks, 1988; Hasher, Zacks, & May, 1999; Kramer, Hahn, Irwin, & Theeuwes, 1999; Ryan, Shen, & Reingold, 2006; Ryan et al., 2007) may result in greater repetition of fixations from the middle of the novel viewing sequence, which are inessential for task performance. Indeed, repetition of these fixations is positively correlated with search time, suggesting that their repetition is related to poor repeated search performance. Whereas young adults appear to repeat the minimum number of fixations necessary for identifying the image before moving into the target detection sequence, older adults may continue recapitulating beyond this point due to a failure to inhibit these inessential fixations. However older adults do not repeat the entire encoding scanpath, suggesting that either the inhibitory response is merely delayed in onset, or that this initial repetition reflects a comparison deficit, or delay, rather than an inhibition deficit. While early scanpath similarity showed significant age-related differences, younger and older adults showed a similar peak in similarity late in the scanpath, although this effect was more robust in young adults. In both age groups, the final three repeated viewing fixations (preceding target detection) were significantly similar to the final three novel viewing fixations, suggesting that both younger and older adults are able to repeat the pre-target fixation sequence at test. This result is consistent with findings of preserved top down control of visual search in older adults (Kramer et al., 2006; Madden, Whiting, Cabeza, & Huettel, 2004; Madden, Whiting, Spaniol. & Bucur, 2005) and further suggests that the spatial and temporal cues provided by fixations surrounding the target may be resistant to age-related memory loss (at least when the target is highly salient). Taken together, the discussed findings suggest that early scanpath repetition may rely more heavily on a functioning relational memory system than late scanpath repetition, which did not differ by age. Finally, the finding of extended early scanpath repetition in older adults provides novel insight into the nature of the scanpath and its relationship with memory. Critically, the feature-by-feature integration of scanpath and memory trace proposed by Scanpath Theory fails to hold when we consider tasks like repeated visual search, on which the scanpath is shortened. On the other hand, quantifying scanpath similarity in a single score, as prior analyses have done, suggests that the scanpath, regardless of its length, represents the quality of the memory as a whole. If the scanpath is indeed directly and necessarily linked to the memory trace, older adults should show a decrease in repetition consistent with age-related memory decline. However, if we conceive of the scanpath as a scaffold for memory we could predict that older adults should show increased similarity to support declining memory processes. Our finding of increased recapitulation in older adults relative to younger adults suggests that scanpath repetition plays a supporting role in memory retrieval by reactivating and reinforcing the memory representation as needed. That age differences were limited to early scanpath similarity further suggests that the retrieval support provided by fixation recapitulation is temporally specific and may be limited to component retrieval processes that show age-related impairments. The discussed findings provide novel evidence that the relationship between scanpath repetition and memory changes over the course of the viewing process and as a function of age-related memory integrity. Taken together with findings from previous tests of Scanpath Theory, the present results suggest that individual fixations, subsets of fixations, and entire scanpaths can provide valuable information about how visual stimuli are encoded, stored, and retrieved. Results of our controlled scanpath similarity analysis using a novel, data driven saliency control further illustrate the potential utility of similarity analyses for isolating the effect of relational memory on eye movements and behaviour.

75 in total

1. Eye-movement-based memory effect: a reprocessing effect in face perception.

Authors: R R Althoff; N J Cohen
Journal: J Exp Psychol Learn Mem Cogn Date: 1999-07 Impact factor: 3.051

2. Attention to repeated images on the World-Wide Web: another look at scanpath theory.

Authors: Sheree Josephson; Michael E Holmes
Journal: Behav Res Methods Instrum Comput Date: 2002-11

3. Visual scan adaptation during repeated visual search.

Authors: Christopher W Myers; Wayne D Gray
Journal: J Vis Date: 2010-07-01 Impact factor: 2.240

4. Look here, eye movements play a functional role in memory retrieval.

Authors: Roger Johansson; Mikael Johansson
Journal: Psychol Sci Date: 2013-10-28

5. Memory for items and relationships among items embedded in realistic scenes: disproportionate relational memory impairments in amnesia.

Authors: Deborah E Hannula; Daniel Tranel; John S Allen; Brenda A Kirchhoff; Allison E Nickel; Neal J Cohen
Journal: Neuropsychology Date: 2014-07-28 Impact factor: 3.295

6. The influence of instructions on object memory in a real-world setting.

Authors: Benjamin W Tatler; Sarah L Tatler
Journal: J Vis Date: 2013-02-06 Impact factor: 2.240

7. Shifts in selective visual attention: towards the underlying neural circuitry.

Authors: C Koch; S Ullman
Journal: Hum Neurobiol Date: 1985

8. Cognitive determinants of fixation location during picture viewing.

Authors: G R Loftus; N H Mackworth
Journal: J Exp Psychol Hum Percept Perform Date: 1978-11 Impact factor: 3.332

9. Differential effects of age on item and associative measures of memory: a meta-analysis.

Authors: Susan R Old; Moshe Naveh-Benjamin
Journal: Psychol Aging Date: 2008-03

10. Predicting cognitive state from eye movements.

Authors: John M Henderson; Svetlana V Shinkareva; Jing Wang; Steven G Luke; Jenn Olejarczyk
Journal: PLoS One Date: 2013-05-29 Impact factor: 3.240

9 in total

1. The spatial distribution of attention predicts familiarity strength during encoding and retrieval.

Authors: Michelle M Ramey; John M Henderson; Andrew P Yonelinas
Journal: J Exp Psychol Gen Date: 2020-04-06

2. Eye movements support behavioral pattern completion.

Authors: Jordana S Wynn; Jennifer D Ryan; Bradley R Buchsbaum
Journal: Proc Natl Acad Sci U S A Date: 2020-03-02 Impact factor: 11.205

3. Temporal context guides visual exploration during scene recognition.

Authors: James E Kragel; Joel L Voss
Journal: J Exp Psychol Gen Date: 2020-09-24

4. Why do we retrace our visual steps? Semantic and episodic memory in gaze reinstatement.

Authors: Michelle M Ramey; Andrew P Yonelinas; John M Henderson
Journal: Learn Mem Date: 2020-06-15 Impact factor: 2.460

5. Long-term memory and hippocampal function support predictive gaze control during goal-directed search.

Authors: Sang-Ah Yoo; R Shayna Rosenbaum; John K Tsotsos; Mazyar Fallah; Kari L Hoffman
Journal: J Vis Date: 2020-05-11 Impact factor: 2.240

Review 6. Eye Movements Actively Reinstate Spatiotemporal Mnemonic Content.

Authors: Jordana S Wynn; Kelly Shen; Jennifer D Ryan
Journal: Vision (Basel) Date: 2019-05-18

Review 7. The intersection between the oculomotor and hippocampal memory systems: empirical developments and clinical implications.

Authors: Jennifer D Ryan; Kelly Shen; Zhong-Xu Liu
Journal: Ann N Y Acad Sci Date: 2019-10-16 Impact factor: 5.691

8. Refixation patterns reveal memory-encoding strategies in free viewing.

Authors: Radha Nila Meghanathan; Andrey R Nikolaev; Cees van Leeuwen
Journal: Atten Percept Psychophys Date: 2019-10 Impact factor: 2.199

Review 9. The Changing Landscape: High-Level Influences on Eye Movement Guidance in Scenes.

Authors: Carrick C Williams; Monica S Castelhano
Journal: Vision (Basel) Date: 2019-06-28

9 in total