Literature DB >> 32584491

Neurobiological successor features for spatial navigation.

William de Cothi1, Caswell Barry1.   

Abstract

The hippocampus has long been observed to encode a representation of an animal's position in space. Recent evidence suggests that the nature of this representation is somewhat predictive and can be modeled by learning a successor representation (SR) between distinct positions in an environment. However, this discretization of space is subjective making it difficult to formulate predictions about how some environmental manipulations should impact the hippocampal representation. Here, we present a model of place and grid cell firing as a consequence of learning a SR from a basis set of known neurobiological features-boundary vector cells (BVCs). The model describes place cell firing as the successor features of the SR, with grid cells forming a low-dimensional representation of these successor features. We show that the place and grid cells generated using the BVC-SR model provide a good account of biological data for a variety of environmental manipulations, including dimensional stretches, barrier insertions, and the influence of environmental geometry on the hippocampal representation of space.
© 2020 The Authors. Hippocampus published by Wiley Periodicals LLC.

Entities:  

Keywords:  boundary vector cells; grid cells; plant cells; successor features; successor representation

Mesh:

Year:  2020        PMID: 32584491      PMCID: PMC8432165          DOI: 10.1002/hipo.23246

Source DB:  PubMed          Journal:  Hippocampus        ISSN: 1050-9631            Impact factor:   3.753


INTRODUCTION

The hippocampal formation plays a central role in the ability of humans and other mammals to navigate physical space (Morris, Garrud, Rawlins, & O'Keefe, 1982; Scoville & Milner, 1957). Consistent with behavioral findings, electrophysiological studies in rodents have uncovered a range of spatially modulated neurons—yielding important insights into how the brain represents space—including place cells (O'Keefe & Dostrovsky, 1971), grid cells (Hafting, Fyhn, Molden, Moser, & Moser, 2005), head direction cells (Taube, Muller, & Ranck, 1990), and boundary vector cells (BVCs) (Barry et al., 2006; Lever, Burton, Jeewajee, O'Keefe, & Burgess, 2009; Solstad, Boccara, Kropff, Moser, & Moser, 2008). Yet how these neural representations combine to facilitate flexible and efficient goal‐directed navigation, such as that observed in mammals (Etienne & Jeffery, 2004), remains an open question. One way is to approach this problem from the field of reinforcement learning (RL). RL (Sutton & Barto, 2018) seeks to address how an agent should act optimally to maximize expected future reward. Consequently, a quantity often used in RL is the value of a state in the environment which is defined as the expected cumulative reward , exponentially discounted into the future by a discount parameter . This equation can be rewritten by deconstructing value into the long‐run transition statistics and corresponding reward statistics of the environment (Dayan, 1993). Here, the transition statistics, denoted by , is called the successor representation (SR) which represents the discounted expected future occupancy of each state from the current state . The SR encapsulates both the short‐ and long‐term state‐transition dynamics of the environment, with a time‐horizon dictated by the discount parameter Furthermore, changes to the transition and reward structure can be incorporated into the value estimates by adjusting and , respectively. These adjustments can be made experientially using a temporal‐difference learning rule, which uses the difference between predicted outcomes and the actual outcomes to improve the accuracy of the predicted estimate (Sutton, 1988). Thus, the SR allows the value of possible future states to be calculated flexibly and efficiently. Consequently, it has been proposed that the hippocampus encodes a SR of space (Stachenfeld, Botvinick, & Gershman, 2017)—a claim that is further evidenced by the SR providing a good account of experimental observations of both place and grid cells. This formulation of the SR typically involves discretization of the environment into a grid of locations, within which the SR can be learnt by transitioning around the grid of states. However, this fixed grid‐world renders it hard to make predictions about how environmental manipulations, such as dimensional stretches, would immediately affect hippocampal representations. Furthermore, in very large state spaces, estimating the SR for every state becomes an increasingly difficult and costly task. Instead, using a set of features to approximate location would allow generalization across similar states and circumvent this curse of dimensionality. Indeed, it is clear from electrophysiological studies of the neural circuits supporting navigation that the brain does not represent space as a grid of discrete states, but rather uses an array of spatially sensitive neurons. In particular, boundary responsive neurons are found throughout the hippocampal formation, including “border cells” in superficial medial entorhinal cortex (mEC) (Solstad et al., 2008) and BVCs in subiculum (Barry et al., 2006; Hartley, Burgess, Lever, Cacucci, & Keefe, 2000; Lever et al., 2009). Because these neurons effectively provide a representation of the environmental topography surrounding the animal and—in the case of the mEC—are positioned to provide input to the main hippocampal subfields (Zhang et al., 2014), it seems plausible that they might function as an efficient substrate for a SR. Thus the aim of this article is to build and evaluate a biologically plausible SR based on the firing rates of known neurobiological features in the form of BVCs (Barry et al., 2006; Hartley et al., 2000; Lever et al., 2009; Solstad et al., 2008). Not only does this provide an efficient foundation for solving goal‐directed spatial navigation problems, we show it provides an explanation for electrophysiological phenomena currently unaccounted for by the standard SR model (Stachenfeld et al., 2017).

MODEL

We generate a population of BVCs following the specification used in previous iterations of the BVC model (Barry & Burgess, 2007; Grieves, Duvelle, & Dudchenko, 2018; Hartley et al., 2000). That is, the firing of the i th BVC, tuned to preferred distance and angle to a boundary at distance r and direction subtending at an angle is given by:where, In the model, the angular tuning width is constant and radial tuning width increases linearly with the preferred tuning distance: for constants and ξ. Using a set of BVC's, each position or state in the environment corresponds to a vector of BVC firing rates (Figure 1). We use a tilde ~ to indicate variables constructed in the BVC feature space of . By learning a SR among these BVC features, we can use linear function approximation of the value function to learn a set of weights such that:where ⊺ denotes the transpose and is the vector of successor features constructed using the BVCs as basis features. Analogous to the discrete state‐space case where the successor matrix provides a predictive mapping from the current state to the expected future states, the successor matrix provides a predictive mapping from current BVC firing rates to expected future BVC firing rates. Importantly, and can be learnt online using temporal‐difference learning rules: where and are the learning rates for the SR and weight vector , respectively. Because Equation (6) is independent of reward , the model is still able to capture the structure of the environment in the absence of reward () by learning the successor matrix . In this manner, it inherently describes spatial latent learning as described in rodents (Tolman, 1948).
FIGURE 1

Schematic of model. Boundary vector cells (BVCs), which track the agent's allocentric distance and direction from environmental boundaries, are used as basis features for a successor representation (). The agent's behavior is generated using a rodent‐like movement model with the successor matrix being updated incrementally at each 50 Hz time step. Following from previous analyses of the successor matrix—thresholded sums of the BVC features, weighted by rows of the SR matrix, yield unimodal firing fields with characteristics similar to CA1 place cells. Similarly, thresholded eigenvectors of the successor matrix reveal spatially periodic firing patterns similar to medial entorhinal grid cells

Schematic of model. Boundary vector cells (BVCs), which track the agent's allocentric distance and direction from environmental boundaries, are used as basis features for a successor representation (). The agent's behavior is generated using a rodent‐like movement model with the successor matrix being updated incrementally at each 50 Hz time step. Following from previous analyses of the successor matrix—thresholded sums of the BVC features, weighted by rows of the SR matrix, yield unimodal firing fields with characteristics similar to CA1 place cells. Similarly, thresholded eigenvectors of the successor matrix reveal spatially periodic firing patterns similar to medial entorhinal grid cells Consequently, we can learn through experience which BVCs are predictive of others by estimating the SR matrix . More precisely, given the agent is at position with BVC population firing rate vector , represents the expected sum of future population firing rate vectors, exponentially discounted into the future by the parameter . This contrasts with previous implementations of the SR where rows and columns of the matrix correspond to particular states. Here, rows and columns of the SR matrix correspond to particular BVCs instead. Specifically, the element ij can be thought of as a weighting for how much the j th BVC predicts the firing of the i th BVC in the near future. Thus, although BVC firing depends on the environmental boundaries, the SR matrix and consequently successor features are policy dependent meaning they are shaped by behavior. Here, to generate the trajectories used for learning, we utilized a motion model designed to mimic the foraging behavior of rodents (Raudies & Hasselmo, 2012). Trajectories were sampled at a frequency of 50 Hz, and the learning update from Equation (6) was processed at every time point. All of the simulations presented here investigate the learning of successor matrix in the absence of reward (). Similar to the BVC model (Hartley et al., 2000), the firing of each simulated place cell in a given location s is proportional to the thresholded, weighted sum of the BVCs connected to it:where is the cell's threshold and The weights in the sum (Equation [8]) correspond to a row of the SR matrix and refer to the individual contributions that a particular BVC (encoded by that row) will fire in the near future. Thus, assuming homogeneous behavior, sets of BVCs with overlapping fields will typically exhibit mutually strong positive weights, resulting in the formation of place fields at their intersection (Figure 2a). The place cell threshold was set to 80% of the cell's maximum activation.
FIGURE 2

Typical place and grid cells generated by the BVC‐SR and standard SR models. (a) Like rodent CA1 place cells, BVC‐SR place cells (top) in the open field are non‐uniform, irregular, and often conform to the geometry of the environment. In contrast, standard SR place cells (bottom) are characterized by smooth, circular fields. (b) Grid cells in both the BVC‐SR (first row) and SR models (third row) are produced by taking the eigenvectors of the SR matrix. The corresponding spatial autocorrelograms (second and fourth rows) are used to assess the hexagonal periodicity (gridness) of the firing patterns, shown above each spatial autocorrelogram

Typical place and grid cells generated by the BVC‐SR and standard SR models. (a) Like rodent CA1 place cells, BVC‐SR place cells (top) in the open field are non‐uniform, irregular, and often conform to the geometry of the environment. In contrast, standard SR place cells (bottom) are characterized by smooth, circular fields. (b) Grid cells in both the BVC‐SR (first row) and SR models (third row) are produced by taking the eigenvectors of the SR matrix. The corresponding spatial autocorrelograms (second and fourth rows) are used to assess the hexagonal periodicity (gridness) of the firing patterns, shown above each spatial autocorrelogram Grid cells in the model are generated by taking the eigen decomposition of the SR matrix and thus represent a low‐dimensional embedding of the SR. Similar to the place cells, the activity of each simulated grid cell is proportional to a thresholded, weighted sum of BVCs. However, for the grid cells, the weights in the sum correspond to particular eigenvector of the SR matrix , and the firing is thresholded at zero to only permit positive grid cell firing rates. This gives rise to spatially periodic firing fields such as those observed in Figure 2b.

RESULTS

Following Stachenfeld and colleagues (Stachenfeld et al., 2017), we propose that the hippocampus encodes the BVC successor features to facilitate decision making during spatial navigation. Importantly, due to the disassociation of and reward weights in the computation of value (Equation [5]), the model facilitates latent learning via the independent learning of irrespective of whether reward is present. It also provides an efficient platform for goal‐based navigation by simply changing the reward weights . Like real place cells and those generated by the standard SR model (Stachenfeld et al., 2017), place cells simulated with the BVC‐SR model respect the transition statistics of the environment and thus do not extend through environmental boundaries. However, due the nature of the underlying BVC basis features, the simulated place cells also exhibit characteristics of hippocampal place cells that are unaccounted for by the standard SR model. For example, in the standard SR model, place cell firing in a uniformly sampled open field environment tends to be characterized by circular smoothly decaying fields (Stachenfeld et al., 2017). In contrast, BVC‐SR derived place fields—like real place cells and those from the BVC model (Hartley et al., 2000; Muller, Kubie, & Ranck, 1987)—are elongated along environmental boundaries and generally conform to the shape of the enclosing space (Figure 2a). Most importantly, the use of a BVC basis set provides a means to predict how the model will respond to instantaneous changes in the structure of the environment. In Stachenfeld et al. (2017), the states available to an agent were distinct from the environmental features that constrained the allowed transitions. Thus, insertion of a barrier into an environment had no immediate effect on place or grid fields—changes in firing fields would accumulate through subsequent exploration and learning causing to be updated. However, biological results indicate that place cell activity is modulated almost immediately by changes made to the geometry of an animal's environment (Barry et al., 2006; Barry & Burgess, 2007; Hartley et al., 2000; Lever, Burgess, Cacucci, Hartley, & O'Keefe, 2002; O'Keefe & Burgess, 1996). Because BVC activity is defined relative to environmental boundaries, manipulations made to the geometry of an environment produce immediate changes in the activity of place cells without any change to the SR matrix . Thus, similar to the standard BVC model, elongation or compression of one or both dimensions of an open field environment distorts place cell firing in a commensurate fashion (Figure 3a–c), as has been seen in rodents (O'Keefe & Burgess, 1996). As a result, the basic firing properties of BVC‐SR place cells—such as field size—are relatively preserved between manipulations (Figure 3d).
FIGURE 3

BVC‐SR derived place cells deform in response to geometric manipulations made to the environment. Scaling one or both axes of an environment produces commensurate changes in the activity of BVC‐SR place cells (a). Such that firing field size scales proportionally with environment size (b, c), whereas the relative size of place fields is largely preserved between environments and Pearson correlation coefficient shown (d)

BVC‐SR derived place cells deform in response to geometric manipulations made to the environment. Scaling one or both axes of an environment produces commensurate changes in the activity of BVC‐SR place cells (a). Such that firing field size scales proportionally with environment size (b, c), whereas the relative size of place fields is largely preserved between environments and Pearson correlation coefficient shown (d) The introduction of internal barriers into an environment provides a succinct test for geometric theories of spatial firing and has been studied in both experimental and theoretical settings. Indeed, the predictable allocentric responses of biological BVCs to inserted barriers provide some of the most compelling evidence for their existence (Lever et al., 2009; Poulter, Hartley, & Lever, 2018). In CA1 place cells, barrier insertion promotes an almost immediate duplication of place fields (Muller & Kubie, 1987) which may then be then lost or stabilized during subsequent exploration (Barry et al., 2006; Barry & Burgess, 2007). The BVC‐SR model provided a good account of empirical data, exhibiting similar dynamic responses. Barrier insertion caused 23% of place cells (32/160) to immediately form an additional field, one being present on either side of the barrier (Figure 4a). Following further exploration, 19% of these (7/32) gradually lost one of the duplicates—a modification reflecting updates made to resulting from changes in behavior due to the barrier (Figure 4b) (Barry & Burgess, 2007). Upon removal of the barrier, the simulated place cells reverted more or less to their initial tuning fields before barrier insertion, with minor differences due to the updated SR .
FIGURE 4

Insertion of an additional barrier into an environment can induce duplication of BVC‐SR place fields. (a) In 23% of place cells, barrier insertion causes immediate place field duplication. In most cases (81%), the duplicate field persists for the equivalent of 40 min of random foraging (learning update occurs at 50 Hz). (b) In some cases (19%), one of the duplicate fields—not necessarily the new one—is lost during subsequent exploration. Similar results have been observed in vivo (Barry et al., 2006)

Insertion of an additional barrier into an environment can induce duplication of BVC‐SR place fields. (a) In 23% of place cells, barrier insertion causes immediate place field duplication. In most cases (81%), the duplicate field persists for the equivalent of 40 min of random foraging (learning update occurs at 50 Hz). (b) In some cases (19%), one of the duplicate fields—not necessarily the new one—is lost during subsequent exploration. Similar results have been observed in vivo (Barry et al., 2006) Stachenfeld et al. (2017) previously demonstrated that eigen decomposition of the successor matrix produced spatially periodic firing fields resembling mEC grid cells. Examining the eigenvectors of , from the BVC‐SR model, we found that these too resembled the regular firing patterns of grid cells (Figure 2b). Indeed, although there was no difference in the hexagonal regularity of BVC‐SR and standard‐SR eigenvectors (mean gridness ± SD: −0.28 ± 0.35 vs. −0.27 ± 0.60; t[318] = 0.14, p = 0.886), the eigenvectors from the BVC‐SR exhibit less elliptic grid fields (mean field ellipticity ± SD: 0.59 ± 0.23 vs. 0.75 ± 0.25; t[318] = −5.93; p < 0.001; Supplementary Figure 1), and a larger variability in field firing rates (mean coefficient of variability ± SD: 0.48 ± 0.11 vs 0.14 ± 0.11; t[318] = 26.5; p < 0.001; Supplementary Figure 2), similar to that observed in real grid cells (ellipticity: 0.55 ± 0.02 Krupic, Bauza, Burton, Barry, & O'Keefe, 2015; coefficient of variability: 0.58 ± 0.01 Ismakov, Barak, Jeffery, & Derdikman, 2017)—although neither yield exclusively hexagonal patterns. Empirical work has shown that grid‐patterns are modulated by environmental geometry, the regular spatial activity becoming distorted in strongly polarized environments (Derdikman et al., 2009; Krupic et al., 2015; Stensola, Stensola, Moser, & Moser, 2015). Grid‐patterns derived from the standard‐SR eigenvectors also exhibit distortions comparable to those seen experimentally. Thus, we next examined the regularity of BVC‐SR eigenvectors derived from SR matrices trained in square and trapezoid environments. As with rodent data (Krupic et al., 2015) and the standard‐SR model, we found that grid‐patterns in the two halves of the square environment were considerably more regular than those derived from the trapezoid (mean correlation between spatial autocorrelograms ± SD: 0.68 ± 0.18 vs. 0.47 ± 0.15, t[318] = 10.99, p < 0.001; Figure 5b). Furthermore, BVC‐SR eigenvectors that exceeded a shuffled gridness threshold (see supplementary methods)—and hence were classified as grid cells—were more regular in the square than the trapezoid (mean gridness ± SD: 0.37 ± 0.17 vs. 0.10 ± 0.09; t[24] = 4.87, p < 0.001; Figure 5c). In particular, as had previously been noted in rodents (Krupic et al., 2015), the regularity of these “grid cells” was markedly reduced in the narrow end of the trapezoid compared to the broad end (mean gridness ± SD: −0.30 ± 0.19 vs. 0.16 ± 0.23; t[22] = −5.45, p < 0.001; Figure 5d), a difference that did not exist in the two halves of the square environment (mean gridness ± SD: 0.19 ± 0.25 vs. 0.22 ± 0.36; t[26] = −0.28, p = 0.78).
FIGURE 5

BVC‐SR grid‐patterns are influenced by environmental geometry. (a) Eigenvectors of the BVC‐SR can be used to model grid cells firing patterns in a variety of different shaped enclosures (white line indicates division of square and trapezoid into halves of equal area). (b) Grid‐patterns are more similar in the two halves of the square environment than in the two halves of the trapezoid (mean Pearson's correlation between spatial autocorrelograms ± SD: 0.68 ± 0.18 vs. 0.47 ± 0.15, t[317] = 10.99, p < 0.001), similar results have been noted in rodents (Krupic et al., 2015). (c) “Grid cells” (grid‐patterns that exceed a shuffled gridness criteria, see supplementary methods) are more hexagonal in the square environment than the trapezoid (mean gridness ± SD: 0.37 ± 0.17 vs. 0.10 ± 0.09; t[24] = 4.87, p < 0.001), (d) the narrow half of the trapezoid being less regular than the wider end (mean gridness ± SD: −0.30 ± 0.19 vs. 0.16 ± 0.23; t[22] = −5.45, p < 0.001). The axes of “grid cells” are more polarized (less uniform) in a square (e) than circular environment (f) (D KL(Square||Uniform) = 0.17, D KL(Circle||Uniform) = 0.04; Bayes factor = 1.00 × 10−6). (g) The BVC‐SR eigenvector grid patterns are fragmented in a compartmentalized maze and repeat across alternating maze arms as has been observed in rodents (Derdikman et al., 2009). (h) The Pearson's correlation matrix between the grid patterns on different arms of the maze has a checkerboard‐like appearance due to the strong similarity between alternating internal channels of the maze (n = 160 eigenvectors). Again, similar results have been noted empirically (Derdikman et al., 2009)

BVC‐SR grid‐patterns are influenced by environmental geometry. (a) Eigenvectors of the BVC‐SR can be used to model grid cells firing patterns in a variety of different shaped enclosures (white line indicates division of square and trapezoid into halves of equal area). (b) Grid‐patterns are more similar in the two halves of the square environment than in the two halves of the trapezoid (mean Pearson's correlation between spatial autocorrelograms ± SD: 0.68 ± 0.18 vs. 0.47 ± 0.15, t[317] = 10.99, p < 0.001), similar results have been noted in rodents (Krupic et al., 2015). (c) “Grid cells” (grid‐patterns that exceed a shuffled gridness criteria, see supplementary methods) are more hexagonal in the square environment than the trapezoid (mean gridness ± SD: 0.37 ± 0.17 vs. 0.10 ± 0.09; t[24] = 4.87, p < 0.001), (d) the narrow half of the trapezoid being less regular than the wider end (mean gridness ± SD: −0.30 ± 0.19 vs. 0.16 ± 0.23; t[22] = −5.45, p < 0.001). The axes of “grid cells” are more polarized (less uniform) in a square (e) than circular environment (f) (D KL(Square||Uniform) = 0.17, D KL(Circle||Uniform) = 0.04; Bayes factor = 1.00 × 10−6). (g) The BVC‐SR eigenvector grid patterns are fragmented in a compartmentalized maze and repeat across alternating maze arms as has been observed in rodents (Derdikman et al., 2009). (h) The Pearson's correlation matrix between the grid patterns on different arms of the maze has a checkerboard‐like appearance due to the strong similarity between alternating internal channels of the maze (n = 160 eigenvectors). Again, similar results have been noted empirically (Derdikman et al., 2009) Rodent grid‐patterns have been shown to orient relative to straight environmental boundaries—tending to align to the walls of square but not circular environments (Krupic et al., 2015; Stensola et al., 2015). In a similar vein, we saw that firing patterns of simulated grid cells also were more polarized in a square than a circular environment, tending to cluster around specific orientations (Figure 5e and f). To illustrate this, we used the Kullback–Leibler divergence (D KL) to measure the difference between the distribution of grid orientations and a uniform distribution (see Supplementary Methods). We found the grid orientations in the circular environment were much closer to uniform (D KL(Circle||Uniform) = 0.04 vs. D KL(Square||Uniform) =0.17), and significantly better explained by an underlying uniform distribution as opposed to the grid orientations in the square environment (Bayes factor = 1.00 × 106). Finally, the activity of grid cells recorded while a rodent explores a compartmentalized maze have been shown to fragment into repeated submaps for similar compartments traversed in the same direction (Derdikman et al., 2009). We examined the BVC‐SR eigenvector patterns in a similar maze and found that they too fragmented into repeated submaps for alternating internal arms of the maze (Figure 5g). Consequently, the Pearson's correlation matrix between eigenvector patterns on different arms of the maze exhibits a strong checkboard‐like appearance (Figure 5h), exemplifying the repetition of alternated submaps in a manner more similar to the rodent data (Derdikman et al., 2009) than previous implementations of the SR (Stachenfeld et al., 2017).

DISCUSSION

The model presented here links the BVC model of place cell firing with a SR to provide an efficient platform for using RL to navigate space. The work builds upon previous implementations of the SR by replacing the underlying grid of states with the firing rates of known neurobiological features—BVCs, which have been observed in the hippocampal formation (Barry et al., 2006; Lever et al., 2009; Solstad et al., 2008) and can be derived from optic flow (Raudies & Hasselmo, 2012). As a consequence, the place cells generated using the BVC‐SR approach presented here produce more realistic fields that conform to the shape of the environment. Unlike previous SR implementations, the BVC‐SR place fields respond immediately to environmental manipulations such as dimensional stretches and barrier insertions in a similar manner to real place cells. Comparable to previous SR implementations, the eigenvectors of the SR matrix display grid cell like periodicity when projected back onto the BVC state space, with reduced periodicity in polarized enclosures such as trapezoids. Furthermore, likely due to the experiential learning and the natural smoothness of the BVC basis features, the eigenvectors from the BVC‐SR model exhibit more realistic variations among grid fields, resulting in a model of grid cells that is more similar to biological recordings than previous implementations of the SR. This form of eigen decomposition is similar to other dimensionality reduction techniques that have been used to generate grid cells from populations of idealized place cells with a generalized Hebbian learning rule (Dordek, Soudry, Meir, & Derdikman, 2016; Oja, 1982). Previously, low‐dimensional encodings such as these have been shown to accelerate learning and facilitate vector‐based navigation (Banino et al., 2018; Gustafson & Daw, 2011). The model extends upon the BVC model of place cell firing (Barry et al., 2006; Barry & Burgess, 2007; Hartley et al., 2000) by also providing a means of predicting how environmental boundaries might affect the firing of grid cells. Furthermore, although both models produce similar place cells if the agent samples the environment uniformly, the policy dependence of the BVC‐SR model provides a mechanism for estimating how behavioral biases will influence place cell firing. These models both use BVCs as the basis for allocentric place representations in the brain. As a consequence, they would be unable to distinguish between visually identical compartments based on boundary information alone. To achieve this, the models would require some form of additional information about the agent's past trajectory, such as a path integration signal. Theoretical evidence (Bicanski & Burgess, 2018; Byrne, Becker, & Burgess, 2007) suggests that recently discovered egocentric BVCs (Gofman et al., 2019; Hinman, Chapman, & Hasselmo, 2019) could provide the link between the egocentric perception of the environment to an allocentric representation in the hippocampal formation. The focus of this work has centered on the representation of successor features in the hippocampus during the absence of environmental reward. However, a key feature of SR models is their ability to adapt flexibly and efficiently to changes in the reward structure of the environment (Dayan, 1993; Russek, Momennejad, Botvinick, Gershman, & Daw, 2017; Stachenfeld et al., 2017). This is permitted by the independent updating of reward weights (Equation [7]) combined with its immediate effect on the computation of value (Equation [5]). Reward signals analogous to that used in the model have been shown to exist in the orbitofrontal cortex of rodents (Sul, Kim, Huh, Lee, & Jung, 2010), humans (Gottfried, O'Doherty, & Dolan, 2003; Kringelbach, 2005), and non‐human primates (Tremblay & Schultz, 1999). Meanwhile, a candidate area for integrating orbitofrontal reward representations with hippocampal successor features to compute value could be anterior cingulate cortex (Kolling et al., 2016; Shenhav, Botvinick, & Cohen, 2013). Finally, the model relies on a prediction error signal for learning both the reward weights and successor features (Equations (6), (7)]). Although midbrain dopamine neurons have long been considered a source for such a reward prediction error (Schultz, Dayan, & Montague, 1997), mounting evidence suggests they may also provide the sensory prediction error signal necessary for computing successor features with temporal‐difference learning (Chang, Gardner, Di Tillio, & Schoenbaum, 2017; Gardner, Schoenbaum, & Gershman, 2018). Successor features have been used to accelerate learning in tasks where transfer of knowledge is useful, such as virtual and real world navigation tasks (Barreto et al., 2017; Zhang, Springenberg, Boedecker, & Burgard, 2017). Although the successor features used in this paper were built upon known neurobiological spatial neurons, BVCs, the framework itself could be applied to any basis of sensory neurons that are predictive of reward in a task. Thus, the framework could be adapted to use basis features that are receptive to the frequency of auditory cues (Aronov, Nevers, & Tank, 2017), or even the size and shape of birds (Constantinescu, O'Reilly, & Behrens, 2016). In summary, the model describes the formation of place and grid fields in terms the geometric properties and transition statistics of the environment, while providing an efficient platform for goal‐directed spatial navigation. This has particular relevance for the neural underpinnings of spatial navigation, although the framework itself could be applied to other basis sets of sensory features. AppendixS1: Supporting information Click here for additional data file. Supplementary Figure 1 Grid fields generated using eigenvectors from the BVC‐SR model are less elliptic than those from the standard SR model. Lower values indicate more circular fields and larger values indicate more elliptic fields, with a value of 0 indicating a perfect circle. a) Grid fields generated using the BVC‐SR model had significantly lower ellipticity than the standard SR model (mean field ellipticity ± SD: 0.59 ± 0.23 vs. 0.75 ± 0.25; t[318] = −5.93; p < 0.001), and were similar to observations of real grid cells (Krupic et al., 2015). b) Histogram of the grid field ellipticity (N = 160 eigenvectors) Click here for additional data file. Supplementary Figure 2 Grid fields generated using eigenvectors from the BVC‐SR model exhibit more firing rate variability than the standard SR model. Following the method of Ismakov et al., (2017), the peak firing rates of grid fields was used to compute a coefficient of variability for each eigenvector (CV; SD divided by mean). a) The CV for eigenvectors produced by the BVC‐SR model were significantly larger than that observed in the standard SR model (mean CV ± SD: 0.48 ± 0.11 vs 0.14 ± 0.11; t[318] = 26.5; p < 0.001), and similar to that observed in real grid cells (Ismakov et al., 2017). b) Histogram of the CV for each of the models (N = 160 eigenvectors). Click here for additional data file.
  45 in total

Review 1.  The human orbitofrontal cortex: linking reward to hedonic experience.

Authors:  Morten L Kringelbach
Journal:  Nat Rev Neurosci       Date:  2005-09       Impact factor: 34.870

2.  Fragmentation of grid cell maps in a multicompartment environment.

Authors:  Dori Derdikman; Jonathan R Whitlock; Albert Tsao; Marianne Fyhn; Torkel Hafting; May-Britt Moser; Edvard I Moser
Journal:  Nat Neurosci       Date:  2009-09-13       Impact factor: 24.884

3.  Dissociation between Postrhinal Cortex and Downstream Parahippocampal Regions in the Representation of Egocentric Boundaries.

Authors:  Xenia Gofman; Gilad Tocker; Shahaf Weiss; Charlotte N Boccara; Li Lu; May-Britt Moser; Edvard I Moser; Genela Morris; Dori Derdikman
Journal:  Curr Biol       Date:  2019-08-01       Impact factor: 10.834

4.  Encoding predictive reward value in human amygdala and orbitofrontal cortex.

Authors:  Jay A Gottfried; John O'Doherty; Raymond J Dolan
Journal:  Science       Date:  2003-08-22       Impact factor: 47.728

5.  Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features.

Authors:  Chun Yun Chang; Matthew Gardner; Maria Gonzalez Di Tillio; Geoffrey Schoenbaum
Journal:  Curr Biol       Date:  2017-11-02       Impact factor: 10.834

Review 6.  The Neurobiology of Mammalian Navigation.

Authors:  Steven Poulter; Tom Hartley; Colin Lever
Journal:  Curr Biol       Date:  2018-09-10       Impact factor: 10.834

7.  A neural-level model of spatial memory and imagery.

Authors:  Andrej Bicanski; Neil Burgess
Journal:  Elife       Date:  2018-09-04       Impact factor: 8.140

8.  What can the hippocampal representation of environmental geometry tell us about Hebbian learning?

Authors:  Colin Lever; Neil Burgess; Francesca Cacucci; Tom Hartley; John O'Keefe
Journal:  Biol Cybern       Date:  2002-12       Impact factor: 2.086

Review 9.  Remembering the past and imagining the future: a neural model of spatial memory and imagery.

Authors:  Patrick Byrne; Suzanna Becker; Neil Burgess
Journal:  Psychol Rev       Date:  2007-04       Impact factor: 8.934

10.  Neurobiological successor features for spatial navigation.

Authors:  William de Cothi; Caswell Barry
Journal:  Hippocampus       Date:  2020-06-25       Impact factor: 3.753

View more
  9 in total

1.  Cognitive maps of social features enable flexible inference in social networks.

Authors:  Jae-Young Son; Apoorva Bhandari; Oriel FeldmanHall
Journal:  Proc Natl Acad Sci U S A       Date:  2021-09-28       Impact factor: 11.205

2.  Learning Structures: Predictive Representations, Replay, and Generalization.

Authors:  Ida Momennejad
Journal:  Curr Opin Behav Sci       Date:  2020-05-05

Review 3.  The grid code for ordered experience.

Authors:  Jon W Rueckemann; Marielena Sosa; Lisa M Giocomo; Elizabeth A Buffalo
Journal:  Nat Rev Neurosci       Date:  2021-08-27       Impact factor: 38.755

Review 4.  Structuring Knowledge with Cognitive Maps and Cognitive Graphs.

Authors:  Michael Peer; Iva K Brunec; Nora S Newcombe; Russell A Epstein
Journal:  Trends Cogn Sci       Date:  2020-11-26       Impact factor: 20.229

5.  Hippocampal place cells encode global location but not connectivity in a complex space.

Authors:  Éléonore Duvelle; Roddy M Grieves; Anyi Liu; Selim Jedidi-Ayoub; Joanna Holeniewska; Adam Harris; Nils Nyberg; Francesco Donnarumma; Julie M Lefort; Kate J Jeffery; Christopher Summerfield; Giovanni Pezzulo; Hugo J Spiers
Journal:  Curr Biol       Date:  2021-02-12       Impact factor: 10.900

6.  Neural network based successor representations to form cognitive maps of space and language.

Authors:  Paul Stoewer; Christian Schlieker; Achim Schilling; Claus Metzner; Andreas Maier; Patrick Krauss
Journal:  Sci Rep       Date:  2022-07-04       Impact factor: 4.996

7.  A general model of hippocampal and dorsal striatal learning and decision making.

Authors:  Jesse P Geerts; Fabian Chersi; Kimberly L Stachenfeld; Neil Burgess
Journal:  Proc Natl Acad Sci U S A       Date:  2020-11-23       Impact factor: 11.205

8.  Neurobiological successor features for spatial navigation.

Authors:  William de Cothi; Caswell Barry
Journal:  Hippocampus       Date:  2020-06-25       Impact factor: 3.753

9.  Neurophysiological Evidence for Cognitive Map Formation during Sequence Learning.

Authors:  Jennifer Stiso; Christopher W Lynn; Ari E Kahn; Vinitha Rangarajan; Karol P Szymula; Ryan Archer; Andrew Revell; Joel M Stein; Brian Litt; Kathryn A Davis; Timothy H Lucas; Dani S Bassett
Journal:  eNeuro       Date:  2022-03-03
  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.