Literature DB >> 30617071

Risk-taking bias in human decision-making is encoded via a right-left brain push-pull system.

Pierre Sacré¹, Matthew S D Kerr², Sandya Subramanian², Zachary Fitzgerald³, Kevin Kahn², Matthew A Johnson⁴, Ernst Niebur⁵, Uri T Eden⁶, Jorge A González-Martínez³, John T Gale⁷, Sridevi V Sarma¹.

Abstract

A person's decisions vary even when options stay the same, like when a gambler changes bets despite constant odds of winning. Internal bias (e.g., emotion) contributes to this variability and is shaped by past outcomes, yet its neurobiology during decision-making is not well understood. To map neural circuits encoding bias, we administered a gambling task to 10 participants implanted with intracerebral depth electrodes in cortical and subcortical structures. We predicted the variability in betting behavior within and across patients by individual bias, which is estimated through a dynamical model of choice. Our analysis further revealed that high-frequency activity increased in the right hemisphere when participants were biased toward risky bets, while it increased in the left hemisphere when participants were biased away from risky bets. Our findings provide electrophysiological evidence that risk-taking bias is a lateralized push-pull neural system governing counterintuitive and highly variable decision-making in humans.

Entities: Chemical Disease Gene Species

Keywords: human decision-making; neural encoding; risk-taking dynamic bias; stereoelectroencephalography; stochastic dynamic model

Mesh：

Year: 2019 PMID： 30617071 PMCID： PMC6347682 DOI： 10.1073/pnas.1811259115

Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN： 0027-8424 Impact factor: 11.205

Imagine sitting at a poker table in Las Vegas, facing a hand that has low odds of winning. You stare at the stack of chips that just piled up during your recent lucky streak, and the sight of your winnings is just the nudge that you need to make a large bet, despite your bad hand. Such biases during decision-making are ubiquitous in human behaviors (1). They show that how humans respond to environmental stimuli is dynamic—not static. That is, it is influenced by their past experiences, in both adaptive and maladaptive ways. Therefore, we refer to this nudge as “dynamic bias.” The complex interplay between dynamic bias and environmental stimuli to produce behavioral responses is a fundamental aspect of decision-making, yet the neural circuits mediating these processes are largely unknown. This lack of knowledge stems mainly from the gap between the timescale at which neural activity evolves—on the order of milliseconds—and the time resolution of the tools that are currently used to measure proxies of biases and to image the human brain—typically on the order of seconds or minutes. While researchers commonly manipulate environmental stimuli in structured behavioral experiments to study valuation of return and risk during decision-making (refs. 2–5 and references therein), measuring dynamic bias on a trial-by-trial basis is very challenging because bias is an internal state that we cannot directly observe. Several autonomic responses (e.g., skin conductance, heart rate, blood pressure) have been proposed to measure proxies of bias (e.g., emotion), but all suffer from delays on the order of seconds to minutes (ref. 6 and references therein). Identifying the neural substrates of decision-making in humans is also difficult, as measuring electrical activity across multiple structures at the source and at millisecond resolution in humans is not possible in general (ref. 7 and references therein). On one hand, prior work in humans has been largely dominated by studies wherein functional magnetic resonance imaging (fMRI) or positron emission tomography (PET) scans are used to measure neural activity in participants during decision-making (8–12). The fMRI and PET scans measure blood flow in the entire brain, which is an indirect measure of brain activity, and both suffer from low temporal resolution, on the order of seconds and minutes, respectively (13, 14). On the other hand, a small number of electroencephalography (EEG) and magnetoencephalography (MEG) studies have been conducted to understand human decision-making (15, 16). While their temporal resolution is high, EEG- or MEG-based approaches measure activity from outside the head and suffer from global summation from different sources (7). To map the neural circuits mediating dynamic bias in human decision-making, we used techniques that allowed us to track dynamic bias and its neural circuits at a relevant timescale and directly at the source. First, we administered a sequential economic decision-making task in which bias fluctuates in both positive and negative directions and can play a role in at least 20% of trials (17). Then, we constructed a stochastic dynamical model to estimate dynamic bias from participants’ responses using maximum-likelihood methods. In addition, we exploited a unique opportunity to record neural activity from humans with medically refractory epilepsy implanted with multiple intracerebral depth electrodes while they performed the decision-making task. Specifically, we used a functional electrophysiological monitoring modality, called stereoelectroencephalography (SEEG), to simultaneously record local field potentials at millisecond resolution from hundreds of sources in cortical and subcortical brain structures. In this paper, we first report the variability in choices across participants but also across trials within participants when they were faced with the same options. Then, we explain this variability using participant-specific stochastic dynamical (state space) models of the decision-making process. In particular, we show that an estimated dynamic bias (state variable) predicts when and why participants changed their betting strategies (that is, made different choices under identical task stimuli), and exactly how participants weighed bias with respect to return and risk (that can be directly computed from the task stimuli) on each decision. These findings highlight the importance of incorporating dynamic bias within models of human choice, which has recently been strongly argued by philosophers (18), psychologists (19), behavioral economists (1), and neuroscientists (20). Finally, we map the neural circuits and pathways that modulate with estimated bias during different stages of decision-making (forming preferences, selecting and executing actions, and evaluating outcomes). We find that the structures that encode bias are highly distributed and strikingly lateralized. High-frequency activity increased in the right hemisphere when participants were biased toward risky bets (push), while it increased in the left hemisphere when participants were biased away (pull) from risky bets. Similar push–pull neural control mechanisms have been found to mediate motion via go/no-go pathways in basal ganglia (21, 22), vision via on/off cells in visual cortex (23, 24), and seizure spread via synchronizing/desynchronizing populations (25). Lateralization in the brain has also been observed when encoding approach–avoidance behaviors and positive–negative emotions, and is thought to maximize processing efficiency by minimizing competition between conflicting behaviors (26, 27). As a proof of concept, we also demonstrate that a simple linear regression model has predictive power to decode dynamic bias from this lateralized push–pull system, where the quality of decoding increases with the quality of neural recording coverage. Our findings demonstrate—with electrophysiological evidence—that risk-taking bias relies on a distributed lateralized push–pull neural system that governs counterintuitive and highly variable decision-making in humans and involves many areas beyond the widely studied ventromedial and dorsolateral prefrontal cortices (vmPFC and dlPFC).

Results and Discussion

Task Exposes Participants to Scenarios Where Bias Can Play a Large Role.

We administered an economic decision-making task that exposes participants () to a sequence of stimuli that differ in the probability distribution of their reward and, therefore, elicit a variety of behaviors that are influenced by stimuli (through notions like return and risk) but also potentially by an internal state (dynamic bias) that evolves over trials. Our task, previously described in refs. 17 and 28–30, is a computerized game analogous to the classic card game of “war” (Fig. 1). In each trial, the player is dealt a single card face up while the computer is dealt a single card face down. After evaluation of the exposed card, the player decides between two choices: a low bet ($5) or a high bet ($20) on the fact that the exposed card is higher than the hidden card. The player is not allowed to decline to bet. If the player’s card is higher/lower than the computer’s card, the player wins/loses virtual money. If both cards are equal, then no virtual money is won or lost. To simplify the task, the deck was limited to only five different cards—the even cards from 2 through 10 of the spade suit (, , , , or )—and each card was drawn randomly with equal probability and with replacement within each trial, allowing the player’s and the computer’s cards to be identical. The rules of the task were carefully explained to each participant. They practiced until they said that they understood the rules of the task and felt comfortable selecting the choice using the manipulandum (around 20 min). See for details.

Fig. 1.

Economic decision-making task, behavioral data, and neural data. (A) Economic decision-making task. On each trial, two cards are drawn (with replacement within each trial) from a deck of the five, even cards from 2 through 10 in the spade suit. The player sees the face of one card and the back of the other card. Then, the player has to bet $5 or $20 on the fact that the exposed card is higher than the hidden card. The player wins/loses the bet if the exposed card is higher/lower than the hidden card (no win or no loss on a tie). See for details. (B) Return and risk computation. Return is defined as the expected value of reward; risk is defined as the variance of reward. (C) Dynamic bias estimation from behavioral data. Behavioral data (binary decisions) are recorded while participants are playing our gambling task. Dynamic bias is an internal variable that we cannot measure experimentally but that we estimated from the binary decisions using a stochastic dynamical model of the decision-making system. (D) Neural recording. Each participant was implanted with multiple intracerebral depth electrodes (SEEG). This method allows us to simultaneously record local field potentials at millisecond resolution from hundreds of sources in cortical and subcortical brain structures. The five stimuli (five different cards) differ in the probability distribution of their reward. It is common to compute notions of return and risk from this probability distribution given the player’s card and the decision. Return is defined as the expected value of reward, and risk is defined as the variance of reward (Fig. 1). In this task, betting high is therefore always more risky (higher variance of reward) than betting low. Basic probabilities about the gambling task are provided in . The most profitable strategy (i.e., maximizing return) in our gambling task is a strict function of stimuli: Bet high when dealt the 8 or 10 card, and bet low when dealt the 2 or 4 card. On 6-card trials, the return is equal to zero for both choices, and thus the most profitable strategy is not unique. The decisions on these 6-card trials, but also on any other trial, can therefore be influenced by a dynamic bias that is shaped by past outcomes. For example, the dynamic bias may capture a recent winning streak, which may nudge a player to bet high even on a low card (risk-seeking behavior). Similarly, the dynamic bias may capture a recent losing streak and may nudge a player to bet low even on a high card (risk-averse behavior).

Decision Strategies Vary Across Participants and Trials.

We first asked the question, “Did participants follow the same decision strategy on each card, and was it the most profitable strategy?” To answer our question, we computed the sample mean responses (bets and reaction times) across trials for each player’s card value and each participant. Participants closely followed the most profitable strategy to maximize return (Fig. 2, Top). They predominantly bet low on 2 and 4 cards and bet high on 8 and 10 cards. On 6-card trials, they switched more often between both betting decisions, with a preference for low bets (mean proportion of high bet, 27.35%), which can be explained by an average risk-averse behavior. Because of the ambiguity on 6-card trials, they also took longer to decide on these trials (Fig. 2, Bottom). Surprisingly, some participants made counterintuitive decisions: bet high on 2 cards or low on 10 cards on some trials, which is the least profitable strategy (see the proportions of high bets that are different from 0 or 1 on 2 cards or 10 cards, respectively, in Fig. 2, Top).

Fig. 2.

Variability of decision strategies across participants and across trials. (A) Mean responses to player’s cards during 30-min sessions: (Top) proportion of high bets per player’s card and (Bottom) reaction time (-score) per player’s card. The session mean for each participant is represented by a filled circle ( trials per player’s card per participant). The population mean ( SEM) is represented by a black bar ( participants). (B) Variability of responses across participants. The variability on -card trials is defined as the average (across the four card values) of sample variances of betting decisions on -card trials; the variability on 6-card trials is defined as the sample variance of betting decisions on 6-card trials. Here, low bet is 0 and high bet is 1. The measure for each participant is represented by a filled circle. Gray lines (dotted and dashed) show two trends that deviate from a completely static behavior (origin). These lines were fitted by using multiline orthogonal regression, i.e., minimizing the sum of distances between each point and its closest line (see for details). (C) Variability of responses across trials within participants. The variability across trials within participants is defined as the moving average (per player’s card) of the difference between the actual bet and the session-average bet given the player’s card value (with overlapping windows of length with ). Each curve corresponds to the behavior on a different card value (2, blue; 4, red; 6, yellow; 8, purple; 10, green). Some curves are plotted on top of each other. They exhibit (Top) dynamic behavior on no cards (static on all cards), (Middle) dynamic behavior on the 6 card only, and (Bottom) dynamic behavior on all cards; these three types of behaviors correspond in B to the origin, the dotted gray line, and the dashed gray line, respectively. The variability curves for participant 8 are not entirely flat and equal to zero because of two isolated bets that are different from the dominant behavior. To further investigate how participants’ betting strategies vary during the session, we summarize the variability for each participant by computing two measures for two different sets of trials (Fig. 2): (i) the set of -card trials for which the two choices lead to different returns and (ii) the set of 6-card trials for which both choices lead to the same (zero) return. The variability on -card trials is defined as the average (across the four card values) of sample variances of betting decisions on -card trials, and the variability on 6-card trials is defined as the sample variance of betting decisions on 6-card trials. We identified two trends with different behavioral patterns in the population of participants: one that shows a low level of variability on cards and different degrees of variability on 6 cards (along the dotted gray line), and one that shows different degrees of variability on both cards and on 6 cards (along the dashed gray line). The next question we asked is “Did participants modulate their betting behavior across trials in a predictable and smooth manner or did they randomly flip their decisions?” To answer this question, we quantified how much the betting strategy of each participant on each card value modulated on a trial-by-trial basis around the average behavior for this card value (see for details). The variability across trials for three representative participants reveals three distinct behaviors (Fig. 2): dynamic on no cards or static on all cards (participant 8), dynamic on 6 cards only (participant 2), and dynamic on all cards (participant 7). These three types of behaviors relate to the 2D mapping of participants according to variability on -card trials and 6-card trials. The static behavior corresponds to participants close to the origin, and the two different dynamic behaviors correspond to the two trends that we observed. Motivated by the above observations, we hypothesized that decisions are influenced by the player’s card (stimulus) through notions of return and risk, which can be computed at each trial from the current player’s card value, and by a dynamic bias (internal state) that fluctuates smoothly on a trial-by-trial basis. More specifically, counterintuitive and 6-card decisions can be predicted by dynamic bias.

An Internal State Representing Dynamic Bias Predicts the Variability Across Trials and Across Participants.

We tested our internal state hypothesis by asking the following question, “Can an internal state predict the modulation of betting decisions?” To answer this question, we built a stochastic dynamical (state space) model of the decision-making process for each participant ( and ). Briefly, the dynamical model predicts the decision of each participant on individual trials by integrating the input of the current trial (through notions of return and risk computed from the player’s card) and a state variable (representing dynamic bias). The state variable accumulates evidence from the inputs up to the current trial. For example, a participant might be more likely to take a risk by betting high on a 6-card trial when the state variable reflects a recent “string of good luck.” Similar models have been recently used to model decision-making (29–32). However, either they include only information from the previous trial or they assume that the state variable accumulates information in a deterministic fashion. The framework of a stochastic dynamical model generalizes these models by including a fading effect for all past trials and allowing more flexibility with the introduction of a noise term in the state evolution. This framework allows us to estimate the model parameters as well as the state variable from the observations. We estimated the model parameters and the distribution of the state at each trial by maximizing the likelihood of observing the betting decisions given the set of stimuli. We solved this problem using the expectation–maximization algorithm (33–36). See for details. The dynamical model captures the variability of the behavior across trials and across participants (Fig. 3). First, plotting the probability of a high bet against the internal state for each trial reveals the spectrum of behaviors among our population of participants, ranging from static behavior to dynamic behavior on all cards to dynamic behavior on the 6 card only (Fig. 3 and ). Most of the high bets (in red) are associated with a probability of a high bet close to 1 (87% of high bets with ); most of the low bets (in blue) are associated with a probability of a high bet close to 0 (95% of low bets with ), i.e., a probability of a low bet close to 1. In addition, the model captures counterintuitive trial behaviors. For example, low bets on 10-card trials (blue up-triangle in participant 7) are associated with a low negative internal state that biases the probability of a high bet toward smaller values. Similarly, high bets on 2-card trials (red down-triangle in participant 7) are associated with a high positive internal state that biases the probability of a high bet toward larger values. These counterintuitive trials are thus predicted by the estimated bias.

Fig. 3.

Dynamical model predictions. (A) Overlay of model estimation and observed data for three representative participants. (Top) 8, (Middle) 2, and (Bottom) 7. Each subplot represents the probability of high bet (vertical axis) against the internal state (horizontal axis). Each symbol represents a trial. The player’s card received on each trial is encoded by different symbols ( for 2 card, for 4 card, for 6 card, for 8 card, and for 10 card). The observed betting decision on each trial is encoded by the two colors (blue for low bet, red for high bet). The five gray curves represent the probability of high bets as a function of the internal state for each of the five player’s card values (logistic function induced by the model structure). (B) State trajectory for three representative participants. The relative contribution of the internal state in the output Eq. is quantified by dividing the internal state by the sum of absolute value of the mean of each term in Eq. , i.e., , where is the mean of . We plot the mean of the relative contribution (green solid line) and its 95 % confidence bounds (green shaded area). The betting decisions on each trial are overlaid on top of the trajectories. High bets are represented above the trajectory, and low bets are represented below the trajectory. Counterintuitive bets [high on card and low on card] and ambiguous bets (6 card) are highlighted in red (high) and blue (low). All other decisions are represented in gray. (C) Goodness of fit of each model. The goodness of fit is quantified using the deviance and the prediction error (see for definitions). Both statistics show an improvement from the static to the dynamical model for participants with some variability in the data. In addition, the time evolution of state trajectories across sessions for three representative players reveals exactly for which trials the contribution of bias plays a significant role over return and risk (Fig. 3). Participant 8 has almost no variability across trials with the same card, and here the state hovers around 0 throughout the session (Fig. 3, Top). Participant 2 has high variability in betting behavior only on 6-card trials. The majority of 6-card bets (, or 64%) are predicted by (Fig. 3, Middle). The state variable varies in a spiky manner, in which nearly each “spike” captures the betting on a 6-card trial. For other trials, the state variable is close to zero. Participant 7 has high variability in betting behavior across all card values. The bets for the majority of 6-card trials (, or 70%) and some of the counterintuitive -card trials (, or 23%) are predicted by the state variable (Fig. 3, Bottom). In particular, low bets on these trials are explained by negative values of , and high bets are explained by positive values of . The state variable varies smoothly to capture as many of these trials as possible. Finally, we quantified the improvement in goodness of fit of the model with and without an internal state for each participant using two statistics that quantify how much the variation of the output can be predicted by the state variable and inputs of the model. The first statistic is the total deviance, and the second statistic is the prediction error (see for definitions). For both statistics, smaller values indicate better model performance. The dynamical model (with internal state) predicted the behavior better than the static model (without internal state) for participants who changed their betting strategies on one or more trial types, i.e., whose behavior was more dynamic (Fig. 3).

Model Parameters Reveal Different Types of Gamblers.

The model parameters reveal how the participants update their dynamic bias, and how they weigh dynamic bias with return and risk in their decisions throughout their session (Fig. 4). First, the coefficient in front of the “memory” term () in Eq. shows that memory contributes to our participants’ bias with varying levels of decay (Fig. 4).

Fig. 4.

Behavioral interpretation of model parameters. (A) Memory coefficient. Depending on the value of the parameter , the state accumulates inputs from the previous trial only for (fast decay) or from all past trials for (slow decay). For intermediate values of , the inputs from previous trials influence the state with exponentially decaying weights. (B) Influence of player’s card value and reward prediction error on the state evolution. The sign of parameters and relate to hot-hand fallacy (positive recency) and gambler’s fallacy (negative recency). (C) Influence of return and risk on probability of betting high. The parameters and relate to the session-average variability across participants. Gray lines (dotted and dashed) show two trends among participants in the return–risk parameter space. These lines were fitted by using multiline orthogonal regression like in Fig. 2. Then, the coefficients in front of the player’s card value () and the reward prediction error () terms in Eq. reveal how each term shapes the dynamic bias (Fig. 4). Positive values for and correspond to a situation in which people tend to predict the same outcome as the last events (positive recency or hot-hand fallacy), while negative values for and correspond to a situation in which people tend to predict the opposite outcome as the last events (negative recency or gambler’s fallacy). Most of our participants either exhibit the hot-hand fallacy for both inputs (positive and ) or gambler’s fallacy for both inputs (negative and ), while few participants exhibit a mixed fallacy ( and of opposite signs). In addition, the coefficients in Eq. quantify the contribution of return () and risk () to the decision probability (Fig. 4). If is small/large, then the player weighs return more/less, and if is positive/negative, the player is risk-seeking/risk-adverse. Most of our participants are risk-averse () (Fig. 4). The striking similarity between Figs. 2 and 4 suggests that the return parameter captures the session-average behavior on (2, 4, 8, 10)-card trials, while the risk parameter captures the session-average behavior on 6-card trials.

Neural Rhythms Encode Dynamic Bias.

Then, we asked the question, “Can we map the neural circuits responsible for encoding dynamic bias in this task?” To identify brain regions whose activity modulates with dynamic bias, we analyzed the neural oscillations in each brain region, time-locked to each task epoch. Neural oscillations are commonly used due to their association with synchronized activities of the underlying neuronal population encoding behavior (ref. 37 and references therein). Specifically, we measured the correlation between the dynamic bias signal and the oscillatory power of the local field potential, across trials, electrodes, and participants, using a cluster-based nonparametric statistical test (see and ref. 38 for details). A positive correlation means that an increase in the oscillatory power of the activity of that brain region was associated with an increase in bias across trials, and therefore an increase in the probability of betting high (i.e., “push” toward risk-seeking behavior). A negative correlation means that an increase in the oscillatory power was associated with a decrease in bias, and therefore a decrease in the probability of betting high (i.e., “pull” away from risk-seeking behavior). Two representative examples show the variety of neural encoding of dynamic bias in terms of the direction of the neural modulation and the dominant frequency band of the neural rhythm (Fig. 5). For each brain region (highlighted in Fig. 5), we mapped the -values of the Spearman correlations between the dynamic bias and the oscillatory power in each time–frequency window for each epoch. Then, we identified clusters, that is, sets of adjacent time–frequency windows that show a significant correlation (surrounded by red lines). These clusters show when during the epoch and in which frequency band the neural modulation with dynamic bias occurs (Fig. 5). In addition, we plotted the data from one electrode contact contributing to one cluster for each brain region to provide an additional visual representation of the neural modulation with dynamic bias (Fig. 5). To create these representations, we first binned bias values into five different groups for each participant individually using pentiles (the first pentile being the 20th percentile). Then, we showed (i) the time evolution of the average power in the frequency band defined by the cluster for trials associated with low bias (first bin) and high bias (last bin), and (ii) the distribution of the average power in the time–frequency region defined by the cluster against the binned bias.

Fig. 5.

Neural encoding of dynamic bias in brain regions. Top and Bottom represent the neural encoding of dynamic bias in two different brain regions. These regions have been chosen to show different directions of modulation and different frequency bands. (A) The location of the brain region in an MRI slice. Ins, insula; L, left; OFC, orbitofrontal cortex; R, right. (B) Statistic maps in the time–frequency domain at the four epochs of our task: Show Card, Show Bet, Show Deck, and Show Reward. Each pixel in these statistic maps quantifies a (hierarchical) average across electrodes and participants of the Spearman correlation -values between the dynamic bias and the neural oscillatory power (see for details). These statistic maps use the data from all electrode contacts of all participants corresponding to the brain region. Each cluster is defined as a set of adjacent time–frequency windows for which the power shows a significant correlation with dynamic bias (red contour). The vertical white line corresponds to the time-locking epoch. (C) (Left) Time evolution of the average power in the frequency band defined by one cluster for one electrode contact of one participant. For the cluster at the Show Reward epoch in insula, we show data from electrode T’1 in participant 8. For the cluster at the Show Deck epoch in orbitofrontal cortex, we show data from electrode O4 in participant 2. We binned bias values into five different groups for each participant individually using pentiles (the first pentile contains a fifth of the population, so it is equal to the 20th percentile). The two curves correspond to two different groups of trials: the set of trials with low bias (first bin in blue) and the set of trials with high bias (last bin in green). (Right) A distribution of the average power in the time–frequency region defined by the cluster against the binned bias (same cluster, same participant, same electrode as in Left). Error bars represent 1 SEM in all plots.

Fig. 6.

SEEG maps of neural circuits encoding dynamic bias (multiparticipant plots). SEEG maps represent, in MRI slices, the brain regions where the oscillatory power of the SEEG neural activity significantly modulates with dynamic bias at different epochs in the task. (A) Lateralized circuits encode dynamic bias in opposite direction: The activity in the left hemisphere is negatively correlated with bias, while the activity in the right hemisphere is positively correlated with bias. Color encodes the modulation direction of the effect observed in the brain region. (B) Gamma-dominant rhythms encode dynamic bias. Color encodes the dominant frequency band of the effect observed in the brain region. We performed the analysis in the time–frequency domain (without defining frequency bands), but we reported our results using the classical notion of frequency bands, including (1 Hz to 4 Hz), (4 Hz to 8 Hz), (8 Hz to 13 Hz), (13 Hz to 30 Hz), low (30 Hz to 70 Hz), and high (70 Hz to 150 Hz). We associated each cluster with its dominant frequency band by choosing the frequency band containing the largest number of time–frequency windows from the cluster. In A and B, the tint of the color encodes the significance level (logarithm of value) in each brain region. Gray hatched regions are gray matter brain regions for which we don’t have at least three participants and that are therefore not covered by our analysis. The horizontal white line in each slice represents at which level the slice in the other viewpoint is taken. The brain regions whose activity modulated with the risk-taking dynamic bias are distributed across the whole brain, beyond the vmPFC and dlPFC. Encoding appears first in temporal, limbic, and parietal lobes and later appears in frontal cortex (see the progression through task epochs in Fig. 6 and ). Interestingly, this set of distributed brain regions shows roughly an equal split between positive and negative correlations with bias (Fig. 6), and it is mostly localized in the high- frequency band (Fig. 6). Furthermore, the direction of neural modulation for high- rhythms is lateralized in the left and right brain hemispheres (Fig. 7). The high- activity in right-hemisphere regions shows a positive correlation with dynamic bias (push), while the high- activity in left-hemisphere regions shows a negative correlation with dynamic bias (pull) (Fig. 7). In the other frequency bands, we don’t observe this lateralization in the direction of modulation.

Fig. 7.

Lateralized push–pull mechanism in high- rhythms. (A) For (Left) left hemisphere and (Right) right hemisphere number of brain regions whose activity modulates with bias. For each brain hemisphere, the number of brain regions is given separately for modulations in positive (push in green) or negative (pull in red) direction and for different frequency bands. A strong lateralization of the direction of modulation is observed for the high- frequency band. (B) The distribution of the average power in the time–frequency region defined by each cluster against the binned bias, for all brain regions with (Left) negative and (Right) positive correlations. Error bars represent 1 SEM in all plots. Importantly, this result does not rely on a small subpopulation (). First, all but one participant that have electrodes in the respective hemispheres contribute to the push–pull effect (). The one exception is participant 7, who also shows a strong push in the right hemisphere but almost no signal in the left hemisphere. This may be due to the sparse implantation (only two electrodes) in the left hemisphere for this participant. Second, all but two participants that have electrodes in the respective hemispheres show an effect in the same direction as what the population analysis reveals: push in right hemisphere, pull in left (). The two exceptions show a push effect in the left hemisphere: Participant 1 shows a stronger push than pull in the left hemisphere and no signal in the right hemisphere (no electrode in the right hemisphere); participant 5 shows a stronger push than pull in the left hemisphere and a strong push in the right hemisphere. We asked a final question: “Could fluctuations in betting behavior be explained by variations in task engagement or arousal?” For example, a participant may appear to be risk-averse when, in reality, the participant is just not paying much attention. We anticipated this potential problem and designed the task to make sure that each participant is paying attention during each trial by forcing the participant to use the manipulandum to move and hold the cursor on the fixation point at the center of the screen before the trial would begin. Furthermore, from the behavioral data, we see that participants use the optimal strategy to maximize the expected reward (a unique optimal choice exists for all cards but the 6 card) on most of the trials, suggesting that participants are maintaining a similar level of attention throughout the task. The deviation from this optimal strategy is explained by the state variable in our dynamical model, which is constructed to follow a very particular evolution. The state evolution equation involves the accumulation of card values (minus 6 card) and the reward prediction error. We don’t believe that a state variable capturing attention would follow this specific structure. Finally, we examined -band activity in all recorded brain regions, as attention is normally associated with power in this frequency band (39–44). We observed the following. (i) Only a few brain regions showed a modulation of the -band power with dynamic bias over trials (). (ii) Very few brain regions showed a difference in the -band power between low bets (low risk) and high bets (high risk) (). Therefore, there is no consistent modulation in the -band activity across all brain regions that could be caused by a change in attention driving the change in behavior in our data.

Why Distributed Lateralized Push–Pull High- Rhythms?

Our findings suggest the existence of lateralized push–pull high- rhythms encoding dynamic bias. But why? In this section, we speculate on the reason why we observed such a mechanism.

Why distributed?

Human cognitive processes involved in a complex task such as decision-making are often associated with widely distributed neural activation patterns, which involve numerous cortical and subcortical regions (45). Even though studies of decision-making under risk have widely focused on vmPFC and dlPFC, it is not surprising that brain regions in the temporal, parietal, and limbic lobes that project to the prefrontal cortex are also involved in the processing of risk-taking dynamic bias. Some literature on the involvement of these brain regions in decision-making tasks is provided in .

Why high- rhythms?

High- rhythms have been found to correlate with spiking activity (46). In addition, an MEG study showed that high- oscillations across distributed networks reliably reconstruct decision-making stages, including processing of sensory input, option evaluation, intention formation, and action execution (15). These findings suggest that dynamic bias may be encoded in the firing rates of individual neurons as well as patterns in which groups of neurons work together.

Why push–pull?

Push–pull control mechanisms have been found to be pervasive throughout neuroscience, in different functions and dysfunctions of neural circuits. For executive motor function, the excitatory and inhibitory pathways of the basal ganglia operate in concert as a push–pull system to control neural activity in the neocortex and brainstem (21, 22, 47). For vision, the majority of cells of layer 4 in the visual cortex have receptive fields built of parallel, adjacent On and Off subregions in which stimuli of the opposite contrast evoke responses of the inverse sign, an arrangement known as push–pull (23, 24). In the epileptic brain, a push–pull interaction between synchronizing and desynchronizing brain regions controls the seizure spread (25). These findings suggest that dynamic bias may be encoded by two systems pushing toward and pulling away from risky decisions.

Why lateralized?

The lateralization of brain functions has often been associated with an enhancement of cognitive capacity and efficiency of the brain at the individual level and with an “evolutionarily stable strategy under social pressures” at the population level (48). For example, the lateralization of the approach–avoidance motivation and positive–negative emotions (valence hypothesis) seems to have an evolutionary benefit, where minimizing competition between two conflicting behaviors enhances processing efficiency (26, 27).

Our results in the context of prior art.

Functional lateralization effects have been observed during gambling in vmPFC, dlPFC, and amygdala, in both lesion and stimulation studies. Tranel et al. (49, 52) studied decision-making under uncertainty (using the Iowa gambling task) in participants with lesions in the vmPFC. Initially, they showed that participants with lesions in the right vmPFC showed significantly more decision impairments than those with lesions in the left vmPFC (49). In follow-up studies, they found that men with right-side vmPFC lesions and that women with left-side vmPFC lesions had more severe social, emotional, and decision-making impairments than healthy participants (50, 51). Finally, they also showed a similar sex-related asymmetry in unilateral amygdala lesions (52). Knoch et al. (53, 54) have studied decision-making under risk (using the Cambridge gambling task) in healthy individuals using low-frequency Repetitive Transcranial Magnetic Stimulation (rTMS) of the dlPFC. They showed that increased risk-taking behavior was induced by rTMS to the right dlPFC in comparison with the left dlPFC. In addition, Fecteau et al. (55, 56) also studied decision-making under risk in healthy individuals using concurrent anodal transcranial direct current stimulation (tDCS). They showed that right-anodal/left-cathodal tDCS of the dlPFC decreased risk-taking behavior compared with left-anodal/right-cathodal tDCS or sham stimulation. These studies raise the hypothesis that interhemispheric balance of activity may be critical in decision-making. However, this hypothesis is based on observed differences in overall session behaviors that were associated with lateralized brain function manipulated “once,” either via a lesion or via conditioning of the brain using stimulation (often administered before tasks began). In addition, no neural activity was recorded during these sessions. Rather, neural activity was inherently modulated by lesions or exogenously via noninvasive stimulation modalities that are limited in spatial specificity. Our data may explain prior observations, and further show that the lateralized effects occur dynamically on a trial-by-trial basis. Furthermore, our data show that these effects occur in temporal, limbic, and parietal structures earlier than in prefrontal cortices, which are commonly implied in decision-making.

Conclusion

The influence of dynamic bias is ubiquitous in human behaviors. Bias has been recognized as a key factor in the field of behavioral economics, first inspired by Herbert A. Simon’s (57) principle of bounded rationality (late 1950s), and recently formalized by Richard H. Thaler et al.’s (1) notion of “nudge.” Neuroscientists studying decision-making have also been interested in bias affecting choice in humans (20), but have lacked the tools to study its role in brain and behavior during sequential decision-making occurring on the order of seconds. Measuring dynamic bias, such as emotion, is difficult to do at this “fast” timescale, and recording electrical activity in relevant brain regions (which span the entire brain, as we discovered here) in humans is even more challenging. We exploited a unique experimental setup and a stochastic dynamical (state-space) modeling framework that enabled us to estimate bias and its trial-by-trial fluctuations and identify neural structures across the entire brain that modulated with bias, while humans gambled virtual money and made decisions on the order of milliseconds to seconds. As demonstrated in this study, a bias signal can be estimated on a trial-by-trial basis from measurable data using a dynamical model. This bias signal predicts the variability in behavioral data and, in particular, explains why a participant implements a less profitable strategy at times. It was therefore used to identify the neural circuits at the root of this variability. We found that the neural circuits that encode dynamic bias during different stages of decision-making are strikingly lateralized. Increased high-frequency activity in the right hemisphere pushed participants to be more risk-seeking, while increased high-frequency activity in the left hemisphere pulled participants away from risky bets. Our study demonstrates the importance of incorporating dynamic bias—or other internal states—in models of behavior, combined with high spatial and temporal resolution recordings of the neural activity, and will lead to improvements in the understanding of human behavior and its neural origins.

Materials and Methods

Human Participants.

Ten human participants (seven females and three males; mean age, 36 y) with medically refractory epilepsy, who underwent a surgical procedure in which depth electrodes were implanted for seizure monitoring, performed an economic decision-making task (). Demographics and clinical characteristics of each participant are listed in . We excluded two additional participants who volunteered but failed to complete the experiment.

Limitations.

There are standard concerns in analyzing data from epileptic participants. First, participants are often on medication, which might affect the neurophysiology of the brain. For clinical purposes, participants were kept off of their antiseizure medication for their entire stay at Cleveland Clinic, so these effects would be minimized. Secondly, actual seizures might impact the neurophysiology around the seizure focus. Human epilepsy recordings are taken to localize the seizure focus, so overlap is expected between seizure focus and areas recorded.

Ethics Statement.

All experimental protocols were approved by the Cleveland Clinic Institutional Review Board. Experiments and methods were performed in accordance with the guidelines and regulations of the Cleveland Clinic Institutional Review Board. All participants volunteered and provided informed consent in accordance with the guidelines of the Cleveland Clinic Institutional Review Board. Participant criteria required individuals over the age of 18 with the ability to provide informed consent and perform the behavioral task. Besides the behavioral experiments, no alterations were made to the course of clinical care.

Stochastic Dynamical Model of Choice.

We use the following notations to describe our model. At each trial , the player’s card and the computer’s card are denoted by and , respectively. The binary betting decision is denoted by . Here, means that the participant bets high ($20), while means that the participant bets low ($5). The reward is denoted as and is given by . In the following, uppercase letters denote random variables, and lowercase letters denote specific values for these variables. At each trial , we modeled the player’s betting decision as a random variable with a Bernoulli distribution, i.e.,where is the probability of betting high. The probability of betting high on any given trial is assumed to depend on three terms: (i) dynamic bias quantified by an internal state , (ii) return difference quantified by the expected reward, and (iii) risk difference quantified by the variance of reward. The probability of betting high is assumed to follow a logistic model,where and are the model parameters that determine how the probability varies as a function of the inputs (return and risk). This internal state process is modeled by the first-order update equationwhere is an independent normal random input with zero mean and covariance (process noise). The initial state is assumed to be a normal random variable with mean and covariance . The coefficients , , and determine how the participant’s state at trial is related to the participant’s state and inputs (stimuli) at trial . The inputs were chosen based on previous literature and on observations in our behavioral data. The first input, , is the player’s card (minus 6, the median card value) and could represent an effect of luck (29, 58). Specifically, cards larger than 6 and cards smaller than 6 are assumed to have an opposite effect on bias. The second input, , is the reward prediction error, i.e., the difference between the actual reward and the expected reward , and represents an effect of performance feedback (59–61). Following the state evolution in Eq. , the state accumulates evidence from the inputs over the session, and its expected value is written as follows:Depending on the value of the parameter , the previous trials (memory) contribute to the current trial with varying levels of decay, between fast decay () and slow decay (). Indeed, if , the state depends only on the inputs at trial ; if , the state depends equally on inputs from all previous trials and the initial condition; if , the state depends on inputs from all previous trials and the initial condition with exponentially decaying weights. Note that only information from previous trials (trials ) is influencing the current state variable (trial ).

52 in total

1. Decision-making in a risk-taking task: a PET study.

Authors: Monique Ernst; Karen Bolla; Maria Mouratidis; Carlo Contoreggi; John A Matochik; V Kurian; Jean Lud Cadet; Alane S Kimes; Edythe D London
Journal: Neuropsychopharmacology Date: 2002-05 Impact factor: 7.853

2. A general mechanism for perceptual decision-making in the human brain.

Authors: H R Heekeren; S Marrett; P A Bandettini; L G Ungerleider
Journal: Nature Date: 2004-10-14 Impact factor: 49.962

Review 3. Neural synchrony in brain disorders: relevance for cognitive dysfunctions and pathophysiology.

Authors: Peter J Uhlhaas; Wolf Singer
Journal: Neuron Date: 2006-10-05 Impact factor: 17.173

4. Diminishing risk-taking behavior by modulating activity in the prefrontal cortex: a direct current stimulation study.

Authors: Shirley Fecteau; Daria Knoch; Felipe Fregni; Natasha Sultani; Paulo Boggio; Alvaro Pascual-Leone
Journal: J Neurosci Date: 2007-11-14 Impact factor: 6.167

Review 5. Emotion and decision making: multiple modulatory neural circuits.

Authors: Elizabeth A Phelps; Karolina M Lempert; Peter Sokol-Hessner
Journal: Annu Rev Neurosci Date: 2014-05-29 Impact factor: 12.449

Review 6. Dopamine reward prediction-error signalling: a two-component response.

Authors: Wolfram Schultz
Journal: Nat Rev Neurosci Date: 2016-02-11 Impact factor: 34.870

7. Does gender play a role in functional asymmetry of ventromedial prefrontal cortex?

Authors: Daniel Tranel; Hanna Damasio; Natalie L Denburg; Antoine Bechara
Journal: Brain Date: 2005-09-29 Impact factor: 13.501

Review 8. The neural processes underlying perceptual decision making in humans: recent progress and future directions.

Authors: Simon P Kelly; Redmond G O'Connell
Journal: J Physiol Paris Date: 2014-09-07

Review 9. Neuronal Reward and Decision Signals: From Theories to Data.

Authors: Wolfram Schultz
Journal: Physiol Rev Date: 2015-07 Impact factor: 37.312

10. Anticipatory attentional suppression of visual features indexed by oscillatory alpha-band power increases: a high-density electrical mapping study.

Authors: Adam C Snyder; John J Foxe
Journal: J Neurosci Date: 2010-03-17 Impact factor: 6.709

4 in total

1. Association of increased abdominal adiposity at birth with altered ventral caudate microstructure.

Authors: Dawn X P Koh; Mya Thway Tint; Peter D Gluckman; Yap Seng Chong; Fabian K P Yap; Anqi Qiu; Johan G Eriksson; Marielle V Fortier; Patricia P Silveira; Michael J Meaney; Ai Peng Tan
Journal: Int J Obes (Lond) Date: 2021-07-19 Impact factor: 5.095

2. Structure-function coupling within the reward network in preschool children predicts executive functioning in later childhood.

Authors: Shi Yu Chan; Zi Yan Ong; Zhen Ming Ngoh; Yap Seng Chong; Juan H Zhou; Marielle V Fortier; Lourdes M Daniel; Anqi Qiu; Michael J Meaney; Ai Peng Tan
Journal: Dev Cogn Neurosci Date: 2022-03-30 Impact factor: 5.811

3. Infra-slow brain dynamics as a marker for cognitive function and decline.

Authors: Shagun Ajmera; Shreya Rajagopal; Razi Ur Rehman; Devarajan Sridharan
Journal: Adv Neural Inf Process Syst Date: 2019

4. Differential functional connectivity underlying asymmetric reward-related activity in human and nonhuman primates.

Authors: Alizée Lopez-Persem; Léa Roumazeilles; Davide Folloni; Kévin Marche; Elsa F Fouragnan; Nima Khalighinejad; Matthew F S Rushworth; Jérôme Sallet
Journal: Proc Natl Acad Sci U S A Date: 2020-10-29 Impact factor: 11.205

4 in total