| Literature DB >> 25267823 |
Karl Friston1, Philipp Schwartenbeck2, Thomas FitzGerald2, Michael Moutoussis2, Timothy Behrens3, Raymond J Dolan2.
Abstract
This paper considers goal-directed decision-making in terms of embodied or active inference. We associate bounded rationality with approximate Bayesian inference that optimizes a free energy bound on model evidence. Several constructs such as expected utility, exploration or novelty bonuses, softmax choice rules and optimism bias emerge as natural consequences of free energy minimization. Previous accounts of active inference have focused on predictive coding. In this paper, we consider variational Bayes as a scheme that the brain might use for approximate Bayesian inference. This scheme provides formal constraints on the computational anatomy of inference and action, which appear to be remarkably consistent with neuroanatomy. Active inference contextualizes optimal decision theory within embodied inference, where goals become prior beliefs. For example, expected utility theory emerges as a special case of free energy minimization, where the sensitivity or inverse temperature (associated with softmax functions and quantal response equilibria) has a unique and Bayes-optimal solution. Crucially, this sensitivity corresponds to the precision of beliefs about behaviour. The changes in precision during variational updates are remarkably reminiscent of empirical dopaminergic responses-and they may provide a new perspective on the role of dopamine in assimilating reward prediction errors to optimize decision-making.Entities:
Keywords: Bayesian inference; active inference; agency; bounded rationality; free energy; utility theory
Mesh:
Substances:
Year: 2014 PMID: 25267823 PMCID: PMC4186234 DOI: 10.1098/rstb.2013.0481
Source DB: PubMed Journal: Philos Trans R Soc Lond B Biol Sci ISSN: 0962-8436 Impact factor: 6.237
Figure 1.Upper panel: a schematic of a hierarchical generative model with discrete states. The key feature of this model is that it entertains a subset of hidden states called control states. The transitions among one subset depend upon the state occupied in the other. Lower panels: this provides an example of a particular model with two control states; reject (stay) or accept (shift). The control state determines transitions among hidden states that comprise a low offer (first state), a high offer (second state), a no-offer state (third state) and absorbing states that are entered whenever a low (fourth state) or high (fifth state) offer is accepted. The probability of moving from one state to another is unity, unless otherwise specified by the transition probabilities shown in the middle row. The (hazard rate) parameter r controls the rate of offer withdrawal. Note that absorbing states—that re-enter themselves with unit probability—render this Markovian process irreversible. We will use this example in later simulations of choice behaviour.
Figure 2.This figure illustrates the cognitive and functional anatomy implied by the variational scheme—or more precisely, the mean field assumption implicit in variational updates. Here, we have associated the variational updates of expected states with perception, of future control states (policies) with action selection and, finally, expected precision with evaluation. The updates suggest the expectations from each subset are passed among each other until convergence to an internally consistent (Bayes optimal) solution. In terms of neuronal implementation, this might be likened to the exchange of neuronal signals via extrinsic connections among functionally specialized brain systems. In this (purely iconic) schematic, we have associated perception (inference about the current state of the world) with the prefrontal cortex, while assigning action selection to the basal ganglia. Crucially, precision has been associated with dopaminergic projections from VTA and SN. See main text for a full description of the equations.
Figure 3.This figure shows the results of a simulation of 16 trials, where a low offer was replaced by high offer on the 11th trial, which was accepted on the subsequent trial. Panel (a) shows the expected states as a function of trials or time, where the states are defined in figure 1. Panel (b) shows the corresponding expectations about control in the future, where the dotted lines are expectations during earlier trials and the full lines correspond to expectations during the final trial. Black corresponds to reject (stay) and grey to accept (shift). Panels (c,d) show the time-dependent changes in expected precision, after convergence on each trial (c) and deconvolved updates after each iteration of the variational updates (d).