Literature DB >> 23162491

When Can Predictive Brains be Truly Bayesian?

Mark Blokpoel1, Johan Kwisthout, Iris van Rooij.   

Abstract

Entities:  

Year:  2012        PMID: 23162491      PMCID: PMC3491582          DOI: 10.3389/fpsyg.2012.00406

Source DB:  PubMed          Journal:  Front Psychol        ISSN: 1664-1078


× No keyword cloud information.
It is thus a major virtue of the hierarchical predictive coding account that it effectively implements a computationally tractable version of the so-called Bayesian Brain Hypothesis. (Clark, in press) It seems by now common wisdom that a brain organized according to the principles of hierarchical predictive coding is a brain that is capable of efficiently performing full-blown Bayesian inferences. The idea is not only common, but also of great significance, as it suggests that the hierarchical predictive coding framework may provide a neurally plausible and computationally feasible bridge between theories of neural functioning (Friston, 2005) and theories of cognitive functioning (Chater and Manning, 2006; Baker et al., 2009). But can predictive brains really be the same as Bayesian brains? Or is the claim merely an informal or imprecise shorthand for something which is formally and factually false? We address these questions by reconsidering the formal specifications of the theory of hierarchical predictive coding, as put forth by Friston (2002, 2005). In the hierarchical predictive coding framework, it is assumed that the brain represents the statistical structure of the world at different levels of abstraction by maintaining different causal models that are organized on different levels of a hierarchy, where each level obtains input from its subordinate level. In a feed-backward chain, predictions are made for the level below. The error between the model’s predicted input and the observed (for the lowest level) or inferred (for higher levels) input at that level is used (a) in a feed-forward chain to estimate the causes at the level above and (b) to reconfigure the causal models for future predictions. Ultimately, the system stabilizes when it has minimized the overall prediction error. Here we will focus on (a) the cause estimation step in the feed-forward chain. We will argue that the predictive coding framework does not yet satisfactorily specify how this step can be both Bayesian and computationally tractable. In the Bayesian interpretation of predictive coding (Friston, 2002) estimating the causes comes down to finding the most probable causes v given the input u for that level and the current model parameters θ: Given that v has maximum a posteriori probability (MAP), the idea that predictive coding implements Bayesian inference seems to hinge on this step. The idea that hierarchical predictive coding implements tractable Bayesian inference in turn hinges on the presumed existence of a tractable computational method for estimating v. Given that it is known that computing MAP—whether exactly or approximately—is computationally intractable for arbitrary causal structures (Shimony, 1994; Abdelbar and Hedetniemi, 1998; Kwisthout, 2011), the existence of a tractable method crucially depends on the structural properties of the brain’s causal models (Kwisthout et al., 2011). At present, the hierarchical predictive coding framework does not yet make stringent commitments as to the nature of the causal models that the brain can represent. Hence, contrary to suggestions by Clark (in press), the framework does not yet have the virtue that it effectively implements tractable Bayesian inference. At this point in time three mutually exclusive options remain open: either predictive coding does not implement Bayesian inference, or predictive coding is not tractable, or the theory of hierarchical predictive coding is enriched by specific assumptions about the structure of the brain’s causal models. Assuming that one is committed to the Bayesian Brain Hypothesis, the first two options are out and the third is the only one remaining. Formal analyses expanding on this option are beyond the scope of this commentary (see e.g., Blokpoel et al., 2010; van Rooij et al., 2011), but Table 1 qualitatively sketches the space of causal models that could (or could not) yield tractable Bayesian cause estimation. We will discuss the viability of the options in more detail below.
Table 1

For which types of causal models do there exist methods for cause estimation that are both tractable and Bayesian?

Structure of causal modelsMethod used for cause estimationBayesianTractable
SimpleHeuristicYesYes
ApproximateYesYes

IntermediateHeuristicMaybeYes
ApproximateYesMaybe

UnconstrainedHeuristicNoYes
ApproximateYesNo
For which types of causal models do there exist methods for cause estimation that are both tractable and Bayesian? To start, causal models could be assumed to be quite simple, e.g., having high degrees of statistical independencies of variables. In this case, it may be that heuristic methods, such as those based on gradient ascent (Friston, 2002, p. 13) or a Kalman filter (Rao and Ballard, 1999), yield tractable Bayesian cause estimation. Let’s assume that it does. Then, of course, also tractable approximation methods exist for those simple structures—the heuristics themselves being a case in point. Note, however, that a commitment to such simple causal models may limit the scope of the predictive coding theory to simple or low-level forms of perception and cognition. After all, higher-order causal reasoning—such as occurs, for instance, in Theory of Mind (Kilner et al., 2007)—seems to presuppose quite sophisticated causal structures containing complex statistical interdependencies (see Figure 1 for an illustration; cf. Uithol et al., 2011). Complex causal models can allow for rugged probability landscapes of different possible causes and heuristic methods can get stuck in local optima that may be arbitrarily far off from the true Bayesian (i.e., MAP) solution. For complex causal structures, heuristics are thus not guaranteed to do anything remotely like approximating Bayesian inference.
Figure 1

An illustration of a hierarchy with higher level complex causal models. The illustration builds on the Jekyll and Hyde example used by Kilner et al. (2007). Kilner et al. assumed four different levels and simple mappings between the levels. For example, if at the higher level one infers that the person grasping the scalpel is Dr. Jekyll (or Mr. Hyde) then at the lower level one predicts the intention is to heal (or to hurt). The Figure illustrates that at higher levels of the hierarchy the causal models within a level can become quite complex. Whether one infers that the person is Jekyll or Hyde can depend on a myriad of interconnected variables, such as the present location, the health status of the patient, the weather, and the person’s mood. Note that this complexity cannot be dissolved by decomposing the complex causal model into simple causal models at higher levels of the hierarchy, because complex models cannot generally be so decomposed. So it seems that if one wants to use the hierarchical predictive coding framework to explain high-level cognition, then complex models within levels are required.

An illustration of a hierarchy with higher level complex causal models. The illustration builds on the Jekyll and Hyde example used by Kilner et al. (2007). Kilner et al. assumed four different levels and simple mappings between the levels. For example, if at the higher level one infers that the person grasping the scalpel is Dr. Jekyll (or Mr. Hyde) then at the lower level one predicts the intention is to heal (or to hurt). The Figure illustrates that at higher levels of the hierarchy the causal models within a level can become quite complex. Whether one infers that the person is Jekyll or Hyde can depend on a myriad of interconnected variables, such as the present location, the health status of the patient, the weather, and the person’s mood. Note that this complexity cannot be dissolved by decomposing the complex causal model into simple causal models at higher levels of the hierarchy, because complex models cannot generally be so decomposed. So it seems that if one wants to use the hierarchical predictive coding framework to explain high-level cognition, then complex models within levels are required. Given that the hierarchical predictive coding framework seems to aspire spanning all levels of cognitive functioning, it probably does not want to commit to simple causal models. The other extreme—i.e., that the brain’s causal models are structurally unconstrained—is also excluded. As explained above, it follows from known intractability results for approximating MAP (Shimony, 1994; Abdelbar and Hedetniemi, 1998; Kwisthout, 2011) that such a brain cannot implement tractable Bayesian inference. We are thus left with the intermediate option: The causal models represented by the brain can be complex but not arbitrarily so. Given that the exact nature of this causal complexity will determine whether or not a hierarchical predictive coding architecture can implement tractable Bayesian inference, it seems vital for the viability of the marriage between the predictive coding framework and the Bayesian Brain Hypothesis to identify exactly what this nature is. There is a strong appeal to the Bayesian Brain Hypothesis, as well as to the hypothesis that the brain implements cognition via hierarchical predictive coding. Given that the statistics of the world do not seem to be arbitrarily complex, it is conceivable that the brain has evolved specifically those constraints on its causal models that afford tractable Bayesian inference via hierarchical predictive coding. The open question remaining is what those constraints could possibly be. This question is particularly pressing, yet non-trivial to answer, if the hierarchical predictive coding account aims to apply to all levels of perception and cognition.
  9 in total

1.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.

Authors:  R P Rao; D H Ballard
Journal:  Nat Neurosci       Date:  1999-01       Impact factor: 24.884

Review 2.  Functional integration and inference in the brain.

Authors:  Karl Friston
Journal:  Prog Neurobiol       Date:  2002-10       Impact factor: 11.685

Review 3.  A theory of cortical responses.

Authors:  Karl Friston
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2005-04-29       Impact factor: 6.237

4.  Probabilistic models of language processing and acquisition.

Authors:  Nick Chater; Christopher D Manning
Journal:  Trends Cogn Sci       Date:  2006-06-19       Impact factor: 20.229

5.  Intentional communication: computationally easy or difficult?

Authors:  Iris van Rooij; Johan Kwisthout; Mark Blokpoel; Jakub Szymanik; Todd Wareham; Ivan Toni
Journal:  Front Hum Neurosci       Date:  2011-06-30       Impact factor: 3.169

6.  Bayesian intractability is not an ailment that approximation can cure.

Authors:  Johan Kwisthout; Todd Wareham; Iris van Rooij
Journal:  Cogn Sci       Date:  2011-05-24

Review 7.  Whatever next? Predictive brains, situated agents, and the future of cognitive science.

Authors:  Andy Clark
Journal:  Behav Brain Sci       Date:  2013-05-10       Impact factor: 12.579

Review 8.  Predictive coding: an account of the mirror neuron system.

Authors:  James M Kilner; Karl J Friston; Chris D Frith
Journal:  Cogn Process       Date:  2007-04-12

9.  Action understanding as inverse planning.

Authors:  Chris L Baker; Rebecca Saxe; Joshua B Tenenbaum
Journal:  Cognition       Date:  2009-09-02
  9 in total
  2 in total

1.  The Predictive Processing Paradigm Has Roots in Kant.

Authors:  Link R Swanson
Journal:  Front Syst Neurosci       Date:  2016-10-10

Review 2.  Unifying Theories of Psychedelic Drug Effects.

Authors:  Link R Swanson
Journal:  Front Pharmacol       Date:  2018-03-02       Impact factor: 5.810

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.