Literature DB >> 34712038

Inverse Rational Control with Partially Observable Continuous Nonlinear Dynamics.

Minhae Kwon1, Saurabh Daptardar2, Paul Schrater3, Xaq Pitkow4.   

Abstract

A fundamental question in neuroscience is how the brain creates an internal model of the world to guide actions using sequences of ambiguous sensory information. This is naturally formulated as a reinforcement learning problem under partial observations, where an agent must estimate relevant latent variables in the world from its evidence, anticipate possible future states, and choose actions that optimize total expected reward. This problem can be solved by control theory, which allows us to find the optimal actions for a given system dynamics and objective function. However, animals often appear to behave suboptimally. Why? We hypothesize that animals have their own flawed internal model of the world, and choose actions with the highest expected subjective reward according to that flawed model. We describe this behavior as rational but not optimal. The problem of Inverse Rational Control (IRC) aims to identify which internal model would best explain an agent's actions. Our contribution here generalizes past work on Inverse Rational Control which solved this problem for discrete control in partially observable Markov decision processes. Here we accommodate continuous nonlinear dynamics and continuous actions, and impute sensory observations corrupted by unknown noise that is private to the animal. We first build an optimal Bayesian agent that learns an optimal policy generalized over the entire model space of dynamics and subjective rewards using deep reinforcement learning. Crucially, this allows us to compute a likelihood over models for experimentally observable action trajectories acquired from a suboptimal agent. We then find the model parameters that maximize the likelihood using gradient ascent. Our method successfully recovers the true model of rational agents. This approach provides a foundation for interpreting the behavioral and neural dynamics of animal brains during complex tasks.

Entities:  

Year:  2020        PMID: 34712038      PMCID: PMC8549572     

Source DB:  PubMed          Journal:  Adv Neural Inf Process Syst        ISSN: 1049-5258


  13 in total

1.  Inferring learners' knowledge from their actions.

Authors:  Anna N Rafferty; Michelle M LaMar; Thomas L Griffiths
Journal:  Cogn Sci       Date:  2014-08-25

2.  Rational thoughts in neural codes.

Authors:  Zhengwei Wu; Minhae Kwon; Saurabh Daptardar; Paul Schrater; Xaq Pitkow
Journal:  Proc Natl Acad Sci U S A       Date:  2020-11-24       Impact factor: 11.205

3.  Not noisy, just wrong: the role of suboptimal inference in behavioral variability.

Authors:  Jeffrey M Beck; Wei Ji Ma; Xaq Pitkow; Peter E Latham; Alexandre Pouget
Journal:  Neuron       Date:  2012-04-12       Impact factor: 17.173

4.  Decision making under uncertainty: a neural model based on partially observable markov decision processes.

Authors:  Rajesh P N Rao
Journal:  Front Comput Neurosci       Date:  2010-11-24       Impact factor: 2.380

5.  Tracking the Mind's Eye: Primate Gaze Behavior during Virtual Visuomotor Navigation Reflects Belief Dynamics.

Authors:  Kaushik J Lakshminarasimhan; Eric Avila; Erin Neyhart; Gregory C DeAngelis; Xaq Pitkow; Dora E Angelaki
Journal:  Neuron       Date:  2020-03-13       Impact factor: 17.173

6.  A Dynamic Bayesian Observer Model Reveals Origins of Bias in Visual Path Integration.

Authors:  Kaushik J Lakshminarasimhan; Marina Petsalis; Hyeshin Park; Gregory C DeAngelis; Xaq Pitkow; Dora E Angelaki
Journal:  Neuron       Date:  2018-06-21       Impact factor: 17.173

7.  Observing the observer (II): deciding when to decide.

Authors:  Jean Daunizeau; Hanneke E M den Ouden; Matthias Pessiglione; Stefan J Kiebel; Karl J Friston; Klaas E Stephan
Journal:  PLoS One       Date:  2010-12-14       Impact factor: 3.240

8.  Observing the observer (I): meta-bayesian models of learning and decision-making.

Authors:  Jean Daunizeau; Hanneke E M den Ouden; Matthias Pessiglione; Stefan J Kiebel; Klaas E Stephan; Karl J Friston
Journal:  PLoS One       Date:  2010-12-14       Impact factor: 3.240

9.  Cognitive tomography reveals complex, task-independent mental representations.

Authors:  Neil M T Houlsby; Ferenc Huszár; Mohammad M Ghassemi; Gergő Orbán; Daniel M Wolpert; Máté Lengyel
Journal:  Curr Biol       Date:  2013-11-04       Impact factor: 10.834

10.  Reward optimization in the primate brain: a probabilistic model of decision making under uncertainty.

Authors:  Yanping Huang; Rajesh P N Rao
Journal:  PLoS One       Date:  2013-01-22       Impact factor: 3.240

View more
  2 in total

1.  Putting perception into action with inverse optimal control for continuous psychophysics.

Authors:  Dominik Straub; Constantin A Rothkopf
Journal:  Elife       Date:  2022-09-29       Impact factor: 8.713

2.  Sensory Evidence Accumulation Using Optic Flow in a Naturalistic Navigation Task.

Authors:  Panos Alefantis; Kaushik Lakshminarasimhan; Eric Avila; Jean-Paul Noel; Xaq Pitkow; Dora E Angelaki
Journal:  J Neurosci       Date:  2022-05-31       Impact factor: 6.709

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.