Inverse Belief Dynamics from Ecological Task Behavior

Zitieren

Zugehöriges Material

Banff International Research Station (BIRS) for Mathematical Innovation and Discovery

Schrater, Paul

Formale Metadaten

Titel

Inverse Belief Dynamics from Ecological Task Behavior

Serientitel

Optimal Neuroethology of Movement and Motor Control (19w5235)

Anzahl der Teile

Autor

Schrater, Paul

Lizenz

CC-Namensnennung - keine kommerzielle Nutzung - keine Bearbeitung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt in unveränderter Form zu jedem legalen und nicht-kommerziellen Zweck nutzen, vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/60663 (DOI)

Herausgeber

Banff International Research Station (BIRS) for Mathematical Innovation and Discovery

Erscheinungsjahr

2019

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Biowissenschaften / Biologie Mathematik Technik

Genre

Workshop/Interaktives Format Vorlesung

Abstract

Complex ecological behaviors are often driven by an internal model, which integrates sensory information over time and facilitates long-term planning. Inferring the internal model is a crucial ingredient for interpreting neural activities of agents and is beneficial for imitation learning. We introduce methods to infer an agent's internal model and dynamic beliefs for a dynamic foraging and game-like navigation tasks. We model agents as rational according to their (possibly defective) understanding of the task and the relevant causal variables that cannot be fully observed. Using a novel gradient-based constrained EM algorithm, we show that it's possible to invert Partially Observable Markov Decision Process (POMDP) from behavior with unknown transition dynamics, partially unknown observation functions and parametrically unknown rewards. We allow that the agent may have wrong assumptions about the task, and our method learns these assumptions from the agent's actions. We validate our method on simulated agents performing suboptimally on a foraging task, and successfully recover the agent's actual model. We show how to extend this approach to a larger range of ecological tasks. The result is a powerful method for eliciting trajectories of latent belief states from behavior that can serve as a powerful tool for interpreting neural activity.