Deconfounded Imitation Learning

11/04/2022
by   Risto Vuorio, et al.
0

Standard imitation learning can fail when the expert demonstrators have different sensory inputs than the imitating agent. This is because partial observability gives rise to hidden confounders in the causal graph. We break down the space of confounded imitation learning problems and identify three settings with different data requirements in which the correct imitation policy can be identified. We then introduce an algorithm for deconfounded imitation learning, which trains an inference model jointly with a latent-conditional policy. At test time, the agent alternates between updating its belief over the latent and acting under the belief. We show in theory and practice that this algorithm converges to the correct interventional policy, solves the confounding issue, and can under certain assumptions achieve an asymptotically optimal imitation performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2019

State Alignment-based Imitation Learning

Consider an imitation learning problem that the imitator and the expert ...
research
06/22/2019

Learning Belief Representations for Imitation Learning in POMDPs

We consider the problem of imitation learning from expert demonstrations...
research
08/03/2022

Sequence Model Imitation Learning with Unobserved Contexts

We consider imitation learning problems where the expert has access to a...
research
03/02/2020

Causal Transfer for Imitation Learning and Decision Making under Sensor-shift

Learning from demonstrations (LfD) is an efficient paradigm to train AI ...
research
01/22/2018

Convergence of Value Aggregation for Imitation Learning

Value aggregation is a general framework for solving imitation learning ...
research
08/12/2022

Sequential Causal Imitation Learning with Unobserved Confounders

"Monkey see monkey do" is an age-old adage, referring to naïve imitation...
research
06/01/2023

Causal Imitability Under Context-Specific Independence Relations

Drawbacks of ignoring the causal mechanisms when performing imitation le...

Please sign up or login with your details

Forgot password? Click here to reset