Specification Inference from Demonstrations

Learning from expert demonstrations has received a lot of attention in artificial intelligence and machine learning. The goal is to infer the underlying reward function that an agent is optimizing given a set of observations of the agent's behavior over time in a variety of circumstances, the system state trajectories, and a plant model specifying the evolution of the system state for different agent's actions. The system is often modeled as a Markov decision process, that is, the next state depends only on the current state and agent's action, and the the agent's choice of action depends only on the current state. While the former is a Markovian assumption on the evolution of system state, the later assumes that the target reward function is itself Markovian. In this work, we explore learning a class of non-Markovian reward functions, known in the formal methods literature as specifications. These specifications offer better composition, transferability, and interpretability. We then show that inferring the specification can be done efficiently without unrolling the transition system. We demonstrate on a 2-d grid world example.


page 1

page 2

page 3

page 4


Imitation from Observation With Bootstrapped Contrastive Learning

Imitation from observation (IfO) is a learning paradigm that consists of...

Task-Guided Inverse Reinforcement Learning Under Partial Information

We study the problem of inverse reinforcement learning (IRL), where the ...

Online Learning of Non-Markovian Reward Models

There are situations in which an agent should receive rewards only after...

Safety-Aware Apprenticeship Learning

Apprenticeship learning (AL) is a class of "learning from demonstrations...

CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem

Inverse reinforcement learning (IRL) is used to infer the reward functio...

Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning

In this work, we consider one-shot imitation learning for object rearran...

Training an Interactive Helper

Developing agents that can quickly adapt their behavior to new tasks rem...

Please sign up or login with your details

Forgot password? Click here to reset