Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

by   Wilka Carvalho, et al.

First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor virtual home-environment pose significant sample-efficiency challenges for reinforcement learning (RL) agents learning from sparse task rewards. To alleviate these challenges, prior work has provided extensive supervision via a combination of reward-shaping, ground-truth object-information, and expert demonstrations. In this work, we show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task during task learning with an object-centric relational RL agent. Our key insight is that learning an object-model that incorporates object-attention into forward prediction provides a dense learning signal for unsupervised representation learning of both objects and their relationships. This, in turn, enables faster policy learning for an object-centric relational RL agent. We demonstrate our agent by introducing a set of challenging object-interaction tasks in the AI2Thor environment where learning with our attentive object-model is key to strong performance. Specifically, we compare our agent and relational RL agents with alternative auxiliary tasks to a relational RL agent equipped with ground-truth object-information, and show that learning with our object-model best closes the performance gap in terms of both learning speed and maximum success rate. Additionally, we find that incorporating object-attention into an object-model's forward predictions is key to learning representations which capture object-category and object-state.


page 4

page 10


Agent-Centric Representations for Multi-Agent Reinforcement Learning

Object-centric representations have recently enabled significant progres...

Learning by Playing - Solving Sparse Reward Tasks from Scratch

We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm ...

Learning Good Representation via Continuous Attention

In this paper we present our scientific discovery that good representati...

Learning Dynamic Attribute-factored World Models for Efficient Multi-object Reinforcement Learning

In many reinforcement learning tasks, the agent has to learn to interact...

Relational-Grid-World: A Novel Relational Reasoning Environment and An Agent Model for Relational Information Extraction

Reinforcement learning (RL) agents are often designed specifically for a...

Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals

High sample complexity has long been a challenge for RL. On the other ha...

Deep Sets for Generalization in RL

This paper investigates the idea of encoding object-centered representat...

Please sign up or login with your details

Forgot password? Click here to reset