Agent-State Construction with Auxiliary Inputs

by   Ruo Yu Tao, et al.

In many, if not every realistic sequential decision-making task, the decision-making agent is not able to model the full complexity of the world. The environment is often much larger and more complex than the agent, a setting also known as partial observability. In such settings, the agent must leverage more than just the current sensory inputs; it must construct an agent state that summarizes previous interactions with the world. Currently, a popular approach for tackling this problem is to learn the agent-state function via a recurrent network from the agent's sensory stream as input. Many impressive reinforcement learning applications have instead relied on environment-specific functions to aid the agent's inputs for history summarization. These augmentations are done in multiple ways, from simple approaches like concatenating observations to more complex ones such as uncertainty estimates. Although ubiquitous in the field, these additional inputs, which we term auxiliary inputs, are rarely emphasized, and it is not clear what their role or impact is. In this work we explore this idea further, and relate these auxiliary inputs to prior classic approaches to state construction. We present a series of examples illustrating the different ways of using auxiliary inputs for reinforcement learning. We show that these auxiliary inputs can be used to discriminate between observations that would otherwise be aliased, leading to more expressive features that smoothly interpolate between different states. Finally, we show that this approach is complementary to state-of-the-art methods such as recurrent neural networks and truncated back-propagation through time, and acts as a heuristic that facilitates longer temporal credit assignment, leading to better performance.


Learning to Navigate in Complex Environments

Learning to navigate in complex environments with dynamic elements is an...

Learning Agent State Online with Recurrent Generate-and-Test

Learning continually and online from a continuous stream of data is chal...

Concurrent Meta Reinforcement Learning

State-of-the-art meta reinforcement learning algorithms typically assume...

Playing Games in the Dark: An approach for cross-modality transfer in reinforcement learning

In this work we explore the use of latent representations obtained from ...

Deep Variational Reinforcement Learning for POMDPs

Many real-world sequential decision making problems are partially observ...

A Self-Supervised Auxiliary Loss for Deep RL in Partially Observable Settings

In this work we explore an auxiliary loss useful for reinforcement learn...

Auxiliary Tasks and Exploration Enable ObjectNav

ObjectGoal Navigation (ObjectNav) is an embodied task wherein agents are...

Please sign up or login with your details

Forgot password? Click here to reset