Concurrent Meta Reinforcement Learning

by   Emilio Parisotto, et al.

State-of-the-art meta reinforcement learning algorithms typically assume the setting of a single agent interacting with its environment in a sequential manner. A negative side-effect of this sequential execution paradigm is that, as the environment becomes more and more challenging, and thus requiring more interaction episodes for the meta-learner, it needs the agent to reason over longer and longer time-scales. To combat the difficulty of long time-scale credit assignment, we propose an alternative parallel framework, which we name "Concurrent Meta-Reinforcement Learning" (CMRL), that transforms the temporal credit assignment problem into a multi-agent reinforcement learning one. In this multi-agent setting, a set of parallel agents are executed in the same environment and each of these "rollout" agents are given the means to communicate with each other. The goal of the communication is to coordinate, in a collaborative manner, the most efficient exploration of the shared task the agents are currently assigned. This coordination therefore represents the meta-learning aspect of the framework, as each agent can be assigned or assign itself a particular section of the current task's state space. This framework is in contrast to standard RL methods that assume that each parallel rollout occurs independently, which can potentially waste computation if many of the rollouts end up sampling the same part of the state space. Furthermore, the parallel setting enables us to define several reward sharing functions and auxiliary losses that are non-trivial to apply in the sequential setting. We demonstrate the effectiveness of our proposed CMRL at improving over sequential methods in a variety of challenging tasks.


page 5

page 9

page 14


Off-Beat Multi-Agent Reinforcement Learning

We investigate model-free multi-agent reinforcement learning (MARL) in e...

Improving Coordination in Multi-Agent Deep Reinforcement Learning through Memory-driven Communication

Deep reinforcement learning algorithms have recently been used to train ...

Multi-agent Reinforcement Learning Improvement in a Dynamic Environment Using Knowledge Transfer

Cooperative multi-agent systems are being widely used in variety of area...

Agent-State Construction with Auxiliary Inputs

In many, if not every realistic sequential decision-making task, the dec...

Credit Assignment with Meta-Policy Gradient for Multi-Agent Reinforcement Learning

Reward decomposition is a critical problem in centralized training with ...

On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning

The creation and destruction of agents in cooperative multi-agent reinfo...

Analyzing Micro-Founded General Equilibrium Models with Many Agents using Deep Reinforcement Learning

Real economies can be modeled as a sequential imperfect-information game...

Please sign up or login with your details

Forgot password? Click here to reset