IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks

by   Michael Luo, et al.

The practical usage of reinforcement learning agents is often bottlenecked by the duration of training time. To accelerate training, practitioners often turn to distributed reinforcement learning architectures to parallelize and accelerate the training process. However, modern methods for scalable reinforcement learning (RL) often tradeoff between the throughput of samples that an RL agent can learn from (sample throughput) and the quality of learning from each sample (sample efficiency). In these scalable RL architectures, as one increases sample throughput (i.e. increasing parallelization in IMPALA), sample efficiency drops significantly. To address this, we propose a new distributed reinforcement learning algorithm, IMPACT. IMPACT extends IMPALA with three changes: a target network for stabilizing the surrogate objective, a circular buffer, and truncated importance sampling. In discrete action-space environments, we show that IMPACT attains higher reward and, simultaneously, achieves up to 30 continuous control environments, IMPACT trains faster than existing scalable agents while preserving the sample efficiency of synchronous PPO.


page 1

page 2

page 3

page 4


Sample Efficient Ensemble Learning with Catalyst.RL

We present Catalyst.RL, an open-source PyTorch framework for reproducibl...

High-Throughput Synchronous Deep RL

Deep reinforcement learning (RL) is computationally demanding and requir...

NAPPO: Modular and scalable reinforcement learning in pytorch

Reinforcement learning (RL) has been very successful in recent years but...

Hierarchical Reinforcement Learning with Hindsight

Reinforcement Learning (RL) algorithms can suffer from poor sample effic...

Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

Increasing the scale of reinforcement learning experiments has allowed r...

Podracer architectures for scalable Reinforcement Learning

Supporting state-of-the-art AI research requires balancing rapid prototy...

Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

Multi-simulator training has contributed to the recent success of Deep R...

Please sign up or login with your details

Forgot password? Click here to reset