An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

12/10/2020
by   Kyunghyun Lee, et al.
1

Deep reinforcement learning (DRL) algorithms and evolution strategies (ES) have been applied to various tasks, showing excellent performances. These have the opposite properties, with DRL having good sample efficiency and poor stability, while ES being vice versa. Recently, there have been attempts to combine these algorithms, but these methods fully rely on synchronous update scheme, making it not ideal to maximize the benefits of the parallelism in ES. To solve this challenge, asynchronous update scheme was introduced, which is capable of good time-efficiency and diverse policy exploration. In this paper, we introduce an Asynchronous Evolution Strategy-Reinforcement Learning (AES-RL) that maximizes the parallel efficiency of ES and integrates it with policy gradient methods. Specifically, we propose 1) a novel framework to merge ES and DRL asynchronously and 2) various asynchronous update methods that can take all advantages of asynchronism, ES, and DRL, which are exploration and time efficiency, stability, and sample efficiency, respectively. The proposed framework and update methods are evaluated in continuous control benchmark work, showing superior performance as well as time efficiency compared to the previous methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/12/2022

Evolutionary Action Selection for Gradient-based Policy Learning

Evolutionary Algorithms (EAs) and Deep Reinforcement Learning (DRL) have...
research
03/03/2019

Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex Environments

Deep Deterministic Policy Gradient (DDPG) has been proved to be a succes...
research
05/21/2018

Evolutionary Reinforcement Learning

Deep Reinforcement Learning (DRL) algorithms have been successfully appl...
research
04/19/2021

Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement Learning

Deep reinforcement learning (DRL) has successfully solved various proble...
research
12/17/2020

High-Throughput Synchronous Deep RL

Deep reinforcement learning (RL) is computationally demanding and requir...
research
04/13/2018

Smooth and Efficient Policy Exploration for Robot Trajectory Learning

Many policy search algorithms have been proposed for robot learning and ...
research
09/21/2022

Lamarckian Platform: Pushing the Boundaries of Evolutionary Reinforcement Learning towards Asynchronous Commercial Games

Despite the emerging progress of integrating evolutionary computation in...

Please sign up or login with your details

Forgot password? Click here to reset