ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search

11/06/2018
by   Shangtong Zhang, et al.
0

In this paper, we propose an actor ensemble algorithm, named ACE, for continuous control with a deterministic policy in reinforcement learning. In ACE, we use actor ensemble (i.e., multiple actors) to search the global maxima of the critic. Besides the ensemble perspective, we also formulate ACE in the option framework by extending the option-critic architecture with deterministic intra-option policies, revealing a relationship between ensemble and options. Furthermore, we perform a look-ahead tree search with those actors and a learned value prediction model, resulting in a refined value estimation. We demonstrate a significant performance boost of ACE over DDPG and its variants in challenging physical robot simulators.

READ FULL TEXT
research
04/29/2019

DAC: The Double Actor-Critic Architecture for Learning Options

We reformulate the option framework as two parallel augmented MDPs. Unde...
research
12/25/2017

Learning to Run with Actor-Critic Ensemble

We introduce an Actor-Critic Ensemble(ACE) method for improving the perf...
research
12/11/2017

The Eigenoption-Critic Framework

Eigenoptions (EOs) have been recently introduced as a promising idea for...
research
09/24/2021

Parameter-Free Deterministic Reduction of the Estimation Bias in Continuous Control

Approximation of the value functions in value-based deep reinforcement l...
research
12/27/2019

Crowdfunding Dynamics Tracking: A Reinforcement Learning Approach

Recent years have witnessed the increasing interests in research of crow...
research
04/27/2020

Neural Machine Translation with Monte-Carlo Tree Search

Recent algorithms in machine translation have included a value network t...
research
11/30/2021

Continuous Control With Ensemble Deep Deterministic Policy Gradients

The growth of deep reinforcement learning (RL) has brought multiple exci...

Please sign up or login with your details

Forgot password? Click here to reset