Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators

by   Alexander Herzog, et al.

We describe a system for deep reinforcement learning of robotic manipulation skills applied to a large-scale real-world task: sorting recyclables and trash in office buildings. Real-world deployment of deep RL policies requires not only effective training algorithms, but the ability to bootstrap real-world training and enable broad generalization. To this end, our system combines scalable deep RL from real-world data with bootstrapping from training in simulation, and incorporates auxiliary inputs from existing computer vision systems as a way to boost generalization to novel objects, while retaining the benefits of end-to-end training. We analyze the tradeoffs of different design decisions in our system, and present a large-scale empirical validation that includes training on real-world data gathered over the course of 24 months of experimentation, across a fleet of 23 robots in three office buildings, with a total training set of 9527 hours of robotic experience. Our final validation also consists of 4800 evaluation trials across 240 waste station configurations, in order to evaluate in detail the impact of the design decisions in our system, the scaling effects of including more real-world data, and the performance of the method on novel objects. The projects website and videos can be found at \href{}{}.


page 1

page 4

page 7

page 8

page 9

page 10

page 14

page 15


AW-Opt: Learning Robotic Skills with Imitation and Reinforcement at Scale

Robotic skills can be learned via imitation learning (IL) using user-pro...

A Framework for Efficient Robotic Manipulation

Data-efficient learning of manipulation policies from visual observation...

Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost

Dexterous multi-fingered robotic hands can perform a wide range of manip...

Adversarial Feature Training for Generalizable Robotic Visuomotor Control

Deep reinforcement learning (RL) has enabled training action-selection p...

DeepRacer: Educational Autonomous Racing Platform for Experimentation with Sim2Real Reinforcement Learning

DeepRacer is a platform for end-to-end experimentation with RL and can b...

Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second

We present Galactic, a large-scale simulation and reinforcement-learning...

Habitat 2.0: Training Home Assistants to Rearrange their Habitat

We introduce Habitat 2.0 (H2.0), a simulation platform for training virt...

Please sign up or login with your details

Forgot password? Click here to reset