Pay attention! - Robustifying a Deep Visuomotor Policy through Task-Focused Attention

by   Pooya Abolghasemi, et al.

Several recent projects demonstrated the promise of end-to-end learned deep visuomotor policies for robot manipulator control. Despite impressive progress, these systems are known to be vulnerable to physical disturbances, such as accidental or adversarial bumps that make them drop the manipulated object. They also tend to be distracted by visual disturbances such as objects moving in the robot's field of view, even if the disturbance does not physically prevent the execution of the task. In this paper we propose a technique for augmenting a deep visuomotor policy trained through demonstrations with task-focused attention. The manipulation task is specified with a natural language text such as "move the red bowl to the left". This allows the attention component to concentrate on the current object that the robot needs to manipulate. We show that even in benign environments, the task focused attention allows the policy to consistently outperform a variant with no attention mechanism. More importantly, the new policy is significantly more robust: it regularly recovers from severe physical disturbances (such as bumps causing it to drop the object) from which the unmodified policy almost never recovers. In addition, we show that the proposed policy performs correctly in the presence of a wide class of visual disturbances, exhibiting a behavior reminiscent of human selective attention experiments.


page 3

page 5

page 7


Deep Object-Centric Representations for Generalizable Robot Learning

Robotic manipulation in complex open-world scenarios requires both relia...

Attentional Network for Visual Object Detection

We propose augmenting deep neural networks with an attention mechanism f...

Language-Guided Generation of Physically Realistic Robot Motion and Control

We aim to control a robot to physically behave in the real world followi...

Robot Object Retrieval with Contextual Natural Language Queries

Natural language object retrieval is a highly useful yet challenging tas...

Accept Synthetic Objects as Real: End-to-End Training of Attentive Deep Visuomotor Policies for Manipulation in Clutter

Recent research demonstrated that it is feasible to end-to-end train mul...

Learning to Design and Construct Bridge without Blueprint

Autonomous assembly has been a desired functionality of many intelligent...

Conditionally Learn to Pay Attention for Sequential Visual Task

Sequential visual task usually requires to pay attention to its current ...

Please sign up or login with your details

Forgot password? Click here to reset