Coarse-to-fine Q-attention with Tree Expansion

04/26/2022
by   Stephen James, et al.
2

Coarse-to-fine Q-attention enables sample-efficient robot manipulation by discretizing the translation space in a coarse-to-fine manner, where the resolution gradually increases at each layer in the hierarchy. Although effective, Q-attention suffers from "coarse ambiguity" - when voxelization is significantly coarse, it is not feasible to distinguish similar-looking objects without first inspecting at a finer resolution. To combat this, we propose to envision Q-attention as a tree that can be expanded and used to accumulate value estimates across the top-k voxels at each Q-attention depth. When our extension, Q-attention with Tree Expansion (QTE), replaces standard Q-attention in the Attention-driven Robot Manipulation (ARM) system, we are able to accomplish a larger set of tasks; especially on those that suffer from "coarse ambiguity". In addition to evaluating our approach across 12 RLBench tasks, we also show that the improved performance is visible in a real-world task involving small objects.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 6

research
06/23/2021

Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation

Reflecting on the last few years, the biggest breakthroughs in deep rein...
research
04/04/2022

Coarse-to-Fine Q-attention with Learned Path Ranking

We propose Learned Path Ranking (LPR), a method that accepts an end-effe...
research
12/12/2019

Parareal with a Learned Coarse Model for Robotic Manipulation

A key component of many robotics model-based planning and control algori...
research
09/16/2016

Image-to-Markup Generation with Coarse-to-Fine Attention

We present a neural encoder-decoder model to convert images into present...
research
06/30/2023

Act3D: Infinite Resolution Action Detection Transformer for Robotic Manipulation

3D perceptual representations are well suited for robot manipulation as ...
research
07/31/2019

Towards a Theory of Intentions for Human-Robot Collaboration

The architecture described in this paper encodes a theory of intentions ...
research
04/16/2020

Top-Down Networks: A coarse-to-fine reimagination of CNNs

Biological vision adopts a coarse-to-fine information processing pathway...

Please sign up or login with your details

Forgot password? Click here to reset