Anchor DETR: Query Design for Transformer-Based Detector

09/15/2021
by   Yingming Wang, et al.
0

In this paper, we propose a novel query design for the transformer-based detectors. In previous transformer-based detectors, the object queries are a set of learned embeddings. However, each learned embedding does not have an explicit physical meaning and we can not explain where it will focus on. It is difficult to optimize as the prediction slot of each object query does not have a specific mode. In other words, each object query will not focus on a specific region. To solved these problems, in our query design, object queries are based on anchor points, which are widely used in CNN-based detectors. So each object query focus on the objects near the anchor point. Moreover, our query design can predict multiple objects at one position to solve the difficulty: "one region, multiple objects". In addition, we design an attention variant, which can reduce the memory cost while achieving similar or better performance than the standard attention in DETR. Thanks to the query design and the attention variant, the proposed detector that we called Anchor DETR, can achieve better performance and run faster than the DETR with 10× fewer training epochs. For example, it achieves 44.2 AP with 16 FPS on the MSCOCO dataset when using the ResNet50-DC5 feature for training 50 epochs. Extensive experiments on the MSCOCO benchmark prove the effectiveness of the proposed methods. Code is available at https://github.com/megvii-model/AnchorDETR.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 8

research
06/13/2022

Featurized Query R-CNN

The query mechanism introduced in the DETR method is changing the paradi...
research
03/30/2022

AdaMixer: A Fast-Converging Query-Based Object Detector

Traditional object detectors employ the dense paradigm of scanning over ...
research
07/18/2022

Conditional DETR V2: Efficient Detection Transformer with Box Queries

In this paper, we are interested in Detection Transformer (DETR), an end...
research
11/18/2022

Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization

This paper deals with the problem of localizing objects in image and vid...
research
02/14/2023

Team DETR: Guide Queries as a Professional Team in Detection Transformers

Recent proposed DETR variants have made tremendous progress in various s...
research
08/18/2023

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation

Recent sparse detectors with multiple, e.g. six, decoder layers achieve ...
research
07/17/2023

Box-DETR: Understanding and Boxing Conditional Spatial Queries

Conditional spatial queries are recently introduced into DEtection TRans...

Please sign up or login with your details

Forgot password? Click here to reset