Dual Refinement Network for Single-Shot Object Detection
Object detection methods fall into two categories, i.e., two-stage and single-stage detectors. The former is characterized by high detection accuracy while the latter usually has considerable inference speed. Hence, it is imperative to fuse their metrics for a better accuracy vs. speed trade-off. To this end, we propose a dual refinement network (Dual-RefineDet) to boost the performance of the single-stage detector. Inheriting from advantages of the two-stage approach (i.e., two-step regression and accurate features for detection), anchor refinement and feature offset refinement are conducted in anchor-offset detection, where the detection head is comprised of deformable convolutions. Moreover, to leverage contextual information for describing objects, we design a multi-deformable head, in which multiple detection paths with different respective field sizes devote themselves to detecting objects. Extensive experiments on PASCAL VOC datasets are conducted, and we achieve the state-of-the-art results and a better accuracy vs. speed trade-off, i.e., 81.3% mAP vs. 42.3 FPS with 320× 320 input image on VOC2007 dataset. Codes will be made publicly available.
READ FULL TEXT