ARUBA: An Architecture-Agnostic Balanced Loss for Aerial Object Detection

by   Rebbapragada V C Sairam, et al.

Deep neural networks tend to reciprocate the bias of their training dataset. In object detection, the bias exists in the form of various imbalances such as class, background-foreground, and object size. In this paper, we denote size of an object as the number of pixels it covers in an image and size imbalance as the over-representation of certain sizes of objects in a dataset. We aim to address the problem of size imbalance in drone-based aerial image datasets. Existing methods for solving size imbalance are based on architectural changes that utilize multiple scales of images or feature maps for detecting objects of different sizes. We, on the other hand, propose a novel ARchitectUre-agnostic BAlanced Loss (ARUBA) that can be applied as a plugin on top of any object detection model. It follows a neighborhood-driven approach inspired by the ordinality of object size. We evaluate the effectiveness of our approach through comprehensive experiments on aerial datasets such as HRSC2016, DOTAv1.0, DOTAv1.5 and VisDrone and obtain consistent improvement in performance.


page 1

page 4

page 8


A systematic study of the foreground-background imbalance problem in deep learning for object detection

The class imbalance problem in deep learning has been explored in severa...

Improving the performance of object detection by preserving label distribution

Object detection is a task that performs position identification and lab...

Investigating the Challenges of Class Imbalance and Scale Variation in Object Detection in Aerial Images

While object detection is a common problem in computer vision, it is eve...

Salience Biased Loss for Object Detection in Aerial Images

Object detection in remote sensing, especially in aerial images, remains...

Mind the Pad – CNNs can Develop Blind Spots

We show how feature maps in convolutional networks are susceptible to sp...

Environment-Invariant Curriculum Relation Learning for Fine-Grained Scene Graph Generation

The scene graph generation (SGG) task is designed to identify the predic...

Spatio-temporal Consistency to Detect Potential Aedes aegypti Breeding Grounds in Aerial Video Sequences

Every year, the Aedes aegypti mosquito infects thousands of people with ...

Please sign up or login with your details

Forgot password? Click here to reset