Deep Active Learning with Contrastive Learning Under Realistic Data Pool Assumptions

03/25/2023
by   Jihyo Kim, et al.
0

Active learning aims to identify the most informative data from an unlabeled data pool that enables a model to reach the desired accuracy rapidly. This benefits especially deep neural networks which generally require a huge number of labeled samples to achieve high performance. Most existing active learning methods have been evaluated in an ideal setting where only samples relevant to the target task, i.e., in-distribution samples, exist in an unlabeled data pool. A data pool gathered from the wild, however, is likely to include samples that are irrelevant to the target task at all and/or too ambiguous to assign a single class label even for the oracle. We argue that assuming an unlabeled data pool consisting of samples from various distributions is more realistic. In this work, we introduce new active learning benchmarks that include ambiguous, task-irrelevant out-of-distribution as well as in-distribution samples. We also propose an active learning method designed to acquire informative in-distribution samples in priority. The proposed method leverages both labeled and unlabeled data pools and selects samples from clusters on the feature space constructed via contrastive learning. Experimental results demonstrate that the proposed method requires a lower annotation budget than existing active learning methods to reach the same level of accuracy.

READ FULL TEXT

page 1

page 3

page 4

research
07/04/2022

Pareto Optimization for Active Learning under Out-of-Distribution Data Scenarios

Pool-based Active Learning (AL) has achieved great success in minimizing...
research
01/12/2023

Forgetful Active Learning with Switch Events: Efficient Sampling for Out-of-Distribution Data

This paper considers deep out-of-distribution active learning. In practi...
research
12/20/2022

Temporal Output Discrepancy for Loss Estimation-based Active Learning

While deep learning succeeds in a wide range of tasks, it highly depends...
research
07/04/2020

Deep Active Learning via Open Set Recognition

In many applications, data is easy to acquire but expensive and time con...
research
04/10/2020

State-Relabeling Adversarial Active Learning

Active learning is to design label-efficient algorithms by sampling the ...
research
10/10/2019

Active Learning with Importance Sampling

We consider an active learning setting where the algorithm has access to...
research
11/01/2022

Batch Active Learning from the Perspective of Sparse Approximation

Active learning enables efficient model training by leveraging interacti...

Please sign up or login with your details

Forgot password? Click here to reset