Rethinking Re-Sampling in Imbalanced Semi-Supervised Learning

06/01/2021
by   Ju He, et al.
11

Semi-Supervised Learning (SSL) has shown its strong ability in utilizing unlabeled data when labeled data is scarce. However, most SSL algorithms work under the assumption that the class distributions are balanced in both training and test sets. In this work, we consider the problem of SSL on class-imbalanced data, which better reflects real-world situations but has only received limited attention so far. In particular, we decouple the training of the representation and the classifier, and systematically investigate the effects of different data re-sampling techniques when training the whole network including a classifier as well as fine-tuning the feature extractor only. We find that data re-sampling is of critical importance to learn a good classifier as it increases the accuracy of the pseudo-labels, in particular for the minority classes in the unlabeled data. Interestingly, we find that accurate pseudo-labels do not help when training the feature extractor, rather contrariwise, data re-sampling harms the training of the feature extractor. This finding is against the general intuition that wrong pseudo-labels always harm the model performance in SSL. Based on these findings, we suggest to re-think the current paradigm of having a single data re-sampling strategy and develop a simple yet highly effective Bi-Sampling (BiS) strategy for SSL on class-imbalanced data. BiS implements two different re-sampling strategies for training the feature extractor and the classifier and integrates this decoupled training into an end-to-end framework... Code will be released at https://github.com/TACJu/Bi-Sampling.

READ FULL TEXT

page 2

page 6

research
07/17/2020

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning

While semi-supervised learning (SSL) has proven to be a promising way fo...
research
06/10/2021

Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning

The capability of the traditional semi-supervised learning (SSL) methods...
research
07/28/2022

Learning to Adapt Classifier for Imbalanced Semi-supervised Learning

Pseudo-labeling has proven to be a promising semi-supervised learning (S...
research
01/18/2023

Semi-Supervised Semantic Segmentation via Gentle Teaching Assistant

Semi-Supervised Semantic Segmentation aims at training the segmentation ...
research
12/08/2021

CoSSL: Co-Learning of Representation and Classifier for Imbalanced Semi-Supervised Learning

In this paper, we propose a novel co-learning framework (CoSSL) with dec...
research
08/07/2019

Unsupervised Feature Learning in Remote Sensing

The need for labeled data is among the most common and well-known practi...
research
05/01/2021

Semi-supervised Long-tailed Recognition using Alternate Sampling

Main challenges in long-tailed recognition come from the imbalanced data...

Please sign up or login with your details

Forgot password? Click here to reset