Label Noise Types and Their Effects on Deep Learning

by   Görkem Algan, et al.

The recent success of deep learning is mostly due to the availability of big datasets with clean annotations. However, gathering a cleanly annotated dataset is not always feasible due to practical challenges. As a result, label noise is a common problem in datasets, and numerous methods to train deep neural networks in the presence of noisy labels are proposed in the literature. These methods commonly use benchmark datasets with synthetic label noise on the training set. However, there are multiple types of label noise, and each of them has its own characteristic impact on learning. Since each work generates a different kind of label noise, it is problematic to test and compare those algorithms in the literature fairly. In this work, we provide a detailed analysis of the effects of different kinds of label noise on learning. Moreover, we propose a generic framework to generate feature-dependent label noise, which we show to be the most challenging case for learning. Our proposed method aims to emphasize similarities among data instances by sparsely distributing them in the feature domain. By this approach, samples that are more likely to be mislabeled are detected from their softmax probabilities, and their labels are flipped to the corresponding class. The proposed method can be applied to any clean dataset to synthesize feature-dependent noisy labels. For the ease of other researchers to test their algorithms with noisy labels, we share corrupted labels for the most commonly used benchmark datasets. Our code and generated noisy synthetic labels are available online.


page 1

page 2

page 3

page 4


A Realistic Simulation Framework for Learning with Label Noise

We propose a simulation framework for generating realistic instance-depe...

Generation and Analysis of Feature-Dependent Pseudo Noise for Training Deep Neural Networks

Training Deep neural networks (DNNs) on noisy labeled datasets is a chal...

Towards the Identifiability in Noisy Label Learning: A Multinomial Mixture Approach

Learning from noisy labels plays an important role in the deep learning ...

Do We Really Need Gold Samples for Sample Weighting Under Label Noise?

Learning with labels noise has gained significant traction recently due ...

Neighborhood Collective Estimation for Noisy Label Identification and Correction

Learning with noisy labels (LNL) aims at designing strategies to improve...

Learning to Combat Noisy Labels via Classification Margins

A deep neural network trained on noisy labels is known to quickly lose i...

Analysing the Noise Model Error for Realistic Noisy Label Data

Distant and weak supervision allow to obtain large amounts of labeled tr...

Please sign up or login with your details

Forgot password? Click here to reset