Over-Fit: Noisy-Label Detection based on the Overfitted Model Property

by   Seulki Park, et al.

Due to the increasing need to handle the noisy label problem in a massive dataset, learning with noisy labels has received much attention in recent years. As a promising approach, there have been recent studies to select clean training data by finding small-loss instances before a deep neural network overfits the noisy-label data. However, it is challenging to prevent overfitting. In this paper, we propose a novel noisy-label detection algorithm by employing the property of overfitting on individual data points. To this end, we present two novel criteria that statistically measure how much each training sample abnormally affects the model and clean validation data. Using the criteria, our iterative algorithm removes noisy-label samples and retrains the model alternately until no further performance improvement is made. In experiments on multiple benchmark datasets, we demonstrate the validity of our algorithm and show that our algorithm outperforms the state-of-the-art methods when the exact noise rates are not given. Furthermore, we show that our method can not only be expanded to a real-world video dataset but also can be viewed as a regularization method to solve problems caused by overfitting.


page 7

page 8


Tripartite: Tackle Noisy Labels by a More Precise Partition

Samples in large-scale datasets may be mislabeled due to various reasons...

Label Noise-Robust Learning using a Confidence-Based Sieving Strategy

In learning tasks with label noise, boosting model robustness against ov...

Unleashing the Potential of Regularization Strategies in Learning with Noisy Labels

In recent years, research on learning with noisy labels has focused on d...

Neighborhood Collective Estimation for Noisy Label Identification and Correction

Learning with noisy labels (LNL) aims at designing strategies to improve...

Truncate-Split-Contrast: A Framework for Learning from Mislabeled Videos

Learning with noisy label (LNL) is a classic problem that has been exten...

Learning to Aggregate and Refine Noisy Labels for Visual Sentiment Analysis

Visual sentiment analysis has received increasing attention in recent ye...

Improving Generalization of Deep Fault Detection Models in the Presence of Mislabeled Data

Mislabeled samples are ubiquitous in real-world datasets as rule-based o...

Please sign up or login with your details

Forgot password? Click here to reset