Learning in Imperfect Environment: Multi-Label Classification with Long-Tailed Distribution and Partial Labels

by   Wenqiao Zhang, et al.

Conventional multi-label classification (MLC) methods assume that all samples are fully labeled and identically distributed. Unfortunately, this assumption is unrealistic in large-scale MLC data that has long-tailed (LT) distribution and partial labels (PL). To address the problem, we introduce a novel task, Partial labeling and Long-Tailed Multi-Label Classification (PLT-MLC), to jointly consider the above two imperfect learning environments. Not surprisingly, we find that most LT-MLC and PL-MLC approaches fail to solve the PLT-MLC, resulting in significant performance degradation on the two proposed PLT-MLC benchmarks. Therefore, we propose an end-to-end learning framework: COrrection → ModificatIon → balanCe, abbreviated as . Our bootstrapping philosophy is to simultaneously correct the missing labels (Correction) with convinced prediction confidence over a class-aware threshold and to learn from these recall labels during training. We next propose a novel multi-focal modifier loss that simultaneously addresses head-tail imbalance and positive-negative imbalance to adaptively modify the attention to different samples (Modification) under the LT class distribution. In addition, we develop a balanced training strategy by distilling the model's learning effect from head and tail samples, and thus design a balanced classifier (Balance) conditioned on the head and tail learning effect to maintain stable performance for all samples. Our experimental study shows that the proposed significantly outperforms general MLC, LT-MLC and PL-MLC methods in terms of effectiveness and robustness on our newly created PLT-MLC datasets.


page 2

page 4


Invariant Feature Learning for Generalized Long-Tailed Classification

Existing long-tailed classification (LT) methods only focus on tackling ...

PLM: Partial Label Masking for Imbalanced Multi-label Classification

Neural networks trained on real-world datasets with long-tailed label di...

Balancing Domain Experts for Long-Tailed Camera-Trap Recognition

Label distributions in camera-trap images are highly imbalanced and long...

Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

Negative sampling schemes enable efficient training given a large number...

Identifying Hard Noise in Long-Tailed Sample Distribution

Conventional de-noising methods rely on the assumption that all samples ...

Imbalanced Continual Learning with Partitioning Reservoir Sampling

Continual learning from a sequential stream of data is a crucial challen...

Pairwise Instance Relation Augmentation for Long-tailed Multi-label Text Classification

Multi-label text classification (MLTC) is one of the key tasks in natura...

Please sign up or login with your details

Forgot password? Click here to reset