Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled Learning and Conditional Generation with Extra Data

by   Bing Yu, et al.

The scarcity of class-labeled data is a ubiquitous bottleneck in a wide range of machine learning problems. While abundant unlabeled data normally exist and provide a potential solution, it is extremely challenging to exploit them. In this paper, we address this problem by leveraging Positive-Unlabeled (PU) classification and conditional generation with extra unlabeled data simultaneously, both of which aim to make full use of agnostic unlabeled data to improve classification and generation performances. In particular, we present a novel training framework to jointly target both PU classification and conditional generation when exposing to extra data, especially out-of-distribution unlabeled data, by exploring the interplay between them: 1) enhancing the performance of PU classifiers with the assistance of a novel Conditional Generative Adversarial Network (CGAN) that is robust to noisy labels, 2) leveraging extra data with predicted labels from a PU classifier to help the generation. Our key contribution is a Classifier-Noise-Invariant Conditional GAN (CNI-CGAN) that can learn the clean data distribution from noisy labels predicted by a PU classifier. Theoretically, we proved the optimal condition of CNI-CGAN and experimentally, we conducted extensive evaluations on diverse datasets, verifying the simultaneous improvements on both classification and generation.


page 7

page 15


A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels

Positive-unlabeled learning refers to the process of training a binary c...

Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data

Label-noise or curated unlabeled data is used to compensate for the assu...

ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-supervised Continual Learning

Continual learning usually assumes the incoming data are fully labeled, ...

Score-based Conditional Generation with Fewer Labeled Data by Self-calibrating Classifier Guidance

Score-based Generative Models (SGMs) are a popular family of deep genera...

Learning Classifiers on Positive and Unlabeled Data with Policy Gradient

Existing algorithms aiming to learn a binary classifier from positive (P...

Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation

Generating unlabeled data has been recently shown to help address the fe...

On Information Regularization

We formulate a principle for classification with the knowledge of the ma...

Please sign up or login with your details

Forgot password? Click here to reset