On Information Regularization

10/19/2012
by   Adrian Corduneanu, et al.
0

We formulate a principle for classification with the knowledge of the marginal distribution over the data points (unlabeled data). The principle is cast in terms of Tikhonov style regularization where the regularization penalty articulates the way in which the marginal density should constrain otherwise unrestricted conditional distributions. Specifically, the regularization penalty penalizes any information introduced between the examples and labels beyond what is provided by the available labeled examples. The work extends Szummer and Jaakkola's information regularization (NIPS 2002) to multiple dimensions, providing a regularizer independent of the covering of the space used in the derivation. We show in addition how the information regularizer can be used as a measure of complexity of the classification task with unlabeled data and prove a relevant sample-complexity bound. We illustrate the regularization principle in practice by restricting the class of conditional distributions to be logistic regression models and constructing the regularization penalty from a finite set of unlabeled examples.

READ FULL TEXT
research
11/11/2015

Universum Prescription: Regularization using Unlabeled Data

This paper shows that simply prescribing "none of the above" labels to u...
research
01/16/2019

The information-theoretic value of unlabeled data in semi-supervised learning

We quantify the separation between the numbers of labeled examples requi...
research
08/26/2011

Semi-supervised logistic discrimination via labeled data and unlabeled data from different sampling distributions

This article addresses the problem of classification method based on bot...
research
08/06/2020

Functional Regularization for Representation Learning: A Unified Theoretical Perspective

Unsupervised and self-supervised learning approaches have become a cruci...
research
01/23/2021

Granular conditional entropy-based attribute reduction for partially labeled data with proxy labels

Attribute reduction is one of the most important research topics in the ...
research
06/14/2020

Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled Learning and Conditional Generation with Extra Data

The scarcity of class-labeled data is a ubiquitous bottleneck in a wide ...
research
04/26/2018

High-dimensional Penalty Selection via Minimum Description Length Principle

We tackle the problem of penalty selection of regularization on the basi...

Please sign up or login with your details

Forgot password? Click here to reset