Mutual Information Learned Regressor: an Information-theoretic Viewpoint of Training Regression Systems

11/23/2022
by   Jirong Yi, et al.
0

As one of the central tasks in machine learning, regression finds lots of applications in different fields. An existing common practice for solving regression problems is the mean square error (MSE) minimization approach or its regularized variants which require prior knowledge about the models. Recently, Yi et al., proposed a mutual information based supervised learning framework where they introduced a label entropy regularization which does not require any prior knowledge. When applied to classification tasks and solved via a stochastic gradient descent (SGD) optimization algorithm, their approach achieved significant improvement over the commonly used cross entropy loss and its variants. However, they did not provide a theoretical convergence analysis of the SGD algorithm for the proposed formulation. Besides, applying the framework to regression tasks is nontrivial due to the potentially infinite support set of the label. In this paper, we investigate the regression under the mutual information based supervised learning framework. We first argue that the MSE minimization approach is equivalent to a conditional entropy learning problem, and then propose a mutual information learning formulation for solving regression problems by using a reparameterization technique. For the proposed formulation, we give the convergence analysis of the SGD algorithm for solving it in practice. Finally, we consider a multi-output regression data model where we derive the generalization performance lower bound in terms of the mutual information associated with the underlying data distribution. The result shows that the high dimensionality can be a bless instead of a curse, which is controlled by a threshold. We hope our work will serve as a good starting point for further research on the mutual information based regression.

READ FULL TEXT
research
09/21/2022

Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems

Deep learning systems have been reported to achieve state-of-the-art per...
research
07/30/2021

A Training-Based Mutual Information Lower Bound for Large-Scale Systems

We provide a mutual information lower bound that can be used to analyze ...
research
07/16/2021

LeanML: A Design Pattern To Slash Avoidable Wastes in Machine Learning Projects

We introduce the first application of the lean methodology to machine le...
research
11/10/2018

Formal Limitations on the Measurement of Mutual Information

Motivate by applications to unsupervised learning, we consider the probl...
research
06/27/2012

Communications Inspired Linear Discriminant Analysis

We study the problem of supervised linear dimensionality reduction, taki...
research
05/24/2021

Adaptive Local Kernels Formulation of Mutual Information with Application to Active Post-Seismic Building Damage Inference

The abundance of training data is not guaranteed in various supervised l...
research
04/04/2021

Filtering ASVs/OTUs via Mutual Information-Based Microbiome Network Analysis

Microbial communities are widely studied using high-throughput sequencin...

Please sign up or login with your details

Forgot password? Click here to reset