A Minimax Approach to Supervised Learning

06/07/2016
by   Farzan Farnia, et al.
0

Given a task of predicting Y from X, a loss function L, and a set of probability distributions Γ on (X,Y), what is the optimal decision rule minimizing the worst-case expected loss over Γ? In this paper, we address this question by introducing a generalization of the principle of maximum entropy. Applying this principle to sets of distributions with marginal on X constrained to be the empirical marginal from the data, we develop a general minimax approach for supervised learning problems. While for some loss functions such as squared-error and log loss, the minimax approach rederives well-knwon regression models, for the 0-1 loss it results in a new linear classifier which we call the maximum entropy machine. The maximum entropy machine minimizes the worst-case 0-1 loss over the structured set of distribution, and by our numerical experiments can outperform other well-known linear classifiers such as SVM. We also prove a bound on the generalization worst-case error in the minimax approach.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset