Improved scalability under heavy tails, without strong convexity

06/02/2020
by   Matthew J. Holland, et al.
0

Real-world data is laden with outlying values. The challenge for machine learning is that the learner typically has no prior knowledge of whether the feedback it receives (losses, gradients, etc.) will be heavy-tailed or not. In this work, we study a simple algorithmic strategy that can be leveraged when both losses and gradients can be heavy-tailed. The core technique introduces a simple robust validation sub-routine, which is used to boost the confidence of inexpensive gradient-based sub-processes. Compared with recent robust gradient descent methods from the literature, dimension dependence (both risk bounds and cost) is substantially improved, without relying upon strong convexity or expensive per-step robustification. Empirically, we also show that under heavy-tailed losses, the proposed procedure cannot simply be replaced with naive cross-validation. Taken together, we have a scalable method with transparent guarantees, which performs well without prior knowledge of how "convenient" the feedback it receives will be.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2020

Better scalability under potentially heavy-tailed feedback

We study scalable alternatives to robust gradient descent (RGD) techniqu...
research
06/01/2020

Better scalability under potentially heavy-tailed gradients

We study a scalable alternative to robust gradient descent (RGD) techniq...
research
05/24/2021

Robust learning with anytime-guaranteed feedback

Under data distributions which may be heavy-tailed, many stochastic grad...
research
01/27/2023

Robust variance-regularized risk minimization with concomitant scaling

Under losses which are potentially heavy-tailed, we consider the task of...
research
06/01/2017

Efficient learning with robust gradient descent

Minimizing the empirical risk is a popular training strategy, but for le...
research
02/20/2021

On Proximal Policy Optimization's Heavy-tailed Gradients

Modern policy gradient algorithms, notably Proximal Policy Optimization ...
research
02/10/2023

Long-Tailed Partial Label Learning via Dynamic Rebalancing

Real-world data usually couples the label ambiguity and heavy imbalance,...

Please sign up or login with your details

Forgot password? Click here to reset