AGRO: Adversarial Discovery of Error-prone groups for Robust Optimization

by   Bhargavi Paranjape, et al.

Models trained via empirical risk minimization (ERM) are known to rely on spurious correlations between labels and task-independent input features, resulting in poor generalization to distributional shifts. Group distributionally robust optimization (G-DRO) can alleviate this problem by minimizing the worst-case loss over a set of pre-defined groups over training data. G-DRO successfully improves performance of the worst-group, where the correlation does not hold. However, G-DRO assumes that the spurious correlations and associated worst groups are known in advance, making it challenging to apply it to new tasks with potentially multiple unknown spurious correlations. We propose AGRO – Adversarial Group discovery for Distributionally Robust Optimization – an end-to-end approach that jointly identifies error-prone groups and improves accuracy on them. AGRO equips G-DRO with an adversarial slicing model to find a group assignment for training examples which maximizes worst-case loss over the discovered groups. On the WILDS benchmark, AGRO results in 8 known worst-groups, compared to prior group discovery approaches used with G-DRO. AGRO also improves out-of-distribution performance on SST2, QQP, and MS-COCO – datasets where potential spurious correlations are as yet uncharacterized. Human evaluation of ARGO groups shows that they contain well-defined, yet previously unstudied spurious correlations that lead to model errors.


page 24

page 25

page 27


Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization

Models trained with empirical risk minimization (ERM) are revealed to ea...

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

Overparameterized neural networks can be highly accurate on average on a...

Examining and Combating Spurious Features under Distribution Shift

A central goal of machine learning is to learn robust representations th...

How does overparametrization affect performance on minority groups?

The benefits of overparameterization for the overall performance of mode...

Focus on the Common Good: Group Distributional Robustness Follows

We consider the problem of training a classification model with group an...

Distributionally Robust Learning with Stable Adversarial Training

Machine learning algorithms with empirical risk minimization are vulnera...

Robustness to Spurious Correlations via Human Annotations

The reliability of machine learning systems critically assumes that the ...

Please sign up or login with your details

Forgot password? Click here to reset