Focus on the Common Good: Group Distributional Robustness Follows

by   Vihari Piratla, et al.

We consider the problem of training a classification model with group annotated training data. Recent work has established that, if there is distribution shift across different groups, models trained using the standard empirical risk minimization (ERM) objective suffer from poor performance on minority groups and that group distributionally robust optimization (Group-DRO) objective is a better alternative. The starting point of this paper is the observation that though Group-DRO performs better than ERM on minority groups for some benchmark datasets, there are several other datasets where it performs much worse than ERM. Inspired by ideas from the closely related problem of domain generalization, this paper proposes a new and simple algorithm that explicitly encourages learning of features that are shared across various groups. The key insight behind our proposed algorithm is that while Group-DRO focuses on groups with worst regularized loss, focusing instead, on groups that enable better performance even on other groups, could lead to learning of shared/common features, thereby enhancing minority performance beyond what is achieved by Group-DRO. Empirically, we show that our proposed algorithm matches or achieves better performance compared to strong contemporary baselines including ERM and Group-DRO on standard benchmarks on both minority groups and across all groups. Theoretically, we show that the proposed algorithm is a descent method and finds first order stationary points of smooth nonconvex functions.


page 1

page 2

page 3

page 4


How does overparametrization affect performance on minority groups?

The benefits of overparameterization for the overall performance of mode...

AGRO: Adversarial Discovery of Error-prone groups for Robust Optimization

Models trained via empirical risk minimization (ERM) are known to rely o...

BARACK: Partially Supervised Group Robustness With Guarantees

While neural networks have shown remarkable success on classification ta...

Robust Learning with Progressive Data Expansion Against Spurious Correlation

While deep learning models have shown remarkable performance in various ...

Just Mix Once: Worst-group Generalization by Group Interpolation

Advances in deep learning theory have revealed how average generalizatio...

Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization

Models trained with empirical risk minimization (ERM) are revealed to ea...

Simple and Fast Group Robustness by Automatic Feature Reweighting

A major challenge to out-of-distribution generalization is reliance on s...

Please sign up or login with your details

Forgot password? Click here to reset