ReMix: Calibrated Resampling for Class Imbalance in Deep learning
Class imbalance is a problem of significant importance in applied deep learning where trained models are exploited for decision support and automated decisions in critical areas such as health and medicine, transportation, and finance. The challenge of learning deep models from imbalanced training data remains high, and the state-of-the-art solutions are typically data dependent and primarily focused on image data. Real-world imbalanced classification problems, however, are much more diverse thus necessitating a general solution that can be applied to tabular, image and text data. In this paper, we propose ReMix, a training technique that leverages batch resampling, instance mixing and soft-labels to enable the induction of robust deep models for imbalanced learning. Our results show that dense nets and CNNs trained with ReMix generally outperform the alternatives according to the g-mean and are better calibrated according to the balanced Brier score.
READ FULL TEXT