How Does Disagreement Benefit Co-teaching?
Learning with noisy labels is one of the most important question in weakly-supervised learning domain. Classical approaches focus on adding the regularization or estimating the noise transition matrix. However, either a regularization bias is permanently introduced, or the noise transition matrix is hard to be estimated accurately. In this paper, following a novel path to train on small-loss samples, we propose a robust learning paradigm called Co-teaching+. This paradigm naturally bridges "Update by Disagreement" strategy with Co-teaching that trains two deep neural networks, thus consists of disagreement-update step and cross-update step. In disagreement-update step, two networks predicts all data first, and feeds forward prediction disagreement data only. Then, in cross-update step, each network selects its small-loss data from such disagreement data, but back propagates the small-loss data by its peer network and updates itself parameters. Empirical results on noisy versions of MNIST, CIFAR-10 and NEWS demonstrate that Co-teaching+ is much superior to the state-of-the-art methods in the robustness of trained deep models.
READ FULL TEXT