ConvNets and ImageNet Beyond Accuracy: Explanations, Bias Detection, Adversarial Examples and Model Criticism
ConvNets and Imagenet have driven the recent success of deep learning for image classification. However, the marked slowdown in performance improvement, the recent studies on the lack of robustness of neural networks to adversarial examples and their tendency to exhibit undesirable biases (e.g racial biases) questioned the reliability and the sustained development of these methods. This work investigates these questions from the perspective of the end-user by using human subject studies and explanations. We experimentally demonstrate that the accuracy and robustness of ConvNets measured on Imagenet are underestimated. We show that explanations can mitigate the impact of misclassified adversarial examples from the perspective of the end-user and we introduce a novel tool for uncovering the undesirable biases learned by a model. These contributions also show that explanations are a promising tool for improving our understanding of ConvNets' predictions and for designing more reliable models
READ FULL TEXT