Towards a Measure of Individual Fairness for Deep Learning
Deep learning has produced big advances in artificial intelligence, but trained neural networks often reflect and amplify bias in their training data, and thus produce unfair predictions. We propose a novel measure of individual fairness, called prediction sensitivity, that approximates the extent to which a particular prediction is dependent on a protected attribute. We show how to compute prediction sensitivity using standard automatic differentiation capabilities present in modern deep learning frameworks, and present preliminary empirical results suggesting that prediction sensitivity may be effective for measuring bias in individual predictions.
READ FULL TEXT