Mitigating Source Bias for Fairer Weak Supervision

03/30/2023
by   Changho Shin, et al.
4

Weak supervision overcomes the label bottleneck, enabling efficient development of training sets. Millions of models trained on such datasets have been deployed in the real world and interact with users on a daily basis. However, the techniques that make weak supervision attractive – such as integrating any source of signal to estimate unknown labels – also ensure that the pseudolabels it produces are highly biased. Surprisingly, given everyday use and the potential for increased bias, weak supervision has not been studied from the point of view of fairness. This work begins such a study. Our departure point is the observation that even when a fair model can be built from a dataset with access to ground-truth labels, the corresponding dataset labeled via weak supervision can be arbitrarily unfair. Fortunately, not all is lost: we propose and empirically validate a model for source unfairness in weak supervision, then introduce a simple counterfactual fairness-based technique that can mitigate these biases. Theoretically, we show that it is possible for our approach to simultaneously improve both accuracy and fairness metrics – in contrast to standard fairness approaches that suffer from tradeoffs. Empirically, we show that our technique improves accuracy on weak supervision baselines by as much as 32

READ FULL TEXT
research
06/06/2022

Training Subset Selection for Weak Supervision

Existing weak supervision approaches use all the data covered by weak si...
research
05/15/2019

Passage Ranking with Weak Supervision

In this paper, we propose a weak supervision framework for neural rankin...
research
05/15/2019

Passage Ranking with Weak Supervsion

In this paper, we propose a weak supervision framework for neural rankin...
research
05/11/2022

Weak Supervision with Incremental Source Accuracy Estimation

Motivated by the desire to generate labels for real-time data we develop...
research
12/07/2021

Universalizing Weak Supervision

Weak supervision (WS) frameworks are a popular way to bypass hand-labeli...
research
07/27/2022

Learning Hyper Label Model for Programmatic Weak Supervision

To reduce the human annotation efforts, the programmatic weak supervisio...
research
05/10/2022

Don't Throw it Away! The Utility of Unlabeled Data in Fair Decision Making

Decision making algorithms, in practice, are often trained on data that ...

Please sign up or login with your details

Forgot password? Click here to reset