Systematic Offensive Stereotyping (SOS) Bias in Language Models

08/21/2023
by   Fatma Elsafoury, et al.
0

Research has shown that language models (LMs) are socially biased. However, toxicity and offensive stereotyping bias in LMs are understudied. In this paper, we investigate the systematic offensive stereotype (SOS) bias in LMs. We propose a method to measure it. Then, we validate the SOS bias and investigate the effectiveness of debias methods from the literature on removing it. Finally, we investigate the impact of the SOS bias in LMs on their performance and their fairness on the task of hate speech detection. Our results suggest that all the inspected LMs are SOS biased. The results suggest that the SOS bias in LMs is reflective of the hate experienced online by the inspected marginalized groups. The results indicate that removing the SOS bias in LMs, using a popular debias method from the literature, leads to worse SOS bias scores. Finally, Our results show no strong evidence that the SOS bias in LMs is impactful on their performance on hate speech detection. On the other hand, there is evidence that the SOS bias in LMs is impactful on their fairness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

On Bias and Fairness in NLP: How to have a fairer text classification?

In this paper, we provide a holistic analysis of the different sources o...
research
04/20/2023

On the Independence of Association Bias and Empirical Fairness in Language Models

The societal impact of pre-trained language models has prompted research...
research
12/14/2021

Measuring Fairness with Biased Rulers: A Survey on Quantifying Biases in Pretrained Language Models

An increasing awareness of biased patterns in natural language processin...
research
02/05/2018

The Social Structure of Consensus in Scientific Review

Personal connections between creators and evaluators of scientific works...
research
05/02/2019

A Mathematical Justification for Exponentially Distributed NLOS Bias

In the past few decades, the localization literature has seen many model...
research
08/31/2023

Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection

This paper is a summary of the work in my PhD thesis. In which, I invest...
research
05/22/2023

Keeping Up with the Language Models: Robustness-Bias Interplay in NLI Data and Models

Auditing unwanted social bias in language models (LMs) is inherently har...

Please sign up or login with your details

Forgot password? Click here to reset