This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language Models

Bias research in NLP seeks to analyse models for social biases, thus helping NLP practitioners uncover, measure, and mitigate social harms. We analyse the body of work that uses prompts and templates to assess bias in language models. We draw on a measurement modelling framework to create a taxonomy of attributes that capture what a bias test aims to measure and how that measurement is carried out. By applying this taxonomy to 90 bias tests, we illustrate qualitatively and quantitatively that core aspects of bias test conceptualisations and operationalisations are frequently unstated or ambiguous, carry implicit assumptions, or be mismatched. Our analysis illuminates the scope of possible bias types the field is able to measure, and reveals types that are as yet under-researched. We offer guidance to enable the community to explore a wider section of the possible bias space, and to better close the gap between desired outcomes and experimental design, both for bias and for evaluating language models more broadly.


Speciesist Language and Nonhuman Animal Bias in English Masked Language Models

Various existing studies have analyzed what social biases are inherited ...

"I'm sorry to hear that": finding bias in language models with a holistic descriptor dataset

As language models grow in popularity, their biases across all possible ...

Language-Agnostic Bias Detection in Language Models

Pretrained language models (PLMs) are key components in NLP, but they co...

Intersectional Inquiry, on the Ground and in the Algorithm

This article makes two key contributions to methodological debates in au...

AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models

Social bias in Pretrained Language Models (PLMs) affects text generation...

Unmasking the Mask – Evaluating Social Biases in Masked Language Models

Masked Language Models (MLMs) have shown superior performances in numero...

Taught by the Internet, Exploring Bias in OpenAIs GPT3

This research delves into the current literature on bias in Natural Lang...

Please sign up or login with your details

Forgot password? Click here to reset