When the Majority is Wrong: Leveraging Annotator Disagreement for Subjective Tasks

by   Eve Fleisig, et al.

Though majority vote among annotators is typically used for ground truth labels in natural language processing, annotator disagreement in tasks such as hate speech detection may reflect differences among group opinions, not noise. Thus, a crucial problem in hate speech detection is whether a statement is offensive to the demographic group that it targets, which may constitute a small fraction of the annotator pool. We construct a model that predicts individual annotator ratings on potentially offensive text and combines this information with the predicted target group of the text to model the opinions of target group members. We show gains across a range of metrics, including raising performance over the baseline by 22 annotators' ratings and 33 provides a method of measuring model uncertainty downstream. We find that annotators' ratings can be predicted using their demographic information and opinions on online content, without the need to track identifying annotator IDs that link each annotator to their ratings. We also find that use of non-invasive survey questions on annotators' online experiences helps to maximize privacy and minimize unnecessary collection of demographic information when predicting annotators' opinions.


Aligning Language Models to User Opinions

An important aspect of developing LLMs that interact with humans is to a...

Whose Opinions Do Language Models Reflect?

Language models (LMs) are increasingly being used in open-ended contexts...

Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection

Algorithmic bias often arises as a result of differential subgroup valid...

Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection

Social media platforms provide users the freedom of expression and a med...

Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models

To recognize and mitigate harms from large language models (LLMs), we ne...

Measuring Asymmetric Opinions on Online Social Interrelationship with Language and Network Features

Instead of studying the properties of social relationship from an object...

Please sign up or login with your details

Forgot password? Click here to reset