Automatically Neutralizing Subjective Bias in Text

by   Reid Pryzant, et al.

Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity - introducing attitudes via framing, presupposing truth, and casting doubt - remains ubiquitous. This kind of bias erodes our collective trust and fuels social conflict. To address this issue, we introduce a novel testbed for natural language generation: automatically bringing inappropriately subjective text into a neutral point of view ("neutralizing" biased text). We also offer the first parallel corpus of biased language. The corpus contains 180,000 sentence pairs and originates from Wikipedia edits that removed various framings, presuppositions, and attitudes from biased sentences. Last, we propose two strong encoder-decoder baselines for the task. A straightforward yet opaque CONCURRENT system uses a BERT encoder to identify subjective words as part of the generation process. An interpretable and controllable MODULAR algorithm separates these steps, using (1) a BERT-based classifier to identify problematic words and (2) a novel join embedding through which the classifier can edit the hidden states of the encoder. Large-scale human evaluation across four domains (encyclopedias, news headlines, books, and political speeches) suggests that these algorithms are a first step towards the automatic identification and reduction of bias.


page 1

page 2

page 3

page 4


Towards Detection of Subjective Bias using Contextualized Word Embeddings

Subjective bias detection is critical for applications like propaganda d...

Viable Threat on News Reading: Generating Biased News Using Natural Language Models

Recent advancements in natural language generation has raised serious co...

It's All Relative: Interpretable Models for Scoring Bias in Documents

We propose an interpretable model to score the bias present in web docum...

Differential Bias: On the Perceptibility of Stance Imbalance in Argumentation

Most research on natural language processing treats bias as an absolute ...

Identification of Biased Terms in News Articles by Comparison of Outlet-specific Word Embeddings

Slanted news coverage, also called media bias, can heavily influence how...

Multi-Figurative Language Generation

Figurative language generation is the task of reformulating a given text...

Please sign up or login with your details

Forgot password? Click here to reset