Detecting and Mitigating Indirect Stereotypes in Word Embeddings

by   Erin George, et al.

Societal biases in the usage of words, including harmful stereotypes, are frequently learned by common word embedding methods. These biases manifest not only between a word and an explicit marker of its stereotype, but also between words that share related stereotypes. This latter phenomenon, sometimes called "indirect bias,” has resisted prior attempts at debiasing. In this paper, we propose a novel method called Biased Indirect Relationship Modification (BIRM) to mitigate indirect bias in distributional word embeddings by modifying biased relationships between words before embeddings are learned. This is done by considering how the co-occurrence probability of a given pair of words changes in the presence of words marking an attribute of bias, and using this to average out the effect of a bias attribute. To evaluate this method, we perform a series of common tests and demonstrate that measures of bias in the word embeddings are reduced in exchange for minor reduction in the semantic quality of the embeddings. In addition, we conduct novel tests for measuring indirect stereotypes by extending the Word Embedding Association Test (WEAT) with new test sets for indirect binary gender stereotypes. With these tests, we demonstrate the presence of more subtle stereotypes not addressed by previous work. The proposed method is able to reduce the presence of some of these new stereotypes, serving as a crucial next step towards non-stereotyped word embeddings.


page 1

page 2

page 3

page 4


Conceptor Debiasing of Word Representations Evaluated on WEAT

Bias in word embeddings such as Word2Vec has been widely investigated, a...

Examining Gender Bias in Languages with Grammatical Gender

Recent studies have shown that word embeddings exhibit gender bias inher...

Probabilistic Bias Mitigation in Word Embeddings

It has been shown that word embeddings derived from large corpora tend t...

It's All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution

This paper treats gender bias latent in word embeddings. Previous mitiga...

Identification of Biased Terms in News Articles by Comparison of Outlet-specific Word Embeddings

Slanted news coverage, also called media bias, can heavily influence how...

A Source-Criticism Debiasing Method for GloVe Embeddings

It is well-documented that word embeddings trained on large public corpo...

Reflection-based Word Attribute Transfer

Word embeddings, which often represent such analogic relations as king -...

Please sign up or login with your details

Forgot password? Click here to reset