Gender Artifacts in Visual Datasets

by   Nicole Meister, et al.

Gender biases are known to exist within large-scale visual datasets and can be reflected or even amplified in downstream models. Many prior works have proposed methods for mitigating gender biases, often by attempting to remove gender expression information from images. To understand the feasibility and practicality of these approaches, we investigate what gender artifacts exist within large-scale visual datasets. We define a gender artifact as a visual cue that is correlated with gender, focusing specifically on those cues that are learnable by a modern image classifier and have an interpretable human corollary. Through our analyses, we find that gender artifacts are ubiquitous in the COCO and OpenImages datasets, occurring everywhere from low-level information (e.g., the mean value of the color channels) to the higher-level composition of the image (e.g., pose and location of people). Given the prevalence of gender artifacts, we claim that attempts to remove gender artifacts from such datasets are largely infeasible. Instead, the responsibility lies with researchers and practitioners to be aware that the distribution of images within datasets is highly gendered and hence develop methods which are robust to these distributional shifts across groups.


page 3

page 9

page 10

page 11

page 13

page 25


The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages

Gender biases in language generation systems are challenging to mitigate...

Identifying the Prevalence of Gender Biases among the Computing Organizations

We have designed an online survey to understand the status quo of four d...

Large scale analysis of gender bias and sexism in song lyrics

We employ Natural Language Processing techniques to analyse 377808 Engli...

Adversarial Removal of Gender from Deep Image Representations

In this work we analyze visual recognition tasks such as object and acti...

Raw Audio for Depression Detection Can Be More Robust Against Gender Imbalance than Mel-Spectrogram Features

Depression is a large-scale mental health problem and a challenging area...

Mitigating Gender Bias in Captioning Systems

Image captioning has made substantial progress with huge supporting imag...

Inferring User Gender from User Generated Visual Content on a Deep Semantic Space

In this paper we address the task of gender classification on picture sh...

Please sign up or login with your details

Forgot password? Click here to reset