American == White in Multimodal Language-and-Image AI

by   Robert Wolfe, et al.

Three state-of-the-art language-and-image AI models, CLIP, SLIP, and BLIP, are evaluated for evidence of a bias previously observed in social and experimental psychology: equating American identity with being White. Embedding association tests (EATs) using standardized images of self-identified Asian, Black, Latina/o, and White individuals from the Chicago Face Database (CFD) reveal that White individuals are more associated with collective in-group words than are Asian, Black, or Latina/o individuals. In assessments of three core aspects of American identity reported by social psychologists, single-category EATs reveal that images of White individuals are more associated with patriotism and with being born in America, but that, consistent with prior findings in psychology, White individuals are associated with being less likely to treat people of all races and backgrounds equally. Three downstream machine learning tasks demonstrate biases associating American with White. In a visual question answering task using BLIP, 97 are identified as American, compared to only 3 asked in what state the individual depicted lives in, the model responds China 53 White individuals. In an image captioning task, BLIP remarks upon the race of Asian individuals as much as 36 White individuals. Finally, provided with an initialization image from the CFD and the text "an American person," a synthetic image generator (VQGAN) using the text-based guidance of CLIP lightens the skin tone of individuals of all races (by 35 indicate that biases equating American identity with being White are learned by language-and-image AI, and propagate to downstream applications of such models.


page 2

page 7

page 13


Evidence for Hypodescent in Visual Semantic AI

We examine the state-of-the-art multimodal "visual semantic" model CLIP ...

Markedness in Visual Semantic AI

We evaluate the state-of-the-art multimodal "visual semantic" model CLIP...

Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias

Nine language-vision AI models trained on web scrapes with the Contrasti...

Black or White but never neutral: How readers perceive identity from yellow or skin-toned emoji

Research in sociology and linguistics shows that people use language not...

Unequal Representations: Analyzing Intersectional Biases in Word Embeddings Using Representational Similarity Analysis

We present a new approach for detecting human-like social biases in word...

Robots Racialized in the Likeness of Marginalized Social Identities are Subject to Greater Dehumanization than those racialized as White

The emergence and spread of humanlike robots into increasingly public do...

Beyond Algorithmic Bias: A Socio-Computational Interrogation of the Google Search by Image Algorithm

We perform a socio-computational interrogation of the google search by i...

Please sign up or login with your details

Forgot password? Click here to reset