A Causal View of Entity Bias in (Large) Language Models

by   Fei Wang, et al.

Entity bias widely affects pretrained (large) language models, causing them to excessively rely on (biased) parametric knowledge to make unfaithful predictions. Although causality-inspired methods have shown great potential to mitigate entity bias, it is hard to precisely estimate the parameters of underlying causal models in practice. The rise of black-box LLMs also makes the situation even worse, because of their inaccessible parameters and uncalibrated logits. To address these problems, we propose a specific structured causal model (SCM) whose parameters are comparatively easier to estimate. Building upon this SCM, we propose causal intervention techniques to mitigate entity bias for both white-box and black-box settings. The proposed causal intervention perturbs the original entity with neighboring entities. This intervention reduces specific biasing information pertaining to the original entity while still preserving sufficient common predictive information from similar entities. When evaluated on the relation extraction task, our training-time intervention significantly improves the F1 score of RoBERTa by 5.7 points on EntRED, in which spurious shortcuts between entities and labels are removed. Meanwhile, our in-context intervention effectively reduces the knowledge conflicts between parametric knowledge and contextual knowledge in GPT-3.5 and improves the F1 score by 9.14 points on a challenging test set derived from Re-TACRED.


Element Intervention for Open Relation Extraction

Open relation extraction aims to cluster relation instances referring to...

Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis

Recent literature focuses on utilizing the entity information in the sen...

Causality-aware Concept Extraction based on Knowledge-guided Prompting

Concepts benefit natural language understanding but are far from complet...

Causal Reasoning of Entities and Events in Procedural Texts

Entities and events have long been regarded as the crux of machine reaso...

Predicting Document Coverage for Relation Extraction

This paper presents a new task of predicting the coverage of a text docu...

LNN-EL: A Neuro-Symbolic Approach to Short-text Entity Linking

Entity linking (EL), the task of disambiguating mentions in text by link...

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

We introduce Inference-Time Intervention (ITI), a technique designed to ...

Please sign up or login with your details

Forgot password? Click here to reset