EntRED: Benchmarking Relation Extraction with Fewer Shortcuts

by   Yiwei Wang, et al.

Entity names play an effective role in relation extraction (RE) and often influence model performance. As a result, the entity names in the benchmarks' test sets significantly influence the evaluation of RE models. In this work, we find that the standard RE benchmarks' datasets have a large portion of incorrect entity annotations, low entity name diversity, and are prone to have shortcuts from entity names to ground-truth relations. These issues make the standard benchmarks far from reflecting the real-world scenarios. Hence, in this work, we present EntRED, a challenging RE benchmark with reduced shortcuts and higher diversity of entities. To build EntRED, we propose an end-to-end entity replacement pipeline based on causal inference (CI): ERIC. ERIC performs type-constrained replacements on entities to reduce the shortcuts from entity bias to ground-truth relations. ERIC applies CI in two aspects: 1) targeting the instances that need entity replacements, and 2) determining the candidate entities for replacements. We apply ERIC on TACRED to produce EntRED. Our EntRED evaluates whether the RE model can correctly extract the relations from the text instead of relying on entity bias. Empirical results reveal that even the strong RE model has a significant performance drop on EntRED, which memorizes entity name patterns instead of reasoning from the textual context. We release ERIC's source code and the EntRED benchmark at https://github.com/wangywUST/ENTRED.


page 3

page 6


Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis

Recent literature focuses on utilizing the entity information in the sen...

Learning from Context or Names? An Empirical Study on Neural Relation Extraction

Neural models have achieved remarkable success on relation extraction (R...

Entity Disambiguation with Entity Definitions

Local models have recently attained astounding performances in Entity Di...

Think Rationally about What You See: Continuous Rationale Extraction for Relation Extraction

Relation extraction (RE) aims to extract potential relations according t...

Capturing Knowledge of Emerging Entities From Extended Search Snippets

Google and other search engines feature the entity search by representin...

Simultaneously Linking Entities and Extracting Relations from Biomedical Text Without Mention-level Supervision

Understanding the meaning of text often involves reasoning about entitie...

Alaska: A Flexible Benchmark for Data Integration Tasks

Data integration is a long-standing interest of the data management comm...

Please sign up or login with your details

Forgot password? Click here to reset