TempEL: Linking Dynamically Evolving and Newly Emerging Entities

by   Klim Zaporojets, et al.

In our continuously evolving world, entities change over time and new, previously non-existing or unknown, entities appear. We study how this evolutionary scenario impacts the performance on a well established entity linking (EL) task. For that study, we introduce TempEL, an entity linking dataset that consists of time-stratified English Wikipedia snapshots from 2013 to 2022, from which we collect both anchor mentions of entities, and these target entities' descriptions. By capturing such temporal aspects, our newly introduced TempEL resource contrasts with currently existing entity linking datasets, which are composed of fixed mentions linked to a single static version of a target Knowledge Base (e.g., Wikipedia 2010 for CoNLL-AIDA). Indeed, for each of our collected temporal snapshots, TempEL contains links to entities that are continual, i.e., occur in all of the years, as well as completely new entities that appear for the first time at some point. Thus, we enable to quantify the performance of current state-of-the-art EL models for: (i) entities that are subject to changes over time in their Knowledge Base descriptions as well as their mentions' contexts, and (ii) newly created entities that were previously non-existing (e.g., at the time the EL model was trained). Our experimental results show that in terms of temporal performance degradation, (i) continual entities suffer a decrease of up to 3.1 accuracy, while (ii) for new entities this accuracy drop is up to 17.9 highlights the challenge of the introduced TempEL dataset and opens new research prospects in the area of time-evolving entity disambiguation.


page 8

page 32

page 33

page 34

page 35

page 36

page 37

page 38


DaMuEL: A Large Multilingual Dataset for Entity Linking

We present DaMuEL, a large Multilingual Dataset for Entity Linking conta...

EDIN: An End-to-end Benchmark and Pipeline for Unknown Entity Discovery and Indexing

Existing work on Entity Linking mostly assumes that the reference knowle...

Early Discovery of Disappearing Entities in Microblogs

We make decisions by reacting to changes in the real world, in particula...

DESCGEN: A Distantly Supervised Dataset for Generating Abstractive Entity Descriptions

Short textual descriptions of entities provide summaries of their key at...

Learning Entity Linking Features for Emerging Entities

Entity linking (EL) is the process of linking entity mentions appearing ...

Model-based annotation of coreference

Humans do not make inferences over texts, but over models of what texts ...

Early Discovery of Emerging Entities in Microblogs

Keeping up to date on emerging entities that appear every day is indispe...

Please sign up or login with your details

Forgot password? Click here to reset