COMETA: A Corpus for Medical Entity Linking in the Social Media

10/07/2020
by   Marco Basaldella, et al.
0

Whilst there has been growing progress in Entity Linking (EL) for general language, existing datasets fail to address the complex nature of health terminology in layman's language. Meanwhile, there is a growing need for applications that can understand the public's voice in the health domain. To address this we introduce a new corpus called COMETA, consisting of 20k English biomedical entity mentions from Reddit expert-annotated with links to SNOMED CT, a widely-used medical knowledge graph. Our corpus satisfies a combination of desirable properties, from scale and coverage to diversity and quality, that to the best of our knowledge has not been met by any of the existing resources in the field. Through benchmark experiments on 20 EL baselines from string- to neural-based models we shed light on the ability of these systems to perform complex inference on entities and concepts under 2 challenging evaluation scenarios. Our experimental results on COMETA illustrate that no golden bullet exists and even the best mainstream techniques still have a significant performance gap to fill, while the best solution relies on combining different views of data.

READ FULL TEXT
research
11/21/2019

LATTE: Latent Type Modeling for Biomedical Entity Linking

Entity linking is the task of linking mentions of named entities in natu...
research
04/22/2020

ParsEL 1.0: Unsupervised Entity Linking in Persian Social Media Texts

In recent years, social media data has exponentially increased, which ca...
research
01/04/2021

Reddit Entity Linking Dataset

We introduce and make publicly available an entity linking dataset from ...
research
09/06/2021

BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

Biomedical entity linking is the task of linking entity mentions in a bi...
research
07/24/2019

Linking Physicians to Medical Research Results via Knowledge Graph Embeddings and Twitter

Informing professionals about the latest research results in their field...
research
08/28/2023

Biomedical Entity Linking with Triple-aware Pre-Training

Linking biomedical entities is an essential aspect in biomedical natural...
research
04/18/2023

A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese

Textual health records of cancer patients are usually protracted and hig...

Please sign up or login with your details

Forgot password? Click here to reset