Recovering Patient Journeys: A Corpus of Biomedical Entities and Relations on Twitter (BEAR)

by   Amelie Wührl, et al.

Text mining and information extraction for the medical domain has focused on scientific text generated by researchers. However, their direct access to individual patient experiences or patient-doctor interactions can be limited. Information provided on social media, e.g., by patients and their relatives, complements the knowledge in scientific text. It reflects the patient's journey and their subjective perspective on the process of developing symptoms, being diagnosed and offered a treatment, being cured or learning to live with a medical condition. The value of this type of data is therefore twofold: Firstly, it offers direct access to people's perspectives. Secondly, it might cover information that is not available elsewhere, including self-treatment or self-diagnoses. Named entity recognition and relation extraction are methods to structure information that is available in unstructured text. However, existing medical social media corpora focused on a comparably small set of entities and relations and particular domains, rather than putting the patient into the center of analyses. With this paper we contribute a corpus with a rich set of annotation layers following the motivation to uncover and model patients' journeys and experiences in more detail. We label 14 entity classes (incl. environmental factors, diagnostics, biochemical processes, patients' quality-of-life descriptions, pathogens, medical conditions, and treatments) and 20 relation classes (e.g., prevents, influences, interactions, causes) most of which have not been considered before for social media data. The publicly available dataset consists of 2,100 tweets with approx. 6,000 entity and 3,000 relation annotations. In a corpus analysis we find that over 80 contain relevant entities. Over 50 consider essential for uncovering patients' narratives about their journeys.


page 1

page 2

page 3

page 4


Named Entities in Medical Case Reports: Corpus and Experiments

We present a new corpus comprising annotations of medical entities in ca...

To What Extent are Name Variants Used as Named Entities in Turkish Tweets?

Social media texts differ from regular texts in various aspects. One of ...

Let's Make It Personal, A Challenge in Personalizing Medical Inter-Human Communication

Current AI approaches have frequently been used to help personalize many...

An Entity-based Claim Extraction Pipeline for Real-world Biomedical Fact-checking

Existing fact-checking models for biomedical claims are typically traine...

A Silver Standard Corpus of Human Phenotype-Gene Relations

Human phenotype-gene relations are fundamental to fully understand the o...

Towards User Friendly Medication Mapping Using Entity-Boosted Two-Tower Neural Network

Recent advancements in medical entity linking have been applied in the a...

Please sign up or login with your details

Forgot password? Click here to reset