TWEET-FID: An Annotated Dataset for Multiple Foodborne Illness Detection Tasks

05/22/2022
by   Ruofan Hu, et al.
0

Foodborne illness is a serious but preventable public health problem – with delays in detecting the associated outbreaks resulting in productivity loss, expensive recalls, public safety hazards, and even loss of life. While social media is a promising source for identifying unreported foodborne illnesses, there is a dearth of labeled datasets for developing effective outbreak detection models. To accelerate the development of machine learning-based models for foodborne outbreak detection, we thus present TWEET-FID (TWEET-Foodborne Illness Detection), the first publicly available annotated dataset for multiple foodborne illness incident detection tasks. TWEET-FID collected from Twitter is annotated with three facets: tweet class, entity type, and slot type, with labels produced by experts as well as by crowdsource workers. We introduce several domain tasks leveraging these three facets: text relevance classification (TRC), entity mention detection (EMD), and slot filling (SF). We describe the end-to-end methodology for dataset design, creation, and labeling for supporting model development for these tasks. A comprehensive set of results for these tasks leveraging state-of-the-art single- and multi-task deep learning methods on the TWEET-FID dataset are provided. This dataset opens opportunities for future research in foodborne outbreak detection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2017

Towards Zero-Shot Frame Semantic Parsing for Domain Scaling

State-of-the-art slot filling models for goal-oriented human/machine con...
research
07/05/2019

Multi-lingual Intent Detection and Slot Filling in a Joint BERT-based Model

Intent Detection and Slot Filling are two pillar tasks in Spoken Natural...
research
01/04/2021

Reddit Entity Linking Dataset

We introduce and make publicly available an entity linking dataset from ...
research
05/01/2020

Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter

We present a new challenging stance detection dataset, called Will-They-...
research
04/13/2023

Vax-Culture: A Dataset for Studying Vaccine Discourse on Twitter

Vaccine hesitancy continues to be a main challenge for public health off...
research
09/13/2021

Traffic Event Detection as a Slot Filling Problem

In this paper, we introduce the new problem of extracting fine-grained t...
research
01/16/2017

Deep Memory Networks for Attitude Identification

We consider the task of identifying attitudes towards a given set of ent...

Please sign up or login with your details

Forgot password? Click here to reset