The Extended Dawid-Skene Model: Fusing Information from Multiple Data Schemas

06/04/2019
by   Michael P. J. Camilleri, et al.
0

While label fusion from multiple noisy annotations is a well understood concept in data wrangling (tackled for example by the Dawid-Skene (DS) model), we consider the extended problem of carrying out learning when the labels themselves are not consistently annotated with the same schema. We show that even if annotators use disparate, albeit related, label-sets, we can still draw inferences for the underlying full label-set. We propose the Inter-Schema AdapteR (ISAR) to translate the fully-specified label-set to the one used by each annotator, enabling learning under such heterogeneous schemas, without the need to re-annotate the data. We apply our method to a mouse behavioural dataset, achieving significant gains (compared with DS) in out-of-sample log-likelihood (-3.40 to -2.39) and F1-score (0.785 to 0.864).

READ FULL TEXT

page 2

page 5

research
07/22/2022

Learning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion

Data lies at the core of modern deep learning. The impressive performanc...
research
10/16/2022

Skeptical inferences in multi-label ranking with sets of probabilities

In this paper, we consider the problem of making skeptical inferences fo...
research
07/24/2019

Semi Automatic Construction of ShEx and SHACL Schemas

We present a method for the construction of SHACL or ShEx constraints fo...
research
01/04/2023

Towards the Identifiability in Noisy Label Learning: A Multinomial Mixture Approach

Learning from noisy labels plays an important role in the deep learning ...
research
09/22/2020

Learning Image Labels On-the-fly for Training Robust Classification Models

Current deep learning paradigms largely benefit from the tremendous amou...
research
03/06/2019

Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing

Existing entity typing systems usually exploit the type hierarchy provid...
research
06/08/2022

Towards Schema Inference for Data Lakes

A data lake is a repository of data with potential for future analysis. ...

Please sign up or login with your details

Forgot password? Click here to reset