Attention-Aware Answers of the Crowd

by   Jingzheng Tu, et al.

Crowdsourcing is a relatively economic and efficient solution to collect annotations from the crowd through online platforms. Answers collected from workers with different expertise may be noisy and unreliable, and the quality of annotated data needs to be further maintained. Various solutions have been attempted to obtain high-quality annotations. However, they all assume that workers' label quality is stable over time (always at the same level whenever they conduct the tasks). In practice, workers' attention level changes over time, and the ignorance of which can affect the reliability of the annotations. In this paper, we focus on a novel and realistic crowdsourcing scenario involving attention-aware annotations. We propose a new probabilistic model that takes into account workers' attention to estimate the label quality. Expectation propagation is adopted for efficient Bayesian inference of our model, and a generalized Expectation Maximization algorithm is derived to estimate both the ground truth of all tasks and the label-quality of each individual crowd worker with attention. In addition, the number of tasks best suited for a worker is estimated according to changes in attention. Experiments against related methods on three real-world and one semi-simulated datasets demonstrate that our method quantifies the relationship between workers' attention and label-quality on the given tasks, and improves the aggregated labels.


page 1

page 2

page 3

page 4


Open-Set Crowdsourcing using Multiple-Source Transfer Learning

We raise and define a new crowdsourcing scenario, open set crowdsourcing...

Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations

Crowd-sourcing is a cheap and popular means of creating training and eva...

Finding the Ground-Truth from Multiple Labellers: Why Parameters of the Task Matter

Employing multiple workers to label data for machine learning models has...

Toward Effective Automated Content Analysis via Crowdsourcing

Many computer scientists use the aggregated answers of online workers to...

Learning From Noisy Singly-labeled Data

Supervised learning depends on annotated examples, which are taken to be...

Confident in the Crowd: Bayesian Inference to Improve Data Labelling in Crowdsourcing

With the increased interest in machine learning and big data problems, t...

Active Multi-Label Crowd Consensus

Crowdsourcing is an economic and efficient strategy aimed at collecting ...

Please sign up or login with your details

Forgot password? Click here to reset