Cost-effective Variational Active Entity Resolution

11/20/2020
by   Alex Bogatu, et al.
0

Accurately identifying different representations of the same real-world entity is an integral part of data cleaning and many methods have been proposed to accomplish it. The challenges of this entity resolution task that demand so much research attention are often rooted in the task-specificity and user-dependence of the process. Adopting deep learning techniques has the potential to lessen these challenges. In this paper, we set out to devise an entity resolution method that builds on the robustness conferred by deep autoencoders to reduce human-involvement costs. Specifically, we reduce the cost of training deep entity resolution models by performing unsupervised representation learning. This unveils a transferability property of the resulting model that can further reduce the cost of applying the approach to new datasets by means of transfer learning. Finally, we reduce the cost of labelling training data through an active learning approach that builds on the properties conferred by the use of deep autoencoders. Empirical evaluation confirms the accomplishment of our cost-reduction desideratum while achieving comparable effectiveness with state-of-the-art alternatives.

READ FULL TEXT
research
06/17/2019

Low-resource Deep Entity Resolution with Transfer and Active Learning

Entity resolution (ER) is the task of identifying different representati...
research
08/23/2022

FlexER: Flexible Entity Resolution for Multiple Intents

Entity resolution, a longstanding problem of data cleaning and integrati...
research
06/27/2018

Data Efficient Lithography Modeling with Transfer Learning and Active Data Selection

Lithography simulation is one of the key steps in physical verification,...
research
05/31/2018

Improving Machine-based Entity Resolution with Limited Human Effort: A Risk Perspective

Pure machine-based solutions usually struggle in the challenging classif...
research
01/06/2021

Attention-based Convolutional Autoencoders for 3D-Variational Data Assimilation

We propose a new 'Bi-Reduced Space' approach to solving 3D Variational D...
research
12/23/2020

Active Deep Learning on Entity Resolution by Risk Sampling

While the state-of-the-art performance on entity resolution (ER) has bee...
research
10/12/2022

Frustratingly Simple Entity Tracking with Effective Use of Multi-Task Learning Models

We present SET, a frustratingly Simple-yet-effective approach for Entity...

Please sign up or login with your details

Forgot password? Click here to reset