Better Sampling of Negatives for Distantly Supervised Named Entity Recognition

by   Lu Xu, et al.

Distantly supervised named entity recognition (DS-NER) has been proposed to exploit the automatically labeled training data instead of human annotations. The distantly annotated datasets are often noisy and contain a considerable number of false negatives. The recent approach uses a weighted sampling approach to select a subset of negative samples for training. However, it requires a good classifier to assign weights to the negative samples. In this paper, we propose a simple and straightforward approach for selecting the top negative samples that have high similarities with all the positive samples for training. Our method achieves consistent performance improvements on four distantly supervised NER datasets. Our analysis also shows that it is critical to differentiate the true negatives from the false negatives.


page 1

page 2

page 3

page 4


Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

We study the problem of training named entity recognition (NER) models u...

Named Entity Recognition with Partially Annotated Training Data

Supervised machine learning assumes the availability of fully-labeled da...

Rethinking Negative Sampling for Unlabeled Entity Problem in Named Entity Recognition

In many situations (e.g., distant supervision), unlabeled entity problem...

An Effective, Performant Named Entity Recognition System for Noisy Business Telephone Conversation Transcripts

We present a simple yet effective method to train a named entity recogni...

A Noise-Robust Loss for Unlabeled Entity Problem in Named Entity Recognition

Named Entity Recognition (NER) is an important task in natural language ...

Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp

Performance of neural models for named entity recognition degrades over ...

Building a Massive Corpus for Named Entity Recognition using Free Open Data Sources

With the recent progress in machine learning, boosted by techniques such...

Please sign up or login with your details

Forgot password? Click here to reset