ANEA: Distant Supervision for Low-Resource Named Entity Recognition

02/25/2021
by   Michael A. Hedderich, et al.
0

Distant supervision allows obtaining labeled training corpora for low-resource settings where only limited hand-annotated data exists. However, to be used effectively, the distant supervision must be easy to obtain. In this work, we present ANEA, a tool to automatically annotate named entities in text based on entity lists. It spans the whole pipeline from obtaining the lists to analyzing the errors of the distant supervision. A tuning step allows the user to improve the automatic annotation with their linguistic insights without having to manually label or check all tokens. In six low-resource scenarios, we show that the F1-score can be increased by on average 18 points through distantly supervised data obtained by ANEA.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset