When Specialization Helps: Using Pooled Contextualized Embeddings to Detect Chemical and Biomedical Entities in Spanish

10/08/2019
by   Manuel Stoeckel, et al.
0

The recognition of pharmacological substances, compounds and proteins is an essential preliminary work for the recognition of relations between chemicals and other biomedically relevant units. In this paper, we describe an approach to Task 1 of the PharmaCoNER Challenge, which involves the recognition of mentions of chemicals and drugs in Spanish medical texts. We train a state-of-the-art BiLSTM-CRF sequence tagger with stacked Pooled Contextualized Embeddings, word and sub-word embeddings using the open-source framework FLAIR. We present a new corpus composed of articles and papers from Spanish health science journals, termed the Spanish Health Corpus, and use it to train domain-specific embeddings which we incorporate in our model training. We achieve a result of 89.76 to improve these results to 90.52

READ FULL TEXT
research
04/03/2019

Evaluating KGR10 Polish word embeddings in the recognition of temporal expressions using BiLSTM-CRF

The article introduces a new set of Polish word embeddings, built using ...
research
10/24/2020

Word Embeddings for Chemical Patent Natural Language Processing

We evaluate chemical patent word embeddings against known biomedical emb...
research
10/06/2022

Domain-Specific Word Embeddings with Structure Prediction

Complementary to finding good general word embeddings, an important ques...
research
08/25/2018

Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

We compare the use of LSTM-based and CNN-based character-level word embe...
research
11/03/2020

BioNerFlair: biomedical named entity recognition using flair embedding and sequence tagger

Motivation: The proliferation of Biomedical research articles has made t...
research
06/29/2017

Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition

Background. Previous state-of-the-art systems on Drug Name Recognition (...
research
04/27/2019

Enabling Open-World Specification Mining via Unsupervised Learning

Many programming tasks require using both domain-specific code and well-...

Please sign up or login with your details

Forgot password? Click here to reset