Knowledge-Base Enriched Word Embeddings for Biomedical Domain

02/20/2021
by   Kishlay Jha, et al.
0

Word embeddings have been shown adept at capturing the semantic and syntactic regularities of the natural language text, as a result of which these representations have found their utility in a wide variety of downstream content analysis tasks. Commonly, these word embedding techniques derive the distributed representation of words based on the local context information. However, such approaches ignore the rich amount of explicit information present in knowledge-bases. This is problematic, as it might lead to poor representation for words with insufficient local context such as domain specific words. Furthermore, the problem becomes pronounced in domain such as bio-medicine where the presence of these domain specific words are relatively high. Towards this end, in this project, we propose a new word embedding based model for biomedical domain that jointly leverages the information from available corpora and domain knowledge in order to generate knowledge-base powered embeddings. Unlike existing approaches, the proposed methodology is simple but adept at capturing the precise knowledge available in domain resources in an accurate way. Experimental results on biomedical concept similarity and relatedness task validates the effectiveness of the proposed approach.

READ FULL TEXT
research
10/27/2022

Leveraging knowledge graphs to update scientific word embeddings using latent semantic imputation

The most interesting words in scientific texts will often be novel or ra...
research
08/07/2023

Vocab-Expander: A System for Creating Domain-Specific Vocabularies Based on Word Embeddings

In this paper, we propose Vocab-Expander at https://vocab-expander.com, ...
research
10/20/2020

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

Contextual word embedding models, such as BioBERT and Bio_ClinicalBERT, ...
research
09/21/2017

Learning Domain-Specific Word Embeddings from Sparse Cybersecurity Texts

Word embedding is a Natural Language Processing (NLP) technique that aut...
research
08/14/2018

Syntree2Vec - An algorithm to augment syntactic hierarchy into word embeddings

Word embeddings aims to map sense of the words into a lower dimensional ...
research
06/07/2017

Insights into Analogy Completion from the Biomedical Domain

Analogy completion has been a popular task in recent years for evaluatin...
research
12/05/2017

AWE-CM Vectors: Augmenting Word Embeddings with a Clinical Metathesaurus

In recent years, word embeddings have been surprisingly effective at cap...

Please sign up or login with your details

Forgot password? Click here to reset