Jointly Embedding Entities and Text with Distant Supervision

07/09/2018
by   Denis Newman-Griffis, et al.
0

Learning representations for knowledge base entities and concepts is becoming increasingly important for NLP applications. However, recent entity embedding methods have relied on structured resources that are expensive to create for new domains and corpora. We present a distantly-supervised method for jointly learning embeddings of entities and text from an unnanotated corpus, using only a list of mappings between entities and surface forms. We learn embeddings from open-domain and biomedical corpora, and compare against prior methods that rely on human-annotated text or large knowledge graph structure. Our embeddings capture entity similarity and relatedness better than prior work, both in existing biomedical datasets and a new Wikipedia-based dataset that we release to the community. Results on analogy completion and entity sense disambiguation indicate that entities and words capture complementary information that can be effectively combined for downstream use.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/27/2022

Learning to Borrow – Relation Representation for Without-Mention Entity-Pairs for Knowledge Graph Completion

Prior work on integrating text corpora with knowledge graphs (KGs) to im...
research
02/06/2019

Word Embeddings for Entity-annotated Texts

Many information retrieval and natural language processing tasks due to ...
research
08/23/2018

Mapping Text to Knowledge Graph Entities using Multi-Sense LSTMs

This paper addresses the problem of mapping natural language text to kno...
research
11/01/2019

Reasoning Over Paths via Knowledge Base Completion

Reasoning over paths in large scale knowledge graphs is an important pro...
research
07/28/2023

Select and Augment: Enhanced Dense Retrieval Knowledge Graph Augmentation

Injecting textual information into knowledge graph (KG) entity represent...
research
06/05/2023

CoSiNES: Contrastive Siamese Network for Entity Standardization

Entity standardization maps noisy mentions from free-form text to standa...
research
11/06/2019

Gextext: Disease Network Extraction from Biomedical Literature

PURPOSE: We propose a fully unsupervised method to learn latent disease ...

Please sign up or login with your details

Forgot password? Click here to reset