Extracting a Knowledge Base of Mechanisms from COVID-19 Papers

by   Aida Amini, et al.

The urgency of mitigating COVID-19 has spawned a large and diverse body of scientific literature that is challenging for researchers to navigate. This explosion of information has stimulated interest in automated tools to help identify useful knowledge. We have pursued the use of methods for extracting diverse forms of mechanism relations from the natural language of scientific papers. We seek to identify concepts in COVID-19 and related literature which represent activities, functions, associations and causal relations, ranging from cellular processes to economic impacts. We formulate a broad, coarse-grained schema targeting mechanism relations between open, free-form entities. Our approach strikes a balance between expressivity and breadth that supports generalization across diverse concepts. We curate a dataset of scientific papers annotated according to our novel schema. Using an information extraction model trained on this new corpus, we construct a knowledge base (KB) of 2M mechanism relations, which we make publicly available. Our model is able to extract relations at an F1 at least twice that of baselines such as open IE or related scientific IE systems. We conduct experiments examining the ability of our system to retrieve relevant information on viral mechanisms of action, and on applications of AI to COVID-19 research. In both cases, our system identifies relevant information from our automatically-constructed knowledge base with high precision.


page 1

page 2

page 3

page 4


SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications

We describe the SemEval task of extracting keyphrases and relations betw...

ImPaKT: A Dataset for Open-Schema Knowledge Base Construction

Large language models have ushered in a golden age of semantic parsing. ...

High-Precision Extraction of Emerging Concepts from Scientific Literature

Identification of new concepts in scientific literature can help power f...

GrapAL: Connecting the Dots in Scientific Literature

We introduce GrapAL (Graph database of Academic Literature), a versatile...

The smarty4covid dataset and knowledge base: a framework enabling interpretable analysis of audio signals

Harnessing the power of Artificial Intelligence (AI) and m-health toward...

A Search Engine for Discovery of Scientific Challenges and Directions

Keeping track of scientific challenges, advances and emerging directions...

Multi-Round Parsing-based Multiword Rules for Scientific OpenIE

Information extraction (IE) in scientific literature has facilitated man...

Please sign up or login with your details

Forgot password? Click here to reset