Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge

07/17/2023
by   Gilchan Park, et al.
0

Understanding protein interactions and pathway knowledge is crucial for unraveling the complexities of living systems and investigating the underlying mechanisms of biological functions and complex diseases. While existing databases provide curated biological data from literature and other sources, they are often incomplete and their maintenance is labor-intensive, necessitating alternative approaches. In this study, we propose to harness the capabilities of large language models to address these issues by automatically extracting such knowledge from the relevant scientific literature. Toward this goal, in this work, we investigate the effectiveness of different large language models in tasks that involve recognizing protein interactions, pathways, and gene regulatory relations. We thoroughly evaluate the performance of various models, highlight the significant findings, and discuss both the future opportunities and the remaining challenges associated with this approach. The code and data are available at: https://github.com/boxorange/BioIE-LLM

READ FULL TEXT

page 4

page 5

page 6

research
01/28/2023

On Pre-trained Language Models for Antibody

Antibodies are vital proteins offering robust protection for the human b...
research
01/23/2022

OntoProtein: Protein Pretraining With Gene Ontology Embedding

Self-supervised protein language models have proved their effectiveness ...
research
06/27/2022

ProGen2: Exploring the Boundaries of Protein Language Models

Attention-based models trained on protein sequences have demonstrated in...
research
12/03/2022

iEnhancer-ELM: Improve Enhancer Identification by Extracting Multi-scale Contextual Information based on Enhancer Language Models

Motivation: Enhancers are important cis-regulatory elements that regulat...
research
03/19/2018

Mechanisms for producing a working knowledge: Enacting, orchestrating and organizing

Given that knowledge (intensive) work takes place immersed in truly hete...
research
12/23/2019

BioConceptVec: creating and evaluating literature-based biomedical concept embeddings on a large scale

Capturing the semantics of related biological concepts, such as genes an...
research
08/09/2023

Two Novel Approaches to Detect Community: A Case Study of Omicron Lineage Variants PPI Network

The capacity to identify and analyze protein-protein interactions, along...

Please sign up or login with your details

Forgot password? Click here to reset