Improving Biomedical Information Retrieval with Neural Retrievers

by   Man Luo, et al.

Information retrieval (IR) is essential in search engines and dialogue systems as well as natural language processing tasks such as open-domain question answering. IR serve an important function in the biomedical domain, where content and sources of scientific knowledge may evolve rapidly. Although neural retrievers have surpassed traditional IR approaches such as TF-IDF and BM25 in standard open-domain question answering tasks, they are still found lacking in the biomedical domain. In this paper, we seek to improve information retrieval (IR) using neural retrievers (NR) in the biomedical domain, and achieve this goal using a three-pronged approach. First, to tackle the relative lack of data in the biomedical domain, we propose a template-based question generation method that can be leveraged to train neural retriever models. Second, we develop two novel pre-training tasks that are closely aligned to the downstream task of information retrieval. Third, we introduce the “Poly-DPR” model which encodes each context into multiple context vectors. Extensive experiments and analysis on the BioASQ challenge suggest that our proposed method leads to large gains over existing neural approaches and beats BM25 in the small-corpus setting. We show that BM25 and our method can complement each other, and a simple hybrid model leads to further gains in the large corpus setting.


page 1

page 2

page 3

page 4


Multi-Perspective Semantic Information Retrieval in the Biomedical Domain

Information Retrieval (IR) is the task of obtaining pieces of data (such...

Better Retrieval May Not Lead to Better Question Answering

Considerable progress has been made recently in open-domain question ans...

Deep Contextualized Biomedical Abbreviation Expansion

Automatic identification and expansion of ambiguous abbreviations are es...

NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework

Information retrieval aims to find information that meets users' needs f...

BanditRank: Learning to Rank Using Contextual Bandits

We propose an extensible deep learning method that uses reinforcement le...

Neural Retriever and Go Beyond: A Thesis Proposal

Information Retriever (IR) aims to find the relevant documents (e.g. sni...

Biomedical Question Answering via Weighted Neural Network Passage Retrieval

The amount of publicly available biomedical literature has been growing ...

Please sign up or login with your details

Forgot password? Click here to reset