When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets

09/15/2023
by   Orion Weller, et al.
0

Using large language models (LMs) for query or document expansion can improve generalization in information retrieval. However, it is unknown whether these techniques are universally beneficial or only effective in specific settings, such as for particular retrieval models, dataset domains, or query types. To answer this, we conduct the first comprehensive analysis of LM-based expansion. We find that there exists a strong negative correlation between retriever performance and gains from expansion: expansion improves scores for weaker models, but generally harms stronger models. We show this trend holds across a set of eleven expansion techniques, twelve datasets with diverse distribution shifts, and twenty-four retrieval models. Through qualitative error analysis, we hypothesize that although expansions provide extra information (potentially improving recall), they add additional noise that makes it difficult to discern between the top relevant documents (thus introducing false positives). Our results suggest the following recipe: use expansions for weaker models or when the target dataset significantly differs from training corpus in format; otherwise, avoid expansions to keep the relevance signal clear.

READ FULL TEXT

page 3

page 5

page 6

page 7

page 13

page 14

page 15

research
01/18/2020

Experiments on Manual Thesaurus based Query Expansion for Ad-hoc Monolingual Gujarati Information Retrieval Tasks

In this paper, we present the experimental work done on Query Expansion ...
research
04/25/2023

Generative Relevance Feedback with Large Language Models

Current query expansion models use pseudo-relevance feedback to improve ...
research
09/15/2020

BERT-QE: Contextualized Query Expansion for Document Re-ranking

Query expansion aims to mitigate the mismatch between the language used ...
research
03/09/2021

CEQE: Contextualized Embeddings for Query Expansion

In this work we leverage recent advances in context-sensitive language m...
research
01/26/2023

BERT-Embedding and Citation Network Analysis based Query Expansion Technique for Scholarly Search

The enormous growth of research publications has made it challenging for...
research
10/13/2022

Query Expansion Using Contextual Clue Sampling with Language Models

Query expansion is an effective approach for mitigating vocabulary misma...
research
11/17/2019

Quels corpus d'entraînement pour l'expansion de requêtes par plongement de mots : application à la recherche de microblogs culturels

We describe here an experimental framework and the results obtained on m...

Please sign up or login with your details

Forgot password? Click here to reset