CONVERSER: Few-Shot Conversational Dense Retrieval with Synthetic Data Generation

09/13/2023
by   Chao-Wei Huang, et al.
0

Conversational search provides a natural interface for information retrieval (IR). Recent approaches have demonstrated promising results in applying dense retrieval to conversational IR. However, training dense retrievers requires large amounts of in-domain paired data. This hinders the development of conversational dense retrievers, as abundant in-domain conversations are expensive to collect. In this paper, we propose CONVERSER, a framework for training conversational dense retrievers with at most 6 examples of in-domain dialogues. Specifically, we utilize the in-context learning capability of large language models to generate conversational queries given a passage in the retrieval corpus. Experimental results on conversational retrieval benchmarks OR-QuAC and TREC CAsT 19 show that the proposed CONVERSER achieves comparable performance to fully-supervised models, demonstrating the effectiveness of our proposed framework in few-shot conversational dense retrieval. All source code and generated datasets are available at https://github.com/MiuLab/CONVERSER

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2021

Few-Shot Conversational Dense Retrieval

Dense retrieval (DR) has the potential to resolve the query understandin...
research
03/12/2023

Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search

In this paper, we present a prompting framework called LLMCS that levera...
research
05/12/2023

Knowledge Refinement via Interaction Between Search Engines and Large Language Models

Information retrieval (IR) plays a crucial role in locating relevant res...
research
01/29/2023

HeroNet: A Hybrid Retrieval-Generation Network for Conversational Bots

Using natural language, Conversational Bot offers unprecedented ways to ...
research
06/17/2023

Typo-Robust Representation Learning for Dense Retrieval

Dense retrieval is a basic building block of information retrieval appli...
research
07/10/2023

InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval

Recent work has explored Large Language Models (LLMs) to overcome the la...
research
08/12/2022

Automated Utterance Labeling of Conversations Using Natural Language Processing

Conversational data is essential in psychology because it can help resea...

Please sign up or login with your details

Forgot password? Click here to reset