Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow

by   Kaibo Cao, et al.

As a popular Q A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual query, and the semantic gap between the query and the post content. Therefore, developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms, etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms five state-of-the-art baselines, and achieves a 5.6 𝐸π‘₯π‘Žπ‘π‘‘π‘€π‘Žπ‘‘π‘β„Ž and a 4.8


page 1

page 2

page 3

page 4

βˆ™ 07/26/2022

Using clarification questions to improve software developers' Web search

Context: Recent research indicates that Web queries written by software ...
βˆ™ 07/23/2018

Effective Reformulation of Query for Code Search using Crowdsourced Knowledge and Extra-Large Data Analytics

Software developers frequently issue generic natural language queries fo...
βˆ™ 02/20/2022

SOTitle: A Transformer-based Post Title Generation Approach for Stack Overflow

On Stack Overflow, developers can not only browse question posts to solv...
βˆ™ 04/04/2020

Towards Query Logs for Privacy Studies: On Deriving Search Queries from Questions

Translating verbose information needs into crisp search queries is a phe...
βˆ™ 08/24/2022

Diverse Title Generation for Stack Overflow Posts with Multiple Sampling Enhanced Transformer

Stack Overflow is one of the most popular programming communities where ...
βˆ™ 06/28/2021

Revelio: ML-Generated Debugging Queries for Distributed Systems

A major difficulty in debugging distributed systems lies in manually det...
βˆ™ 07/01/2023

Self-Supervised Query Reformulation for Code Search

Automatic query reformulation is a widely utilized technology for enrich...

Please sign up or login with your details

Forgot password? Click here to reset