Know What I don't Know: Handling Ambiguous and Unanswerable Questions for Text-to-SQL

12/17/2022
by   Bing Wang, et al.
0

The task of text-to-SQL is to convert a natural language question to its corresponding SQL query in the context of relational tables. Existing text-to-SQL parsers generate a "plausible" SQL query for an arbitrary user question, thereby failing to correctly handle problematic user questions. To formalize this problem, we conduct a preliminary study on the observed ambiguous and unanswerable cases in text-to-SQL and summarize them into 6 feature categories. Correspondingly, we identify the causes behind each category and propose requirements for handling ambiguous and unanswerable questions. Following this study, we propose a simple yet effective counterfactual example generation approach for the automatic generation of ambiguous and unanswerable text-to-SQL examples. Furthermore, we propose a weakly supervised model DTE (Detecting-Then-Explaining) for error detection, localization, and explanation. Experimental results show that our model achieves the best result on both real-world examples and generated examples compared with various baselines. We will release data and code for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2020

SeqGenSQL – A Robust Sequence Generation Model for Structured Query Language

We explore using T5 (Raffel et al. (2019)) to directly translate natural...
research
07/11/2023

Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with Sample-aware Prompting and Dynamic Revision Chain

Text-to-SQL aims at generating SQL queries for the given natural languag...
research
09/11/2019

CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

We present CoSQL, a corpus for building cross-domain, general-purpose da...
research
07/30/2020

Photon: A Robust Cross-Domain Text-to-SQL System

Natural language interfaces to databases (NLIDB) democratize end user ac...
research
07/28/2019

A Translate-Edit Model for Natural Language Question to SQL Query Generation on Multi-relational Healthcare Data

Electronic health record (EHR) data contains most of the important patie...
research
06/23/2018

Improving Text-to-SQL Evaluation Methodology

To be informative, an evaluation must measure how well systems generaliz...
research
04/27/2023

Controllable Data Augmentation for Context-Dependent Text-to-SQL

The limited scale of annotated data constraints existing context-depende...

Please sign up or login with your details

Forgot password? Click here to reset