AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question Generation from SPARQL

by   Guanming Xiong, et al.

This study investigates the task of knowledge-based question generation (KBQG). Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL. Moreover, due to the costly annotation of large-scale SPARQL-question pairs, KBQG from SPARQL under low-resource scenarios urgently needs to be explored. Recently, since the generative pre-trained language models (PLMs) typically trained in natural language (NL)-to-NL paradigm have been proven effective for low-resource generation, e.g., T5 and BART, how to effectively utilize them to generate NL-question from non-NL SPARQL is challenging. To address these challenges, AutoQGS, an auto-prompt approach for low-resource KBQG from SPARQL, is proposed. Firstly, we put forward to generate questions directly from SPARQL for the KBQG task to handle complex operations. Secondly, we propose an auto-prompter trained on large-scale unsupervised data to rephrase SPARQL into NL description, smoothing the low-resource transformation from non-NL SPARQL to NL question with PLMs. Experimental results on the WebQuestionsSP, ComlexWebQuestions 1.1, and PathQuestions show that our model achieves state-of-the-art performance, especially in low-resource settings. Furthermore, a corpus of 330k factoid complex question-SPARQL pairs is generated for further KBQG research.


page 1

page 2

page 3

page 4


PSG: Prompt-based Sequence Generation for Acronym Extraction

Acronym extraction aims to find acronyms (i.e., short-forms) and their m...

ETC-NLG: End-to-end Topic-Conditioned Natural Language Generation

Plug-and-play language models (PPLMs) enable topic-conditioned natural l...

A Knowledge-enhanced Two-stage Generative Framework for Medical Dialogue Information Extraction

This paper focuses on term-status pair extraction from medical dialogues...

Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey

Dense retrieval (DR) approaches based on powerful pre-trained language m...

Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph

There have been many recent investigations into prompt-based training of...

Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study

Large generative language models such as GPT-2 are well-known for their ...

Detecting Suicide Risk in Online Counseling Services: A Study in a Low-Resource Language

With the increased awareness of situations of mental crisis and their so...

Please sign up or login with your details

Forgot password? Click here to reset