Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval

by   Tim Hartill, et al.

When provided with sufficient explanatory context, smaller Language Models have been shown to exhibit strong reasoning ability on challenging short-answer question-answering tasks where the questions are unseen in training. We evaluate two methods for further improvement in this setting. Both methods focus on combining rationales generated by a larger Language Model with longer contexts created from a multi-hop dense retrieval system. The first method (RR) involves training a Rationale Ranking model to score both generated rationales and retrieved contexts with respect to relevance and truthfulness. We then use the scores to derive combined contexts from both knowledge sources using a number of combinatory strategies. For the second method (RATD) we train a smaller Reasoning model using retrieval-augmented training datasets such that it becomes proficient at utilising relevant information from longer text sequences that may be only partially evidential and frequently contain many irrelevant sentences. Generally we find that both methods are effective but that the RATD method is more straightforward to apply and produces the strongest results in the unseen setting on which we focus. Our single best Reasoning model using only 440 million parameters materially improves upon strong comparable prior baselines for unseen evaluation datasets (StrategyQA 58.9 → 61.7 acc., CommonsenseQA 63.6 → 72.7 acc., ARC-DA 31.6 → 52.1 F1, IIRC 25.5 → 27.3 F1) and a version utilising our prior knowledge of each type of question in selecting a context combination strategy does even better. Our proposed models also generally outperform direct prompts against much larger models (BLOOM 175B and StableVicuna 13B) in both few-shot chain-of-thought and few-shot answer-only settings.


page 1

page 2

page 3

page 4


Teaching Smaller Language Models To Generalise To Unseen Compositional Questions

We equip a smaller Language Model to generalise to answering challenging...

RECKONING: Reasoning through Dynamic Knowledge Encoding

Recent studies on transformer-based language models show that they can a...

Triggering Multi-Hop Reasoning for Question Answering in Language Models using Soft Prompts and Random Walks

Despite readily memorizing world knowledge about entities, pre-trained l...

Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

Recent work has shown that large language models are capable of generati...

Measuring and Narrowing the Compositionality Gap in Language Models

We investigate the ability of language models to perform compositional r...

Reasoning Circuits: Few-shot Multihop Question Generation with Structured Rationales

Multi-hop Question Generation is the task of generating questions which ...

Few-shot Prompting Towards Controllable Response Generation

Much literature has shown that prompt-based learning is an efficient met...

Please sign up or login with your details

Forgot password? Click here to reset