STaR: Bootstrapping Reasoning With Reasoning

03/28/2022
by   Eric Zelikman, et al.
1

Generating step-by-step "chain-of-thought" rationales improves language model performance on complex reasoning tasks like mathematics or commonsense question-answering. However, inducing language model rationale generation currently requires either constructing massive rationale datasets or sacrificing accuracy by using only few-shot inference. We propose a technique to iteratively leverage a small number of rationale examples and a large dataset without rationales, to bootstrap the ability to perform successively more complex reasoning. This technique, the "Self-Taught Reasoner" (STaR), relies on a simple loop: generate rationales to answer many questions, prompted with a few rationale examples; if the generated answers are wrong, try again to generate a rationale given the correct answer; fine-tune on all the rationales that ultimately yielded correct answers; repeat. We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30× larger state-of-the-art language model on CommensenseQA. Thus, STaR lets a model improve itself by learning from its own generated reasoning.

READ FULL TEXT
research
03/21/2022

Self-Consistency Improves Chain of Thought Reasoning in Language Models

We explore a simple ensemble strategy, self-consistency, that significan...
research
05/22/2023

Can ChatGPT Defend the Truth? Automatic Dialectical Evaluation Elicits LLMs' Deficiencies in Reasoning

We explore testing the reasoning ability of large language models (LLMs)...
research
03/14/2023

It Takes One to Tango but More Make Trouble? In-Context Training with Different Number of Demonstrations

Large language models (LLMs) are capable to perform complex reasoning by...
research
05/24/2023

Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective

Recent studies have discovered that Chain-of-Thought prompting (CoT) can...
research
12/08/2022

Successive Prompting for Decomposing Complex Questions

Answering complex questions that require making latent decisions is a ch...
research
04/19/2023

Progressive-Hint Prompting Improves Reasoning in Large Language Models

The performance of Large Language Models (LLMs) in reasoning tasks depen...
research
06/06/2022

On the Advance of Making Language Models Better Reasoners

Large language models such as GPT-3 and PaLM have shown remarkable perfo...

Please sign up or login with your details

Forgot password? Click here to reset