Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

08/03/2023
by   Zheng Yuan, et al.
0

Mathematical reasoning is a challenging task for large language models (LLMs), while the scaling relationship of it with respect to LLM capacity is under-explored. In this paper, we investigate how the pre-training loss, supervised data amount, and augmented data amount influence the reasoning performances of a supervised LLM. We find that pre-training loss is a better indicator of the model's performance than the model's parameter count. We apply supervised fine-tuning (SFT) with different amounts of supervised data and empirically find a log-linear relation between data amount and model performance, and we find better models improve less with enlarged supervised datasets. To augment more data samples for improving model performances without any human effort, we propose to apply Rejection sampling Fine-Tuning (RFT). RFT uses supervised models to generate and collect correct reasoning paths as augmented fine-tuning datasets. We find with augmented samples containing more distinct reasoning paths, RFT improves mathematical reasoning performance more for LLMs. We also find RFT brings more improvement for less performant LLMs. Furthermore, we combine rejection samples from multiple models which push LLaMA-7B to an accuracy of 49.3 (SFT) accuracy of 35.9

READ FULL TEXT
research
10/20/2022

Large Language Models Can Self-Improve

Large Language Models (LLMs) have achieved excellent performances in var...
research
09/07/2021

Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models

Commonsense reasoning benchmarks have been largely solved by fine-tuning...
research
11/28/2022

GPT-Neo for commonsense reasoning-a theoretical and practical lens

Recent work has demonstrated substantial gains in pre-training large-sca...
research
11/05/2019

MML: Maximal Multiverse Learning for Robust Fine-Tuning of Language Models

Recent state-of-the-art language models utilize a two-phase training pro...
research
05/28/2023

Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

Large Language Models (LLMs) have shown promising performance in knowled...
research
09/22/2021

Caption Enriched Samples for Improving Hateful Memes Detection

The recently introduced hateful meme challenge demonstrates the difficul...
research
07/16/2023

MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning

Reasoning in mathematical domains remains a significant challenge for re...

Please sign up or login with your details

Forgot password? Click here to reset