Revealing the structure of language model capabilities

06/14/2023
by   Ryan Burnell, et al.
0

Building a theoretical understanding of the capabilities of large language models (LLMs) is vital for our ability to predict and explain the behavior of these systems. Here, we investigate the structure of LLM capabilities by extracting latent capabilities from patterns of individual differences across a varied population of LLMs. Using a combination of Bayesian and frequentist factor analysis, we analyzed data from 29 different LLMs across 27 cognitive tasks. We found evidence that LLM capabilities are not monolithic. Instead, they are better explained by three well-delineated factors that represent reasoning, comprehension and core language modeling. Moreover, we found that these three factors can explain a high proportion of the variance in model performance. These results reveal a consistent structure in the capabilities of different LLMs and demonstrate the multifaceted nature of these capabilities. We also found that the three abilities show different relationships to model properties such as model size and instruction tuning. These patterns help refine our understanding of scaling laws and indicate that changes to a model that improve one ability might simultaneously impair others. Based on these findings, we suggest that benchmarks could be streamlined by focusing on tasks that tap into each broad model ability.

READ FULL TEXT

page 5

page 7

page 15

page 17

research
06/07/2023

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

Instruction-tuned large language models have revolutionized natural lang...
research
09/11/2023

Evaluating the Deductive Competence of Large Language Models

The development of highly fluent large language models (LLMs) has prompt...
research
05/23/2023

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Large language models (LLMs) have demonstrated remarkable capabilities i...
research
08/08/2023

Shepherd: A Critic for Language Model Generation

As large language models improve, there is increasing interest in techni...
research
10/13/2022

Assessing Out-of-Domain Language Model Performance from Few Examples

While pretrained language models have exhibited impressive generalizatio...
research
09/18/2023

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Visual instruction tuning has recently shown encouraging progress with o...
research
05/22/2023

Quantifying Association Capabilities of Large Language Models and Its Implications on Privacy Leakage

The advancement of large language models (LLMs) brings notable improveme...

Please sign up or login with your details

Forgot password? Click here to reset