Language Models are Pragmatic Speakers

by   Khanh Nguyen, et al.
Princeton University

How do language models "think"? This paper formulates a probabilistic cognitive model called bounded pragmatic speaker, which can characterize the operation of different variants of language models. In particular, we show that large language models fine-tuned with reinforcement learning from human feedback (Ouyang et al., 2022) implements a model of thought that conceptually resembles a fast-and-slow model (Kahneman, 2011). We discuss the limitations of reinforcement learning from human feedback as a fast-and-slow model of thought and propose directions for extending this framework. Overall, our work demonstrates that viewing language models through the lens of cognitive probabilistic modeling can offer valuable insights for understanding, evaluating, and developing them.


page 1

page 2

page 3

page 4


Beyond the limitations of any imaginable mechanism: large language models and psycholinguistics

Large language models are not detailed models of human linguistic proces...

Psychologically-informed chain-of-thought prompts for metaphor understanding in large language models

Probabilistic models of language understanding are interpretable and str...

Fine-tuning Language Models with Generative Adversarial Feedback

Reinforcement Learning with Human Feedback (RLHF) has been demonstrated ...

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Reinforcement learning from human feedback (RLHF) is a technique for tra...

Cognitive Modeling of Semantic Fluency Using Transformers

Can deep language models be explanatory models of human cognition? If so...

Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts

Many cognitive approaches to well-being, such as recognizing and reframi...

Fundamental Limitations of Alignment in Large Language Models

An important aspect in developing language models that interact with hum...

Please sign up or login with your details

Forgot password? Click here to reset