Solving and Generating NPR Sunday Puzzles with Large Language Models

by   Jingmiao Zhao, et al.
Wellesley College

We explore the ability of large language models to solve and generate puzzles from the NPR Sunday Puzzle game show using PUZZLEQA, a dataset comprising 15 years of on-air puzzles. We evaluate four large language models using PUZZLEQA, in both multiple choice and free response formats, and explore two prompt engineering techniques to improve free response performance: chain-of-thought reasoning and prompt summarization. We find that state-of-the-art large language models can solve many PUZZLEQA puzzles: the best model, GPT-3.5, achieves 50.2 experiment, we find no evidence that models can generate puzzles: GPT-3.5 generates puzzles with answers that do not conform to the generated rules. Puzzle generation remains a challenging task for future work.


page 2

page 3


Beyond the limitations of any imaginable mechanism: large language models and psycholinguistics

Large language models are not detailed models of human linguistic proces...

Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation

Code generation aims to automatically generate source code from high-lev...

Zero-Resource Hallucination Prevention for Large Language Models

The prevalent use of large language models (LLMs) in various domains has...

Multi-Method Self-Training: Improving Code Generation With Text, And Vice Versa

Large Language Models have many methods for solving the same problem. Th...

Generalized Planning in PDDL Domains with Pretrained Large Language Models

Recent work has considered whether large language models (LLMs) can func...

Level Generation Through Large Language Models

Large Language Models (LLMs) are powerful tools, capable of leveraging t...

I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors

Visual metaphors are powerful rhetorical devices used to persuade or com...

Please sign up or login with your details

Forgot password? Click here to reset