Large Language Models are Zero-Shot Rankers for Recommender Systems

by   Yupeng Hou, et al.

Recently, large language models (LLMs) (e.g. GPT-4) have demonstrated impressive general-purpose task-solving abilities, including the potential to approach recommendation tasks. Along this line of research, this work aims to investigate the capacity of LLMs that act as the ranking model for recommender systems. To conduct our empirical study, we first formalize the recommendation problem as a conditional ranking task, considering sequential interaction histories as conditions and the items retrieved by the candidate generation model as candidates. We adopt a specific prompting approach to solving the ranking task by LLMs: we carefully design the prompting template by including the sequential interaction history, the candidate items, and the ranking instruction. We conduct extensive experiments on two widely-used datasets for recommender systems and derive several key findings for the use of LLMs in recommender systems. We show that LLMs have promising zero-shot ranking abilities, even competitive to or better than conventional recommendation models on candidates retrieved by multiple candidate generators. We also demonstrate that LLMs struggle to perceive the order of historical interactions and can be affected by biases like position bias, while these issues can be alleviated via specially designed prompting and bootstrapping strategies. The code to reproduce this work is available at


page 1

page 2

page 3

page 4


PALR: Personalization Aware LLMs for Recommendation

Large language models (LLMs) have recently received significant attentio...

Evaluating ChatGPT as a Recommender System: A Rigorous Approach

Recent popularity surrounds large AI language models due to their impres...

Recommendation as Instruction Following: A Large Language Model Empowered Recommendation Approach

In the past decades, recommender systems have attracted much attention i...

AutoDebias: Learning to Debias for Recommendation

Recommender systems rely on user behavior data like ratings and clicks t...

Towards Personalized Prompt-Model Retrieval for Generative Recommendation

Recommender Systems are built to retrieve relevant items to satisfy user...

Investigating the Robustness of Sequential Recommender Systems Against Training Data Perturbations: an Empirical Study

Sequential Recommender Systems (SRSs) have been widely used to model use...

How Can Recommender Systems Benefit from Large Language Models: A Survey

Recommender systems (RS) play important roles to match users' informatio...

Please sign up or login with your details

Forgot password? Click here to reset