M6-Rec: Generative Pretrained Language Models are Open-Ended Recommender Systems

by   Zeyu Cui, et al.

Industrial recommender systems have been growing increasingly complex, may involve diverse domains such as e-commerce products and user-generated contents, and can comprise a myriad of tasks such as retrieval, ranking, explanation generation, and even AI-assisted content production. The mainstream approach so far is to develop individual algorithms for each domain and each task. In this paper, we explore the possibility of developing a unified foundation model to support open-ended domains and tasks in an industrial recommender system, which may reduce the demand on downstream settings' data and can minimize the carbon footprint by avoiding training a separate model from scratch for every task. Deriving a unified foundation is challenging due to (i) the potentially unlimited set of downstream domains and tasks, and (ii) the real-world systems' emphasis on computational efficiency. We thus build our foundation upon M6, an existing large-scale industrial pretrained language model similar to GPT-3 and T5, and leverage M6's pretrained ability for sample-efficient downstream adaptation, by representing user behavior data as plain texts and converting the tasks to either language understanding or generation. To deal with a tight hardware budget, we propose an improved version of prompt tuning that outperforms fine-tuning with negligible 1% task-specific parameters, and employ techniques such as late interaction, early exiting, parameter sharing, and pruning to further reduce the inference time and the model size. We demonstrate the foundation model's versatility on a wide range of tasks such as retrieval, ranking, zero-shot recommendation, explanation generation, personalized content creation, and conversational recommendation, and manage to deploy it on both cloud servers and mobile devices.


page 1

page 2

page 3

page 4


Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning

Recent studies have proposed unified user modeling frameworks that lever...

Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination

Large-scale pretrained language models have made significant advances in...

ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

With large language models (LLMs) achieving remarkable breakthroughs in ...

Encoder-Agnostic Adaptation for Conditional Language Generation

Large pretrained language models have changed the way researchers approa...

On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence

Large pre-trained models, also known as foundation models (FMs), are tra...

AnyPredict: Foundation Model for Tabular Prediction

Foundation models are pre-trained on massive data to perform well across...

MV-HAN: A Hybrid Attentive Networks based Multi-View Learning Model for Large-scale Contents Recommendation

Industrial recommender systems usually employ multi-source data to impro...

Please sign up or login with your details

Forgot password? Click here to reset