Status-quo policy gradient in Multi-Agent Reinforcement Learning

11/23/2021
by   Pinkesh Badjatiya, et al.
0

Individual rationality, which involves maximizing expected individual returns, does not always lead to high-utility individual or group outcomes in multi-agent problems. For instance, in multi-agent social dilemmas, Reinforcement Learning (RL) agents trained to maximize individual rewards converge to a low-utility mutually harmful equilibrium. In contrast, humans evolve useful strategies in such social dilemmas. Inspired by ideas from human psychology that attribute this behavior to the status-quo bias, we present a status-quo loss (SQLoss) and the corresponding policy gradient algorithm that incorporates this bias in an RL agent. We demonstrate that agents trained with SQLoss learn high-utility policies in several social dilemma matrix games (Prisoner's Dilemma, Stag Hunt matrix variant, Chicken Game). We show how SQLoss outperforms existing state-of-the-art methods to obtain high-utility policies in visual input non-matrix games (Coin Game and Stag Hunt visual input variant) using pre-trained cooperation and defection oracles. Finally, we show that SQLoss extends to a 4-agent setting by demonstrating the emergence of cooperative behavior in the popular Braess' paradox.

READ FULL TEXT
research
01/15/2020

Inducing Cooperative behaviour in Sequential-Social dilemmas through Multi-Agent Reinforcement Learning using Status-Quo Loss

In social dilemma situations, individual rationality leads to sub-optima...
research
06/14/2023

Mediated Multi-Agent Reinforcement Learning

The majority of Multi-Agent Reinforcement Learning (MARL) literature equ...
research
01/15/2020

Inducing Cooperation in Multi-Agent Games Through Status-Quo Loss

Social dilemma situations bring out the conflict between individual and ...
research
09/13/2017

Learning with Opponent-Learning Awareness

Multi-agent settings are quickly gathering importance in machine learnin...
research
08/22/2022

Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Multi-agent reinforcement learning (MARL) is a powerful tool for trainin...
research
10/20/2021

Adversarial Socialbot Learning via Multi-Agent Deep Hierarchical Reinforcement Learning

Socialbots are software-driven user accounts on social platforms, acting...
research
12/29/2020

Multi-Principal Assistance Games: Definition and Collegial Mechanisms

We introduce the concept of a multi-principal assistance game (MPAG), an...

Please sign up or login with your details

Forgot password? Click here to reset