Option-critic in cooperative multi-agent systems

11/28/2019
by   Jhelum Chakravorty, et al.
0

In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems using the options framework (Sutton et al, 1999) and provide a model-free algorithm for this problem. First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a common information approach. We use common beliefs and broadcasting to solve an equivalent centralized POMDP problem. Then, we propose the Distributed Option Critic (DOC) algorithm, motivated by the work of Bacon et al (2017) in the single-agent setting. Our approach uses centralized option evaluation and decentralized intra-option improvement. We analyze theoretically the asymptotic convergence of DOC and validate its performance in grid-world environments, where we implement DOC using a deep neural network. Our experiments show that DOC performs competitively with state-of-the-art algorithms and that it is scalable when the number of agents increases.

READ FULL TEXT

page 9

page 10

research
02/08/2021

Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning

Centralized Training for Decentralized Execution, where agents are train...
research
04/17/2020

F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning

Traditional centralized multi-agent reinforcement learning (MARL) algori...
research
06/05/2022

Efficient decentralized multi-agent learning in asymmetric queuing systems

We study decentralized multi-agent learning in bipartite queuing systems...
research
06/10/2019

Automatic Algorithm Selection In Multi-agent Pathfinding

In a multi-agent pathfinding (MAPF) problem, agents need to navigate fro...
research
03/19/2020

Decentralized MCTS via Learned Teammate Models

A key difficulty of cooperative decentralized planning lies in making ac...
research
07/20/2022

Careful Autonomous Agents in Environments With Multiple Common Resources

Careful rational synthesis was defined in (Condurache et al. 2021) as a ...
research
02/16/2023

Model-Based Decentralized Policy Optimization

Decentralized policy optimization has been commonly used in cooperative ...

Please sign up or login with your details

Forgot password? Click here to reset