Residual Q-Networks for Value Function Factorizing in Multi-Agent Reinforcement Learning

by   Rafael Pina, et al.

Multi-Agent Reinforcement Learning (MARL) is useful in many problems that require the cooperation and coordination of multiple agents. Learning optimal policies using reinforcement learning in a multi-agent setting can be very difficult as the number of agents increases. Recent solutions such as Value Decomposition Networks (VDN), QMIX, QTRAN and QPLEX adhere to the centralized training and decentralized execution scheme and perform factorization of the joint action-value functions. However, these methods still suffer from increased environmental complexity, and at times fail to converge in a stable manner. We propose a novel concept of Residual Q-Networks (RQNs) for MARL, which learns to transform the individual Q-value trajectories in a way that preserves the Individual-Global-Max criteria (IGM), but is more robust in factorizing action-value functions. The RQN acts as an auxiliary network that accelerates convergence and will become obsolete as the agents reach the training objectives. The performance of the proposed method is compared against several state-of-the-art techniques such as QPLEX, QMIX, QTRAN and VDN, in a range of multi-agent cooperative tasks. The results illustrate that the proposed method, in general, converges faster, with increased stability and shows robust performance in a wider family of environments. The improvements in results are more prominent in environments with severe punishments for non-cooperative behaviours and especially in the absence of complete state information during training time.


page 1

page 6


QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning

We explore value-based solutions for multi-agent reinforcement learning ...

Disentangling Successor Features for Coordination in Multi-agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) is a promising framework for s...

UNMAS: Multi-Agent Reinforcement Learning for Unshaped Cooperative Scenarios

Multi-agent reinforcement learning methods such as VDN, QMIX, and QTRAN ...

Local Advantage Networks for Cooperative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) enables us to create adaptive ...

Rethinking Individual Global Max in Cooperative Multi-Agent Reinforcement Learning

In cooperative multi-agent reinforcement learning, centralized training ...

Multi-agent Reinforcement Learning with Sparse Interactions by Negotiation and Knowledge Transfer

Reinforcement learning has significant applications for multi-agent syst...

Value Function Factorisation with Hypergraph Convolution for Cooperative Multi-agent Reinforcement Learning

Cooperation between agents in a multi-agent system (MAS) has become a ho...

Please sign up or login with your details

Forgot password? Click here to reset