A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

by   Unnat Jain, et al.

Autonomous agents must learn to collaborate. It is not scalable to develop a new centralized agent every time a task's difficulty outpaces a single agent's abilities. While multi-agent collaboration research has flourished in gridworld-like environments, relatively little work has considered visually rich domains. Addressing this, we introduce the novel task FurnMove in which agents work together to move a piece of furniture through a living room to a goal. Unlike existing tasks, FurnMove requires agents to coordinate at every timestep. We identify two challenges when training agents to complete FurnMove: existing decentralized action sampling procedures do not permit expressive joint action policies and, in tasks requiring close coordination, the number of failed actions dominates successful actions. To confront these challenges we introduce SYNC-policies (synchronize your actions coherently) and CORDIAL (coordination loss). Using SYNC-policies and CORDIAL, our agents achieve a 58 completion rate on FurnMove, an impressive absolute gain of 25 percentage points over competitive decentralized baselines. Our dataset, code, and pretrained models are available at https://unnat.github.io/cordial-sync .


page 2

page 25

page 34

page 35


Scalable, Decentralized Multi-Agent Reinforcement Learning Methods Inspired by Stigmergy and Ant Colonies

Bolstering multi-agent learning algorithms to tackle complex coordinatio...

Entropy Enhanced Multi-Agent Coordination Based on Hierarchical Graph Learning for Continuous Action Space

In most existing studies on large-scale multi-agent coordination, the co...

Intrinsically-Motivated Goal-Conditioned Reinforcement Learning in Multi-Agent Environments

How can a population of reinforcement learning agents autonomously learn...

PRIMAL2: Pathfinding via Reinforcement and Imitation Multi-Agent Learning – Lifelong

Multi-agent path finding (MAPF) is an indispensable component of large-s...

Cooperation without Coordination: Hierarchical Predictive Planning for Decentralized Multiagent Navigation

Decentralized multiagent planning raises many challenges, such as adapti...

Learning Multi-agent Implicit Communication Through Actions: A Case Study in Contract Bridge, a Collaborative Imperfect-Information Game

In situations where explicit communication is limited, a human collabora...

Please sign up or login with your details

Forgot password? Click here to reset