Value Variance Minimization for Learning Approximate Equilibrium in Aggregation Systems

03/16/2020
by   Tanvi Verma, et al.
0

For effective matching of resources (e.g., taxis, food, bikes, shopping items) to customer demand, aggregation systems have been extremely successful. In aggregation systems, a central entity (e.g., Uber, Food Panda, Ofo) aggregates supply (e.g., drivers, delivery personnel) and matches demand to supply on a continuous basis (sequential decisions). Due to the objective of the central entity to maximize its profits, individual suppliers get sacrificed thereby creating incentive for individuals to leave the system. In this paper, we consider the problem of learning approximate equilibrium solutions (win-win solutions) in aggregation systems, so that individuals have an incentive to remain in the aggregation system. Unfortunately, such systems have thousands of agents and have to consider demand uncertainty and the underlying problem is a (Partially Observable) Stochastic Game. Given the significant complexity of learning or planning in a stochastic game, we make three key contributions: (a) To exploit infinitesimally small contribution of each agent and anonymity (reward and transitions between agents are dependent on agent counts) in interactions, we represent this as a Multi-Agent Reinforcement Learning (MARL) problem that builds on insights from non-atomic congestion games model; (b) We provide a novel variance reduction mechanism for moving joint solution towards Nash Equilibrium that exploits the infinitesimally small contribution of each agent; and finally (c) We provide detailed results on three different domains to demonstrate the utility of our approach in comparison to state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/27/2022

Learning Individual Policies in Large Multi-agent Systems through Local Variance Minimization

In multi-agent systems with large number of agents, typically the contri...
research
03/27/2018

Entropy Controlled Non-Stationarity for Improving Performance of Independent Learners in Anonymous MARL Settings

With the advent of sequential matching (of supply and demand) systems (u...
research
05/27/2023

Reinforcement Learning With Reward Machines in Stochastic Games

We investigate multi-agent reinforcement learning for stochastic games w...
research
05/30/2022

GLDQN: Explicitly Parameterized Quantile Reinforcement Learning for Waste Reduction

We study the problem of restocking a grocery store's inventory with peri...
research
01/30/2019

Coordinating the Crowd: Inducing Desirable Equilibria in Non-Cooperative Systems

Many real-world systems such as taxi systems, traffic networks and smart...
research
03/02/2019

A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics Network

Resource balancing within complex transportation networks is one of the ...
research
10/16/2012

Toward Large-Scale Agent Guidance in an Urban Taxi Service

Empty taxi cruising represents a wastage of resources in the context of ...

Please sign up or login with your details

Forgot password? Click here to reset