Teal: Learning-Accelerated Optimization of Traffic Engineering

by   Zhiying Xu, et al.

In the last decade, global cloud wide-area networks (WANs) have grown 10× in size due to the deployment of new network sites and datacenters, making it challenging for commercial optimization engines to solve the network traffic engineering (TE) problem within the temporal budget of a few minutes. In this work, we show that carefully designed deep learning models are key to accelerating the running time of intra-WAN TE systems for large deployments since deep learning is both massively parallel and it benefits from the wealth of historical traffic allocation data from production WANs. However, off-the-shelf deep learning methods fail to perform well on the TE task since they ignore the effects of network connectivity on flow allocations. They are also faced with a tractability challenge posed by the large problem scale of TE optimization. Moreover, neural networks do not have mechanisms to readily enforce hard constraints on model outputs (e.g., link capacity constraints). We tackle these challenges by designing a deep learning-based TE system – Teal. First, Teal leverages graph neural networks (GNN) to faithfully capture connectivity and model network flows. Second, Teal devises a multi-agent reinforcement learning (RL) algorithm to process individual demands independently in parallel to lower the problem scale. Finally, Teal reduces link capacity violations and improves solution quality using the alternating direction method of multipliers (ADMM). We evaluate Teal on traffic matrices of a global commercial cloud provider and find that Teal computes near-optimal traffic allocations with a 59× speedup over state-of-the-art TE systems on a WAN topology of over 1,500 nodes.


page 1

page 9

page 10

page 11

page 12

page 17


CFR-RL: Traffic Engineering with Reinforcement Learning in SDN

Traditional Traffic Engineering (TE) solutions can achieve the optimal o...

Scaling Graph-based Deep Learning models to larger networks

Graph Neural Networks (GNN) have shown a strong potential to be integrat...

MAGNNETO: A Graph Neural Network-based Multi-Agent system for Traffic Engineering

Current trends in networking propose the use of Machine Learning (ML) fo...

The World as a Graph: Improving El Niño Forecasts with Graph Neural Networks

Deep learning-based models have recently outperformed state-of-the-art s...

Graph Neural Modeling of Network Flows

Network flow problems, which involve distributing traffic over a network...

Scalable Traffic Signal Controls using Fog-Cloud Based Multiagent Reinforcement Learning

Optimizing traffic signal control (TSC) at intersections continues to po...

Structured Nonnegative Matrix Factorization for Traffic Flow Estimation of Large Cloud Networks

Network traffic matrix estimation is an ill-posed linear inverse problem...

Please sign up or login with your details

Forgot password? Click here to reset