Distributed Graph Neural Network Training: A Survey

by   Yingxia Shao, et al.

Graph neural networks (GNNs) are a type of deep learning models that learning over graphs, and have been successfully applied in many domains. Despite the effectiveness of GNNs, it is still challenging for GNNs to efficiently scale to large graphs. As a remedy, distributed computing becomes a promising solution of training large-scale GNNs, since it is able to provide abundant computing resources. However, the dependency of graph structure increases the difficulty of achieving high-efficiency distributed GNN training, which suffers from the massive communication and workload imbalance. In recent years, many efforts have been made on distributed GNN training, and an array of training algorithms and systems have been proposed. Yet, there is a lack of systematic review on the optimization techniques from graph processing to distributed execution. In this survey, we analyze three major challenges in distributed GNN training that are massive feature communication, the loss of model accuracy and workload imbalance. Then we introduce a new taxonomy for the optimization techniques in distributed GNN training that address the above challenges. The new taxonomy classifies existing techniques into four categories that are GNN data partition, GNN batch generation, GNN execution model, and GNN communication protocol.We carefully discuss the techniques in each category. In the end, we summarize existing distributed GNN systems for multi-GPUs, GPU-clusters and CPU-clusters, respectively, and give a discussion about the future direction on scalable GNNs.


A Comprehensive Survey on Distributed Training of Graph Neural Networks

Graph neural networks (GNNs) have been demonstrated to be a powerful alg...

Staleness-Alleviated Distributed GNN Training via Online Dynamic-Embedding Prediction

Despite the recent success of Graph Neural Networks (GNNs), it remains c...

An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

Recently, graph neural networks (GNNs) have gained much attention as a g...

The Evolution of Distributed Systems for Graph Neural Networks and their Origin in Graph Processing and Deep Learning: A Survey

Graph Neural Networks (GNNs) are an emerging research field. This specia...

Characterizing and Understanding Distributed GNN Training on GPUs

Graph neural network (GNN) has been demonstrated to be a powerful model ...

Communication-Efficient Distributed Deep Learning: Survey, Evaluation, and Challenges

In recent years, distributed deep learning techniques are widely deploye...

Please sign up or login with your details

Forgot password? Click here to reset