Communication-Free Distributed GNN Training with Vertex Cut

08/06/2023
by   Kaidi Cao, et al.
0

Training Graph Neural Networks (GNNs) on real-world graphs consisting of billions of nodes and edges is quite challenging, primarily due to the substantial memory needed to store the graph and its intermediate node and edge features, and there is a pressing need to speed up the training process. A common approach to achieve speed up is to divide the graph into many smaller subgraphs, which are then distributed across multiple GPUs in one or more machines and processed in parallel. However, existing distributed methods require frequent and substantial cross-GPU communication, leading to significant time overhead and progressively diminishing scalability. Here, we introduce CoFree-GNN, a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training. The framework utilizes a Vertex Cut partitioning, i.e., rather than partitioning the graph by cutting the edges between partitions, the Vertex Cut partitions the edges and duplicates the node information to preserve the graph structure. Furthermore, the framework maintains high model accuracy by incorporating a reweighting mechanism to handle a distorted graph distribution that arises from the duplicated nodes. We also propose a modified DropEdge technique to further speed up the training process. Using an extensive set of experiments on real-world networks, we demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2023

An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

Recently, graph neural networks (GNNs) have gained much attention as a g...
research
04/14/2021

DistGNN: Scalable Distributed Training for Large-Scale Graph Neural Networks

Full-batch training on Graph Neural Networks (GNN) to learn the structur...
research
05/17/2023

Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model Aggregation

Distributed training of GNNs enables learning on massive graphs (e.g., s...
research
08/27/2023

SPEED: Streaming Partition and Parallel Acceleration for Temporal Interaction Graph Embedding

Temporal Interaction Graphs (TIGs) are widely employed to model intricat...
research
02/28/2023

Semi-decentralized Inference in Heterogeneous Graph Neural Networks for Traffic Demand Forecasting: An Edge-Computing Approach

Accurate and timely prediction of transportation demand and supply is es...
research
10/11/2020

DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs

Graph neural networks (GNN) have shown great success in learning from gr...
research
06/21/2022

Nimble GNN Embedding with Tensor-Train Decomposition

This paper describes a new method for representing embedding tables of g...

Please sign up or login with your details

Forgot password? Click here to reset