Can Transformer and GNN Help Each Other?

by   Peiyan Zhang, et al.

Although Transformer has achieved great success in natural language process and computer vision, it has difficulty generalizing to medium and large-scale graph data for two important reasons: (i) High complexity. (ii) Failing to capture the complex and entangled structure information. In graph representation learning, Graph Neural Networks(GNNs) can fuse the graph structure and node attributes but have limited receptive fields. Therefore, we question whether can we combine Transformers and GNNs to help each other. In this paper, we propose a new model named TransGNN where the Transformer layer and GNN layer are used alternately to improve each other. Specifically, to expand the receptive field and disentangle the information aggregation from edges, we propose using Transformer to aggregate more relevant nodes' information to improve the message passing of GNNs. Besides, to capture the graph structure information, we utilize positional encoding and make use of the GNN layer to fuse the structure into node attributes, which improves the Transformer in graph data. We also propose to sample the most relevant nodes for Transformer and two efficient samples update strategies to lower the complexity. At last, we theoretically prove that TransGNN is more expressive than GNNs only with extra linear complexity. The experiments on eight datasets corroborate the effectiveness of TransGNN on node and graph classification tasks.


page 1

page 2

page 3

page 4


Diffusing Graph Attention

The dominant paradigm for machine learning on graphs uses Message Passin...

Transformer and Snowball Graph Convolution Learning for Biomedical Graph Classification

Graph or network has been widely used for describing and modeling comple...

Rewiring with Positional Encodings for Graph Neural Networks

Several recent works use positional encodings to extend the receptive fi...

My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

Multitask Reinforcement Learning is a promising way to obtain models wit...

PatchGT: Transformer over Non-trainable Clusters for Learning Graph Representations

Recently the Transformer structure has shown good performances in graph ...

SEA: Graph Shell Attention in Graph Neural Networks

A common issue in Graph Neural Networks (GNNs) is known as over-smoothin...

Exploiting Path Information for Anchor Based Graph Neural Network

Learning node representation that incorporating information from graph s...

Please sign up or login with your details

Forgot password? Click here to reset