MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular Representation Learning

by   Cameron Diao, et al.

Molecular representation learning is crucial for the problem of molecular property prediction, where graph neural networks (GNNs) serve as an effective solution due to their structure modeling capabilities. Since labeled data is often scarce and expensive to obtain, it is a great challenge for GNNs to generalize in the extensive molecular space. Recently, the training paradigm of "pre-train, fine-tune" has been leveraged to improve the generalization capabilities of GNNs. It uses self-supervised information to pre-train the GNN, and then performs fine-tuning to optimize the downstream task with just a few labels. However, pre-training does not always yield statistically significant improvement, especially for self-supervised learning with random structural masking. In fact, the molecular structure is characterized by motif subgraphs, which are frequently occurring and influence molecular properties. To leverage the task-related motifs, we propose a novel paradigm of "pre-train, prompt, fine-tune" for molecular representation learning, named molecule continuous prompt tuning (MolCPT). MolCPT defines a motif prompting function that uses the pre-trained model to project the standalone input into an expressive prompt. The prompt effectively augments the molecular graph with meaningful motifs in the continuous representation space; this provides more structural patterns to aid the downstream classifier in identifying molecular properties. Extensive experiments on several benchmark datasets show that MolCPT efficiently generalizes pre-trained GNNs for molecular property prediction, with or without a few fine-tuning steps.


page 1

page 2

page 3

page 4


Motif-based Graph Self-Supervised Learning forMolecular Property Prediction

Predicting molecular properties with data-driven methods has drawn much ...

Voucher Abuse Detection with Prompt-based Fine-tuning on Graph Neural Networks

Voucher abuse detection is an important anomaly detection problem in E-c...

Search to Fine-tune Pre-trained Graph Neural Networks for Graph-level Tasks

Recently, graph neural networks (GNNs) have shown its unprecedented succ...

Pre-training Graph Neural Networks

Many applications of machine learning in science and medicine, including...

BatmanNet: Bi-branch Masked Graph Transformer Autoencoder for Molecular Representation

Although substantial efforts have been made using graph neural networks ...

Motif-Driven Contrastive Learning of Graph Representations

Graph motifs are significant subgraph patterns occurring frequently in g...

CrysGNN : Distilling pre-trained knowledge to enhance property prediction for crystalline materials

In recent years, graph neural network (GNN) based approaches have emerge...

Please sign up or login with your details

Forgot password? Click here to reset