Fragment-based molecular generative model with high generalization ability and synthetic accessibility

by   Seonghwan Seo, et al.

Deep generative models are attracting great attention for molecular design with desired properties. Most existing models generate molecules by sequentially adding atoms. This often renders generated molecules with less correlation with target properties and low synthetic accessibility. Molecular fragments such as functional groups are more closely related to molecular properties and synthetic accessibility than atoms. Here, we propose a fragment-based molecular generative model which designs new molecules with target properties by sequentially adding molecular fragments to any given starting molecule. A key feature of our model is a high generalization ability in terms of property control and fragment types. The former becomes possible by learning the contribution of individual fragments to the target properties in an auto-regressive manner. For the latter, we used a deep neural network that predicts the bonding probability of two molecules from the embedding vectors of the two molecules as input. The high synthetic accessibility of the generated molecules is implicitly considered while preparing the fragment library with the BRICS decomposition method. We show that the model can generate molecules with the simultaneous control of multiple target properties at a high success rate. It also works equally well with unseen fragments even in the property range where the training data is rare, verifying the high generalization ability. As a practical application, we demonstrated that the model can generate potential inhibitors with high binding affinities against the 3CL protease of SARS-COV-2 in terms of docking score.


page 1

page 7


Scaffold-based molecular design using graph generative model

Searching new molecules in areas like drug discovery often starts from t...

Materials Discovery with Extreme Properties via AI-Driven Combinatorial Chemistry

The goal of most materials discovery is to discover materials that are s...

GEN: Highly Efficient SMILES Explorer Using Autodidactic Generative Examination Networks

Recurrent neural networks have been widely used to generate millions of ...

C5T5: Controllable Generation of Organic Molecules with Transformers

Methods for designing organic materials with desired properties have hig...

Generating equilibrium molecules with deep neural networks

Discovery of atomistic systems with desirable properties is a major chal...

Graph Machine Learning for Design of High-Octane Fuels

Fuels with high-knock resistance enable modern spark-ignition engines to...

Please sign up or login with your details

Forgot password? Click here to reset