Partial Product Aware Machine Learning on DNA-Encoded Libraries

05/16/2022
by   Polina Binder, et al.
14

DNA encoded libraries (DELs) are used for rapid large-scale screening of small molecules against a protein target. These combinatorial libraries are built through several cycles of chemistry and DNA ligation, producing large sets of DNA-tagged molecules. Training machine learning models on DEL data has been shown to be effective at predicting molecules of interest dissimilar from those in the original DEL. Machine learning chemical property prediction approaches rely on the assumption that the property of interest is linked to a single chemical structure. In the context of DNA-encoded libraries, this is equivalent to assuming that every chemical reaction fully yields the desired product. However, in practice, multi-step chemical synthesis sometimes generates partial molecules. Each unique DNA tag in a DEL therefore corresponds to a set of possible molecules. Here, we leverage reaction yield data to enumerate the set of possible molecules corresponding to a given DNA tag. This paper demonstrates that training a custom GNN on this richer dataset improves accuracy and generalization performance.

READ FULL TEXT
research
06/07/2021

A generative model for molecule generation based on chemical reaction trees

Deep generative models have been shown powerful in generating novel mole...
research
11/30/2022

DEL-Dock: Molecular Docking-Enabled Modeling of DNA-Encoded Libraries

DNA-Encoded Library (DEL) technology has enabled significant advances in...
research
08/27/2021

Machine learning on DNA-encoded library count data using an uncertainty-aware probabilistic loss function

DNA-encoded library (DEL) screening and quantitative structure-activity ...
research
11/25/2022

Synthesis Cost-Optimal Targeted Mutant Protein Libraries

Protein variant libraries produced by site-directed mutagenesis are a us...
research
10/14/2021

Predictive models of RNA degradation through dual crowdsourcing

Messenger RNA-based medicines hold immense potential, as evidenced by th...
research
06/08/2021

Non-Autoregressive Electron Redistribution Modeling for Reaction Prediction

Reliably predicting the products of chemical reactions presents a fundam...
research
04/07/2021

Modern Hopfield Networks for Few- and Zero-Shot Reaction Prediction

An essential step in the discovery of new drugs and materials is the syn...

Please sign up or login with your details

Forgot password? Click here to reset