NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries
We present NEBULA, the first latent 3D generative model for scalable generation of large molecular libraries around a seed compound of interest. Such libraries are crucial for scientific discovery, but it remains challenging to generate large numbers of high quality samples efficiently. 3D-voxel-bas...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present NEBULA, the first latent 3D generative model for scalable
generation of large molecular libraries around a seed compound of interest.
Such libraries are crucial for scientific discovery, but it remains challenging
to generate large numbers of high quality samples efficiently. 3D-voxel-based
methods have recently shown great promise for generating high quality samples
de novo from random noise (Pinheiro et al., 2023). However, sampling in
3D-voxel space is computationally expensive and use in library generation is
prohibitively slow. Here, we instead perform neural empirical Bayes sampling
(Saremi & Hyvarinen, 2019) in the learned latent space of a vector-quantized
variational autoencoder. NEBULA generates large molecular libraries nearly an
order of magnitude faster than existing methods without sacrificing sample
quality. Moreover, NEBULA generalizes better to unseen drug-like molecules, as
demonstrated on two public datasets and multiple recently released drugs. We
expect the approach herein to be highly enabling for machine learning-based
drug discovery. The code is available at
https://github.com/prescient-design/nebula |
---|---|
DOI: | 10.48550/arxiv.2407.03428 |