Artificial intelligence–enabled virtual screening of ultra-large chemical libraries with deep docking

With the recent explosion of chemical libraries beyond a billion molecules, more efficient virtual screening approaches are needed. The Deep Docking (DD) platform enables up to 100-fold acceleration of structure-based virtual screening by docking only a subset of a chemical library, iteratively sync...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature protocols 2022-03, Vol.17 (3), p.672-697
Hauptverfasser: Gentile, Francesco, Yaacoub, Jean Charle, Gleave, James, Fernandez, Michael, Ton, Anh-Tien, Ban, Fuqiang, Stern, Abraham, Cherkasov, Artem
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the recent explosion of chemical libraries beyond a billion molecules, more efficient virtual screening approaches are needed. The Deep Docking (DD) platform enables up to 100-fold acceleration of structure-based virtual screening by docking only a subset of a chemical library, iteratively synchronized with a ligand-based prediction of the remaining docking scores. This method results in hundreds- to thousands-fold virtual hit enrichment (without significant loss of potential drug candidates) and hence enables the screening of billion molecule–sized chemical libraries without using extraordinary computational resources. Herein, we present and discuss the generalized DD protocol that has been proven successful in various computer-aided drug discovery (CADD) campaigns and can be applied in conjunction with any conventional docking program. The protocol encompasses eight consecutive stages: molecular library preparation, receptor preparation, random sampling of a library, ligand preparation, molecular docking, model training, model inference and the residual docking. The standard DD workflow enables iterative application of stages 3–7 with continuous augmentation of the training set, and the number of such iterations can be adjusted by the user. A predefined recall value allows for control of the percentage of top-scoring molecules that are retained by DD and can be adjusted to control the library size reduction. The procedure takes 1–2 weeks (depending on the available resources) and can be completely automated on computing clusters managed by job schedulers. This open-source protocol, at https://github.com/jamesgleave/DD_protocol , can be readily deployed by CADD researchers and can significantly accelerate the effective exploration of ultra-large portions of a chemical space. Screening chemical databases by computational docking is prohibitively time consuming when the databases are very large. Deep docking is a deep-learning approach aimed at reducing the number of compounds that need to be docked.
ISSN:1754-2189
1750-2799
DOI:10.1038/s41596-021-00659-2