GARLig: A Fully Automated Tool for Subset Selection of Large Fragment Spaces via a Self-Adaptive Genetic Algorithm

In combinatorial chemistry, molecules are assembled according to combinatorial principles by linking suitable reagents or decorating a given scaffold with appropriate substituents from a large chemical space of starting materials. Often the number of possible combinations greatly exceeds the number...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of chemical information and modeling 2010-09, Vol.50 (9), p.1644-1659
Hauptverfasser: Pfeffer, Patrick, Fober, Thomas, Hüllermeier, Eyke, Klebe, Gerhard
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In combinatorial chemistry, molecules are assembled according to combinatorial principles by linking suitable reagents or decorating a given scaffold with appropriate substituents from a large chemical space of starting materials. Often the number of possible combinations greatly exceeds the number feasible to handle by an in-depth in silico approach or even more if it should be experimentally synthesized. Therefore, powerful tools to efficiently enumerate large chemical spaces are required. They can be provided by genetic algorithms, which mimic Darwinian evolution. GARLig (genetic algorithm using reagents to compose ligands) has been developed to perform subset selection in large chemical compound spaces subject to target-specific 3D-scoring criteria. GARLig uses different scoring schemes, such as AutoDock4 Score, GOLDScore, and DrugScoreCSD, as fitness functions. Its genetic parameters have been optimized to characterize combinatorial libraries with respect to the binding to various targets of pharmaceutical interest. A large tripeptide library of 203 members has been used to profile amino acid frequencies in putative substrates for trypsin, thrombin, factor Xa, and plasmin. A peptidomimetic scaffold assembled from a selection of a 253 building block was used to test the performance of the evolutionary algorithm in suggesting potent inhibitors of the enzyme cathepsin D. In a final case study, our program was used to characterize and rank a combinatorial drug-like library comprising 33 750 potential thrombin inhibitors. These case studies demonstrate that GARLig finds experimentally confirmed potent leads by processing a significantly smaller subset of the fully enumerated combinatorial library. Furthermore, the profiles of amino acids computed by the genetic algorithm match the observed amino acid frequencies found by screening peptide libraries in substrate cleavage assays.
ISSN:1549-9596
1549-960X
DOI:10.1021/ci9003305