Six global and local QSPR models of aqueous solubility at pH = 7.4 based on structural similarity and physicochemical descriptors
Aqueous solubility at pH = 7.4 is a very important property for medicinal chemists because this is the pH value of physiological media. The present work describes the application of three different methods (support vector machine (SVM), random forest (RF) and multiple linear regression (MLR)) and th...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Dataset |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Aqueous solubility at pH = 7.4 is a very important property for medicinal chemists because this is the pH value of physiological media. The present work describes the application of three different methods (support vector machine (SVM), random forest (RF) and multiple linear regression (MLR)) and three local quantitative structure–property relationship (QSPR) models (regression corrected by nearest neighbours (RCNN), arithmetic mean property (AMP) and local regression property (LoReP)) to construct stable QSPRs with clear mechanistic interpretation. Our data set contained experimental values of aqueous solubility at pH = 7.4 of 387 chemicals (349 in the training set and 38 in the test set including 16 own measurements). The initial descriptor pool contained 210 physicochemical descriptors, calculated from the HYBOT, DRAGON, SYBYL and VolSurf+ programs. Six QSPRs with good statistics based on fundamentals of aqueous solubility and optimization of descriptor space were obtained. Those models have an RMSE close to experimental error (0.70), and are amenable to physical interpretation. The QSPR models developed in this study may be useful for medicinal chemists. Global MLR, RF and SVM models may be valuable for consideration of common factors that influence solubility. The RCNN, AMP and LoReP local models may be helpful for the optimization of aqueous solubility in small sets of related chemicals. |
---|---|
DOI: | 10.6084/m9.figshare.5395372 |