QSAR–old and new directions
Regression analysis has recently faced increasing doubt concerning its predictivity. A series of studies have questioned the reliability of the underlying approach leading to elusive models despite significant correlations for the training data, but conversely disappointing results for external test...
Gespeichert in:
Format: | Buchkapitel |
---|---|
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Regression analysis has recently faced increasing doubt concerning its predictivity. A series of studies have questioned the reliability of the underlying approach leading to elusive models despite significant correlations for the training data, but conversely disappointing results for external test sets. The performance of QSAR (quantitative structure-activity relationships) predictions depends on a series of issues, comprising choice of descriptors, compound set, mathematical methods, quality of experimental data, and eventually common sense. A further problem concerns the interpretability of descriptors. The vast number of computable molecular features makes a preselection mandatory particularly for the use in neural networks and support vector regression. Corresponding strategies comprise principal component analysis and removal of collinear descriptors. The issues involved with the latter approach can lead to the preference of highly specific variables in favour of more generally applicable and more meaningful descriptors. Examples are provided where the resulting models are questionable despite seemingly sound statistical prove. Therefore, selection criteria and general guidelines are discussed which facilitate the choice of interpretable descriptors e.g. for lipophilicity and hydrogen-bonding capacity. Reasons for errors and outliers in prediction models are summarized with respect to cross-validations methods, such as leave-one-out. Furthermore, some case studies are discussed which make use of support vector regression, an emerging technique in QSAR. |
---|---|
ISSN: | 1472-0965 1472-0973 |
DOI: | 10.1039/B812893F |