Chance factors in studies of quantitative structure-activity relationships

Multiple regression analysis is a basic statistical tool used for QSAR studies in drug design. However, there is a risk or arriving at fortuitous correlations when too many variables are screened relative to the number of available observations. In this regard, a critical distinction must be made be...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of medicinal chemistry 1979-10, Vol.22 (10), p.1238-1244
Hauptverfasser: Topliss, John G, Edwards, Robert P
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multiple regression analysis is a basic statistical tool used for QSAR studies in drug design. However, there is a risk or arriving at fortuitous correlations when too many variables are screened relative to the number of available observations. In this regard, a critical distinction must be made between the number of variables screened for possible correlation and the number which actually appear in the regression equation. Using a modified Fortran stepwise multiple-regression analysis program, simulated QSAR studies employing random numbers were run for many different combinations of screened variables and observations. Under certain conditions, a substantial incidence of correlations with high r2 values were found, although the overall degree of chance correlation noted was less than that reported in a previous study. Analysis of the results has provided a basis for making judgements concerning the level of risk of encountering chance correlations for a wide range of combinations of observations and screened variables in QSAR studies using multiple-regression analysis. For illustrative purposes, some examples involving published QSAR studies have been considered and the reported correlations shown to be less significant than originally presented through the influence of unrecognized chance factors.
ISSN:0022-2623
1520-4804
DOI:10.1021/jm00196a017