Megavariate analysis of environmental QSAR data. Part I--a basic framework founded on principal component analysis (PCA), partial least squares (PLS), and statistical molecular design (SMD)

This paper introduces principal component analysis (PCA), partial least squares projections to latent structures (PLS), and statistical molecular design (SMD) as useful tools in deriving multi- and megavariate quantitative structure-activity relationship (QSAR) models. Two QSAR data sets from the fi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular diversity 2006-05, Vol.10 (2), p.169-186
Hauptverfasser: Eriksson, Lennart, Andersson, Patrik L, Johansson, Erik, Tysklind, Mats
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper introduces principal component analysis (PCA), partial least squares projections to latent structures (PLS), and statistical molecular design (SMD) as useful tools in deriving multi- and megavariate quantitative structure-activity relationship (QSAR) models. Two QSAR data sets from the fields of environmental toxicology and environmental chemistry are worked out in detail, showing the benefits of PCA, PLS and SMD. PCA is useful when overviewing a data set and exploring relationships among compounds and relationships among variables. PLS is the regression extension of PCA and is used for establishing QSARs. SMD is essential for selecting informative training and test sets of compounds for QSAR calibration and validation.
ISSN:1381-1991
1573-501X
DOI:10.1007/s11030-006-9024-6