Interpretation of QSAR Models Based on Random Forest Methods

A new algorithm for the interpretation of Random Forest models has been developed. It allows to calculate the contribution of each descriptor to the calculated property value. In case of the simplex representation of a molecular structure, contributions of individual atoms can be calculated, and thu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular informatics 2011-06, Vol.30 (6-7), p.593-603
Hauptverfasser: Kuz'min, Victor E., Polishchuk, Pavel G., Artemenko, Anatoly G., Andronati, Sergey A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A new algorithm for the interpretation of Random Forest models has been developed. It allows to calculate the contribution of each descriptor to the calculated property value. In case of the simplex representation of a molecular structure, contributions of individual atoms can be calculated, and thus it becomes possible to estimate the influence of separate molecular fragments on the investigated property. Such information can be used for the design of new compounds with a predefined property value. The proposed measure of descriptor contributions is not an alternative to the importance of Breiman’s variable, but it characterizes the contribution of a particular explanatory variable to the calculated response value.
ISSN:1868-1743
1868-1751
DOI:10.1002/minf.201000173