Classification of a large anticancer data set by adaptive fuzzy partition

An Adaptive Fuzzy Partition (AFP) algorithm, derived from Fuzzy Logic concepts, was used to classify an anticancer data set, including about 1300 compounds subdivided into eight mechanisms of action. AFP classification builds relationships between molecular descriptors and bio-activities by dynamica...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of computer-aided molecular design 2004-07, Vol.18 (7-9), p.577-586
Hauptverfasser:	Piclin, Nadège, Pintore, Marco, Wechman, Christophe, Chrétien, Jacques R
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Antineoplastic Agents - classification Database Management Systems Fuzzy Logic Structure-Activity Relationship Studies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	An Adaptive Fuzzy Partition (AFP) algorithm, derived from Fuzzy Logic concepts, was used to classify an anticancer data set, including about 1300 compounds subdivided into eight mechanisms of action. AFP classification builds relationships between molecular descriptors and bio-activities by dynamically dividing the descriptor hyperspace into a set of fuzzy subspaces. These subspaces are described by simple linguistic rules, from which scores ranging between 0 and 1 can be derived. The latter values define, for each compound, the degrees of membership of the different mechanisms analyzed. A particular attention was devoted to develop structure-activity relations that have a real utility. Then, well-defined and widely accepted protocols were used to validate the models by defining their robustness and prediction ability. More particularly, after selecting the most relevant descriptors with help of a genetic algorithm, a training set of 640 compounds was isolated by a rational procedure based on Self-Organizing Maps. The related AFP model was then validated with help of a validation set and, above all, of cross-validation and Y-randomization procedures. Good validation scores of about 80% were obtained, underlining the robustness of the model. Moreover, the prediction ability was evaluated with 374 test compounds that had not been used to establish the model and 77% of them were predicted correctly.
ISSN:	0920-654X 1573-4951
DOI:	10.1007/s10822-004-4076-0