System identification of fuzzy cartesian granule feature models using genetic programming

A Cartesian granule feature is a multidimensional feature formed over the cross product of words drawn from the linguistic partitions of the constituent input features. Systems can be quite naturally described in terms of Cartesian granule features incorporated into additive models (if-then-rules wi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Baldwin, James F., Martin, Trevor P., Shanahan, James G.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A Cartesian granule feature is a multidimensional feature formed over the cross product of words drawn from the linguistic partitions of the constituent input features. Systems can be quite naturally described in terms of Cartesian granule features incorporated into additive models (if-then-rules with weighted antecedents) where each Cartesian granule feature focuses on modelling the interactions of a subset of input variables. This can often lead to models that reduce if not eliminate decomposition error, while enhancing the model’s generalisation powers and transparency. Within a machine learning context the system identification of good, parsimonious additive Cartesian granule feature models is an exponential search problem. In this paper we present the G_DACG constructive induction algorithm as a means of automatically identifying additive Cartesian granule feature models from example data. G_DACG combines the powerful optimisation capabilities of genetic programming with a rather novel and cheap fitness function which relies on the semantic separation of concepts expressed in terms of Cartesian granule fuzzy sets in identifying these additive models. G_DACG helps avoid many of the problems of traditional approaches to system identification that arise from feature selection and feature abstraction such as local minima. G_DACG has been applied in the system identification of additive Cartesian granule feature models on a variety of artificial and real world problems. Here we present a sample of those results including those for the benchmark Pima Diabetes problem. A classification accuracy of 79.7% was achieved on this dataset outperforming previous bests of 78% (generally from black box modelling approaches such as neural nets and oblique decision trees).
ISSN:0302-9743
1611-3349
DOI:10.1007/BFb0095073