Knowledge reduction for decision tables with attribute value taxonomies

•We present an attribute-generalization reduct for decision tables with AVTs.•We analyze relationships between the attribute reduct and the generalization reduct.•We develop a heuristic algorithm AGR-SCE to find the generalization reduct.•The generalization reduct can objectively control the general...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2014-01, Vol.56, p.68-78
Hauptverfasser: Ye, Mingquan, Wu, Xindong, Hu, Xuegang, Hu, Donghui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We present an attribute-generalization reduct for decision tables with AVTs.•We analyze relationships between the attribute reduct and the generalization reduct.•We develop a heuristic algorithm AGR-SCE to find the generalization reduct.•The generalization reduct can objectively control the generalization process.•The generalization reduct can avoid over-generalization or under-generalization. Attribute reduction and attribute generalization are two basic methods for simple representations of knowledge. Attribute reduction can only reduce the number of attributes and is thus unsuitable for attributes with hierarchical domains. Attribute generalization can transform raw attribute domains into a coarser granularity by exploiting attribute value taxonomies (AVTs). As the control of how high an attribute should be generalized is typically quite subjective, it can easily result in over-generalization or under-generalization. This paper investigates knowledge reduction for decision tables with AVTs, which can objectively control the generalization process, and construct a reduced data set with fewer attributes and smaller attribute domains. Specifically, we make use of Shannon’s conditional entropy for measuring classification capability for generalization and propose a novel concept for knowledge reduction, designated attribute-generalization reduct, which can objectively generalize attributes to maximize high levels while keep the same classification capability as the raw data. We analyze major relationships between attribute reduct and attribute-generalization reduct and prove that finding a minimal attribute-generalization reduct is an NP-hard problem and develop a heuristic algorithm for attribute-generalization reduction, namely, AGR-SCE. Empirical studies demonstrate that our algorithm accomplishes better classification performance and assists in computing smaller rule sets with better generalized knowledge compared with the attribute reduction method.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2013.10.022