Efficient discovery of contrast subspaces for object explanation and characterization

We tackle the novel problem of mining contrast subspaces. Given a set of multidimensional objects in two classes C + and C - and a query object o , we want to find the top- k subspaces that maximize the ratio of likelihood of o in C + against that in C - . Such subspaces are very useful for characte...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge and information systems 2016-04, Vol.47 (1), p.99-129
Hauptverfasser: Duan, Lei, Tang, Guanting, Pei, Jian, Bailey, James, Dong, Guozhu, Nguyen, Vinh, Campbell, Akiko, Tang, Changjie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We tackle the novel problem of mining contrast subspaces. Given a set of multidimensional objects in two classes C + and C - and a query object o , we want to find the top- k subspaces that maximize the ratio of likelihood of o in C + against that in C - . Such subspaces are very useful for characterizing an object and explaining how it differs between two classes. We demonstrate that this problem has important applications, and, at the same time, is very challenging, being MAX SNP-hard. We present CSMiner, a mining method that uses kernel density estimation in conjunction with various pruning techniques. We experimentally investigate the performance of CSMiner on a range of data sets, evaluating its efficiency, effectiveness, and stability and demonstrating it is substantially faster than a baseline method.
ISSN:0219-1377
0219-3116
DOI:10.1007/s10115-015-0835-6