Efficient discovery of contrast subspaces for object explanation and characterization
We tackle the novel problem of mining contrast subspaces. Given a set of multidimensional objects in two classes C + and C - and a query object o , we want to find the top- k subspaces that maximize the ratio of likelihood of o in C + against that in C - . Such subspaces are very useful for characte...
Gespeichert in:
Veröffentlicht in: | Knowledge and information systems 2016-04, Vol.47 (1), p.99-129 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We tackle the novel problem of mining contrast subspaces. Given a set of multidimensional objects in two classes
C
+
and
C
-
and a query object
o
, we want to find the top-
k
subspaces that maximize the ratio of likelihood of
o
in
C
+
against that in
C
-
. Such subspaces are very useful for characterizing an object and explaining how it differs between two classes. We demonstrate that this problem has important applications, and, at the same time, is very challenging, being MAX SNP-hard. We present CSMiner, a mining method that uses kernel density estimation in conjunction with various pruning techniques. We experimentally investigate the performance of CSMiner on a range of data sets, evaluating its efficiency, effectiveness, and stability and demonstrating it is substantially faster than a baseline method. |
---|---|
ISSN: | 0219-1377 0219-3116 |
DOI: | 10.1007/s10115-015-0835-6 |