Symbolic Distance Measurements Based on Characteristic Subspaces

We introduce the subspace difference metric, a novel heterogeneous distance metric for calculating distances between points with both continuous and (unordered) categorical attributes. Our approach is based on the computation and comparison of characteristic subspaces (i.e. contexts) for each of the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Ludl, Marcus-Christopher
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We introduce the subspace difference metric, a novel heterogeneous distance metric for calculating distances between points with both continuous and (unordered) categorical attributes. Our approach is based on the computation and comparison of characteristic subspaces (i.e. contexts) for each of the symbols and can be viewed as a generalization of the well-known value difference metric. Subsequently, as one possible extension, we propose a linearization of the computed symbolic distances by multidimensional scaling, thereby mapping a set of symbols onto the interval [0, 1]. Thus, even algorithms, which have originally been designed for usage with continuous attributes (e.g. clustering algorithms like k-means), may be applied to datasets containing discrete attributes, without having to adapt the algorithm itself. Finally, we evaluate the proposed metric and the linearization in quantitative and qualitative settings and exemplify the applicability in clustering domains.
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-540-39804-2_29