Symbolic Distance Measurements Based on Characteristic Subspaces
We introduce the subspace difference metric, a novel heterogeneous distance metric for calculating distances between points with both continuous and (unordered) categorical attributes. Our approach is based on the computation and comparison of characteristic subspaces (i.e. contexts) for each of the...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We introduce the subspace difference metric, a novel heterogeneous distance metric for calculating distances between points with both continuous and (unordered) categorical attributes. Our approach is based on the computation and comparison of characteristic subspaces (i.e. contexts) for each of the symbols and can be viewed as a generalization of the well-known value difference metric.
Subsequently, as one possible extension, we propose a linearization of the computed symbolic distances by multidimensional scaling, thereby mapping a set of symbols onto the interval [0, 1]. Thus, even algorithms, which have originally been designed for usage with continuous attributes (e.g. clustering algorithms like k-means), may be applied to datasets containing discrete attributes, without having to adapt the algorithm itself.
Finally, we evaluate the proposed metric and the linearization in quantitative and qualitative settings and exemplify the applicability in clustering domains. |
---|---|
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-540-39804-2_29 |