Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction
Dimensionality reduction is crucial both for visualization and preprocessing high dimensional data for machine learning. We introduce a novel method based on a hierarchy built on 1-nearest neighbor graphs in the original space which is used to preserve the grouping properties of the data distributio...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Dimensionality reduction is crucial both for visualization and preprocessing
high dimensional data for machine learning. We introduce a novel method based
on a hierarchy built on 1-nearest neighbor graphs in the original space which
is used to preserve the grouping properties of the data distribution on
multiple levels. The core of the proposal is an optimization-free projection
that is competitive with the latest versions of t-SNE and UMAP in performance
and visualization quality while being an order of magnitude faster in run-time.
Furthermore, its interpretable mechanics, the ability to project new data, and
the natural separation of data clusters in visualizations make it a general
purpose unsupervised dimension reduction technique. In the paper, we argue
about the soundness of the proposed method and evaluate it on a diverse
collection of datasets with sizes varying from 1K to 11M samples and dimensions
from 28 to 16K. We perform comparisons with other state-of-the-art methods on
multiple metrics and target dimensions highlighting its efficiency and
performance. Code is available at https://github.com/koulakis/h-nne |
---|---|
DOI: | 10.48550/arxiv.2203.12997 |