Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction
Dimensionality reduction is crucial both for visualization and preprocessing high dimensional data for machine learning. We introduce a novel method based on a hierarchy built on 1-nearest neighbor graphs in the original space which is used to preserve the grouping properties of the data distributio...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Sarfraz, M. Saquib Koulakis, Marios Seibold, Constantin Stiefelhagen, Rainer |
description | Dimensionality reduction is crucial both for visualization and preprocessing
high dimensional data for machine learning. We introduce a novel method based
on a hierarchy built on 1-nearest neighbor graphs in the original space which
is used to preserve the grouping properties of the data distribution on
multiple levels. The core of the proposal is an optimization-free projection
that is competitive with the latest versions of t-SNE and UMAP in performance
and visualization quality while being an order of magnitude faster in run-time.
Furthermore, its interpretable mechanics, the ability to project new data, and
the natural separation of data clusters in visualizations make it a general
purpose unsupervised dimension reduction technique. In the paper, we argue
about the soundness of the proposed method and evaluate it on a diverse
collection of datasets with sizes varying from 1K to 11M samples and dimensions
from 28 to 16K. We perform comparisons with other state-of-the-art methods on
multiple metrics and target dimensions highlighting its efficiency and
performance. Code is available at https://github.com/koulakis/h-nne |
doi_str_mv | 10.48550/arxiv.2203.12997 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2203_12997</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2203_12997</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-90e7a50db36a639a683b8d1f8436a89d10ddcfdeeece87d1d9bbeb21d58187bf3</originalsourceid><addsrcrecordid>eNotj7FOwzAURb0woMIHMOEfSLDjJrZHVEKLVBUJZY-e_Z4bS0laOQHRvycUpqN7hisdxh6kyNemLMUTpO_4lReFULksrNW3rNlFSpB8Fz30_ECQaJoXxmPnTolvE5w7Xg-OEON45GFxdQjRRxpn_hIHGqd4GqGP84V_EH76eZl37CZAP9H9P1esea2bzS7bv2_fNs_7DCqtMytIQynQqQoqZaEyyhmUwawXYSxKgegDEpEno1GidY5cIbE00mgX1Io9_t1es9pzigOkS_ub117z1A8PAEyQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction</title><source>arXiv.org</source><creator>Sarfraz, M. Saquib ; Koulakis, Marios ; Seibold, Constantin ; Stiefelhagen, Rainer</creator><creatorcontrib>Sarfraz, M. Saquib ; Koulakis, Marios ; Seibold, Constantin ; Stiefelhagen, Rainer</creatorcontrib><description>Dimensionality reduction is crucial both for visualization and preprocessing
high dimensional data for machine learning. We introduce a novel method based
on a hierarchy built on 1-nearest neighbor graphs in the original space which
is used to preserve the grouping properties of the data distribution on
multiple levels. The core of the proposal is an optimization-free projection
that is competitive with the latest versions of t-SNE and UMAP in performance
and visualization quality while being an order of magnitude faster in run-time.
Furthermore, its interpretable mechanics, the ability to project new data, and
the natural separation of data clusters in visualizations make it a general
purpose unsupervised dimension reduction technique. In the paper, we argue
about the soundness of the proposed method and evaluate it on a diverse
collection of datasets with sizes varying from 1K to 11M samples and dimensions
from 28 to 16K. We perform comparisons with other state-of-the-art methods on
multiple metrics and target dimensions highlighting its efficiency and
performance. Code is available at https://github.com/koulakis/h-nne</description><identifier>DOI: 10.48550/arxiv.2203.12997</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Data Structures and Algorithms ; Computer Science - Graphics</subject><creationdate>2022-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2203.12997$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2203.12997$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Sarfraz, M. Saquib</creatorcontrib><creatorcontrib>Koulakis, Marios</creatorcontrib><creatorcontrib>Seibold, Constantin</creatorcontrib><creatorcontrib>Stiefelhagen, Rainer</creatorcontrib><title>Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction</title><description>Dimensionality reduction is crucial both for visualization and preprocessing
high dimensional data for machine learning. We introduce a novel method based
on a hierarchy built on 1-nearest neighbor graphs in the original space which
is used to preserve the grouping properties of the data distribution on
multiple levels. The core of the proposal is an optimization-free projection
that is competitive with the latest versions of t-SNE and UMAP in performance
and visualization quality while being an order of magnitude faster in run-time.
Furthermore, its interpretable mechanics, the ability to project new data, and
the natural separation of data clusters in visualizations make it a general
purpose unsupervised dimension reduction technique. In the paper, we argue
about the soundness of the proposed method and evaluate it on a diverse
collection of datasets with sizes varying from 1K to 11M samples and dimensions
from 28 to 16K. We perform comparisons with other state-of-the-art methods on
multiple metrics and target dimensions highlighting its efficiency and
performance. Code is available at https://github.com/koulakis/h-nne</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Data Structures and Algorithms</subject><subject>Computer Science - Graphics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj7FOwzAURb0woMIHMOEfSLDjJrZHVEKLVBUJZY-e_Z4bS0laOQHRvycUpqN7hisdxh6kyNemLMUTpO_4lReFULksrNW3rNlFSpB8Fz30_ECQaJoXxmPnTolvE5w7Xg-OEON45GFxdQjRRxpn_hIHGqd4GqGP84V_EH76eZl37CZAP9H9P1esea2bzS7bv2_fNs_7DCqtMytIQynQqQoqZaEyyhmUwawXYSxKgegDEpEno1GidY5cIbE00mgX1Io9_t1es9pzigOkS_ub117z1A8PAEyQ</recordid><startdate>20220324</startdate><enddate>20220324</enddate><creator>Sarfraz, M. Saquib</creator><creator>Koulakis, Marios</creator><creator>Seibold, Constantin</creator><creator>Stiefelhagen, Rainer</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220324</creationdate><title>Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction</title><author>Sarfraz, M. Saquib ; Koulakis, Marios ; Seibold, Constantin ; Stiefelhagen, Rainer</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-90e7a50db36a639a683b8d1f8436a89d10ddcfdeeece87d1d9bbeb21d58187bf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Data Structures and Algorithms</topic><topic>Computer Science - Graphics</topic><toplevel>online_resources</toplevel><creatorcontrib>Sarfraz, M. Saquib</creatorcontrib><creatorcontrib>Koulakis, Marios</creatorcontrib><creatorcontrib>Seibold, Constantin</creatorcontrib><creatorcontrib>Stiefelhagen, Rainer</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sarfraz, M. Saquib</au><au>Koulakis, Marios</au><au>Seibold, Constantin</au><au>Stiefelhagen, Rainer</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction</atitle><date>2022-03-24</date><risdate>2022</risdate><abstract>Dimensionality reduction is crucial both for visualization and preprocessing
high dimensional data for machine learning. We introduce a novel method based
on a hierarchy built on 1-nearest neighbor graphs in the original space which
is used to preserve the grouping properties of the data distribution on
multiple levels. The core of the proposal is an optimization-free projection
that is competitive with the latest versions of t-SNE and UMAP in performance
and visualization quality while being an order of magnitude faster in run-time.
Furthermore, its interpretable mechanics, the ability to project new data, and
the natural separation of data clusters in visualizations make it a general
purpose unsupervised dimension reduction technique. In the paper, we argue
about the soundness of the proposed method and evaluate it on a diverse
collection of datasets with sizes varying from 1K to 11M samples and dimensions
from 28 to 16K. We perform comparisons with other state-of-the-art methods on
multiple metrics and target dimensions highlighting its efficiency and
performance. Code is available at https://github.com/koulakis/h-nne</abstract><doi>10.48550/arxiv.2203.12997</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2203.12997 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2203_12997 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition Computer Science - Data Structures and Algorithms Computer Science - Graphics |
title | Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T22%3A55%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hierarchical%20Nearest%20Neighbor%20Graph%20Embedding%20for%20Efficient%20Dimensionality%20Reduction&rft.au=Sarfraz,%20M.%20Saquib&rft.date=2022-03-24&rft_id=info:doi/10.48550/arxiv.2203.12997&rft_dat=%3Carxiv_GOX%3E2203_12997%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |