Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study

Dimensionality Reduction (DR) techniques can generate 2D projections and enable visual exploration of cluster structures of high-dimensional datasets. However, different DR techniques would yield various patterns, which significantly affect the performance of visual cluster analysis tasks. We presen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on visualization and computer graphics 2022-01, Vol.28 (1), p.529-539
Hauptverfasser: Xia, Jiazhi, Zhang, Yuchen, Song, Jie, Chen, Yang, Wang, Yunhai, Liu, Shixia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 539
container_issue 1
container_start_page 529
container_title IEEE transactions on visualization and computer graphics
container_volume 28
creator Xia, Jiazhi
Zhang, Yuchen
Song, Jie
Chen, Yang
Wang, Yunhai
Liu, Shixia
description Dimensionality Reduction (DR) techniques can generate 2D projections and enable visual exploration of cluster structures of high-dimensional datasets. However, different DR techniques would yield various patterns, which significantly affect the performance of visual cluster analysis tasks. We present the results of a user study that investigates the influence of different DR techniques on visual cluster analysis. Our study focuses on the most concerned property types, namely the linearity and locality, and evaluates twelve representative DR techniques that cover the concerned properties. Four controlled experiments were conducted to evaluate how the DR techniques facilitate the tasks of 1) cluster identification, 2) membership identification, 3) distance comparison, and 4) density comparison, respectively. We also evaluated users' subjective preference of the DR techniques regarding the quality of projected clusters. The results show that: 1) Non-linear and Local techniques are preferred in cluster identification and membership identification; 2) Linear techniques perform better than non-linear techniques in density comparison; 3) UMAP (Uniform Manifold Approximation and Projection) and t-SNE (t-Distributed Stochastic Neighbor Embedding) perform the best in cluster identification and membership identification; 4) NMF (Nonnegative Matrix Factorization) has competitive performance in distance comparison; 5) t-SNLE (t-Distributed Stochastic Neighbor Linear Embedding) has competitive performance in density comparison.
doi_str_mv 10.1109/TVCG.2021.3114694
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_2578150474</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9552226</ieee_id><sourcerecordid>2578150474</sourcerecordid><originalsourceid>FETCH-LOGICAL-c397t-1ca4816bff4f03e8f566dc387409fb073891bb8dcd248267620fdc65243ac1e23</originalsourceid><addsrcrecordid>eNpdkEtLxDAQgIMovn-ACFLw4qVrJu94k_UJgqCrR0ubJhrpY01aYf-9WXb14GlmmG-GmQ-hI8ATAKzPZ6_T2wnBBCYUgAnNNtAuaAY55lhsphxLmRNBxA7ai_ETY2BM6W20QxlXEgPfRW9P9ttHP_juPbvyre2i77uy8cMie7L1aIZUZjNrPjr_NdqYuT5krz6OZZNNmzEONmSXiV9EHy9Sll23cx-8Se3nYawXB2jLlU20h-u4j15urmfTu_zh8fZ-evmQG6rlkIMpmQJROcccplY5LkRtqJIMa1dhSZWGqlK1qQlTREhBsKuN4ITR0oAldB-drfbOQ7-8cyhaH41tmrKz_RgLwqUCjplkCT39h372Y0g_JEoApUITrhIFK8qEPsZgXTEPvi3DogBcLOUXS_nFUn6xlp9mTtabx6q19d_Er-0EHK8Ab639a2vOCSGC_gDNcYgj</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2613369258</pqid></control><display><type>article</type><title>Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study</title><source>IEEE Electronic Library (IEL)</source><creator>Xia, Jiazhi ; Zhang, Yuchen ; Song, Jie ; Chen, Yang ; Wang, Yunhai ; Liu, Shixia</creator><creatorcontrib>Xia, Jiazhi ; Zhang, Yuchen ; Song, Jie ; Chen, Yang ; Wang, Yunhai ; Liu, Shixia</creatorcontrib><description>Dimensionality Reduction (DR) techniques can generate 2D projections and enable visual exploration of cluster structures of high-dimensional datasets. However, different DR techniques would yield various patterns, which significantly affect the performance of visual cluster analysis tasks. We present the results of a user study that investigates the influence of different DR techniques on visual cluster analysis. Our study focuses on the most concerned property types, namely the linearity and locality, and evaluates twelve representative DR techniques that cover the concerned properties. Four controlled experiments were conducted to evaluate how the DR techniques facilitate the tasks of 1) cluster identification, 2) membership identification, 3) distance comparison, and 4) density comparison, respectively. We also evaluated users' subjective preference of the DR techniques regarding the quality of projected clusters. The results show that: 1) Non-linear and Local techniques are preferred in cluster identification and membership identification; 2) Linear techniques perform better than non-linear techniques in density comparison; 3) UMAP (Uniform Manifold Approximation and Projection) and t-SNE (t-Distributed Stochastic Neighbor Embedding) perform the best in cluster identification and membership identification; 4) NMF (Nonnegative Matrix Factorization) has competitive performance in distance comparison; 5) t-SNLE (t-Distributed Stochastic Neighbor Linear Embedding) has competitive performance in density comparison.</description><identifier>ISSN: 1077-2626</identifier><identifier>EISSN: 1941-0506</identifier><identifier>DOI: 10.1109/TVCG.2021.3114694</identifier><identifier>PMID: 34587015</identifier><identifier>CODEN: ITVGEA</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Cluster analysis ; Density ; Dimensionality reduction ; Embedding ; Empirical analysis ; Identification ; Linearity ; Manifolds ; Measurement ; perception-based evaluation ; Principal component analysis ; Reduction ; Task analysis ; visual cluster analysis ; Visual perception ; Visual tasks ; Visualization</subject><ispartof>IEEE transactions on visualization and computer graphics, 2022-01, Vol.28 (1), p.529-539</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c397t-1ca4816bff4f03e8f566dc387409fb073891bb8dcd248267620fdc65243ac1e23</citedby><cites>FETCH-LOGICAL-c397t-1ca4816bff4f03e8f566dc387409fb073891bb8dcd248267620fdc65243ac1e23</cites><orcidid>0000-0003-4629-6268</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9552226$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27926,27927,54760</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9552226$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34587015$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Xia, Jiazhi</creatorcontrib><creatorcontrib>Zhang, Yuchen</creatorcontrib><creatorcontrib>Song, Jie</creatorcontrib><creatorcontrib>Chen, Yang</creatorcontrib><creatorcontrib>Wang, Yunhai</creatorcontrib><creatorcontrib>Liu, Shixia</creatorcontrib><title>Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study</title><title>IEEE transactions on visualization and computer graphics</title><addtitle>TVCG</addtitle><addtitle>IEEE Trans Vis Comput Graph</addtitle><description>Dimensionality Reduction (DR) techniques can generate 2D projections and enable visual exploration of cluster structures of high-dimensional datasets. However, different DR techniques would yield various patterns, which significantly affect the performance of visual cluster analysis tasks. We present the results of a user study that investigates the influence of different DR techniques on visual cluster analysis. Our study focuses on the most concerned property types, namely the linearity and locality, and evaluates twelve representative DR techniques that cover the concerned properties. Four controlled experiments were conducted to evaluate how the DR techniques facilitate the tasks of 1) cluster identification, 2) membership identification, 3) distance comparison, and 4) density comparison, respectively. We also evaluated users' subjective preference of the DR techniques regarding the quality of projected clusters. The results show that: 1) Non-linear and Local techniques are preferred in cluster identification and membership identification; 2) Linear techniques perform better than non-linear techniques in density comparison; 3) UMAP (Uniform Manifold Approximation and Projection) and t-SNE (t-Distributed Stochastic Neighbor Embedding) perform the best in cluster identification and membership identification; 4) NMF (Nonnegative Matrix Factorization) has competitive performance in distance comparison; 5) t-SNLE (t-Distributed Stochastic Neighbor Linear Embedding) has competitive performance in density comparison.</description><subject>Cluster analysis</subject><subject>Density</subject><subject>Dimensionality reduction</subject><subject>Embedding</subject><subject>Empirical analysis</subject><subject>Identification</subject><subject>Linearity</subject><subject>Manifolds</subject><subject>Measurement</subject><subject>perception-based evaluation</subject><subject>Principal component analysis</subject><subject>Reduction</subject><subject>Task analysis</subject><subject>visual cluster analysis</subject><subject>Visual perception</subject><subject>Visual tasks</subject><subject>Visualization</subject><issn>1077-2626</issn><issn>1941-0506</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtLxDAQgIMovn-ACFLw4qVrJu94k_UJgqCrR0ubJhrpY01aYf-9WXb14GlmmG-GmQ-hI8ATAKzPZ6_T2wnBBCYUgAnNNtAuaAY55lhsphxLmRNBxA7ai_ETY2BM6W20QxlXEgPfRW9P9ttHP_juPbvyre2i77uy8cMie7L1aIZUZjNrPjr_NdqYuT5krz6OZZNNmzEONmSXiV9EHy9Sll23cx-8Se3nYawXB2jLlU20h-u4j15urmfTu_zh8fZ-evmQG6rlkIMpmQJROcccplY5LkRtqJIMa1dhSZWGqlK1qQlTREhBsKuN4ITR0oAldB-drfbOQ7-8cyhaH41tmrKz_RgLwqUCjplkCT39h372Y0g_JEoApUITrhIFK8qEPsZgXTEPvi3DogBcLOUXS_nFUn6xlp9mTtabx6q19d_Er-0EHK8Ab639a2vOCSGC_gDNcYgj</recordid><startdate>202201</startdate><enddate>202201</enddate><creator>Xia, Jiazhi</creator><creator>Zhang, Yuchen</creator><creator>Song, Jie</creator><creator>Chen, Yang</creator><creator>Wang, Yunhai</creator><creator>Liu, Shixia</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-4629-6268</orcidid></search><sort><creationdate>202201</creationdate><title>Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study</title><author>Xia, Jiazhi ; Zhang, Yuchen ; Song, Jie ; Chen, Yang ; Wang, Yunhai ; Liu, Shixia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c397t-1ca4816bff4f03e8f566dc387409fb073891bb8dcd248267620fdc65243ac1e23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Cluster analysis</topic><topic>Density</topic><topic>Dimensionality reduction</topic><topic>Embedding</topic><topic>Empirical analysis</topic><topic>Identification</topic><topic>Linearity</topic><topic>Manifolds</topic><topic>Measurement</topic><topic>perception-based evaluation</topic><topic>Principal component analysis</topic><topic>Reduction</topic><topic>Task analysis</topic><topic>visual cluster analysis</topic><topic>Visual perception</topic><topic>Visual tasks</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xia, Jiazhi</creatorcontrib><creatorcontrib>Zhang, Yuchen</creatorcontrib><creatorcontrib>Song, Jie</creatorcontrib><creatorcontrib>Chen, Yang</creatorcontrib><creatorcontrib>Wang, Yunhai</creatorcontrib><creatorcontrib>Liu, Shixia</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on visualization and computer graphics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xia, Jiazhi</au><au>Zhang, Yuchen</au><au>Song, Jie</au><au>Chen, Yang</au><au>Wang, Yunhai</au><au>Liu, Shixia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study</atitle><jtitle>IEEE transactions on visualization and computer graphics</jtitle><stitle>TVCG</stitle><addtitle>IEEE Trans Vis Comput Graph</addtitle><date>2022-01</date><risdate>2022</risdate><volume>28</volume><issue>1</issue><spage>529</spage><epage>539</epage><pages>529-539</pages><issn>1077-2626</issn><eissn>1941-0506</eissn><coden>ITVGEA</coden><abstract>Dimensionality Reduction (DR) techniques can generate 2D projections and enable visual exploration of cluster structures of high-dimensional datasets. However, different DR techniques would yield various patterns, which significantly affect the performance of visual cluster analysis tasks. We present the results of a user study that investigates the influence of different DR techniques on visual cluster analysis. Our study focuses on the most concerned property types, namely the linearity and locality, and evaluates twelve representative DR techniques that cover the concerned properties. Four controlled experiments were conducted to evaluate how the DR techniques facilitate the tasks of 1) cluster identification, 2) membership identification, 3) distance comparison, and 4) density comparison, respectively. We also evaluated users' subjective preference of the DR techniques regarding the quality of projected clusters. The results show that: 1) Non-linear and Local techniques are preferred in cluster identification and membership identification; 2) Linear techniques perform better than non-linear techniques in density comparison; 3) UMAP (Uniform Manifold Approximation and Projection) and t-SNE (t-Distributed Stochastic Neighbor Embedding) perform the best in cluster identification and membership identification; 4) NMF (Nonnegative Matrix Factorization) has competitive performance in distance comparison; 5) t-SNLE (t-Distributed Stochastic Neighbor Linear Embedding) has competitive performance in density comparison.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>34587015</pmid><doi>10.1109/TVCG.2021.3114694</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0003-4629-6268</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1077-2626
ispartof IEEE transactions on visualization and computer graphics, 2022-01, Vol.28 (1), p.529-539
issn 1077-2626
1941-0506
language eng
recordid cdi_proquest_miscellaneous_2578150474
source IEEE Electronic Library (IEL)
subjects Cluster analysis
Density
Dimensionality reduction
Embedding
Empirical analysis
Identification
Linearity
Manifolds
Measurement
perception-based evaluation
Principal component analysis
Reduction
Task analysis
visual cluster analysis
Visual perception
Visual tasks
Visualization
title Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T06%3A48%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Revisiting%20Dimensionality%20Reduction%20Techniques%20for%20Visual%20Cluster%20Analysis:%20An%20Empirical%20Study&rft.jtitle=IEEE%20transactions%20on%20visualization%20and%20computer%20graphics&rft.au=Xia,%20Jiazhi&rft.date=2022-01&rft.volume=28&rft.issue=1&rft.spage=529&rft.epage=539&rft.pages=529-539&rft.issn=1077-2626&rft.eissn=1941-0506&rft.coden=ITVGEA&rft_id=info:doi/10.1109/TVCG.2021.3114694&rft_dat=%3Cproquest_RIE%3E2578150474%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2613369258&rft_id=info:pmid/34587015&rft_ieee_id=9552226&rfr_iscdi=true