Absolute Cluster Validity

The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2020-09, Vol.42 (9), p.2096-2112
Hauptverfasser: Iglesias, Felix, Zseby, Tanja, Zimek, Arthur
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2112
container_issue 9
container_start_page 2096
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 42
creator Iglesias, Felix
Zseby, Tanja
Zimek, Arthur
description The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces a trade-off between accuracy and computational effort, moreover given that engineering applications usually work well with suboptimal solutions. In such applied scenarios, cluster validation is mandatory to refine algorithms and ensure that solutions are meaningful. Validity indices are commonly intended to benchmark diverse clustering setups, therefore they are coefficients with a relative nature, i.e., useful when compared to one another. In this paper, we propose a validation methodology that enables absolute evaluations of clustering results. Our method performs geometric measurements of the solution space and provides a coherent interpretation of the data structure by using indices based on inter- and intra-cluster distances, density, and multimodality within clusters. Conducted tests and comparisons with well-known indices show that our validation methodology improves the robustness of the clustering application for knowledge discovery. While clustering is often performed as a black box technique, our index is construable and therefore allows for the implementation of systems enriched with self-checking capabilities.
doi_str_mv 10.1109/TPAMI.2019.2912970
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_31027043</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8695871</ieee_id><sourcerecordid>2431701883</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-c3da0dcbe4cddc0f88de73f37372a7213c4753e7cd1ceabaaf79a2c6b5d951323</originalsourceid><addsrcrecordid>eNpdkE1PAjEQhhujEUR_gJoYEi9eFjudLW2PhPhBgtEDem26bTdZsrDY7h749xZBDl5mDvO8k5mHkBugIwCqHhcfk7fZiFFQI6aAKUFPSB8Uqgw5qlPSpzBmmZRM9shFjEtKIecUz0kPgTJBc-yT60kRm7pr_XBad7H1Yfhl6spV7faSnJWmjv7q0Afk8_lpMX3N5u8vs-lknlnk0KbqDHW28Ll1ztJSSucFlihQMCMYoM0FRy-sA-tNYUwplGF2XHCnOCDDAXnY792E5rvzsdWrKlpf12btmy5qxtIXKhWe0Pt_6LLpwjpdp1mOIChIiYlie8qGJsbgS70J1cqErQaqd-L0rzi9E6cP4lLo7rC6K1beHSN_phJwuwcq7_1xLMeKSwH4A3ZScAg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2431701883</pqid></control><display><type>article</type><title>Absolute Cluster Validity</title><source>IEEE Electronic Library (IEL)</source><creator>Iglesias, Felix ; Zseby, Tanja ; Zimek, Arthur</creator><creatorcontrib>Iglesias, Felix ; Zseby, Tanja ; Zimek, Arthur</creatorcontrib><description>The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces a trade-off between accuracy and computational effort, moreover given that engineering applications usually work well with suboptimal solutions. In such applied scenarios, cluster validation is mandatory to refine algorithms and ensure that solutions are meaningful. Validity indices are commonly intended to benchmark diverse clustering setups, therefore they are coefficients with a relative nature, i.e., useful when compared to one another. In this paper, we propose a validation methodology that enables absolute evaluations of clustering results. Our method performs geometric measurements of the solution space and provides a coherent interpretation of the data structure by using indices based on inter- and intra-cluster distances, density, and multimodality within clusters. Conducted tests and comparisons with well-known indices show that our validation methodology improves the robustness of the clustering application for knowledge discovery. While clustering is often performed as a black box technique, our index is construable and therefore allows for the implementation of systems enriched with self-checking capabilities.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2019.2912970</identifier><identifier>PMID: 31027043</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Algorithms ; Autonomous systems ; Benchmark testing ; cluster validity ; Clustering ; Clustering algorithms ; Data structures ; Indexes ; Proposals ; Solution space ; Task analysis</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2020-09, Vol.42 (9), p.2096-2112</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-c3da0dcbe4cddc0f88de73f37372a7213c4753e7cd1ceabaaf79a2c6b5d951323</citedby><cites>FETCH-LOGICAL-c351t-c3da0dcbe4cddc0f88de73f37372a7213c4753e7cd1ceabaaf79a2c6b5d951323</cites><orcidid>0000-0002-5391-467X ; 0000-0001-6081-969X ; 0000-0001-7713-4208</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8695871$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8695871$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31027043$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Iglesias, Felix</creatorcontrib><creatorcontrib>Zseby, Tanja</creatorcontrib><creatorcontrib>Zimek, Arthur</creatorcontrib><title>Absolute Cluster Validity</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces a trade-off between accuracy and computational effort, moreover given that engineering applications usually work well with suboptimal solutions. In such applied scenarios, cluster validation is mandatory to refine algorithms and ensure that solutions are meaningful. Validity indices are commonly intended to benchmark diverse clustering setups, therefore they are coefficients with a relative nature, i.e., useful when compared to one another. In this paper, we propose a validation methodology that enables absolute evaluations of clustering results. Our method performs geometric measurements of the solution space and provides a coherent interpretation of the data structure by using indices based on inter- and intra-cluster distances, density, and multimodality within clusters. Conducted tests and comparisons with well-known indices show that our validation methodology improves the robustness of the clustering application for knowledge discovery. While clustering is often performed as a black box technique, our index is construable and therefore allows for the implementation of systems enriched with self-checking capabilities.</description><subject>Algorithms</subject><subject>Autonomous systems</subject><subject>Benchmark testing</subject><subject>cluster validity</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Data structures</subject><subject>Indexes</subject><subject>Proposals</subject><subject>Solution space</subject><subject>Task analysis</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1PAjEQhhujEUR_gJoYEi9eFjudLW2PhPhBgtEDem26bTdZsrDY7h749xZBDl5mDvO8k5mHkBugIwCqHhcfk7fZiFFQI6aAKUFPSB8Uqgw5qlPSpzBmmZRM9shFjEtKIecUz0kPgTJBc-yT60kRm7pr_XBad7H1Yfhl6spV7faSnJWmjv7q0Afk8_lpMX3N5u8vs-lknlnk0KbqDHW28Ll1ztJSSucFlihQMCMYoM0FRy-sA-tNYUwplGF2XHCnOCDDAXnY792E5rvzsdWrKlpf12btmy5qxtIXKhWe0Pt_6LLpwjpdp1mOIChIiYlie8qGJsbgS70J1cqErQaqd-L0rzi9E6cP4lLo7rC6K1beHSN_phJwuwcq7_1xLMeKSwH4A3ZScAg</recordid><startdate>20200901</startdate><enddate>20200901</enddate><creator>Iglesias, Felix</creator><creator>Zseby, Tanja</creator><creator>Zimek, Arthur</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5391-467X</orcidid><orcidid>https://orcid.org/0000-0001-6081-969X</orcidid><orcidid>https://orcid.org/0000-0001-7713-4208</orcidid></search><sort><creationdate>20200901</creationdate><title>Absolute Cluster Validity</title><author>Iglesias, Felix ; Zseby, Tanja ; Zimek, Arthur</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-c3da0dcbe4cddc0f88de73f37372a7213c4753e7cd1ceabaaf79a2c6b5d951323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Autonomous systems</topic><topic>Benchmark testing</topic><topic>cluster validity</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Data structures</topic><topic>Indexes</topic><topic>Proposals</topic><topic>Solution space</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Iglesias, Felix</creatorcontrib><creatorcontrib>Zseby, Tanja</creatorcontrib><creatorcontrib>Zimek, Arthur</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Iglesias, Felix</au><au>Zseby, Tanja</au><au>Zimek, Arthur</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Absolute Cluster Validity</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2020-09-01</date><risdate>2020</risdate><volume>42</volume><issue>9</issue><spage>2096</spage><epage>2112</epage><pages>2096-2112</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces a trade-off between accuracy and computational effort, moreover given that engineering applications usually work well with suboptimal solutions. In such applied scenarios, cluster validation is mandatory to refine algorithms and ensure that solutions are meaningful. Validity indices are commonly intended to benchmark diverse clustering setups, therefore they are coefficients with a relative nature, i.e., useful when compared to one another. In this paper, we propose a validation methodology that enables absolute evaluations of clustering results. Our method performs geometric measurements of the solution space and provides a coherent interpretation of the data structure by using indices based on inter- and intra-cluster distances, density, and multimodality within clusters. Conducted tests and comparisons with well-known indices show that our validation methodology improves the robustness of the clustering application for knowledge discovery. While clustering is often performed as a black box technique, our index is construable and therefore allows for the implementation of systems enriched with self-checking capabilities.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>31027043</pmid><doi>10.1109/TPAMI.2019.2912970</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-5391-467X</orcidid><orcidid>https://orcid.org/0000-0001-6081-969X</orcidid><orcidid>https://orcid.org/0000-0001-7713-4208</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2020-09, Vol.42 (9), p.2096-2112
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_pubmed_primary_31027043
source IEEE Electronic Library (IEL)
subjects Algorithms
Autonomous systems
Benchmark testing
cluster validity
Clustering
Clustering algorithms
Data structures
Indexes
Proposals
Solution space
Task analysis
title Absolute Cluster Validity
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T01%3A10%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Absolute%20Cluster%20Validity&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Iglesias,%20Felix&rft.date=2020-09-01&rft.volume=42&rft.issue=9&rft.spage=2096&rft.epage=2112&rft.pages=2096-2112&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2019.2912970&rft_dat=%3Cproquest_RIE%3E2431701883%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2431701883&rft_id=info:pmid/31027043&rft_ieee_id=8695871&rfr_iscdi=true