Absolute Cluster Validity
The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence 2020-09, Vol.42 (9), p.2096-2112 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2112 |
---|---|
container_issue | 9 |
container_start_page | 2096 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 42 |
creator | Iglesias, Felix Zseby, Tanja Zimek, Arthur |
description | The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces a trade-off between accuracy and computational effort, moreover given that engineering applications usually work well with suboptimal solutions. In such applied scenarios, cluster validation is mandatory to refine algorithms and ensure that solutions are meaningful. Validity indices are commonly intended to benchmark diverse clustering setups, therefore they are coefficients with a relative nature, i.e., useful when compared to one another. In this paper, we propose a validation methodology that enables absolute evaluations of clustering results. Our method performs geometric measurements of the solution space and provides a coherent interpretation of the data structure by using indices based on inter- and intra-cluster distances, density, and multimodality within clusters. Conducted tests and comparisons with well-known indices show that our validation methodology improves the robustness of the clustering application for knowledge discovery. While clustering is often performed as a black box technique, our index is construable and therefore allows for the implementation of systems enriched with self-checking capabilities. |
doi_str_mv | 10.1109/TPAMI.2019.2912970 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_31027043</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8695871</ieee_id><sourcerecordid>2431701883</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-c3da0dcbe4cddc0f88de73f37372a7213c4753e7cd1ceabaaf79a2c6b5d951323</originalsourceid><addsrcrecordid>eNpdkE1PAjEQhhujEUR_gJoYEi9eFjudLW2PhPhBgtEDem26bTdZsrDY7h749xZBDl5mDvO8k5mHkBugIwCqHhcfk7fZiFFQI6aAKUFPSB8Uqgw5qlPSpzBmmZRM9shFjEtKIecUz0kPgTJBc-yT60kRm7pr_XBad7H1Yfhl6spV7faSnJWmjv7q0Afk8_lpMX3N5u8vs-lknlnk0KbqDHW28Ll1ztJSSucFlihQMCMYoM0FRy-sA-tNYUwplGF2XHCnOCDDAXnY792E5rvzsdWrKlpf12btmy5qxtIXKhWe0Pt_6LLpwjpdp1mOIChIiYlie8qGJsbgS70J1cqErQaqd-L0rzi9E6cP4lLo7rC6K1beHSN_phJwuwcq7_1xLMeKSwH4A3ZScAg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2431701883</pqid></control><display><type>article</type><title>Absolute Cluster Validity</title><source>IEEE Electronic Library (IEL)</source><creator>Iglesias, Felix ; Zseby, Tanja ; Zimek, Arthur</creator><creatorcontrib>Iglesias, Felix ; Zseby, Tanja ; Zimek, Arthur</creatorcontrib><description>The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces a trade-off between accuracy and computational effort, moreover given that engineering applications usually work well with suboptimal solutions. In such applied scenarios, cluster validation is mandatory to refine algorithms and ensure that solutions are meaningful. Validity indices are commonly intended to benchmark diverse clustering setups, therefore they are coefficients with a relative nature, i.e., useful when compared to one another. In this paper, we propose a validation methodology that enables absolute evaluations of clustering results. Our method performs geometric measurements of the solution space and provides a coherent interpretation of the data structure by using indices based on inter- and intra-cluster distances, density, and multimodality within clusters. Conducted tests and comparisons with well-known indices show that our validation methodology improves the robustness of the clustering application for knowledge discovery. While clustering is often performed as a black box technique, our index is construable and therefore allows for the implementation of systems enriched with self-checking capabilities.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2019.2912970</identifier><identifier>PMID: 31027043</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Algorithms ; Autonomous systems ; Benchmark testing ; cluster validity ; Clustering ; Clustering algorithms ; Data structures ; Indexes ; Proposals ; Solution space ; Task analysis</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2020-09, Vol.42 (9), p.2096-2112</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-c3da0dcbe4cddc0f88de73f37372a7213c4753e7cd1ceabaaf79a2c6b5d951323</citedby><cites>FETCH-LOGICAL-c351t-c3da0dcbe4cddc0f88de73f37372a7213c4753e7cd1ceabaaf79a2c6b5d951323</cites><orcidid>0000-0002-5391-467X ; 0000-0001-6081-969X ; 0000-0001-7713-4208</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8695871$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8695871$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31027043$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Iglesias, Felix</creatorcontrib><creatorcontrib>Zseby, Tanja</creatorcontrib><creatorcontrib>Zimek, Arthur</creatorcontrib><title>Absolute Cluster Validity</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces a trade-off between accuracy and computational effort, moreover given that engineering applications usually work well with suboptimal solutions. In such applied scenarios, cluster validation is mandatory to refine algorithms and ensure that solutions are meaningful. Validity indices are commonly intended to benchmark diverse clustering setups, therefore they are coefficients with a relative nature, i.e., useful when compared to one another. In this paper, we propose a validation methodology that enables absolute evaluations of clustering results. Our method performs geometric measurements of the solution space and provides a coherent interpretation of the data structure by using indices based on inter- and intra-cluster distances, density, and multimodality within clusters. Conducted tests and comparisons with well-known indices show that our validation methodology improves the robustness of the clustering application for knowledge discovery. While clustering is often performed as a black box technique, our index is construable and therefore allows for the implementation of systems enriched with self-checking capabilities.</description><subject>Algorithms</subject><subject>Autonomous systems</subject><subject>Benchmark testing</subject><subject>cluster validity</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Data structures</subject><subject>Indexes</subject><subject>Proposals</subject><subject>Solution space</subject><subject>Task analysis</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1PAjEQhhujEUR_gJoYEi9eFjudLW2PhPhBgtEDem26bTdZsrDY7h749xZBDl5mDvO8k5mHkBugIwCqHhcfk7fZiFFQI6aAKUFPSB8Uqgw5qlPSpzBmmZRM9shFjEtKIecUz0kPgTJBc-yT60kRm7pr_XBad7H1Yfhl6spV7faSnJWmjv7q0Afk8_lpMX3N5u8vs-lknlnk0KbqDHW28Ll1ztJSSucFlihQMCMYoM0FRy-sA-tNYUwplGF2XHCnOCDDAXnY792E5rvzsdWrKlpf12btmy5qxtIXKhWe0Pt_6LLpwjpdp1mOIChIiYlie8qGJsbgS70J1cqErQaqd-L0rzi9E6cP4lLo7rC6K1beHSN_phJwuwcq7_1xLMeKSwH4A3ZScAg</recordid><startdate>20200901</startdate><enddate>20200901</enddate><creator>Iglesias, Felix</creator><creator>Zseby, Tanja</creator><creator>Zimek, Arthur</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5391-467X</orcidid><orcidid>https://orcid.org/0000-0001-6081-969X</orcidid><orcidid>https://orcid.org/0000-0001-7713-4208</orcidid></search><sort><creationdate>20200901</creationdate><title>Absolute Cluster Validity</title><author>Iglesias, Felix ; Zseby, Tanja ; Zimek, Arthur</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-c3da0dcbe4cddc0f88de73f37372a7213c4753e7cd1ceabaaf79a2c6b5d951323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Autonomous systems</topic><topic>Benchmark testing</topic><topic>cluster validity</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Data structures</topic><topic>Indexes</topic><topic>Proposals</topic><topic>Solution space</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Iglesias, Felix</creatorcontrib><creatorcontrib>Zseby, Tanja</creatorcontrib><creatorcontrib>Zimek, Arthur</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Iglesias, Felix</au><au>Zseby, Tanja</au><au>Zimek, Arthur</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Absolute Cluster Validity</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2020-09-01</date><risdate>2020</risdate><volume>42</volume><issue>9</issue><spage>2096</spage><epage>2112</epage><pages>2096-2112</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>The application of clustering involves the interpretation of objects placed in multi-dimensional spaces. The task of clustering itself is inherently submitted to subjectivity, the optimal solution can be extremely costly to discover and sometimes even unreachable or nonexistent. This fact introduces a trade-off between accuracy and computational effort, moreover given that engineering applications usually work well with suboptimal solutions. In such applied scenarios, cluster validation is mandatory to refine algorithms and ensure that solutions are meaningful. Validity indices are commonly intended to benchmark diverse clustering setups, therefore they are coefficients with a relative nature, i.e., useful when compared to one another. In this paper, we propose a validation methodology that enables absolute evaluations of clustering results. Our method performs geometric measurements of the solution space and provides a coherent interpretation of the data structure by using indices based on inter- and intra-cluster distances, density, and multimodality within clusters. Conducted tests and comparisons with well-known indices show that our validation methodology improves the robustness of the clustering application for knowledge discovery. While clustering is often performed as a black box technique, our index is construable and therefore allows for the implementation of systems enriched with self-checking capabilities.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>31027043</pmid><doi>10.1109/TPAMI.2019.2912970</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-5391-467X</orcidid><orcidid>https://orcid.org/0000-0001-6081-969X</orcidid><orcidid>https://orcid.org/0000-0001-7713-4208</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2020-09, Vol.42 (9), p.2096-2112 |
issn | 0162-8828 1939-3539 2160-9292 |
language | eng |
recordid | cdi_pubmed_primary_31027043 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Autonomous systems Benchmark testing cluster validity Clustering Clustering algorithms Data structures Indexes Proposals Solution space Task analysis |
title | Absolute Cluster Validity |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T01%3A10%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Absolute%20Cluster%20Validity&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Iglesias,%20Felix&rft.date=2020-09-01&rft.volume=42&rft.issue=9&rft.spage=2096&rft.epage=2112&rft.pages=2096-2112&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2019.2912970&rft_dat=%3Cproquest_RIE%3E2431701883%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2431701883&rft_id=info:pmid/31027043&rft_ieee_id=8695871&rfr_iscdi=true |