Hierarchical Data Generator based on Tree-Structured Stick Breaking Process for Benchmarking Clustering Methods
Object Cluster Hierarchies is a new variant of Hierarchical Cluster Analysis that gains interest in the field of Machine Learning. Being still at an early stage of development, the lack of tools for systematic analysis of Object Cluster Hierarchies inhibits its further improvement. In this paper we...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2020-04 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Olech, Łukasz P Spytkowski, Michał Kwaśnicka, Halina Michalewicz, Zbigniew |
description | Object Cluster Hierarchies is a new variant of Hierarchical Cluster Analysis that gains interest in the field of Machine Learning. Being still at an early stage of development, the lack of tools for systematic analysis of Object Cluster Hierarchies inhibits its further improvement. In this paper we address this issue by proposing a generator of synthetic hierarchical data that can be used for benchmarking Object Cluster Hierarchy methods. The article presents a thorough empirical and theoretical analysis of the generator and provides guidance on how to control its parameters. Conducted experiments show the usefulness of the data generator that is capable of producing a wide range of differently structured data. Further, benchmarking datasets that mirror the most common types of hierarchies are generated and made available to the public, together with the developed generator (http://kio.pwr.edu.pl/?page\_id=396). |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2079225214</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2079225214</sourcerecordid><originalsourceid>FETCH-proquest_journals_20792252143</originalsourceid><addsrcrecordid>eNqNi8sKwjAQRYMgWNR_CLgu1Kn1sbW-NoJg9yWmUxutic4k_28VP8DVvZxzb09EkKbTeDkDGIgx8y1JEpgvIMvSSLiDQVKkG6NVKzfKK7lH2yHvSF4UYyWdlQUhxmdPQftAHTp7o-9yTajuxl7liZxGZll3nzVa3TwUfUXeBvZIn3pE37iKR6Jfq5Zx_MuhmOy2RX6In-ReAdmXNxfIdqqEZLECyGA6S_9bvQF1HEpq</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2079225214</pqid></control><display><type>article</type><title>Hierarchical Data Generator based on Tree-Structured Stick Breaking Process for Benchmarking Clustering Methods</title><source>Free E- Journals</source><creator>Olech, Łukasz P ; Spytkowski, Michał ; Kwaśnicka, Halina ; Michalewicz, Zbigniew</creator><creatorcontrib>Olech, Łukasz P ; Spytkowski, Michał ; Kwaśnicka, Halina ; Michalewicz, Zbigniew</creatorcontrib><description>Object Cluster Hierarchies is a new variant of Hierarchical Cluster Analysis that gains interest in the field of Machine Learning. Being still at an early stage of development, the lack of tools for systematic analysis of Object Cluster Hierarchies inhibits its further improvement. In this paper we address this issue by proposing a generator of synthetic hierarchical data that can be used for benchmarking Object Cluster Hierarchy methods. The article presents a thorough empirical and theoretical analysis of the generator and provides guidance on how to control its parameters. Conducted experiments show the usefulness of the data generator that is capable of producing a wide range of differently structured data. Further, benchmarking datasets that mirror the most common types of hierarchies are generated and made available to the public, together with the developed generator (http://kio.pwr.edu.pl/?page\_id=396).</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Cluster analysis ; Clustering ; Hierarchies ; Machine learning ; Pressurized water reactors ; Structural hierarchy</subject><ispartof>arXiv.org, 2020-04</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>777,781</link.rule.ids></links><search><creatorcontrib>Olech, Łukasz P</creatorcontrib><creatorcontrib>Spytkowski, Michał</creatorcontrib><creatorcontrib>Kwaśnicka, Halina</creatorcontrib><creatorcontrib>Michalewicz, Zbigniew</creatorcontrib><title>Hierarchical Data Generator based on Tree-Structured Stick Breaking Process for Benchmarking Clustering Methods</title><title>arXiv.org</title><description>Object Cluster Hierarchies is a new variant of Hierarchical Cluster Analysis that gains interest in the field of Machine Learning. Being still at an early stage of development, the lack of tools for systematic analysis of Object Cluster Hierarchies inhibits its further improvement. In this paper we address this issue by proposing a generator of synthetic hierarchical data that can be used for benchmarking Object Cluster Hierarchy methods. The article presents a thorough empirical and theoretical analysis of the generator and provides guidance on how to control its parameters. Conducted experiments show the usefulness of the data generator that is capable of producing a wide range of differently structured data. Further, benchmarking datasets that mirror the most common types of hierarchies are generated and made available to the public, together with the developed generator (http://kio.pwr.edu.pl/?page\_id=396).</description><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Hierarchies</subject><subject>Machine learning</subject><subject>Pressurized water reactors</subject><subject>Structural hierarchy</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi8sKwjAQRYMgWNR_CLgu1Kn1sbW-NoJg9yWmUxutic4k_28VP8DVvZxzb09EkKbTeDkDGIgx8y1JEpgvIMvSSLiDQVKkG6NVKzfKK7lH2yHvSF4UYyWdlQUhxmdPQftAHTp7o-9yTajuxl7liZxGZll3nzVa3TwUfUXeBvZIn3pE37iKR6Jfq5Zx_MuhmOy2RX6In-ReAdmXNxfIdqqEZLECyGA6S_9bvQF1HEpq</recordid><startdate>20200404</startdate><enddate>20200404</enddate><creator>Olech, Łukasz P</creator><creator>Spytkowski, Michał</creator><creator>Kwaśnicka, Halina</creator><creator>Michalewicz, Zbigniew</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20200404</creationdate><title>Hierarchical Data Generator based on Tree-Structured Stick Breaking Process for Benchmarking Clustering Methods</title><author>Olech, Łukasz P ; Spytkowski, Michał ; Kwaśnicka, Halina ; Michalewicz, Zbigniew</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20792252143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Hierarchies</topic><topic>Machine learning</topic><topic>Pressurized water reactors</topic><topic>Structural hierarchy</topic><toplevel>online_resources</toplevel><creatorcontrib>Olech, Łukasz P</creatorcontrib><creatorcontrib>Spytkowski, Michał</creatorcontrib><creatorcontrib>Kwaśnicka, Halina</creatorcontrib><creatorcontrib>Michalewicz, Zbigniew</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Olech, Łukasz P</au><au>Spytkowski, Michał</au><au>Kwaśnicka, Halina</au><au>Michalewicz, Zbigniew</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Hierarchical Data Generator based on Tree-Structured Stick Breaking Process for Benchmarking Clustering Methods</atitle><jtitle>arXiv.org</jtitle><date>2020-04-04</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>Object Cluster Hierarchies is a new variant of Hierarchical Cluster Analysis that gains interest in the field of Machine Learning. Being still at an early stage of development, the lack of tools for systematic analysis of Object Cluster Hierarchies inhibits its further improvement. In this paper we address this issue by proposing a generator of synthetic hierarchical data that can be used for benchmarking Object Cluster Hierarchy methods. The article presents a thorough empirical and theoretical analysis of the generator and provides guidance on how to control its parameters. Conducted experiments show the usefulness of the data generator that is capable of producing a wide range of differently structured data. Further, benchmarking datasets that mirror the most common types of hierarchies are generated and made available to the public, together with the developed generator (http://kio.pwr.edu.pl/?page\_id=396).</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2020-04 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2079225214 |
source | Free E- Journals |
subjects | Cluster analysis Clustering Hierarchies Machine learning Pressurized water reactors Structural hierarchy |
title | Hierarchical Data Generator based on Tree-Structured Stick Breaking Process for Benchmarking Clustering Methods |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T05%3A13%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Hierarchical%20Data%20Generator%20based%20on%20Tree-Structured%20Stick%20Breaking%20Process%20for%20Benchmarking%20Clustering%20Methods&rft.jtitle=arXiv.org&rft.au=Olech,%20%C5%81ukasz%20P&rft.date=2020-04-04&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2079225214%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2079225214&rft_id=info:pmid/&rfr_iscdi=true |