A hybrid similarity measure-based clustering approach for mixed attribute data
In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is d...
Gespeichert in:
Veröffentlicht in: | International journal of machine learning and cybernetics 2024-04, Vol.15 (4), p.1295-1311 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1311 |
---|---|
container_issue | 4 |
container_start_page | 1295 |
container_title | International journal of machine learning and cybernetics |
container_volume | 15 |
creator | Chu, Kexin Zhang, Min Xun, Yaling Zhang, Jifu |
description | In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets. |
doi_str_mv | 10.1007/s13042-023-01968-6 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2942205528</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2941511255</sourcerecordid><originalsourceid>FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</originalsourceid><addsrcrecordid>eNqFkE1LxDAQhoMouOj-AU8Bz9XJV9Mcl8UvWPSi4C2kaepm2W5rkoL7781a0ZvOZQbmfWdeHoQuCFwRAHkdCQNOC6CsAKLKqiiP0IxUeaigej3-mSU5RfMYN5CrBMaAztDjAq_3dfANjr7zWxN82uPOmTgGV9Qmugbb7RiTC373hs0whN7YNW77gDv_kbcmpeDrMTncmGTO0UlrttHNv_sZerm9eV7eF6unu4flYlVYqqpUkJZJIkrhFJemkTmmAGkBKssarjghSrXUGUeFqa0FIx0B3irJOWXOUMbO0OV0N-d5H11MetOPYZdfaqo4pSAErf5REUEIFSKr6KSyoY8xuFYPwXcm7DUBfQCsJ8A6A9ZfgHWZTWwyxeFAxoXf03-4PgHaYnw8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2941511255</pqid></control><display><type>article</type><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><source>Springer Nature - Complete Springer Journals</source><creator>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</creator><creatorcontrib>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</creatorcontrib><description>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</description><identifier>ISSN: 1868-8071</identifier><identifier>EISSN: 1868-808X</identifier><identifier>DOI: 10.1007/s13042-023-01968-6</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Artificial Intelligence ; Cluster analysis ; Clustering ; Complex Systems ; Computational Intelligence ; Control ; Datasets ; Engineering ; Entropy ; Entropy (Information theory) ; Mechatronics ; Original Article ; Pattern Recognition ; Prototypes ; Robotics ; Similarity ; Similarity measures ; Systems Biology</subject><ispartof>International journal of machine learning and cybernetics, 2024-04, Vol.15 (4), p.1295-1311</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</cites><orcidid>0000-0002-0396-8901</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s13042-023-01968-6$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s13042-023-01968-6$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51298</link.rule.ids></links><search><creatorcontrib>Chu, Kexin</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><creatorcontrib>Xun, Yaling</creatorcontrib><creatorcontrib>Zhang, Jifu</creatorcontrib><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><title>International journal of machine learning and cybernetics</title><addtitle>Int. J. Mach. Learn. & Cyber</addtitle><description>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Complex Systems</subject><subject>Computational Intelligence</subject><subject>Control</subject><subject>Datasets</subject><subject>Engineering</subject><subject>Entropy</subject><subject>Entropy (Information theory)</subject><subject>Mechatronics</subject><subject>Original Article</subject><subject>Pattern Recognition</subject><subject>Prototypes</subject><subject>Robotics</subject><subject>Similarity</subject><subject>Similarity measures</subject><subject>Systems Biology</subject><issn>1868-8071</issn><issn>1868-808X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNqFkE1LxDAQhoMouOj-AU8Bz9XJV9Mcl8UvWPSi4C2kaepm2W5rkoL7781a0ZvOZQbmfWdeHoQuCFwRAHkdCQNOC6CsAKLKqiiP0IxUeaigej3-mSU5RfMYN5CrBMaAztDjAq_3dfANjr7zWxN82uPOmTgGV9Qmugbb7RiTC373hs0whN7YNW77gDv_kbcmpeDrMTncmGTO0UlrttHNv_sZerm9eV7eF6unu4flYlVYqqpUkJZJIkrhFJemkTmmAGkBKssarjghSrXUGUeFqa0FIx0B3irJOWXOUMbO0OV0N-d5H11MetOPYZdfaqo4pSAErf5REUEIFSKr6KSyoY8xuFYPwXcm7DUBfQCsJ8A6A9ZfgHWZTWwyxeFAxoXf03-4PgHaYnw8</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>Chu, Kexin</creator><creator>Zhang, Min</creator><creator>Xun, Yaling</creator><creator>Zhang, Jifu</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><orcidid>https://orcid.org/0000-0002-0396-8901</orcidid></search><sort><creationdate>20240401</creationdate><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><author>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Complex Systems</topic><topic>Computational Intelligence</topic><topic>Control</topic><topic>Datasets</topic><topic>Engineering</topic><topic>Entropy</topic><topic>Entropy (Information theory)</topic><topic>Mechatronics</topic><topic>Original Article</topic><topic>Pattern Recognition</topic><topic>Prototypes</topic><topic>Robotics</topic><topic>Similarity</topic><topic>Similarity measures</topic><topic>Systems Biology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chu, Kexin</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><creatorcontrib>Xun, Yaling</creatorcontrib><creatorcontrib>Zhang, Jifu</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>International journal of machine learning and cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chu, Kexin</au><au>Zhang, Min</au><au>Xun, Yaling</au><au>Zhang, Jifu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A hybrid similarity measure-based clustering approach for mixed attribute data</atitle><jtitle>International journal of machine learning and cybernetics</jtitle><stitle>Int. J. Mach. Learn. & Cyber</stitle><date>2024-04-01</date><risdate>2024</risdate><volume>15</volume><issue>4</issue><spage>1295</spage><epage>1311</epage><pages>1295-1311</pages><issn>1868-8071</issn><eissn>1868-808X</eissn><abstract>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s13042-023-01968-6</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-0396-8901</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1868-8071 |
ispartof | International journal of machine learning and cybernetics, 2024-04, Vol.15 (4), p.1295-1311 |
issn | 1868-8071 1868-808X |
language | eng |
recordid | cdi_proquest_journals_2942205528 |
source | Springer Nature - Complete Springer Journals |
subjects | Algorithms Artificial Intelligence Cluster analysis Clustering Complex Systems Computational Intelligence Control Datasets Engineering Entropy Entropy (Information theory) Mechatronics Original Article Pattern Recognition Prototypes Robotics Similarity Similarity measures Systems Biology |
title | A hybrid similarity measure-based clustering approach for mixed attribute data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T11%3A41%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20hybrid%20similarity%20measure-based%20clustering%20approach%20for%20mixed%20attribute%20data&rft.jtitle=International%20journal%20of%20machine%20learning%20and%20cybernetics&rft.au=Chu,%20Kexin&rft.date=2024-04-01&rft.volume=15&rft.issue=4&rft.spage=1295&rft.epage=1311&rft.pages=1295-1311&rft.issn=1868-8071&rft.eissn=1868-808X&rft_id=info:doi/10.1007/s13042-023-01968-6&rft_dat=%3Cproquest_cross%3E2941511255%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2941511255&rft_id=info:pmid/&rfr_iscdi=true |