A hybrid similarity measure-based clustering approach for mixed attribute data

In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is d...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of machine learning and cybernetics 2024-04, Vol.15 (4), p.1295-1311
Hauptverfasser:	Chu, Kexin, Zhang, Min, Xun, Yaling, Zhang, Jifu
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial Intelligence Cluster analysis Clustering Complex Systems Computational Intelligence Control Datasets Engineering Entropy Entropy (Information theory) Mechatronics Original Article Pattern Recognition Prototypes Robotics Similarity Similarity measures Systems Biology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1311
container_issue	4
container_start_page	1295
container_title	International journal of machine learning and cybernetics
container_volume	15
creator	Chu, Kexin Zhang, Min Xun, Yaling Zhang, Jifu
description	In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.
doi_str_mv	10.1007/s13042-023-01968-6
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2942205528</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2941511255</sourcerecordid><originalsourceid>FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</originalsourceid><addsrcrecordid>eNqFkE1LxDAQhoMouOj-AU8Bz9XJV9Mcl8UvWPSi4C2kaepm2W5rkoL7781a0ZvOZQbmfWdeHoQuCFwRAHkdCQNOC6CsAKLKqiiP0IxUeaigej3-mSU5RfMYN5CrBMaAztDjAq_3dfANjr7zWxN82uPOmTgGV9Qmugbb7RiTC373hs0whN7YNW77gDv_kbcmpeDrMTncmGTO0UlrttHNv_sZerm9eV7eF6unu4flYlVYqqpUkJZJIkrhFJemkTmmAGkBKssarjghSrXUGUeFqa0FIx0B3irJOWXOUMbO0OV0N-d5H11MetOPYZdfaqo4pSAErf5REUEIFSKr6KSyoY8xuFYPwXcm7DUBfQCsJ8A6A9ZfgHWZTWwyxeFAxoXf03-4PgHaYnw8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2941511255</pqid></control><display><type>article</type><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><source>Springer Nature - Complete Springer Journals</source><creator>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</creator><creatorcontrib>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</creatorcontrib><description>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</description><identifier>ISSN: 1868-8071</identifier><identifier>EISSN: 1868-808X</identifier><identifier>DOI: 10.1007/s13042-023-01968-6</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Artificial Intelligence ; Cluster analysis ; Clustering ; Complex Systems ; Computational Intelligence ; Control ; Datasets ; Engineering ; Entropy ; Entropy (Information theory) ; Mechatronics ; Original Article ; Pattern Recognition ; Prototypes ; Robotics ; Similarity ; Similarity measures ; Systems Biology</subject><ispartof>International journal of machine learning and cybernetics, 2024-04, Vol.15 (4), p.1295-1311</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</cites><orcidid>0000-0002-0396-8901</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s13042-023-01968-6$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s13042-023-01968-6$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51298</link.rule.ids></links><search><creatorcontrib>Chu, Kexin</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><creatorcontrib>Xun, Yaling</creatorcontrib><creatorcontrib>Zhang, Jifu</creatorcontrib><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><title>International journal of machine learning and cybernetics</title><addtitle>Int. J. Mach. Learn. & Cyber</addtitle><description>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Complex Systems</subject><subject>Computational Intelligence</subject><subject>Control</subject><subject>Datasets</subject><subject>Engineering</subject><subject>Entropy</subject><subject>Entropy (Information theory)</subject><subject>Mechatronics</subject><subject>Original Article</subject><subject>Pattern Recognition</subject><subject>Prototypes</subject><subject>Robotics</subject><subject>Similarity</subject><subject>Similarity measures</subject><subject>Systems Biology</subject><issn>1868-8071</issn><issn>1868-808X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNqFkE1LxDAQhoMouOj-AU8Bz9XJV9Mcl8UvWPSi4C2kaepm2W5rkoL7781a0ZvOZQbmfWdeHoQuCFwRAHkdCQNOC6CsAKLKqiiP0IxUeaigej3-mSU5RfMYN5CrBMaAztDjAq_3dfANjr7zWxN82uPOmTgGV9Qmugbb7RiTC373hs0whN7YNW77gDv_kbcmpeDrMTncmGTO0UlrttHNv_sZerm9eV7eF6unu4flYlVYqqpUkJZJIkrhFJemkTmmAGkBKssarjghSrXUGUeFqa0FIx0B3irJOWXOUMbO0OV0N-d5H11MetOPYZdfaqo4pSAErf5REUEIFSKr6KSyoY8xuFYPwXcm7DUBfQCsJ8A6A9ZfgHWZTWwyxeFAxoXf03-4PgHaYnw8</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>Chu, Kexin</creator><creator>Zhang, Min</creator><creator>Xun, Yaling</creator><creator>Zhang, Jifu</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><orcidid>https://orcid.org/0000-0002-0396-8901</orcidid></search><sort><creationdate>20240401</creationdate><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><author>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Complex Systems</topic><topic>Computational Intelligence</topic><topic>Control</topic><topic>Datasets</topic><topic>Engineering</topic><topic>Entropy</topic><topic>Entropy (Information theory)</topic><topic>Mechatronics</topic><topic>Original Article</topic><topic>Pattern Recognition</topic><topic>Prototypes</topic><topic>Robotics</topic><topic>Similarity</topic><topic>Similarity measures</topic><topic>Systems Biology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chu, Kexin</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><creatorcontrib>Xun, Yaling</creatorcontrib><creatorcontrib>Zhang, Jifu</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>International journal of machine learning and cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chu, Kexin</au><au>Zhang, Min</au><au>Xun, Yaling</au><au>Zhang, Jifu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A hybrid similarity measure-based clustering approach for mixed attribute data</atitle><jtitle>International journal of machine learning and cybernetics</jtitle><stitle>Int. J. Mach. Learn. & Cyber</stitle><date>2024-04-01</date><risdate>2024</risdate><volume>15</volume><issue>4</issue><spage>1295</spage><epage>1311</epage><pages>1295-1311</pages><issn>1868-8071</issn><eissn>1868-808X</eissn><abstract>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s13042-023-01968-6</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-0396-8901</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1868-8071
ispartof	International journal of machine learning and cybernetics, 2024-04, Vol.15 (4), p.1295-1311
issn	1868-8071 1868-808X
language	eng
recordid	cdi_proquest_journals_2942205528
source	Springer Nature - Complete Springer Journals
subjects	Algorithms Artificial Intelligence Cluster analysis Clustering Complex Systems Computational Intelligence Control Datasets Engineering Entropy Entropy (Information theory) Mechatronics Original Article Pattern Recognition Prototypes Robotics Similarity Similarity measures Systems Biology
title	A hybrid similarity measure-based clustering approach for mixed attribute data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T11%3A41%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20hybrid%20similarity%20measure-based%20clustering%20approach%20for%20mixed%20attribute%20data&rft.jtitle=International%20journal%20of%20machine%20learning%20and%20cybernetics&rft.au=Chu,%20Kexin&rft.date=2024-04-01&rft.volume=15&rft.issue=4&rft.spage=1295&rft.epage=1311&rft.pages=1295-1311&rft.issn=1868-8071&rft.eissn=1868-808X&rft_id=info:doi/10.1007/s13042-023-01968-6&rft_dat=%3Cproquest_cross%3E2941511255%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2941511255&rft_id=info:pmid/&rfr_iscdi=true