A hybrid similarity measure-based clustering approach for mixed attribute data

In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of machine learning and cybernetics 2024-04, Vol.15 (4), p.1295-1311
Hauptverfasser: Chu, Kexin, Zhang, Min, Xun, Yaling, Zhang, Jifu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1311
container_issue 4
container_start_page 1295
container_title International journal of machine learning and cybernetics
container_volume 15
creator Chu, Kexin
Zhang, Min
Xun, Yaling
Zhang, Jifu
description In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.
doi_str_mv 10.1007/s13042-023-01968-6
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2942205528</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2941511255</sourcerecordid><originalsourceid>FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</originalsourceid><addsrcrecordid>eNqFkE1LxDAQhoMouOj-AU8Bz9XJV9Mcl8UvWPSi4C2kaepm2W5rkoL7781a0ZvOZQbmfWdeHoQuCFwRAHkdCQNOC6CsAKLKqiiP0IxUeaigej3-mSU5RfMYN5CrBMaAztDjAq_3dfANjr7zWxN82uPOmTgGV9Qmugbb7RiTC373hs0whN7YNW77gDv_kbcmpeDrMTncmGTO0UlrttHNv_sZerm9eV7eF6unu4flYlVYqqpUkJZJIkrhFJemkTmmAGkBKssarjghSrXUGUeFqa0FIx0B3irJOWXOUMbO0OV0N-d5H11MetOPYZdfaqo4pSAErf5REUEIFSKr6KSyoY8xuFYPwXcm7DUBfQCsJ8A6A9ZfgHWZTWwyxeFAxoXf03-4PgHaYnw8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2941511255</pqid></control><display><type>article</type><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><source>Springer Nature - Complete Springer Journals</source><creator>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</creator><creatorcontrib>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</creatorcontrib><description>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</description><identifier>ISSN: 1868-8071</identifier><identifier>EISSN: 1868-808X</identifier><identifier>DOI: 10.1007/s13042-023-01968-6</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Artificial Intelligence ; Cluster analysis ; Clustering ; Complex Systems ; Computational Intelligence ; Control ; Datasets ; Engineering ; Entropy ; Entropy (Information theory) ; Mechatronics ; Original Article ; Pattern Recognition ; Prototypes ; Robotics ; Similarity ; Similarity measures ; Systems Biology</subject><ispartof>International journal of machine learning and cybernetics, 2024-04, Vol.15 (4), p.1295-1311</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</cites><orcidid>0000-0002-0396-8901</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s13042-023-01968-6$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s13042-023-01968-6$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51298</link.rule.ids></links><search><creatorcontrib>Chu, Kexin</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><creatorcontrib>Xun, Yaling</creatorcontrib><creatorcontrib>Zhang, Jifu</creatorcontrib><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><title>International journal of machine learning and cybernetics</title><addtitle>Int. J. Mach. Learn. &amp; Cyber</addtitle><description>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Complex Systems</subject><subject>Computational Intelligence</subject><subject>Control</subject><subject>Datasets</subject><subject>Engineering</subject><subject>Entropy</subject><subject>Entropy (Information theory)</subject><subject>Mechatronics</subject><subject>Original Article</subject><subject>Pattern Recognition</subject><subject>Prototypes</subject><subject>Robotics</subject><subject>Similarity</subject><subject>Similarity measures</subject><subject>Systems Biology</subject><issn>1868-8071</issn><issn>1868-808X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNqFkE1LxDAQhoMouOj-AU8Bz9XJV9Mcl8UvWPSi4C2kaepm2W5rkoL7781a0ZvOZQbmfWdeHoQuCFwRAHkdCQNOC6CsAKLKqiiP0IxUeaigej3-mSU5RfMYN5CrBMaAztDjAq_3dfANjr7zWxN82uPOmTgGV9Qmugbb7RiTC373hs0whN7YNW77gDv_kbcmpeDrMTncmGTO0UlrttHNv_sZerm9eV7eF6unu4flYlVYqqpUkJZJIkrhFJemkTmmAGkBKssarjghSrXUGUeFqa0FIx0B3irJOWXOUMbO0OV0N-d5H11MetOPYZdfaqo4pSAErf5REUEIFSKr6KSyoY8xuFYPwXcm7DUBfQCsJ8A6A9ZfgHWZTWwyxeFAxoXf03-4PgHaYnw8</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>Chu, Kexin</creator><creator>Zhang, Min</creator><creator>Xun, Yaling</creator><creator>Zhang, Jifu</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><orcidid>https://orcid.org/0000-0002-0396-8901</orcidid></search><sort><creationdate>20240401</creationdate><title>A hybrid similarity measure-based clustering approach for mixed attribute data</title><author>Chu, Kexin ; Zhang, Min ; Xun, Yaling ; Zhang, Jifu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c298t-1f371565e947ad7808507c008c3d4941199f2eae25abcc0a7e104f974423ea233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Complex Systems</topic><topic>Computational Intelligence</topic><topic>Control</topic><topic>Datasets</topic><topic>Engineering</topic><topic>Entropy</topic><topic>Entropy (Information theory)</topic><topic>Mechatronics</topic><topic>Original Article</topic><topic>Pattern Recognition</topic><topic>Prototypes</topic><topic>Robotics</topic><topic>Similarity</topic><topic>Similarity measures</topic><topic>Systems Biology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chu, Kexin</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><creatorcontrib>Xun, Yaling</creatorcontrib><creatorcontrib>Zhang, Jifu</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>International journal of machine learning and cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chu, Kexin</au><au>Zhang, Min</au><au>Xun, Yaling</au><au>Zhang, Jifu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A hybrid similarity measure-based clustering approach for mixed attribute data</atitle><jtitle>International journal of machine learning and cybernetics</jtitle><stitle>Int. J. Mach. Learn. &amp; Cyber</stitle><date>2024-04-01</date><risdate>2024</risdate><volume>15</volume><issue>4</issue><spage>1295</spage><epage>1311</epage><pages>1295-1311</pages><issn>1868-8071</issn><eissn>1868-808X</eissn><abstract>In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s13042-023-01968-6</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-0396-8901</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1868-8071
ispartof International journal of machine learning and cybernetics, 2024-04, Vol.15 (4), p.1295-1311
issn 1868-8071
1868-808X
language eng
recordid cdi_proquest_journals_2942205528
source Springer Nature - Complete Springer Journals
subjects Algorithms
Artificial Intelligence
Cluster analysis
Clustering
Complex Systems
Computational Intelligence
Control
Datasets
Engineering
Entropy
Entropy (Information theory)
Mechatronics
Original Article
Pattern Recognition
Prototypes
Robotics
Similarity
Similarity measures
Systems Biology
title A hybrid similarity measure-based clustering approach for mixed attribute data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T11%3A41%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20hybrid%20similarity%20measure-based%20clustering%20approach%20for%20mixed%20attribute%20data&rft.jtitle=International%20journal%20of%20machine%20learning%20and%20cybernetics&rft.au=Chu,%20Kexin&rft.date=2024-04-01&rft.volume=15&rft.issue=4&rft.spage=1295&rft.epage=1311&rft.pages=1295-1311&rft.issn=1868-8071&rft.eissn=1868-808X&rft_id=info:doi/10.1007/s13042-023-01968-6&rft_dat=%3Cproquest_cross%3E2941511255%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2941511255&rft_id=info:pmid/&rfr_iscdi=true