A Fast Hybrid Feature Selection Based on Correlation-Guided Clustering and Particle Swarm Optimization for High-Dimensional Data

The "curse of dimensionality" and the high computational cost have still limited the application of the evolutionary algorithm in high-dimensional feature selection (FS) problems. This article proposes a new three-phase hybrid FS algorithm based on correlation-guided clustering and particl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on cybernetics 2022-09, Vol.52 (9), p.9573-9586
Hauptverfasser: Song, Xian-Fang, Zhang, Yong, Gong, Dun-Wei, Gao, Xiao-Zhi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 9586
container_issue 9
container_start_page 9573
container_title IEEE transactions on cybernetics
container_volume 52
creator Song, Xian-Fang
Zhang, Yong
Gong, Dun-Wei
Gao, Xiao-Zhi
description The "curse of dimensionality" and the high computational cost have still limited the application of the evolutionary algorithm in high-dimensional feature selection (FS) problems. This article proposes a new three-phase hybrid FS algorithm based on correlation-guided clustering and particle swarm optimization (PSO) (HFS-C-P) to tackle the above two problems at the same time. To this end, three kinds of FS methods are effectively integrated into the proposed algorithm based on their respective advantages. In the first and second phases, a filter FS method and a feature clustering-based method with low computational cost are designed to reduce the search space used by the third phase. After that, the third phase applies oneself to finding an optimal feature subset by using an evolutionary algorithm with the global searchability. Moreover, a symmetric uncertainty-based feature deletion method, a fast correlation-guided feature clustering strategy, and an improved integer PSO are developed to improve the performance of the three phases, respectively. Finally, the proposed algorithm is validated on 18 publicly available real-world datasets in comparison with nine FS algorithms. Experimental results show that the proposed algorithm can obtain a good feature subset with the lowest computational cost.
doi_str_mv 10.1109/TCYB.2021.3061152
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2704098426</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9380778</ieee_id><sourcerecordid>2502807689</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-add9060482f111b1bd102424a69535f5a5e55a737c460817421a847c619a24373</originalsourceid><addsrcrecordid>eNpdkU1r3DAQhkVpaUKaH1AKRdBLL97qW9YxcbLZQiCBpIeezKw9ThX8sZVkSnLKT4-2u91DddHw6nmHGb2EfORswTlz3-6rn-cLwQRfSGY41-INORbclIUQVr891MYekdMYH1k-ZZZc-Z4cSWmFc9Yck5czuoSY6OppHXxLlwhpDkjvsMcm-Wmk5xCxpbmophCwh61YXM2-zWrVzzFh8OMDhbGltxCSb_rs_gNhoDeb5Af__NdBuynQlX_4VVz4AceYJejpBST4QN510Ec83d8n5Mfy8r5aFdc3V9-rs-uikcqlAtrWMcNUKTrO-ZqvW86EEgqM01J3GjRqDVbaRpm8p1WCQ6lsY7gDoaSVJ-Trru8mTL9njKkefGyw72HEaY610EyUzOYPyuiX_9DHaQ554ExZppgrlTCZ4juqCVOMAbt6E_wA4anmrN4mVG8TqrcJ1fuEsufzvvO8HrA9OP7lkYFPO8Aj4uHZyTyZLeUrHQaS3g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2704098426</pqid></control><display><type>article</type><title>A Fast Hybrid Feature Selection Based on Correlation-Guided Clustering and Particle Swarm Optimization for High-Dimensional Data</title><source>IEEE Electronic Library (IEL)</source><creator>Song, Xian-Fang ; Zhang, Yong ; Gong, Dun-Wei ; Gao, Xiao-Zhi</creator><creatorcontrib>Song, Xian-Fang ; Zhang, Yong ; Gong, Dun-Wei ; Gao, Xiao-Zhi</creatorcontrib><description>The "curse of dimensionality" and the high computational cost have still limited the application of the evolutionary algorithm in high-dimensional feature selection (FS) problems. This article proposes a new three-phase hybrid FS algorithm based on correlation-guided clustering and particle swarm optimization (PSO) (HFS-C-P) to tackle the above two problems at the same time. To this end, three kinds of FS methods are effectively integrated into the proposed algorithm based on their respective advantages. In the first and second phases, a filter FS method and a feature clustering-based method with low computational cost are designed to reduce the search space used by the third phase. After that, the third phase applies oneself to finding an optimal feature subset by using an evolutionary algorithm with the global searchability. Moreover, a symmetric uncertainty-based feature deletion method, a fast correlation-guided feature clustering strategy, and an improved integer PSO are developed to improve the performance of the three phases, respectively. Finally, the proposed algorithm is validated on 18 publicly available real-world datasets in comparison with nine FS algorithms. Experimental results show that the proposed algorithm can obtain a good feature subset with the lowest computational cost.</description><identifier>ISSN: 2168-2267</identifier><identifier>EISSN: 2168-2275</identifier><identifier>DOI: 10.1109/TCYB.2021.3061152</identifier><identifier>PMID: 33729976</identifier><identifier>CODEN: ITCEB8</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Clustering ; Clustering algorithms ; Computational efficiency ; Computing costs ; Convergence ; Correlation ; Evolutionary algorithms ; Feature extraction ; Feature selection ; feature selection (FS) ; Genetic algorithms ; hybrid search ; Mutual information ; Particle swarm optimization ; particle swarm optimization (PSO) ; Search problems</subject><ispartof>IEEE transactions on cybernetics, 2022-09, Vol.52 (9), p.9573-9586</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-add9060482f111b1bd102424a69535f5a5e55a737c460817421a847c619a24373</citedby><cites>FETCH-LOGICAL-c349t-add9060482f111b1bd102424a69535f5a5e55a737c460817421a847c619a24373</cites><orcidid>0000-0003-0026-8181 ; 0000-0002-0078-5675 ; 0000-0003-2838-4301</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9380778$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9380778$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33729976$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Song, Xian-Fang</creatorcontrib><creatorcontrib>Zhang, Yong</creatorcontrib><creatorcontrib>Gong, Dun-Wei</creatorcontrib><creatorcontrib>Gao, Xiao-Zhi</creatorcontrib><title>A Fast Hybrid Feature Selection Based on Correlation-Guided Clustering and Particle Swarm Optimization for High-Dimensional Data</title><title>IEEE transactions on cybernetics</title><addtitle>TCYB</addtitle><addtitle>IEEE Trans Cybern</addtitle><description>The "curse of dimensionality" and the high computational cost have still limited the application of the evolutionary algorithm in high-dimensional feature selection (FS) problems. This article proposes a new three-phase hybrid FS algorithm based on correlation-guided clustering and particle swarm optimization (PSO) (HFS-C-P) to tackle the above two problems at the same time. To this end, three kinds of FS methods are effectively integrated into the proposed algorithm based on their respective advantages. In the first and second phases, a filter FS method and a feature clustering-based method with low computational cost are designed to reduce the search space used by the third phase. After that, the third phase applies oneself to finding an optimal feature subset by using an evolutionary algorithm with the global searchability. Moreover, a symmetric uncertainty-based feature deletion method, a fast correlation-guided feature clustering strategy, and an improved integer PSO are developed to improve the performance of the three phases, respectively. Finally, the proposed algorithm is validated on 18 publicly available real-world datasets in comparison with nine FS algorithms. Experimental results show that the proposed algorithm can obtain a good feature subset with the lowest computational cost.</description><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Computational efficiency</subject><subject>Computing costs</subject><subject>Convergence</subject><subject>Correlation</subject><subject>Evolutionary algorithms</subject><subject>Feature extraction</subject><subject>Feature selection</subject><subject>feature selection (FS)</subject><subject>Genetic algorithms</subject><subject>hybrid search</subject><subject>Mutual information</subject><subject>Particle swarm optimization</subject><subject>particle swarm optimization (PSO)</subject><subject>Search problems</subject><issn>2168-2267</issn><issn>2168-2275</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkU1r3DAQhkVpaUKaH1AKRdBLL97qW9YxcbLZQiCBpIeezKw9ThX8sZVkSnLKT4-2u91DddHw6nmHGb2EfORswTlz3-6rn-cLwQRfSGY41-INORbclIUQVr891MYekdMYH1k-ZZZc-Z4cSWmFc9Yck5czuoSY6OppHXxLlwhpDkjvsMcm-Wmk5xCxpbmophCwh61YXM2-zWrVzzFh8OMDhbGltxCSb_rs_gNhoDeb5Af__NdBuynQlX_4VVz4AceYJejpBST4QN510Ec83d8n5Mfy8r5aFdc3V9-rs-uikcqlAtrWMcNUKTrO-ZqvW86EEgqM01J3GjRqDVbaRpm8p1WCQ6lsY7gDoaSVJ-Trru8mTL9njKkefGyw72HEaY610EyUzOYPyuiX_9DHaQ554ExZppgrlTCZ4juqCVOMAbt6E_wA4anmrN4mVG8TqrcJ1fuEsufzvvO8HrA9OP7lkYFPO8Aj4uHZyTyZLeUrHQaS3g</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Song, Xian-Fang</creator><creator>Zhang, Yong</creator><creator>Gong, Dun-Wei</creator><creator>Gao, Xiao-Zhi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0026-8181</orcidid><orcidid>https://orcid.org/0000-0002-0078-5675</orcidid><orcidid>https://orcid.org/0000-0003-2838-4301</orcidid></search><sort><creationdate>20220901</creationdate><title>A Fast Hybrid Feature Selection Based on Correlation-Guided Clustering and Particle Swarm Optimization for High-Dimensional Data</title><author>Song, Xian-Fang ; Zhang, Yong ; Gong, Dun-Wei ; Gao, Xiao-Zhi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-add9060482f111b1bd102424a69535f5a5e55a737c460817421a847c619a24373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Computational efficiency</topic><topic>Computing costs</topic><topic>Convergence</topic><topic>Correlation</topic><topic>Evolutionary algorithms</topic><topic>Feature extraction</topic><topic>Feature selection</topic><topic>feature selection (FS)</topic><topic>Genetic algorithms</topic><topic>hybrid search</topic><topic>Mutual information</topic><topic>Particle swarm optimization</topic><topic>particle swarm optimization (PSO)</topic><topic>Search problems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Song, Xian-Fang</creatorcontrib><creatorcontrib>Zhang, Yong</creatorcontrib><creatorcontrib>Gong, Dun-Wei</creatorcontrib><creatorcontrib>Gao, Xiao-Zhi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Song, Xian-Fang</au><au>Zhang, Yong</au><au>Gong, Dun-Wei</au><au>Gao, Xiao-Zhi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Fast Hybrid Feature Selection Based on Correlation-Guided Clustering and Particle Swarm Optimization for High-Dimensional Data</atitle><jtitle>IEEE transactions on cybernetics</jtitle><stitle>TCYB</stitle><addtitle>IEEE Trans Cybern</addtitle><date>2022-09-01</date><risdate>2022</risdate><volume>52</volume><issue>9</issue><spage>9573</spage><epage>9586</epage><pages>9573-9586</pages><issn>2168-2267</issn><eissn>2168-2275</eissn><coden>ITCEB8</coden><abstract>The "curse of dimensionality" and the high computational cost have still limited the application of the evolutionary algorithm in high-dimensional feature selection (FS) problems. This article proposes a new three-phase hybrid FS algorithm based on correlation-guided clustering and particle swarm optimization (PSO) (HFS-C-P) to tackle the above two problems at the same time. To this end, three kinds of FS methods are effectively integrated into the proposed algorithm based on their respective advantages. In the first and second phases, a filter FS method and a feature clustering-based method with low computational cost are designed to reduce the search space used by the third phase. After that, the third phase applies oneself to finding an optimal feature subset by using an evolutionary algorithm with the global searchability. Moreover, a symmetric uncertainty-based feature deletion method, a fast correlation-guided feature clustering strategy, and an improved integer PSO are developed to improve the performance of the three phases, respectively. Finally, the proposed algorithm is validated on 18 publicly available real-world datasets in comparison with nine FS algorithms. Experimental results show that the proposed algorithm can obtain a good feature subset with the lowest computational cost.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>33729976</pmid><doi>10.1109/TCYB.2021.3061152</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-0026-8181</orcidid><orcidid>https://orcid.org/0000-0002-0078-5675</orcidid><orcidid>https://orcid.org/0000-0003-2838-4301</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2168-2267
ispartof IEEE transactions on cybernetics, 2022-09, Vol.52 (9), p.9573-9586
issn 2168-2267
2168-2275
language eng
recordid cdi_proquest_journals_2704098426
source IEEE Electronic Library (IEL)
subjects Clustering
Clustering algorithms
Computational efficiency
Computing costs
Convergence
Correlation
Evolutionary algorithms
Feature extraction
Feature selection
feature selection (FS)
Genetic algorithms
hybrid search
Mutual information
Particle swarm optimization
particle swarm optimization (PSO)
Search problems
title A Fast Hybrid Feature Selection Based on Correlation-Guided Clustering and Particle Swarm Optimization for High-Dimensional Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T02%3A48%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Fast%20Hybrid%20Feature%20Selection%20Based%20on%20Correlation-Guided%20Clustering%20and%20Particle%20Swarm%20Optimization%20for%20High-Dimensional%20Data&rft.jtitle=IEEE%20transactions%20on%20cybernetics&rft.au=Song,%20Xian-Fang&rft.date=2022-09-01&rft.volume=52&rft.issue=9&rft.spage=9573&rft.epage=9586&rft.pages=9573-9586&rft.issn=2168-2267&rft.eissn=2168-2275&rft.coden=ITCEB8&rft_id=info:doi/10.1109/TCYB.2021.3061152&rft_dat=%3Cproquest_RIE%3E2502807689%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2704098426&rft_id=info:pmid/33729976&rft_ieee_id=9380778&rfr_iscdi=true