A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities

Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural computing & applications 2021-11, Vol.33 (22), p.15091-15118
Hauptverfasser:	Abiodun, Esther Omolara, Alabdulatif, Abdulatif, Abiodun, Oludare Isaac, Alawida, Moatsum, Alabdulatif, Abdullah, Alkhawaldeh, Rami S.
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Best practice Classification Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data analysis Data Mining and Knowledge Discovery Heuristic methods Image Processing and Computer Vision Literature reviews New technology Optimization Outliers (statistics) Performance prediction Prediction models Probability and Statistics in Computer Science Redundancy Review Review Article Text categorization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	15118
container_issue	22
container_start_page	15091
container_title	Neural computing & applications
container_volume	33
creator	Abiodun, Esther Omolara Alabdulatif, Abdulatif Abiodun, Oludare Isaac Alawida, Moatsum Alabdulatif, Abdullah Alkhawaldeh, Rami S.
description	Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks. FS is a vital and indispensable technique that enables the model to perform faster, eliminate noisy data, remove redundancy, reduce overfitting, improve precision and increase generalization on testing data. While conventional FS techniques have been leveraged for classification tasks in the past few decades, they fail to optimally reduce the high dimensionality of the feature space of texts, thus breeding inefficient predictive models. Emerging technologies such as the metaheuristics and hyper-heuristics optimization methods provide a new paradigm for FS due to their efficiency in improving the accuracy of classification, computational demands, storage, as well as functioning seamlessly in solving complex optimization problems with less time. However, little details are known on best practices for case-to-case usage of emerging FS methods. The literature continues to be engulfed with clear and unclear findings in leveraging effective methods, which, if not performed accurately, alters precision, real-world-use feasibility, and the predictive model's overall performance. This paper reviews the present state of FS with respect to metaheuristics and hyper-heuristic methods. Through a systematic literature review of over 200 articles, we set out the most recent findings and trends to enlighten analysts, practitioners and researchers in the field of data analytics seeking clarity in understanding and implementing effective FS optimization methods for improved text classification tasks.
doi_str_mv	10.1007/s00521-021-06406-8
format	Article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8361413</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2585232458</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-c0d26d3bb3adaf1c20438219fcbc8991e36e513b80d2c65583c0cdd4819b6a063</originalsourceid><addsrcrecordid>eNp9UU1v1DAQtRCILgt_gAOyxIVLYBw7rsMBqarKh1SJC5wtx5nsukriYDtbyh_h79bZlPJx4DCy7PfmzRs_Qp4zeM0ATt9EgKpkBSwlBchCPSAbJjgvOFTqIdlALY4QPyFPYrwCACFV9ZiccCFA1FJsyM8zGm9iwsEkZ2nAg8Nr6juKA4adG3e0Q5PmgDRijzY5P1I_JTe4H-Z4GTDtfRtp58MKmJ4m_J6o7U2MrnP2yHtL0x7pFDDimGhMJiE1Y5tffJwW3QPm9smHNI8uOYxPyaPO9BGf3Z1b8vX9xZfzj8Xl5w-fzs8uCytORSostKVsedNw05qO2RIEVyWrO9tYVdcMucSK8UZlnpVVpbgF27ZCsbqRBiTfkner7jQ3A7Y22wum11PIm4Qb7Y3TfyOj2-udP2jFJROMZ4FXdwLBf5sxJj24aLHvzYh-jrqsZFkxybKxLXn5D_XKz2HM62WWqkpeimxwS8qVZfPfxIDdvRkGesldr7lrWGrJXS9NL_5c477lV9CZwFdCzNC4w_B79n9kbwE2Gb30</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2585232458</pqid></control><display><type>article</type><title>A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities</title><source>SpringerNature Complete Journals</source><creator>Abiodun, Esther Omolara ; Alabdulatif, Abdulatif ; Abiodun, Oludare Isaac ; Alawida, Moatsum ; Alabdulatif, Abdullah ; Alkhawaldeh, Rami S.</creator><creatorcontrib>Abiodun, Esther Omolara ; Alabdulatif, Abdulatif ; Abiodun, Oludare Isaac ; Alawida, Moatsum ; Alabdulatif, Abdullah ; Alkhawaldeh, Rami S.</creatorcontrib><description>Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks. FS is a vital and indispensable technique that enables the model to perform faster, eliminate noisy data, remove redundancy, reduce overfitting, improve precision and increase generalization on testing data. While conventional FS techniques have been leveraged for classification tasks in the past few decades, they fail to optimally reduce the high dimensionality of the feature space of texts, thus breeding inefficient predictive models. Emerging technologies such as the metaheuristics and hyper-heuristics optimization methods provide a new paradigm for FS due to their efficiency in improving the accuracy of classification, computational demands, storage, as well as functioning seamlessly in solving complex optimization problems with less time. However, little details are known on best practices for case-to-case usage of emerging FS methods. The literature continues to be engulfed with clear and unclear findings in leveraging effective methods, which, if not performed accurately, alters precision, real-world-use feasibility, and the predictive model's overall performance. This paper reviews the present state of FS with respect to metaheuristics and hyper-heuristic methods. Through a systematic literature review of over 200 articles, we set out the most recent findings and trends to enlighten analysts, practitioners and researchers in the field of data analytics seeking clarity in understanding and implementing effective FS optimization methods for improved text classification tasks.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-021-06406-8</identifier><identifier>PMID: 34404964</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Artificial Intelligence ; Best practice ; Classification ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Data analysis ; Data Mining and Knowledge Discovery ; Heuristic methods ; Image Processing and Computer Vision ; Literature reviews ; New technology ; Optimization ; Outliers (statistics) ; Performance prediction ; Prediction models ; Probability and Statistics in Computer Science ; Redundancy ; Review ; Review Article ; Text categorization</subject><ispartof>Neural computing & applications, 2021-11, Vol.33 (22), p.15091-15118</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-c0d26d3bb3adaf1c20438219fcbc8991e36e513b80d2c65583c0cdd4819b6a063</citedby><cites>FETCH-LOGICAL-c474t-c0d26d3bb3adaf1c20438219fcbc8991e36e513b80d2c65583c0cdd4819b6a063</cites><orcidid>0000-0002-7801-2541</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-021-06406-8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-021-06406-8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>230,314,780,784,885,27924,27925,41488,42557,51319</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34404964$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Abiodun, Esther Omolara</creatorcontrib><creatorcontrib>Alabdulatif, Abdulatif</creatorcontrib><creatorcontrib>Abiodun, Oludare Isaac</creatorcontrib><creatorcontrib>Alawida, Moatsum</creatorcontrib><creatorcontrib>Alabdulatif, Abdullah</creatorcontrib><creatorcontrib>Alkhawaldeh, Rami S.</creatorcontrib><title>A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities</title><title>Neural computing & applications</title><addtitle>Neural Comput & Applic</addtitle><addtitle>Neural Comput Appl</addtitle><description>Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks. FS is a vital and indispensable technique that enables the model to perform faster, eliminate noisy data, remove redundancy, reduce overfitting, improve precision and increase generalization on testing data. While conventional FS techniques have been leveraged for classification tasks in the past few decades, they fail to optimally reduce the high dimensionality of the feature space of texts, thus breeding inefficient predictive models. Emerging technologies such as the metaheuristics and hyper-heuristics optimization methods provide a new paradigm for FS due to their efficiency in improving the accuracy of classification, computational demands, storage, as well as functioning seamlessly in solving complex optimization problems with less time. However, little details are known on best practices for case-to-case usage of emerging FS methods. The literature continues to be engulfed with clear and unclear findings in leveraging effective methods, which, if not performed accurately, alters precision, real-world-use feasibility, and the predictive model's overall performance. This paper reviews the present state of FS with respect to metaheuristics and hyper-heuristic methods. Through a systematic literature review of over 200 articles, we set out the most recent findings and trends to enlighten analysts, practitioners and researchers in the field of data analytics seeking clarity in understanding and implementing effective FS optimization methods for improved text classification tasks.</description><subject>Artificial Intelligence</subject><subject>Best practice</subject><subject>Classification</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Data analysis</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Heuristic methods</subject><subject>Image Processing and Computer Vision</subject><subject>Literature reviews</subject><subject>New technology</subject><subject>Optimization</subject><subject>Outliers (statistics)</subject><subject>Performance prediction</subject><subject>Prediction models</subject><subject>Probability and Statistics in Computer Science</subject><subject>Redundancy</subject><subject>Review</subject><subject>Review Article</subject><subject>Text categorization</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9UU1v1DAQtRCILgt_gAOyxIVLYBw7rsMBqarKh1SJC5wtx5nsukriYDtbyh_h79bZlPJx4DCy7PfmzRs_Qp4zeM0ATt9EgKpkBSwlBchCPSAbJjgvOFTqIdlALY4QPyFPYrwCACFV9ZiccCFA1FJsyM8zGm9iwsEkZ2nAg8Nr6juKA4adG3e0Q5PmgDRijzY5P1I_JTe4H-Z4GTDtfRtp58MKmJ4m_J6o7U2MrnP2yHtL0x7pFDDimGhMJiE1Y5tffJwW3QPm9smHNI8uOYxPyaPO9BGf3Z1b8vX9xZfzj8Xl5w-fzs8uCytORSostKVsedNw05qO2RIEVyWrO9tYVdcMucSK8UZlnpVVpbgF27ZCsbqRBiTfkner7jQ3A7Y22wum11PIm4Qb7Y3TfyOj2-udP2jFJROMZ4FXdwLBf5sxJj24aLHvzYh-jrqsZFkxybKxLXn5D_XKz2HM62WWqkpeimxwS8qVZfPfxIDdvRkGesldr7lrWGrJXS9NL_5c477lV9CZwFdCzNC4w_B79n9kbwE2Gb30</recordid><startdate>20211101</startdate><enddate>20211101</enddate><creator>Abiodun, Esther Omolara</creator><creator>Alabdulatif, Abdulatif</creator><creator>Abiodun, Oludare Isaac</creator><creator>Alawida, Moatsum</creator><creator>Alabdulatif, Abdullah</creator><creator>Alkhawaldeh, Rami S.</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-7801-2541</orcidid></search><sort><creationdate>20211101</creationdate><title>A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities</title><author>Abiodun, Esther Omolara ; Alabdulatif, Abdulatif ; Abiodun, Oludare Isaac ; Alawida, Moatsum ; Alabdulatif, Abdullah ; Alkhawaldeh, Rami S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-c0d26d3bb3adaf1c20438219fcbc8991e36e513b80d2c65583c0cdd4819b6a063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial Intelligence</topic><topic>Best practice</topic><topic>Classification</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Data analysis</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Heuristic methods</topic><topic>Image Processing and Computer Vision</topic><topic>Literature reviews</topic><topic>New technology</topic><topic>Optimization</topic><topic>Outliers (statistics)</topic><topic>Performance prediction</topic><topic>Prediction models</topic><topic>Probability and Statistics in Computer Science</topic><topic>Redundancy</topic><topic>Review</topic><topic>Review Article</topic><topic>Text categorization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Abiodun, Esther Omolara</creatorcontrib><creatorcontrib>Alabdulatif, Abdulatif</creatorcontrib><creatorcontrib>Abiodun, Oludare Isaac</creatorcontrib><creatorcontrib>Alawida, Moatsum</creatorcontrib><creatorcontrib>Alabdulatif, Abdullah</creatorcontrib><creatorcontrib>Alkhawaldeh, Rami S.</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Neural computing & applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Abiodun, Esther Omolara</au><au>Alabdulatif, Abdulatif</au><au>Abiodun, Oludare Isaac</au><au>Alawida, Moatsum</au><au>Alabdulatif, Abdullah</au><au>Alkhawaldeh, Rami S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities</atitle><jtitle>Neural computing & applications</jtitle><stitle>Neural Comput & Applic</stitle><addtitle>Neural Comput Appl</addtitle><date>2021-11-01</date><risdate>2021</risdate><volume>33</volume><issue>22</issue><spage>15091</spage><epage>15118</epage><pages>15091-15118</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks. FS is a vital and indispensable technique that enables the model to perform faster, eliminate noisy data, remove redundancy, reduce overfitting, improve precision and increase generalization on testing data. While conventional FS techniques have been leveraged for classification tasks in the past few decades, they fail to optimally reduce the high dimensionality of the feature space of texts, thus breeding inefficient predictive models. Emerging technologies such as the metaheuristics and hyper-heuristics optimization methods provide a new paradigm for FS due to their efficiency in improving the accuracy of classification, computational demands, storage, as well as functioning seamlessly in solving complex optimization problems with less time. However, little details are known on best practices for case-to-case usage of emerging FS methods. The literature continues to be engulfed with clear and unclear findings in leveraging effective methods, which, if not performed accurately, alters precision, real-world-use feasibility, and the predictive model's overall performance. This paper reviews the present state of FS with respect to metaheuristics and hyper-heuristic methods. Through a systematic literature review of over 200 articles, we set out the most recent findings and trends to enlighten analysts, practitioners and researchers in the field of data analytics seeking clarity in understanding and implementing effective FS optimization methods for improved text classification tasks.</abstract><cop>London</cop><pub>Springer London</pub><pmid>34404964</pmid><doi>10.1007/s00521-021-06406-8</doi><tpages>28</tpages><orcidid>https://orcid.org/0000-0002-7801-2541</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0941-0643
ispartof	Neural computing & applications, 2021-11, Vol.33 (22), p.15091-15118
issn	0941-0643 1433-3058
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8361413
source	SpringerNature Complete Journals
subjects	Artificial Intelligence Best practice Classification Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data analysis Data Mining and Knowledge Discovery Heuristic methods Image Processing and Computer Vision Literature reviews New technology Optimization Outliers (statistics) Performance prediction Prediction models Probability and Statistics in Computer Science Redundancy Review Review Article Text categorization
title	A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T06%3A20%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20systematic%20review%20of%20emerging%20feature%20selection%20optimization%20methods%20for%20optimal%20text%20classification:%20the%20present%20state%20and%20prospective%20opportunities&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Abiodun,%20Esther%20Omolara&rft.date=2021-11-01&rft.volume=33&rft.issue=22&rft.spage=15091&rft.epage=15118&rft.pages=15091-15118&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-021-06406-8&rft_dat=%3Cproquest_pubme%3E2585232458%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2585232458&rft_id=info:pmid/34404964&rfr_iscdi=true