Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm

In the era of digital boom, single classifier cannot perform well in various datasets. Ensemble classifier aims to bridge this performance gap by combining multiple classifiers of diverse characteristics to get better generalization. But classifier selection highly depends on the dataset, and its ef...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge and information systems 2020-02, Vol.62 (2), p.423-455
Hauptverfasser: Das, Asit Kumar, Pati, Soumen Kumar, Ghosh, Arka
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 455
container_issue 2
container_start_page 423
container_title Knowledge and information systems
container_volume 62
creator Das, Asit Kumar
Pati, Soumen Kumar
Ghosh, Arka
description In the era of digital boom, single classifier cannot perform well in various datasets. Ensemble classifier aims to bridge this performance gap by combining multiple classifiers of diverse characteristics to get better generalization. But classifier selection highly depends on the dataset, and its efficiency degrades tremendously due to the presence of irrelevant features. Feature selection aids the performance of classifier by removing those irrelevant features. Initially, we have proposed a bi-objective genetic algorithm-based feature selection method (FSBOGA), where nonlinear, uniform, hybrid cellular automata are used to generate an initial population. Objective functions are defined using lower bound approximation of rough set theory and Kullback–Leibler divergence method of information theory to select unambiguous and informative features. The replacement strategy for creation of next-generation population is based on the Pareto optimal solution with respect to both the objective functions. Next, a novel bi-objective genetic algorithm-based ensemble classification method (CCBOGA) is devised to ensemble the individual classifiers designed using obtained reduced datasets. It is observed that the constructed ensemble classifier performs better than the individual classifiers. The performances of proposed FSBOGA and CCBOGA are investigated on some popular datasets and compared with the state - of - the - art algorithms to demonstrate their effectiveness.
doi_str_mv 10.1007/s10115-019-01341-6
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2188083785</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2188083785</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-a88786b0ed54ef3d8c19e6d784b5dd27f61fe1af4bdc8d5243d6a6df4f0a5eb73</originalsourceid><addsrcrecordid>eNp9kEtLxDAUhYMoOI7-AVcB19Xcpk0ySxl8gSCIbg1pc1MzdNIxaQf892acAXcuLvfBOefCR8glsGtgTN4kYAB1wWCRi1dQiCMyY2VeOYA4PszApTwlZymtGAMpAGbk4xV73JowUodmnCLSlA_t6IdATbAUQ8J10yNte5OSdx4jtZh8F-iUfOho44uhWe0cW6QdBhx9S03fDdGPn-tzcuJMn_Di0Ofk_f7ubflYPL88PC1vn4uWw2IsjFJSiYahrSt03KoWFiisVFVTW1tKJ8AhGFc1tlW2LituhRHWVY6ZGhvJ5-Rqn7uJw9eEadSrYYohv9QlKMUUl6rOqnKvauOQUkSnN9GvTfzWwPSOo95z1Jmj_uWoRTbxvSllcegw_kX_4_oBBKV32Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2188083785</pqid></control><display><type>article</type><title>Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm</title><source>Springer Nature - Complete Springer Journals</source><creator>Das, Asit Kumar ; Pati, Soumen Kumar ; Ghosh, Arka</creator><creatorcontrib>Das, Asit Kumar ; Pati, Soumen Kumar ; Ghosh, Arka</creatorcontrib><description>In the era of digital boom, single classifier cannot perform well in various datasets. Ensemble classifier aims to bridge this performance gap by combining multiple classifiers of diverse characteristics to get better generalization. But classifier selection highly depends on the dataset, and its efficiency degrades tremendously due to the presence of irrelevant features. Feature selection aids the performance of classifier by removing those irrelevant features. Initially, we have proposed a bi-objective genetic algorithm-based feature selection method (FSBOGA), where nonlinear, uniform, hybrid cellular automata are used to generate an initial population. Objective functions are defined using lower bound approximation of rough set theory and Kullback–Leibler divergence method of information theory to select unambiguous and informative features. The replacement strategy for creation of next-generation population is based on the Pareto optimal solution with respect to both the objective functions. Next, a novel bi-objective genetic algorithm-based ensemble classification method (CCBOGA) is devised to ensemble the individual classifiers designed using obtained reduced datasets. It is observed that the constructed ensemble classifier performs better than the individual classifiers. The performances of proposed FSBOGA and CCBOGA are investigated on some popular datasets and compared with the state - of - the - art algorithms to demonstrate their effectiveness.</description><identifier>ISSN: 0219-1377</identifier><identifier>EISSN: 0219-3116</identifier><identifier>DOI: 10.1007/s10115-019-01341-6</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Cellular automata ; Classifiers ; Computer Science ; Data Mining and Knowledge Discovery ; Database Management ; Datasets ; Divergence ; Feature selection ; Genetic algorithms ; Information Storage and Retrieval ; Information Systems and Communication Service ; Information Systems Applications (incl.Internet) ; Information theory ; IT in Business ; Lower bounds ; Pareto optimum ; Regular Paper ; Set theory</subject><ispartof>Knowledge and information systems, 2020-02, Vol.62 (2), p.423-455</ispartof><rights>Springer-Verlag London Ltd., part of Springer Nature 2019</rights><rights>Knowledge and Information Systems is a copyright of Springer, (2019). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-a88786b0ed54ef3d8c19e6d784b5dd27f61fe1af4bdc8d5243d6a6df4f0a5eb73</citedby><cites>FETCH-LOGICAL-c319t-a88786b0ed54ef3d8c19e6d784b5dd27f61fe1af4bdc8d5243d6a6df4f0a5eb73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10115-019-01341-6$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10115-019-01341-6$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51298</link.rule.ids></links><search><creatorcontrib>Das, Asit Kumar</creatorcontrib><creatorcontrib>Pati, Soumen Kumar</creatorcontrib><creatorcontrib>Ghosh, Arka</creatorcontrib><title>Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm</title><title>Knowledge and information systems</title><addtitle>Knowl Inf Syst</addtitle><description>In the era of digital boom, single classifier cannot perform well in various datasets. Ensemble classifier aims to bridge this performance gap by combining multiple classifiers of diverse characteristics to get better generalization. But classifier selection highly depends on the dataset, and its efficiency degrades tremendously due to the presence of irrelevant features. Feature selection aids the performance of classifier by removing those irrelevant features. Initially, we have proposed a bi-objective genetic algorithm-based feature selection method (FSBOGA), where nonlinear, uniform, hybrid cellular automata are used to generate an initial population. Objective functions are defined using lower bound approximation of rough set theory and Kullback–Leibler divergence method of information theory to select unambiguous and informative features. The replacement strategy for creation of next-generation population is based on the Pareto optimal solution with respect to both the objective functions. Next, a novel bi-objective genetic algorithm-based ensemble classification method (CCBOGA) is devised to ensemble the individual classifiers designed using obtained reduced datasets. It is observed that the constructed ensemble classifier performs better than the individual classifiers. The performances of proposed FSBOGA and CCBOGA are investigated on some popular datasets and compared with the state - of - the - art algorithms to demonstrate their effectiveness.</description><subject>Cellular automata</subject><subject>Classifiers</subject><subject>Computer Science</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Database Management</subject><subject>Datasets</subject><subject>Divergence</subject><subject>Feature selection</subject><subject>Genetic algorithms</subject><subject>Information Storage and Retrieval</subject><subject>Information Systems and Communication Service</subject><subject>Information Systems Applications (incl.Internet)</subject><subject>Information theory</subject><subject>IT in Business</subject><subject>Lower bounds</subject><subject>Pareto optimum</subject><subject>Regular Paper</subject><subject>Set theory</subject><issn>0219-1377</issn><issn>0219-3116</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kEtLxDAUhYMoOI7-AVcB19Xcpk0ySxl8gSCIbg1pc1MzdNIxaQf892acAXcuLvfBOefCR8glsGtgTN4kYAB1wWCRi1dQiCMyY2VeOYA4PszApTwlZymtGAMpAGbk4xV73JowUodmnCLSlA_t6IdATbAUQ8J10yNte5OSdx4jtZh8F-iUfOho44uhWe0cW6QdBhx9S03fDdGPn-tzcuJMn_Di0Ofk_f7ubflYPL88PC1vn4uWw2IsjFJSiYahrSt03KoWFiisVFVTW1tKJ8AhGFc1tlW2LituhRHWVY6ZGhvJ5-Rqn7uJw9eEadSrYYohv9QlKMUUl6rOqnKvauOQUkSnN9GvTfzWwPSOo95z1Jmj_uWoRTbxvSllcegw_kX_4_oBBKV32Q</recordid><startdate>20200201</startdate><enddate>20200201</enddate><creator>Das, Asit Kumar</creator><creator>Pati, Soumen Kumar</creator><creator>Ghosh, Arka</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope></search><sort><creationdate>20200201</creationdate><title>Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm</title><author>Das, Asit Kumar ; Pati, Soumen Kumar ; Ghosh, Arka</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-a88786b0ed54ef3d8c19e6d784b5dd27f61fe1af4bdc8d5243d6a6df4f0a5eb73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Cellular automata</topic><topic>Classifiers</topic><topic>Computer Science</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Database Management</topic><topic>Datasets</topic><topic>Divergence</topic><topic>Feature selection</topic><topic>Genetic algorithms</topic><topic>Information Storage and Retrieval</topic><topic>Information Systems and Communication Service</topic><topic>Information Systems Applications (incl.Internet)</topic><topic>Information theory</topic><topic>IT in Business</topic><topic>Lower bounds</topic><topic>Pareto optimum</topic><topic>Regular Paper</topic><topic>Set theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Das, Asit Kumar</creatorcontrib><creatorcontrib>Pati, Soumen Kumar</creatorcontrib><creatorcontrib>Ghosh, Arka</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Knowledge and information systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Das, Asit Kumar</au><au>Pati, Soumen Kumar</au><au>Ghosh, Arka</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm</atitle><jtitle>Knowledge and information systems</jtitle><stitle>Knowl Inf Syst</stitle><date>2020-02-01</date><risdate>2020</risdate><volume>62</volume><issue>2</issue><spage>423</spage><epage>455</epage><pages>423-455</pages><issn>0219-1377</issn><eissn>0219-3116</eissn><abstract>In the era of digital boom, single classifier cannot perform well in various datasets. Ensemble classifier aims to bridge this performance gap by combining multiple classifiers of diverse characteristics to get better generalization. But classifier selection highly depends on the dataset, and its efficiency degrades tremendously due to the presence of irrelevant features. Feature selection aids the performance of classifier by removing those irrelevant features. Initially, we have proposed a bi-objective genetic algorithm-based feature selection method (FSBOGA), where nonlinear, uniform, hybrid cellular automata are used to generate an initial population. Objective functions are defined using lower bound approximation of rough set theory and Kullback–Leibler divergence method of information theory to select unambiguous and informative features. The replacement strategy for creation of next-generation population is based on the Pareto optimal solution with respect to both the objective functions. Next, a novel bi-objective genetic algorithm-based ensemble classification method (CCBOGA) is devised to ensemble the individual classifiers designed using obtained reduced datasets. It is observed that the constructed ensemble classifier performs better than the individual classifiers. The performances of proposed FSBOGA and CCBOGA are investigated on some popular datasets and compared with the state - of - the - art algorithms to demonstrate their effectiveness.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s10115-019-01341-6</doi><tpages>33</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0219-1377
ispartof Knowledge and information systems, 2020-02, Vol.62 (2), p.423-455
issn 0219-1377
0219-3116
language eng
recordid cdi_proquest_journals_2188083785
source Springer Nature - Complete Springer Journals
subjects Cellular automata
Classifiers
Computer Science
Data Mining and Knowledge Discovery
Database Management
Datasets
Divergence
Feature selection
Genetic algorithms
Information Storage and Retrieval
Information Systems and Communication Service
Information Systems Applications (incl.Internet)
Information theory
IT in Business
Lower bounds
Pareto optimum
Regular Paper
Set theory
title Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A48%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Relevant%20feature%20selection%20and%20ensemble%20classifier%20design%20using%20bi-objective%20genetic%20algorithm&rft.jtitle=Knowledge%20and%20information%20systems&rft.au=Das,%20Asit%20Kumar&rft.date=2020-02-01&rft.volume=62&rft.issue=2&rft.spage=423&rft.epage=455&rft.pages=423-455&rft.issn=0219-1377&rft.eissn=0219-3116&rft_id=info:doi/10.1007/s10115-019-01341-6&rft_dat=%3Cproquest_cross%3E2188083785%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2188083785&rft_id=info:pmid/&rfr_iscdi=true