Filter-based relevance and instance selection

Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mourtji, Basma El, Ouaderhman, Tayeb, Chamlal, Hasna
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page
container_title
container_volume 3034
creator Mourtji, Basma El
Ouaderhman, Tayeb
Chamlal, Hasna
description Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting the features. In this paper, we rank features according to a fitness function that relies on the relevancy using all instances and on the relevancy using a maximum number of significant instances (which really contribute to the feature positive relevancy with the target variable). The relevancy is based on preordonnances theory where the instances are expressed in pairs. The proposed algorithm can be mainly divided into three steps, namely, (a) eliminating all features that are in disagree with the target feature, (b) Finding the best subset of instances, to each feature, that maximize the relevancy and which the cardinal tends to the cardinal of instances in the original dataset, and (c) Ranking features. The second step is defined by dividing the dataset (instances of each feature) into several consistent regions by fuzzy clustering. Then, performing GA-based instances selection independently within each cluster. Finally, aggregating of the partial results by the ensemble voting. Experimental results verify the effectiveness of the proposed method.
doi_str_mv 10.1063/5.0194692
format Conference Proceeding
fullrecord <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_scitation_primary_10_1063_5_0194692</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2937429968</sourcerecordid><originalsourceid>FETCH-LOGICAL-p1682-79268148fac6daebff868afe9b2d459a53e344684f799ff6b8337f4859573afb3</originalsourceid><addsrcrecordid>eNotUM1Kw0AYXETBWD34BgFvwtb9_Xa_oxSrQqGXCt6WTbILKTGJu6ng25vanoYZhplhCLnnbMkZyCe9ZBwVoLggBdeaUwMcLknBGCoqlPy8Jjc57xkTaIwtCF233RQSrXwOTZlCF358X4fS903Z9nn6J3mW66kd-ltyFX2Xw90ZF-Rj_bJbvdHN9vV99byhIwcrqEEBlisbfQ2ND1WMFqyPASvRKI1eyyCVAquiQYwRKiulicpq1Eb6WMkFeTjljmn4PoQ8uf1wSP1c6QRKowQi2Nn1eHLlup38cZ8bU_vl06_jzB3vcNqd75B_Q6pQVQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>2937429968</pqid></control><display><type>conference_proceeding</type><title>Filter-based relevance and instance selection</title><source>AIP Journals Complete</source><creator>Mourtji, Basma El ; Ouaderhman, Tayeb ; Chamlal, Hasna</creator><contributor>Belhamadia, Youssef ; Seaid, Mohammed</contributor><creatorcontrib>Mourtji, Basma El ; Ouaderhman, Tayeb ; Chamlal, Hasna ; Belhamadia, Youssef ; Seaid, Mohammed</creatorcontrib><description>Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting the features. In this paper, we rank features according to a fitness function that relies on the relevancy using all instances and on the relevancy using a maximum number of significant instances (which really contribute to the feature positive relevancy with the target variable). The relevancy is based on preordonnances theory where the instances are expressed in pairs. The proposed algorithm can be mainly divided into three steps, namely, (a) eliminating all features that are in disagree with the target feature, (b) Finding the best subset of instances, to each feature, that maximize the relevancy and which the cardinal tends to the cardinal of instances in the original dataset, and (c) Ranking features. The second step is defined by dividing the dataset (instances of each feature) into several consistent regions by fuzzy clustering. Then, performing GA-based instances selection independently within each cluster. Finally, aggregating of the partial results by the ensemble voting. Experimental results verify the effectiveness of the proposed method.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0194692</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Algorithms ; Clustering ; Datasets</subject><ispartof>AIP conference proceedings, 2024, Vol.3034 (1)</ispartof><rights>Author(s)</rights><rights>2024 Author(s). Published by AIP Publishing.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/acp/article-lookup/doi/10.1063/5.0194692$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>309,310,314,777,781,786,787,791,4499,23912,23913,25122,27906,27907,76134</link.rule.ids></links><search><contributor>Belhamadia, Youssef</contributor><contributor>Seaid, Mohammed</contributor><creatorcontrib>Mourtji, Basma El</creatorcontrib><creatorcontrib>Ouaderhman, Tayeb</creatorcontrib><creatorcontrib>Chamlal, Hasna</creatorcontrib><title>Filter-based relevance and instance selection</title><title>AIP conference proceedings</title><description>Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting the features. In this paper, we rank features according to a fitness function that relies on the relevancy using all instances and on the relevancy using a maximum number of significant instances (which really contribute to the feature positive relevancy with the target variable). The relevancy is based on preordonnances theory where the instances are expressed in pairs. The proposed algorithm can be mainly divided into three steps, namely, (a) eliminating all features that are in disagree with the target feature, (b) Finding the best subset of instances, to each feature, that maximize the relevancy and which the cardinal tends to the cardinal of instances in the original dataset, and (c) Ranking features. The second step is defined by dividing the dataset (instances of each feature) into several consistent regions by fuzzy clustering. Then, performing GA-based instances selection independently within each cluster. Finally, aggregating of the partial results by the ensemble voting. Experimental results verify the effectiveness of the proposed method.</description><subject>Algorithms</subject><subject>Clustering</subject><subject>Datasets</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotUM1Kw0AYXETBWD34BgFvwtb9_Xa_oxSrQqGXCt6WTbILKTGJu6ng25vanoYZhplhCLnnbMkZyCe9ZBwVoLggBdeaUwMcLknBGCoqlPy8Jjc57xkTaIwtCF233RQSrXwOTZlCF358X4fS903Z9nn6J3mW66kd-ltyFX2Xw90ZF-Rj_bJbvdHN9vV99byhIwcrqEEBlisbfQ2ND1WMFqyPASvRKI1eyyCVAquiQYwRKiulicpq1Eb6WMkFeTjljmn4PoQ8uf1wSP1c6QRKowQi2Nn1eHLlup38cZ8bU_vl06_jzB3vcNqd75B_Q6pQVQ</recordid><startdate>20240305</startdate><enddate>20240305</enddate><creator>Mourtji, Basma El</creator><creator>Ouaderhman, Tayeb</creator><creator>Chamlal, Hasna</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20240305</creationdate><title>Filter-based relevance and instance selection</title><author>Mourtji, Basma El ; Ouaderhman, Tayeb ; Chamlal, Hasna</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p1682-79268148fac6daebff868afe9b2d459a53e344684f799ff6b8337f4859573afb3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Clustering</topic><topic>Datasets</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mourtji, Basma El</creatorcontrib><creatorcontrib>Ouaderhman, Tayeb</creatorcontrib><creatorcontrib>Chamlal, Hasna</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mourtji, Basma El</au><au>Ouaderhman, Tayeb</au><au>Chamlal, Hasna</au><au>Belhamadia, Youssef</au><au>Seaid, Mohammed</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Filter-based relevance and instance selection</atitle><btitle>AIP conference proceedings</btitle><date>2024-03-05</date><risdate>2024</risdate><volume>3034</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting the features. In this paper, we rank features according to a fitness function that relies on the relevancy using all instances and on the relevancy using a maximum number of significant instances (which really contribute to the feature positive relevancy with the target variable). The relevancy is based on preordonnances theory where the instances are expressed in pairs. The proposed algorithm can be mainly divided into three steps, namely, (a) eliminating all features that are in disagree with the target feature, (b) Finding the best subset of instances, to each feature, that maximize the relevancy and which the cardinal tends to the cardinal of instances in the original dataset, and (c) Ranking features. The second step is defined by dividing the dataset (instances of each feature) into several consistent regions by fuzzy clustering. Then, performing GA-based instances selection independently within each cluster. Finally, aggregating of the partial results by the ensemble voting. Experimental results verify the effectiveness of the proposed method.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0194692</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0094-243X
ispartof AIP conference proceedings, 2024, Vol.3034 (1)
issn 0094-243X
1551-7616
language eng
recordid cdi_scitation_primary_10_1063_5_0194692
source AIP Journals Complete
subjects Algorithms
Clustering
Datasets
title Filter-based relevance and instance selection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T10%3A29%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Filter-based%20relevance%20and%20instance%20selection&rft.btitle=AIP%20conference%20proceedings&rft.au=Mourtji,%20Basma%20El&rft.date=2024-03-05&rft.volume=3034&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0194692&rft_dat=%3Cproquest_scita%3E2937429968%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2937429968&rft_id=info:pmid/&rfr_iscdi=true