Filter-based relevance and instance selection
Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting t...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 1 |
container_start_page | |
container_title | |
container_volume | 3034 |
creator | Mourtji, Basma El Ouaderhman, Tayeb Chamlal, Hasna |
description | Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting the features. In this paper, we rank features according to a fitness function that relies on the relevancy using all instances and on the relevancy using a maximum number of significant instances (which really contribute to the feature positive relevancy with the target variable). The relevancy is based on preordonnances theory where the instances are expressed in pairs.
The proposed algorithm can be mainly divided into three steps, namely, (a) eliminating all features that are in disagree with the target feature, (b) Finding the best subset of instances, to each feature, that maximize the relevancy and which the cardinal tends to the cardinal of instances in the original dataset, and (c) Ranking features. The second step is defined by dividing the dataset (instances of each feature) into several consistent regions by fuzzy clustering. Then, performing GA-based instances selection independently within each cluster. Finally, aggregating of the partial results by the ensemble voting. Experimental results verify the effectiveness of the proposed method. |
doi_str_mv | 10.1063/5.0194692 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_scitation_primary_10_1063_5_0194692</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2937429968</sourcerecordid><originalsourceid>FETCH-LOGICAL-p1682-79268148fac6daebff868afe9b2d459a53e344684f799ff6b8337f4859573afb3</originalsourceid><addsrcrecordid>eNotUM1Kw0AYXETBWD34BgFvwtb9_Xa_oxSrQqGXCt6WTbILKTGJu6ng25vanoYZhplhCLnnbMkZyCe9ZBwVoLggBdeaUwMcLknBGCoqlPy8Jjc57xkTaIwtCF233RQSrXwOTZlCF358X4fS903Z9nn6J3mW66kd-ltyFX2Xw90ZF-Rj_bJbvdHN9vV99byhIwcrqEEBlisbfQ2ND1WMFqyPASvRKI1eyyCVAquiQYwRKiulicpq1Eb6WMkFeTjljmn4PoQ8uf1wSP1c6QRKowQi2Nn1eHLlup38cZ8bU_vl06_jzB3vcNqd75B_Q6pQVQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>2937429968</pqid></control><display><type>conference_proceeding</type><title>Filter-based relevance and instance selection</title><source>AIP Journals Complete</source><creator>Mourtji, Basma El ; Ouaderhman, Tayeb ; Chamlal, Hasna</creator><contributor>Belhamadia, Youssef ; Seaid, Mohammed</contributor><creatorcontrib>Mourtji, Basma El ; Ouaderhman, Tayeb ; Chamlal, Hasna ; Belhamadia, Youssef ; Seaid, Mohammed</creatorcontrib><description>Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting the features. In this paper, we rank features according to a fitness function that relies on the relevancy using all instances and on the relevancy using a maximum number of significant instances (which really contribute to the feature positive relevancy with the target variable). The relevancy is based on preordonnances theory where the instances are expressed in pairs.
The proposed algorithm can be mainly divided into three steps, namely, (a) eliminating all features that are in disagree with the target feature, (b) Finding the best subset of instances, to each feature, that maximize the relevancy and which the cardinal tends to the cardinal of instances in the original dataset, and (c) Ranking features. The second step is defined by dividing the dataset (instances of each feature) into several consistent regions by fuzzy clustering. Then, performing GA-based instances selection independently within each cluster. Finally, aggregating of the partial results by the ensemble voting. Experimental results verify the effectiveness of the proposed method.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0194692</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Algorithms ; Clustering ; Datasets</subject><ispartof>AIP conference proceedings, 2024, Vol.3034 (1)</ispartof><rights>Author(s)</rights><rights>2024 Author(s). Published by AIP Publishing.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/acp/article-lookup/doi/10.1063/5.0194692$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>309,310,314,777,781,786,787,791,4499,23912,23913,25122,27906,27907,76134</link.rule.ids></links><search><contributor>Belhamadia, Youssef</contributor><contributor>Seaid, Mohammed</contributor><creatorcontrib>Mourtji, Basma El</creatorcontrib><creatorcontrib>Ouaderhman, Tayeb</creatorcontrib><creatorcontrib>Chamlal, Hasna</creatorcontrib><title>Filter-based relevance and instance selection</title><title>AIP conference proceedings</title><description>Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting the features. In this paper, we rank features according to a fitness function that relies on the relevancy using all instances and on the relevancy using a maximum number of significant instances (which really contribute to the feature positive relevancy with the target variable). The relevancy is based on preordonnances theory where the instances are expressed in pairs.
The proposed algorithm can be mainly divided into three steps, namely, (a) eliminating all features that are in disagree with the target feature, (b) Finding the best subset of instances, to each feature, that maximize the relevancy and which the cardinal tends to the cardinal of instances in the original dataset, and (c) Ranking features. The second step is defined by dividing the dataset (instances of each feature) into several consistent regions by fuzzy clustering. Then, performing GA-based instances selection independently within each cluster. Finally, aggregating of the partial results by the ensemble voting. Experimental results verify the effectiveness of the proposed method.</description><subject>Algorithms</subject><subject>Clustering</subject><subject>Datasets</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotUM1Kw0AYXETBWD34BgFvwtb9_Xa_oxSrQqGXCt6WTbILKTGJu6ng25vanoYZhplhCLnnbMkZyCe9ZBwVoLggBdeaUwMcLknBGCoqlPy8Jjc57xkTaIwtCF233RQSrXwOTZlCF358X4fS903Z9nn6J3mW66kd-ltyFX2Xw90ZF-Rj_bJbvdHN9vV99byhIwcrqEEBlisbfQ2ND1WMFqyPASvRKI1eyyCVAquiQYwRKiulicpq1Eb6WMkFeTjljmn4PoQ8uf1wSP1c6QRKowQi2Nn1eHLlup38cZ8bU_vl06_jzB3vcNqd75B_Q6pQVQ</recordid><startdate>20240305</startdate><enddate>20240305</enddate><creator>Mourtji, Basma El</creator><creator>Ouaderhman, Tayeb</creator><creator>Chamlal, Hasna</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20240305</creationdate><title>Filter-based relevance and instance selection</title><author>Mourtji, Basma El ; Ouaderhman, Tayeb ; Chamlal, Hasna</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p1682-79268148fac6daebff868afe9b2d459a53e344684f799ff6b8337f4859573afb3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Clustering</topic><topic>Datasets</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mourtji, Basma El</creatorcontrib><creatorcontrib>Ouaderhman, Tayeb</creatorcontrib><creatorcontrib>Chamlal, Hasna</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mourtji, Basma El</au><au>Ouaderhman, Tayeb</au><au>Chamlal, Hasna</au><au>Belhamadia, Youssef</au><au>Seaid, Mohammed</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Filter-based relevance and instance selection</atitle><btitle>AIP conference proceedings</btitle><date>2024-03-05</date><risdate>2024</risdate><volume>3034</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>Feature selection is an important technique in building prediction systems. In various searches, the judgment of the relevancy of a feature is often calculated using all the instances of the considered sample. However, when the dataset size grows, some of the instances are not useful for weighting the features. In this paper, we rank features according to a fitness function that relies on the relevancy using all instances and on the relevancy using a maximum number of significant instances (which really contribute to the feature positive relevancy with the target variable). The relevancy is based on preordonnances theory where the instances are expressed in pairs.
The proposed algorithm can be mainly divided into three steps, namely, (a) eliminating all features that are in disagree with the target feature, (b) Finding the best subset of instances, to each feature, that maximize the relevancy and which the cardinal tends to the cardinal of instances in the original dataset, and (c) Ranking features. The second step is defined by dividing the dataset (instances of each feature) into several consistent regions by fuzzy clustering. Then, performing GA-based instances selection independently within each cluster. Finally, aggregating of the partial results by the ensemble voting. Experimental results verify the effectiveness of the proposed method.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0194692</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0094-243X |
ispartof | AIP conference proceedings, 2024, Vol.3034 (1) |
issn | 0094-243X 1551-7616 |
language | eng |
recordid | cdi_scitation_primary_10_1063_5_0194692 |
source | AIP Journals Complete |
subjects | Algorithms Clustering Datasets |
title | Filter-based relevance and instance selection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T10%3A29%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Filter-based%20relevance%20and%20instance%20selection&rft.btitle=AIP%20conference%20proceedings&rft.au=Mourtji,%20Basma%20El&rft.date=2024-03-05&rft.volume=3034&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0194692&rft_dat=%3Cproquest_scita%3E2937429968%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2937429968&rft_id=info:pmid/&rfr_iscdi=true |