Fair Feature Subset Selection using Multiobjective Genetic Algorithm

The feature subset selection problem aims at selecting the relevant subset of features to improve the performance of a Machine Learning (ML) algorithm on training data. Some features in data can be inherently noisy, costly to compute, improperly scaled, or correlated to other features, and they can...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2022-04
Hauptverfasser:	Ayaz Ur Rehman, Nadeem, Anas, Malik, Muhammad Zubair
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Decision making Evolutionary algorithms Genetic algorithms Machine learning Performance enhancement
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Ayaz Ur Rehman Nadeem, Anas Malik, Muhammad Zubair
description	The feature subset selection problem aims at selecting the relevant subset of features to improve the performance of a Machine Learning (ML) algorithm on training data. Some features in data can be inherently noisy, costly to compute, improperly scaled, or correlated to other features, and they can adversely affect the accuracy, cost, and complexity of the induced algorithm. The goal of traditional feature selection approaches has been to remove such irrelevant features. In recent years ML is making a noticeable impact on the decision-making processes of our everyday lives. We want to ensure that these decisions do not reflect biased behavior towards certain groups or individuals based on protected attributes such as age, sex, or race. In this paper, we present a feature subset selection approach that improves both fairness and accuracy objectives and computes Pareto-optimal solutions using the NSGA-II algorithm. We use statistical disparity as a fairness metric and F1-Score as a metric for model performance. Our experiments on the most commonly used fairness benchmark datasets with three different machine learning algorithms show that using the evolutionary algorithm we can effectively explore the trade-off between fairness and accuracy.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2659401379</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2659401379</sourcerecordid><originalsourceid>FETCH-proquest_journals_26594013793</originalsourceid><addsrcrecordid>eNqNissKwjAUBYMgWLT_cMF1IU36sEtRqxtXdV_ScltTYqJ5-P1W8ANcHWbmLEjEOE-TXcbYisTOTZRSVpQsz3lEjrWQFmoUPliEJnQOPTSosPfSaAhO6hGuQc3UTV_5RjijRi972KvRWOnvjw1ZDkI5jH-7Jtv6dDtckqc1r4DOt5MJVs-pZUVeZTTlZcX_e30ARvU6wQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2659401379</pqid></control><display><type>article</type><title>Fair Feature Subset Selection using Multiobjective Genetic Algorithm</title><source>Free E- Journals</source><creator>Ayaz Ur Rehman ; Nadeem, Anas ; Malik, Muhammad Zubair</creator><creatorcontrib>Ayaz Ur Rehman ; Nadeem, Anas ; Malik, Muhammad Zubair</creatorcontrib><description>The feature subset selection problem aims at selecting the relevant subset of features to improve the performance of a Machine Learning (ML) algorithm on training data. Some features in data can be inherently noisy, costly to compute, improperly scaled, or correlated to other features, and they can adversely affect the accuracy, cost, and complexity of the induced algorithm. The goal of traditional feature selection approaches has been to remove such irrelevant features. In recent years ML is making a noticeable impact on the decision-making processes of our everyday lives. We want to ensure that these decisions do not reflect biased behavior towards certain groups or individuals based on protected attributes such as age, sex, or race. In this paper, we present a feature subset selection approach that improves both fairness and accuracy objectives and computes Pareto-optimal solutions using the NSGA-II algorithm. We use statistical disparity as a fairness metric and F1-Score as a metric for model performance. Our experiments on the most commonly used fairness benchmark datasets with three different machine learning algorithms show that using the evolutionary algorithm we can effectively explore the trade-off between fairness and accuracy.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Decision making ; Evolutionary algorithms ; Genetic algorithms ; Machine learning ; Performance enhancement</subject><ispartof>arXiv.org, 2022-04</ispartof><rights>2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Ayaz Ur Rehman</creatorcontrib><creatorcontrib>Nadeem, Anas</creatorcontrib><creatorcontrib>Malik, Muhammad Zubair</creatorcontrib><title>Fair Feature Subset Selection using Multiobjective Genetic Algorithm</title><title>arXiv.org</title><description>The feature subset selection problem aims at selecting the relevant subset of features to improve the performance of a Machine Learning (ML) algorithm on training data. Some features in data can be inherently noisy, costly to compute, improperly scaled, or correlated to other features, and they can adversely affect the accuracy, cost, and complexity of the induced algorithm. The goal of traditional feature selection approaches has been to remove such irrelevant features. In recent years ML is making a noticeable impact on the decision-making processes of our everyday lives. We want to ensure that these decisions do not reflect biased behavior towards certain groups or individuals based on protected attributes such as age, sex, or race. In this paper, we present a feature subset selection approach that improves both fairness and accuracy objectives and computes Pareto-optimal solutions using the NSGA-II algorithm. We use statistical disparity as a fairness metric and F1-Score as a metric for model performance. Our experiments on the most commonly used fairness benchmark datasets with three different machine learning algorithms show that using the evolutionary algorithm we can effectively explore the trade-off between fairness and accuracy.</description><subject>Accuracy</subject><subject>Decision making</subject><subject>Evolutionary algorithms</subject><subject>Genetic algorithms</subject><subject>Machine learning</subject><subject>Performance enhancement</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNissKwjAUBYMgWLT_cMF1IU36sEtRqxtXdV_ScltTYqJ5-P1W8ANcHWbmLEjEOE-TXcbYisTOTZRSVpQsz3lEjrWQFmoUPliEJnQOPTSosPfSaAhO6hGuQc3UTV_5RjijRi972KvRWOnvjw1ZDkI5jH-7Jtv6dDtckqc1r4DOt5MJVs-pZUVeZTTlZcX_e30ARvU6wQ</recordid><startdate>20220430</startdate><enddate>20220430</enddate><creator>Ayaz Ur Rehman</creator><creator>Nadeem, Anas</creator><creator>Malik, Muhammad Zubair</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220430</creationdate><title>Fair Feature Subset Selection using Multiobjective Genetic Algorithm</title><author>Ayaz Ur Rehman ; Nadeem, Anas ; Malik, Muhammad Zubair</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_26594013793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Decision making</topic><topic>Evolutionary algorithms</topic><topic>Genetic algorithms</topic><topic>Machine learning</topic><topic>Performance enhancement</topic><toplevel>online_resources</toplevel><creatorcontrib>Ayaz Ur Rehman</creatorcontrib><creatorcontrib>Nadeem, Anas</creatorcontrib><creatorcontrib>Malik, Muhammad Zubair</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ayaz Ur Rehman</au><au>Nadeem, Anas</au><au>Malik, Muhammad Zubair</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Fair Feature Subset Selection using Multiobjective Genetic Algorithm</atitle><jtitle>arXiv.org</jtitle><date>2022-04-30</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>The feature subset selection problem aims at selecting the relevant subset of features to improve the performance of a Machine Learning (ML) algorithm on training data. Some features in data can be inherently noisy, costly to compute, improperly scaled, or correlated to other features, and they can adversely affect the accuracy, cost, and complexity of the induced algorithm. The goal of traditional feature selection approaches has been to remove such irrelevant features. In recent years ML is making a noticeable impact on the decision-making processes of our everyday lives. We want to ensure that these decisions do not reflect biased behavior towards certain groups or individuals based on protected attributes such as age, sex, or race. In this paper, we present a feature subset selection approach that improves both fairness and accuracy objectives and computes Pareto-optimal solutions using the NSGA-II algorithm. We use statistical disparity as a fairness metric and F1-Score as a metric for model performance. Our experiments on the most commonly used fairness benchmark datasets with three different machine learning algorithms show that using the evolutionary algorithm we can effectively explore the trade-off between fairness and accuracy.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-04
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2659401379
source	Free E- Journals
subjects	Accuracy Decision making Evolutionary algorithms Genetic algorithms Machine learning Performance enhancement
title	Fair Feature Subset Selection using Multiobjective Genetic Algorithm
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T22%3A09%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Fair%20Feature%20Subset%20Selection%20using%20Multiobjective%20Genetic%20Algorithm&rft.jtitle=arXiv.org&rft.au=Ayaz%20Ur%20Rehman&rft.date=2022-04-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2659401379%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2659401379&rft_id=info:pmid/&rfr_iscdi=true