Exploring the performance of resampling strategies for the class imbalance problem

The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight differ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: García, Vicente, Sánchez, José Salvador, Mollineda, Ramón A.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 549
container_issue
container_start_page 541
container_title
container_volume
creator García, Vicente
Sánchez, José Salvador
Mollineda, Ramón A.
description The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.
doi_str_mv 10.5555/1945758.1945822
format Conference Proceeding
fullrecord <record><control><sourceid>acm</sourceid><recordid>TN_cdi_acm_books_10_5555_1945758_1945822_brief</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>acm_books_10_5555_1945758_1945822</sourcerecordid><originalsourceid>FETCH-LOGICAL-a157t-197b949e46cc1e5194568f4c4c656ad177ebd0d3fc878b96977a176b266df7b23</originalsourceid><addsrcrecordid>eNqNkDtPwzAUhS0hJKB0ZvXI0mA7fmVEVXlIlZAQzJbtXJeAU0d2Bn4-SckP6FnOcM85uvoQuqOkEpMeaMOFErqaXTN2gW5qyRmtCaPyCq1L-SaTuJKa0Gv0vvsdYsrd8YDHL8AD5JByb48ecAo4Q7H9EOdrGbMd4dBBwVPiFPbRloK73tl4Kgw5uQj9LboMNhZYL75Cn0-7j-3LZv_2_Lp93G8sFWrc0Ea5hjfApfcUxPyu1IF77qWQtqVKgWtJWwevlXaNbJSyVEnHpGyDcqxeoep_1_reuJR-iqHEzAzMwsAsDIzLHYSpcH9mof4DkjVeJQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Exploring the performance of resampling strategies for the class imbalance problem</title><source>Springer Books</source><creator>García, Vicente ; Sánchez, José Salvador ; Mollineda, Ramón A.</creator><contributor>García-Pedrajas, Nicolás ; Herrera, Francisco ; Fyfe, Colin ; Benítez, José Manuel ; Ali, Moonis</contributor><creatorcontrib>García, Vicente ; Sánchez, José Salvador ; Mollineda, Ramón A. ; García-Pedrajas, Nicolás ; Herrera, Francisco ; Fyfe, Colin ; Benítez, José Manuel ; Ali, Moonis</creatorcontrib><description>The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.</description><identifier>ISBN: 3642130216</identifier><identifier>ISBN: 9783642130212</identifier><identifier>DOI: 10.5555/1945758.1945822</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer-Verlag</publisher><subject>Computing methodologies ; Computing methodologies -- Machine learning ; Computing methodologies -- Machine learning -- Learning paradigms ; Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning ; Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning -- Supervised learning by classification ; Computing methodologies -- Machine learning -- Machine learning approaches ; Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees</subject><ispartof>Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I, 2010, p.541-549</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>309,310,776,780,785,786,27902</link.rule.ids></links><search><contributor>García-Pedrajas, Nicolás</contributor><contributor>Herrera, Francisco</contributor><contributor>Fyfe, Colin</contributor><contributor>Benítez, José Manuel</contributor><contributor>Ali, Moonis</contributor><creatorcontrib>García, Vicente</creatorcontrib><creatorcontrib>Sánchez, José Salvador</creatorcontrib><creatorcontrib>Mollineda, Ramón A.</creatorcontrib><title>Exploring the performance of resampling strategies for the class imbalance problem</title><title>Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I</title><description>The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.</description><subject>Computing methodologies</subject><subject>Computing methodologies -- Machine learning</subject><subject>Computing methodologies -- Machine learning -- Learning paradigms</subject><subject>Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning</subject><subject>Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning -- Supervised learning by classification</subject><subject>Computing methodologies -- Machine learning -- Machine learning approaches</subject><subject>Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees</subject><isbn>3642130216</isbn><isbn>9783642130212</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid/><recordid>eNqNkDtPwzAUhS0hJKB0ZvXI0mA7fmVEVXlIlZAQzJbtXJeAU0d2Bn4-SckP6FnOcM85uvoQuqOkEpMeaMOFErqaXTN2gW5qyRmtCaPyCq1L-SaTuJKa0Gv0vvsdYsrd8YDHL8AD5JByb48ecAo4Q7H9EOdrGbMd4dBBwVPiFPbRloK73tl4Kgw5uQj9LboMNhZYL75Cn0-7j-3LZv_2_Lp93G8sFWrc0Ea5hjfApfcUxPyu1IF77qWQtqVKgWtJWwevlXaNbJSyVEnHpGyDcqxeoep_1_reuJR-iqHEzAzMwsAsDIzLHYSpcH9mof4DkjVeJQ</recordid><startdate>20100601</startdate><enddate>20100601</enddate><creator>García, Vicente</creator><creator>Sánchez, José Salvador</creator><creator>Mollineda, Ramón A.</creator><general>Springer-Verlag</general><scope/></search><sort><creationdate>20100601</creationdate><title>Exploring the performance of resampling strategies for the class imbalance problem</title><author>García, Vicente ; Sánchez, José Salvador ; Mollineda, Ramón A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a157t-197b949e46cc1e5194568f4c4c656ad177ebd0d3fc878b96977a176b266df7b23</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Computing methodologies</topic><topic>Computing methodologies -- Machine learning</topic><topic>Computing methodologies -- Machine learning -- Learning paradigms</topic><topic>Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning</topic><topic>Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning -- Supervised learning by classification</topic><topic>Computing methodologies -- Machine learning -- Machine learning approaches</topic><topic>Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>García, Vicente</creatorcontrib><creatorcontrib>Sánchez, José Salvador</creatorcontrib><creatorcontrib>Mollineda, Ramón A.</creatorcontrib></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>García, Vicente</au><au>Sánchez, José Salvador</au><au>Mollineda, Ramón A.</au><au>García-Pedrajas, Nicolás</au><au>Herrera, Francisco</au><au>Fyfe, Colin</au><au>Benítez, José Manuel</au><au>Ali, Moonis</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Exploring the performance of resampling strategies for the class imbalance problem</atitle><btitle>Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I</btitle><date>2010-06-01</date><risdate>2010</risdate><spage>541</spage><epage>549</epage><pages>541-549</pages><isbn>3642130216</isbn><isbn>9783642130212</isbn><abstract>The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer-Verlag</pub><doi>10.5555/1945758.1945822</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISBN: 3642130216
ispartof Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I, 2010, p.541-549
issn
language eng
recordid cdi_acm_books_10_5555_1945758_1945822_brief
source Springer Books
subjects Computing methodologies
Computing methodologies -- Machine learning
Computing methodologies -- Machine learning -- Learning paradigms
Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning
Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning -- Supervised learning by classification
Computing methodologies -- Machine learning -- Machine learning approaches
Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees
title Exploring the performance of resampling strategies for the class imbalance problem
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T15%3A52%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Exploring%20the%20performance%20of%20resampling%20strategies%20for%20the%20class%20imbalance%20problem&rft.btitle=Proceedings%20of%20the%2023rd%20international%20conference%20on%20Industrial%20engineering%20and%20other%20applications%20of%20applied%20intelligent%20systems%20-%20Volume%20Part%20I&rft.au=Garc%C3%ADa,%20Vicente&rft.date=2010-06-01&rft.spage=541&rft.epage=549&rft.pages=541-549&rft.isbn=3642130216&rft.isbn_list=9783642130212&rft_id=info:doi/10.5555/1945758.1945822&rft_dat=%3Cacm%3Eacm_books_10_5555_1945758_1945822%3C/acm%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true