Exploring the performance of resampling strategies for the class imbalance problem
The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight differ...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 549 |
---|---|
container_issue | |
container_start_page | 541 |
container_title | |
container_volume | |
creator | García, Vicente Sánchez, José Salvador Mollineda, Ramón A. |
description | The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning. |
doi_str_mv | 10.5555/1945758.1945822 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>acm</sourceid><recordid>TN_cdi_acm_books_10_5555_1945758_1945822_brief</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>acm_books_10_5555_1945758_1945822</sourcerecordid><originalsourceid>FETCH-LOGICAL-a157t-197b949e46cc1e5194568f4c4c656ad177ebd0d3fc878b96977a176b266df7b23</originalsourceid><addsrcrecordid>eNqNkDtPwzAUhS0hJKB0ZvXI0mA7fmVEVXlIlZAQzJbtXJeAU0d2Bn4-SckP6FnOcM85uvoQuqOkEpMeaMOFErqaXTN2gW5qyRmtCaPyCq1L-SaTuJKa0Gv0vvsdYsrd8YDHL8AD5JByb48ecAo4Q7H9EOdrGbMd4dBBwVPiFPbRloK73tl4Kgw5uQj9LboMNhZYL75Cn0-7j-3LZv_2_Lp93G8sFWrc0Ea5hjfApfcUxPyu1IF77qWQtqVKgWtJWwevlXaNbJSyVEnHpGyDcqxeoep_1_reuJR-iqHEzAzMwsAsDIzLHYSpcH9mof4DkjVeJQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Exploring the performance of resampling strategies for the class imbalance problem</title><source>Springer Books</source><creator>García, Vicente ; Sánchez, José Salvador ; Mollineda, Ramón A.</creator><contributor>García-Pedrajas, Nicolás ; Herrera, Francisco ; Fyfe, Colin ; Benítez, José Manuel ; Ali, Moonis</contributor><creatorcontrib>García, Vicente ; Sánchez, José Salvador ; Mollineda, Ramón A. ; García-Pedrajas, Nicolás ; Herrera, Francisco ; Fyfe, Colin ; Benítez, José Manuel ; Ali, Moonis</creatorcontrib><description>The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.</description><identifier>ISBN: 3642130216</identifier><identifier>ISBN: 9783642130212</identifier><identifier>DOI: 10.5555/1945758.1945822</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer-Verlag</publisher><subject>Computing methodologies ; Computing methodologies -- Machine learning ; Computing methodologies -- Machine learning -- Learning paradigms ; Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning ; Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning -- Supervised learning by classification ; Computing methodologies -- Machine learning -- Machine learning approaches ; Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees</subject><ispartof>Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I, 2010, p.541-549</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>309,310,776,780,785,786,27902</link.rule.ids></links><search><contributor>García-Pedrajas, Nicolás</contributor><contributor>Herrera, Francisco</contributor><contributor>Fyfe, Colin</contributor><contributor>Benítez, José Manuel</contributor><contributor>Ali, Moonis</contributor><creatorcontrib>García, Vicente</creatorcontrib><creatorcontrib>Sánchez, José Salvador</creatorcontrib><creatorcontrib>Mollineda, Ramón A.</creatorcontrib><title>Exploring the performance of resampling strategies for the class imbalance problem</title><title>Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I</title><description>The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.</description><subject>Computing methodologies</subject><subject>Computing methodologies -- Machine learning</subject><subject>Computing methodologies -- Machine learning -- Learning paradigms</subject><subject>Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning</subject><subject>Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning -- Supervised learning by classification</subject><subject>Computing methodologies -- Machine learning -- Machine learning approaches</subject><subject>Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees</subject><isbn>3642130216</isbn><isbn>9783642130212</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid/><recordid>eNqNkDtPwzAUhS0hJKB0ZvXI0mA7fmVEVXlIlZAQzJbtXJeAU0d2Bn4-SckP6FnOcM85uvoQuqOkEpMeaMOFErqaXTN2gW5qyRmtCaPyCq1L-SaTuJKa0Gv0vvsdYsrd8YDHL8AD5JByb48ecAo4Q7H9EOdrGbMd4dBBwVPiFPbRloK73tl4Kgw5uQj9LboMNhZYL75Cn0-7j-3LZv_2_Lp93G8sFWrc0Ea5hjfApfcUxPyu1IF77qWQtqVKgWtJWwevlXaNbJSyVEnHpGyDcqxeoep_1_reuJR-iqHEzAzMwsAsDIzLHYSpcH9mof4DkjVeJQ</recordid><startdate>20100601</startdate><enddate>20100601</enddate><creator>García, Vicente</creator><creator>Sánchez, José Salvador</creator><creator>Mollineda, Ramón A.</creator><general>Springer-Verlag</general><scope/></search><sort><creationdate>20100601</creationdate><title>Exploring the performance of resampling strategies for the class imbalance problem</title><author>García, Vicente ; Sánchez, José Salvador ; Mollineda, Ramón A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a157t-197b949e46cc1e5194568f4c4c656ad177ebd0d3fc878b96977a176b266df7b23</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Computing methodologies</topic><topic>Computing methodologies -- Machine learning</topic><topic>Computing methodologies -- Machine learning -- Learning paradigms</topic><topic>Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning</topic><topic>Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning -- Supervised learning by classification</topic><topic>Computing methodologies -- Machine learning -- Machine learning approaches</topic><topic>Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>García, Vicente</creatorcontrib><creatorcontrib>Sánchez, José Salvador</creatorcontrib><creatorcontrib>Mollineda, Ramón A.</creatorcontrib></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>García, Vicente</au><au>Sánchez, José Salvador</au><au>Mollineda, Ramón A.</au><au>García-Pedrajas, Nicolás</au><au>Herrera, Francisco</au><au>Fyfe, Colin</au><au>Benítez, José Manuel</au><au>Ali, Moonis</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Exploring the performance of resampling strategies for the class imbalance problem</atitle><btitle>Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I</btitle><date>2010-06-01</date><risdate>2010</risdate><spage>541</spage><epage>549</epage><pages>541-549</pages><isbn>3642130216</isbn><isbn>9783642130212</isbn><abstract>The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer-Verlag</pub><doi>10.5555/1945758.1945822</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISBN: 3642130216 |
ispartof | Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I, 2010, p.541-549 |
issn | |
language | eng |
recordid | cdi_acm_books_10_5555_1945758_1945822_brief |
source | Springer Books |
subjects | Computing methodologies Computing methodologies -- Machine learning Computing methodologies -- Machine learning -- Learning paradigms Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning -- Supervised learning by classification Computing methodologies -- Machine learning -- Machine learning approaches Computing methodologies -- Machine learning -- Machine learning approaches -- Classification and regression trees |
title | Exploring the performance of resampling strategies for the class imbalance problem |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T15%3A52%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Exploring%20the%20performance%20of%20resampling%20strategies%20for%20the%20class%20imbalance%20problem&rft.btitle=Proceedings%20of%20the%2023rd%20international%20conference%20on%20Industrial%20engineering%20and%20other%20applications%20of%20applied%20intelligent%20systems%20-%20Volume%20Part%20I&rft.au=Garc%C3%ADa,%20Vicente&rft.date=2010-06-01&rft.spage=541&rft.epage=549&rft.pages=541-549&rft.isbn=3642130216&rft.isbn_list=9783642130212&rft_id=info:doi/10.5555/1945758.1945822&rft_dat=%3Cacm%3Eacm_books_10_5555_1945758_1945822%3C/acm%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |