Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation
Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particul...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2024-05 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Chen, Louis L Chern, Bobbie Eckstrand, Eric Mahapatra, Amogh Royset, Johannes O |
description | Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3063930184</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3063930184</sourcerecordid><originalsourceid>FETCH-proquest_journals_30639301843</originalsourceid><addsrcrecordid>eNqNjNEKgjAYRkcQJOU7_NC1MDc1uw6joCCke_mTabO12Tajx0-hB-jqwDkf34wEjPM4yhPGFiR0rqOUsmzD0pQH5HKWXrbopW7B3wUcnz3WHkwDJ7wJNenCWmMdGA1Xi1JP6i0RSlM_sBFKSdRQCoWf8cXoFZk3qJwIf1yS9b647g5Rb81rEM5XnRmsHlPFaca3nMZ5wv9bfQE4tz62</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3063930184</pqid></control><display><type>article</type><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><source>Free E- Journals</source><creator>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</creator><creatorcontrib>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</creatorcontrib><description>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Cognitive tasks ; Computer vision ; Data augmentation ; Data mining ; Datasets ; Defects ; Errors ; Image classification ; Labeling ; Machine learning ; Natural language processing ; Neural networks ; Regularization ; Sentiment analysis</subject><ispartof>arXiv.org, 2024-05</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Chen, Louis L</creatorcontrib><creatorcontrib>Chern, Bobbie</creatorcontrib><creatorcontrib>Eckstrand, Eric</creatorcontrib><creatorcontrib>Mahapatra, Amogh</creatorcontrib><creatorcontrib>Royset, Johannes O</creatorcontrib><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><title>arXiv.org</title><description>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</description><subject>Cognitive tasks</subject><subject>Computer vision</subject><subject>Data augmentation</subject><subject>Data mining</subject><subject>Datasets</subject><subject>Defects</subject><subject>Errors</subject><subject>Image classification</subject><subject>Labeling</subject><subject>Machine learning</subject><subject>Natural language processing</subject><subject>Neural networks</subject><subject>Regularization</subject><subject>Sentiment analysis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjNEKgjAYRkcQJOU7_NC1MDc1uw6joCCke_mTabO12Tajx0-hB-jqwDkf34wEjPM4yhPGFiR0rqOUsmzD0pQH5HKWXrbopW7B3wUcnz3WHkwDJ7wJNenCWmMdGA1Xi1JP6i0RSlM_sBFKSdRQCoWf8cXoFZk3qJwIf1yS9b647g5Rb81rEM5XnRmsHlPFaca3nMZ5wv9bfQE4tz62</recordid><startdate>20240530</startdate><enddate>20240530</enddate><creator>Chen, Louis L</creator><creator>Chern, Bobbie</creator><creator>Eckstrand, Eric</creator><creator>Mahapatra, Amogh</creator><creator>Royset, Johannes O</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240530</creationdate><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><author>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30639301843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Cognitive tasks</topic><topic>Computer vision</topic><topic>Data augmentation</topic><topic>Data mining</topic><topic>Datasets</topic><topic>Defects</topic><topic>Errors</topic><topic>Image classification</topic><topic>Labeling</topic><topic>Machine learning</topic><topic>Natural language processing</topic><topic>Neural networks</topic><topic>Regularization</topic><topic>Sentiment analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Louis L</creatorcontrib><creatorcontrib>Chern, Bobbie</creatorcontrib><creatorcontrib>Eckstrand, Eric</creatorcontrib><creatorcontrib>Mahapatra, Amogh</creatorcontrib><creatorcontrib>Royset, Johannes O</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Database (Proquest)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Louis L</au><au>Chern, Bobbie</au><au>Eckstrand, Eric</au><au>Mahapatra, Amogh</au><au>Royset, Johannes O</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</atitle><jtitle>arXiv.org</jtitle><date>2024-05-30</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-05 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3063930184 |
source | Free E- Journals |
subjects | Cognitive tasks Computer vision Data augmentation Data mining Datasets Defects Errors Image classification Labeling Machine learning Natural language processing Neural networks Regularization Sentiment analysis |
title | Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T07%3A16%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Mitigating%20the%20Impact%20of%20Labeling%20Errors%20on%20Training%20via%20Rockafellian%20Relaxation&rft.jtitle=arXiv.org&rft.au=Chen,%20Louis%20L&rft.date=2024-05-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3063930184%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3063930184&rft_id=info:pmid/&rfr_iscdi=true |