Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation

Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particul...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-05
Hauptverfasser:	Chen, Louis L, Chern, Bobbie, Eckstrand, Eric, Mahapatra, Amogh, Royset, Johannes O
Format:	Artikel
Sprache:	eng
Schlagworte:	Cognitive tasks Computer vision Data augmentation Data mining Datasets Defects Errors Image classification Labeling Machine learning Natural language processing Neural networks Regularization Sentiment analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Chen, Louis L Chern, Bobbie Eckstrand, Eric Mahapatra, Amogh Royset, Johannes O
description	Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3063930184</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3063930184</sourcerecordid><originalsourceid>FETCH-proquest_journals_30639301843</originalsourceid><addsrcrecordid>eNqNjNEKgjAYRkcQJOU7_NC1MDc1uw6joCCke_mTabO12Tajx0-hB-jqwDkf34wEjPM4yhPGFiR0rqOUsmzD0pQH5HKWXrbopW7B3wUcnz3WHkwDJ7wJNenCWmMdGA1Xi1JP6i0RSlM_sBFKSdRQCoWf8cXoFZk3qJwIf1yS9b647g5Rb81rEM5XnRmsHlPFaca3nMZ5wv9bfQE4tz62</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3063930184</pqid></control><display><type>article</type><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><source>Free E- Journals</source><creator>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</creator><creatorcontrib>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</creatorcontrib><description>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Cognitive tasks ; Computer vision ; Data augmentation ; Data mining ; Datasets ; Defects ; Errors ; Image classification ; Labeling ; Machine learning ; Natural language processing ; Neural networks ; Regularization ; Sentiment analysis</subject><ispartof>arXiv.org, 2024-05</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Chen, Louis L</creatorcontrib><creatorcontrib>Chern, Bobbie</creatorcontrib><creatorcontrib>Eckstrand, Eric</creatorcontrib><creatorcontrib>Mahapatra, Amogh</creatorcontrib><creatorcontrib>Royset, Johannes O</creatorcontrib><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><title>arXiv.org</title><description>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</description><subject>Cognitive tasks</subject><subject>Computer vision</subject><subject>Data augmentation</subject><subject>Data mining</subject><subject>Datasets</subject><subject>Defects</subject><subject>Errors</subject><subject>Image classification</subject><subject>Labeling</subject><subject>Machine learning</subject><subject>Natural language processing</subject><subject>Neural networks</subject><subject>Regularization</subject><subject>Sentiment analysis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjNEKgjAYRkcQJOU7_NC1MDc1uw6joCCke_mTabO12Tajx0-hB-jqwDkf34wEjPM4yhPGFiR0rqOUsmzD0pQH5HKWXrbopW7B3wUcnz3WHkwDJ7wJNenCWmMdGA1Xi1JP6i0RSlM_sBFKSdRQCoWf8cXoFZk3qJwIf1yS9b647g5Rb81rEM5XnRmsHlPFaca3nMZ5wv9bfQE4tz62</recordid><startdate>20240530</startdate><enddate>20240530</enddate><creator>Chen, Louis L</creator><creator>Chern, Bobbie</creator><creator>Eckstrand, Eric</creator><creator>Mahapatra, Amogh</creator><creator>Royset, Johannes O</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240530</creationdate><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><author>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30639301843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Cognitive tasks</topic><topic>Computer vision</topic><topic>Data augmentation</topic><topic>Data mining</topic><topic>Datasets</topic><topic>Defects</topic><topic>Errors</topic><topic>Image classification</topic><topic>Labeling</topic><topic>Machine learning</topic><topic>Natural language processing</topic><topic>Neural networks</topic><topic>Regularization</topic><topic>Sentiment analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Louis L</creatorcontrib><creatorcontrib>Chern, Bobbie</creatorcontrib><creatorcontrib>Eckstrand, Eric</creatorcontrib><creatorcontrib>Mahapatra, Amogh</creatorcontrib><creatorcontrib>Royset, Johannes O</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Database (Proquest)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Louis L</au><au>Chern, Bobbie</au><au>Eckstrand, Eric</au><au>Mahapatra, Amogh</au><au>Royset, Johannes O</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</atitle><jtitle>arXiv.org</jtitle><date>2024-05-30</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-05
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3063930184
source	Free E- Journals
subjects	Cognitive tasks Computer vision Data augmentation Data mining Datasets Defects Errors Image classification Labeling Machine learning Natural language processing Neural networks Regularization Sentiment analysis
title	Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T07%3A16%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Mitigating%20the%20Impact%20of%20Labeling%20Errors%20on%20Training%20via%20Rockafellian%20Relaxation&rft.jtitle=arXiv.org&rft.au=Chen,%20Louis%20L&rft.date=2024-05-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3063930184%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3063930184&rft_id=info:pmid/&rfr_iscdi=true