A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization

We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of machine learning research 2018-01, Vol.19 (1), p.517-564
Hauptverfasser: Chen, Ruidi, Paschalidis, Ioannis Ch
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 564
container_issue 1
container_start_page 517
container_title Journal of machine learning research
container_volume 19
creator Chen, Ruidi
Paschalidis, Ioannis Ch
description We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation ( Huber, 1964 , 1973 ).
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8378760</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2563706229</sourcerecordid><originalsourceid>FETCH-LOGICAL-p173t-e67530fe8b89bcf8721f27243d071d9a76a30d927693928fcc6b95ac4e2724953</originalsourceid><addsrcrecordid>eNpVj9FKwzAUhoMobk7fIZfeFNKkSZobYU6dwmQw9NaQpukWSZuapMJ8ejudF17955zv8MF_AqY5JSTjApenPzPOioLQCbiI8R2hnFPMzsGEFAXOieBT8DaHG18NMcGVUaGz3RbO-z54pXew8QFuzDaYGK3v4LOvjYvwVkVTw3G_szEFWw1phMq5_Z9o3Sfb2i91uF-Cs0a5aK6OOQOvD_cvi8dstV4-LearrM85SZlhnBLUmLIqRaWbkuO8wRwXpEY8r4XiTBFUC8yZIGO1RmtWCap0YQ5fgpIZuPn19kPVmlqbLgXlZB9sq8JeemXlf9LZndz6T1kSXnKGRsH1URD8x2Bikq2N2jinOuOHKDFlhCOGsSDfuDxsAg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2563706229</pqid></control><display><type>article</type><title>A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization</title><source>ACM Digital Library Complete</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Chen, Ruidi ; Paschalidis, Ioannis Ch</creator><creatorcontrib>Chen, Ruidi ; Paschalidis, Ioannis Ch</creatorcontrib><description>We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation ( Huber, 1964 , 1973 ).</description><identifier>ISSN: 1532-4435</identifier><identifier>EISSN: 1533-7928</identifier><identifier>PMID: 34421397</identifier><language>eng</language><ispartof>Journal of machine learning research, 2018-01, Vol.19 (1), p.517-564</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881</link.rule.ids></links><search><creatorcontrib>Chen, Ruidi</creatorcontrib><creatorcontrib>Paschalidis, Ioannis Ch</creatorcontrib><title>A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization</title><title>Journal of machine learning research</title><description>We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation ( Huber, 1964 , 1973 ).</description><issn>1532-4435</issn><issn>1533-7928</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNpVj9FKwzAUhoMobk7fIZfeFNKkSZobYU6dwmQw9NaQpukWSZuapMJ8ejudF17955zv8MF_AqY5JSTjApenPzPOioLQCbiI8R2hnFPMzsGEFAXOieBT8DaHG18NMcGVUaGz3RbO-z54pXew8QFuzDaYGK3v4LOvjYvwVkVTw3G_szEFWw1phMq5_Z9o3Sfb2i91uF-Cs0a5aK6OOQOvD_cvi8dstV4-LearrM85SZlhnBLUmLIqRaWbkuO8wRwXpEY8r4XiTBFUC8yZIGO1RmtWCap0YQ5fgpIZuPn19kPVmlqbLgXlZB9sq8JeemXlf9LZndz6T1kSXnKGRsH1URD8x2Bikq2N2jinOuOHKDFlhCOGsSDfuDxsAg</recordid><startdate>20180101</startdate><enddate>20180101</enddate><creator>Chen, Ruidi</creator><creator>Paschalidis, Ioannis Ch</creator><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20180101</creationdate><title>A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization</title><author>Chen, Ruidi ; Paschalidis, Ioannis Ch</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p173t-e67530fe8b89bcf8721f27243d071d9a76a30d927693928fcc6b95ac4e2724953</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Ruidi</creatorcontrib><creatorcontrib>Paschalidis, Ioannis Ch</creatorcontrib><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of machine learning research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Ruidi</au><au>Paschalidis, Ioannis Ch</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization</atitle><jtitle>Journal of machine learning research</jtitle><date>2018-01-01</date><risdate>2018</risdate><volume>19</volume><issue>1</issue><spage>517</spage><epage>564</epage><pages>517-564</pages><issn>1532-4435</issn><eissn>1533-7928</eissn><abstract>We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation ( Huber, 1964 , 1973 ).</abstract><pmid>34421397</pmid><tpages>48</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1532-4435
ispartof Journal of machine learning research, 2018-01, Vol.19 (1), p.517-564
issn 1532-4435
1533-7928
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8378760
source ACM Digital Library Complete; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
title A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T04%3A37%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Robust%20Learning%20Approach%20for%20Regression%20Models%20Based%20on%20Distributionally%20Robust%20Optimization&rft.jtitle=Journal%20of%20machine%20learning%20research&rft.au=Chen,%20Ruidi&rft.date=2018-01-01&rft.volume=19&rft.issue=1&rft.spage=517&rft.epage=564&rft.pages=517-564&rft.issn=1532-4435&rft.eissn=1533-7928&rft_id=info:doi/&rft_dat=%3Cproquest_pubme%3E2563706229%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2563706229&rft_id=info:pmid/34421397&rfr_iscdi=true