Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers

The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2019-02
Hauptverfasser: Ghanbari, Hiva, Li, Minhan, Scheinberg, Katya
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Ghanbari, Hiva
Li, Minhan
Scheinberg, Katya
description The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2187712983</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2187712983</sourcerecordid><originalsourceid>FETCH-proquest_journals_21877129833</originalsourceid><addsrcrecordid>eNqNjLEKwjAUAIMgWLT_8MA50CbW1lFKRaTo4uRSgr5ASs2rea34-XbwA5xuuONmIlJap7LYKLUQMXObJIna5irLdCROZ3pjB8Y_oLLW3R36AfZ9H-jjnmZw5BksBbhhIHnxCDUxA1monUcToOwMs7MOA6_E3JqOMf5xKdaH6loe5TR7jchD09IY_KQalRZ5nqpdofV_1ReYIjxw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2187712983</pqid></control><display><type>article</type><title>Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers</title><source>Free E- Journals</source><creator>Ghanbari, Hiva ; Li, Minhan ; Scheinberg, Katya</creator><creatorcontrib>Ghanbari, Hiva ; Li, Minhan ; Scheinberg, Katya</creatorcontrib><description>The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Algorithms ; Approximation ; Classifiers ; Machine learning ; Mathematical analysis ; Mathematical models ; Normal distribution ; Optimization ; Predictions</subject><ispartof>arXiv.org, 2019-02</ispartof><rights>2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Ghanbari, Hiva</creatorcontrib><creatorcontrib>Li, Minhan</creatorcontrib><creatorcontrib>Scheinberg, Katya</creatorcontrib><title>Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers</title><title>arXiv.org</title><description>The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Approximation</subject><subject>Classifiers</subject><subject>Machine learning</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Normal distribution</subject><subject>Optimization</subject><subject>Predictions</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjLEKwjAUAIMgWLT_8MA50CbW1lFKRaTo4uRSgr5ASs2rea34-XbwA5xuuONmIlJap7LYKLUQMXObJIna5irLdCROZ3pjB8Y_oLLW3R36AfZ9H-jjnmZw5BksBbhhIHnxCDUxA1monUcToOwMs7MOA6_E3JqOMf5xKdaH6loe5TR7jchD09IY_KQalRZ5nqpdofV_1ReYIjxw</recordid><startdate>20190228</startdate><enddate>20190228</enddate><creator>Ghanbari, Hiva</creator><creator>Li, Minhan</creator><creator>Scheinberg, Katya</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20190228</creationdate><title>Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers</title><author>Ghanbari, Hiva ; Li, Minhan ; Scheinberg, Katya</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_21877129833</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Approximation</topic><topic>Classifiers</topic><topic>Machine learning</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Normal distribution</topic><topic>Optimization</topic><topic>Predictions</topic><toplevel>online_resources</toplevel><creatorcontrib>Ghanbari, Hiva</creatorcontrib><creatorcontrib>Li, Minhan</creatorcontrib><creatorcontrib>Scheinberg, Katya</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ghanbari, Hiva</au><au>Li, Minhan</au><au>Scheinberg, Katya</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers</atitle><jtitle>arXiv.org</jtitle><date>2019-02-28</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2019-02
issn 2331-8422
language eng
recordid cdi_proquest_journals_2187712983
source Free E- Journals
subjects Accuracy
Algorithms
Approximation
Classifiers
Machine learning
Mathematical analysis
Mathematical models
Normal distribution
Optimization
Predictions
title Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T09%3A07%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Novel%20and%20Efficient%20Approximations%20for%20Zero-One%20Loss%20of%20Linear%20Classifiers&rft.jtitle=arXiv.org&rft.au=Ghanbari,%20Hiva&rft.date=2019-02-28&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2187712983%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2187712983&rft_id=info:pmid/&rfr_iscdi=true