Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers
The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructe...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2019-02 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Ghanbari, Hiva Li, Minhan Scheinberg, Katya |
description | The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2187712983</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2187712983</sourcerecordid><originalsourceid>FETCH-proquest_journals_21877129833</originalsourceid><addsrcrecordid>eNqNjLEKwjAUAIMgWLT_8MA50CbW1lFKRaTo4uRSgr5ASs2rea34-XbwA5xuuONmIlJap7LYKLUQMXObJIna5irLdCROZ3pjB8Y_oLLW3R36AfZ9H-jjnmZw5BksBbhhIHnxCDUxA1monUcToOwMs7MOA6_E3JqOMf5xKdaH6loe5TR7jchD09IY_KQalRZ5nqpdofV_1ReYIjxw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2187712983</pqid></control><display><type>article</type><title>Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers</title><source>Free E- Journals</source><creator>Ghanbari, Hiva ; Li, Minhan ; Scheinberg, Katya</creator><creatorcontrib>Ghanbari, Hiva ; Li, Minhan ; Scheinberg, Katya</creatorcontrib><description>The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Algorithms ; Approximation ; Classifiers ; Machine learning ; Mathematical analysis ; Mathematical models ; Normal distribution ; Optimization ; Predictions</subject><ispartof>arXiv.org, 2019-02</ispartof><rights>2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Ghanbari, Hiva</creatorcontrib><creatorcontrib>Li, Minhan</creatorcontrib><creatorcontrib>Scheinberg, Katya</creatorcontrib><title>Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers</title><title>arXiv.org</title><description>The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Approximation</subject><subject>Classifiers</subject><subject>Machine learning</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Normal distribution</subject><subject>Optimization</subject><subject>Predictions</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjLEKwjAUAIMgWLT_8MA50CbW1lFKRaTo4uRSgr5ASs2rea34-XbwA5xuuONmIlJap7LYKLUQMXObJIna5irLdCROZ3pjB8Y_oLLW3R36AfZ9H-jjnmZw5BksBbhhIHnxCDUxA1monUcToOwMs7MOA6_E3JqOMf5xKdaH6loe5TR7jchD09IY_KQalRZ5nqpdofV_1ReYIjxw</recordid><startdate>20190228</startdate><enddate>20190228</enddate><creator>Ghanbari, Hiva</creator><creator>Li, Minhan</creator><creator>Scheinberg, Katya</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20190228</creationdate><title>Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers</title><author>Ghanbari, Hiva ; Li, Minhan ; Scheinberg, Katya</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_21877129833</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Approximation</topic><topic>Classifiers</topic><topic>Machine learning</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Normal distribution</topic><topic>Optimization</topic><topic>Predictions</topic><toplevel>online_resources</toplevel><creatorcontrib>Ghanbari, Hiva</creatorcontrib><creatorcontrib>Li, Minhan</creatorcontrib><creatorcontrib>Scheinberg, Katya</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ghanbari, Hiva</au><au>Li, Minhan</au><au>Scheinberg, Katya</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers</atitle><jtitle>arXiv.org</jtitle><date>2019-02-28</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2019-02 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2187712983 |
source | Free E- Journals |
subjects | Accuracy Algorithms Approximation Classifiers Machine learning Mathematical analysis Mathematical models Normal distribution Optimization Predictions |
title | Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T09%3A07%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Novel%20and%20Efficient%20Approximations%20for%20Zero-One%20Loss%20of%20Linear%20Classifiers&rft.jtitle=arXiv.org&rft.au=Ghanbari,%20Hiva&rft.date=2019-02-28&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2187712983%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2187712983&rft_id=info:pmid/&rfr_iscdi=true |