The Binormal Assumption on Precision-Recall Curves

The precision-recall curve (PRC) has become a widespread conceptual basis for assessing classification performance. The curve relates the positive predictive value of a classifier to its true positive rate and often provides a useful alternative to the well-known receiver operating characteristic (R...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Brodersen, K H, Ong, C S, Stephan, K E, Buhmann, J M
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4266
container_issue
container_start_page 4263
container_title
container_volume
creator Brodersen, K H
Ong, C S
Stephan, K E
Buhmann, J M
description The precision-recall curve (PRC) has become a widespread conceptual basis for assessing classification performance. The curve relates the positive predictive value of a classifier to its true positive rate and often provides a useful alternative to the well-known receiver operating characteristic (ROC). The empirical PRC, however, turns out to be a highly imprecise estimate of the true curve, especially in the case of a small sample size and class imbalance in favour of negative examples. Ironically, this situation tends to occur precisely in those applications where the curve would be most useful, e.g., in anomaly detection or information retrieval. Here, we propose to estimate the PRC on the basis of a simple distributional assumption about the decision values that generalizes the established binormal model for estimating smooth ROC curves. Using simulations, we show that our approach outperforms empirical estimates, and that an account of the class imbalance is crucial for obtaining unbiased PRC estimates.
doi_str_mv 10.1109/ICPR.2010.1036
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5597760</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5597760</ieee_id><sourcerecordid>5597760</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-58c94561abccccb9e149766645f29132c150396299f20f0923cdda45283a88863</originalsourceid><addsrcrecordid>eNo1j1tLAzEUhOMNXOu--uLL_oHUnJOcXB7r4qVQsJT6XLJpFgO7bdm0gv_eFXUYGD4GBoaxOxBTAOEe5vVyNUXxg0LqM1Y6Y0GhUoYUqHNWoJXAzYgX7Oa_QLxkBQgCrjTBNStzTo1AbbQhooLh-iNWj2m3H3rfVbOcT_3hmPa7avRyiCHlEfgqBt91VX0aPmO-ZVet73Is_3LC3p-f1vUrX7y9zOvZggdU-sjJBqdIg2_CqMZFUM5orRW16EBiABLSaXSuRdEKhzJst17R-MJba7WcsPvf3RRj3ByG1Pvha0PkjNFCfgMuhUcb</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>The Binormal Assumption on Precision-Recall Curves</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Brodersen, K H ; Ong, C S ; Stephan, K E ; Buhmann, J M</creator><creatorcontrib>Brodersen, K H ; Ong, C S ; Stephan, K E ; Buhmann, J M</creatorcontrib><description>The precision-recall curve (PRC) has become a widespread conceptual basis for assessing classification performance. The curve relates the positive predictive value of a classifier to its true positive rate and often provides a useful alternative to the well-known receiver operating characteristic (ROC). The empirical PRC, however, turns out to be a highly imprecise estimate of the true curve, especially in the case of a small sample size and class imbalance in favour of negative examples. Ironically, this situation tends to occur precisely in those applications where the curve would be most useful, e.g., in anomaly detection or information retrieval. Here, we propose to estimate the PRC on the basis of a simple distributional assumption about the decision values that generalizes the established binormal model for estimating smooth ROC curves. Using simulations, we show that our approach outperforms empirical estimates, and that an account of the class imbalance is crucial for obtaining unbiased PRC estimates.</description><identifier>ISSN: 1051-4651</identifier><identifier>ISBN: 1424475422</identifier><identifier>ISBN: 9781424475421</identifier><identifier>EISSN: 2831-7475</identifier><identifier>EISBN: 9781424475414</identifier><identifier>EISBN: 9780769541099</identifier><identifier>EISBN: 1424475414</identifier><identifier>EISBN: 0769541097</identifier><identifier>DOI: 10.1109/ICPR.2010.1036</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; classification performance ; Computational modeling ; Data models ; Estimation ; false discovery rate ; generalizability ; information retrieval ; Mathematical model ; Predictive models ; receiver operating characteristic ; Solid modeling</subject><ispartof>2010 20th International Conference on Pattern Recognition, 2010, p.4263-4266</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c246t-58c94561abccccb9e149766645f29132c150396299f20f0923cdda45283a88863</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5597760$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5597760$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Brodersen, K H</creatorcontrib><creatorcontrib>Ong, C S</creatorcontrib><creatorcontrib>Stephan, K E</creatorcontrib><creatorcontrib>Buhmann, J M</creatorcontrib><title>The Binormal Assumption on Precision-Recall Curves</title><title>2010 20th International Conference on Pattern Recognition</title><addtitle>ICPR</addtitle><description>The precision-recall curve (PRC) has become a widespread conceptual basis for assessing classification performance. The curve relates the positive predictive value of a classifier to its true positive rate and often provides a useful alternative to the well-known receiver operating characteristic (ROC). The empirical PRC, however, turns out to be a highly imprecise estimate of the true curve, especially in the case of a small sample size and class imbalance in favour of negative examples. Ironically, this situation tends to occur precisely in those applications where the curve would be most useful, e.g., in anomaly detection or information retrieval. Here, we propose to estimate the PRC on the basis of a simple distributional assumption about the decision values that generalizes the established binormal model for estimating smooth ROC curves. Using simulations, we show that our approach outperforms empirical estimates, and that an account of the class imbalance is crucial for obtaining unbiased PRC estimates.</description><subject>Accuracy</subject><subject>classification performance</subject><subject>Computational modeling</subject><subject>Data models</subject><subject>Estimation</subject><subject>false discovery rate</subject><subject>generalizability</subject><subject>information retrieval</subject><subject>Mathematical model</subject><subject>Predictive models</subject><subject>receiver operating characteristic</subject><subject>Solid modeling</subject><issn>1051-4651</issn><issn>2831-7475</issn><isbn>1424475422</isbn><isbn>9781424475421</isbn><isbn>9781424475414</isbn><isbn>9780769541099</isbn><isbn>1424475414</isbn><isbn>0769541097</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1j1tLAzEUhOMNXOu--uLL_oHUnJOcXB7r4qVQsJT6XLJpFgO7bdm0gv_eFXUYGD4GBoaxOxBTAOEe5vVyNUXxg0LqM1Y6Y0GhUoYUqHNWoJXAzYgX7Oa_QLxkBQgCrjTBNStzTo1AbbQhooLh-iNWj2m3H3rfVbOcT_3hmPa7avRyiCHlEfgqBt91VX0aPmO-ZVet73Is_3LC3p-f1vUrX7y9zOvZggdU-sjJBqdIg2_CqMZFUM5orRW16EBiABLSaXSuRdEKhzJst17R-MJba7WcsPvf3RRj3ByG1Pvha0PkjNFCfgMuhUcb</recordid><startdate>201008</startdate><enddate>201008</enddate><creator>Brodersen, K H</creator><creator>Ong, C S</creator><creator>Stephan, K E</creator><creator>Buhmann, J M</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201008</creationdate><title>The Binormal Assumption on Precision-Recall Curves</title><author>Brodersen, K H ; Ong, C S ; Stephan, K E ; Buhmann, J M</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-58c94561abccccb9e149766645f29132c150396299f20f0923cdda45283a88863</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Accuracy</topic><topic>classification performance</topic><topic>Computational modeling</topic><topic>Data models</topic><topic>Estimation</topic><topic>false discovery rate</topic><topic>generalizability</topic><topic>information retrieval</topic><topic>Mathematical model</topic><topic>Predictive models</topic><topic>receiver operating characteristic</topic><topic>Solid modeling</topic><toplevel>online_resources</toplevel><creatorcontrib>Brodersen, K H</creatorcontrib><creatorcontrib>Ong, C S</creatorcontrib><creatorcontrib>Stephan, K E</creatorcontrib><creatorcontrib>Buhmann, J M</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Brodersen, K H</au><au>Ong, C S</au><au>Stephan, K E</au><au>Buhmann, J M</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>The Binormal Assumption on Precision-Recall Curves</atitle><btitle>2010 20th International Conference on Pattern Recognition</btitle><stitle>ICPR</stitle><date>2010-08</date><risdate>2010</risdate><spage>4263</spage><epage>4266</epage><pages>4263-4266</pages><issn>1051-4651</issn><eissn>2831-7475</eissn><isbn>1424475422</isbn><isbn>9781424475421</isbn><eisbn>9781424475414</eisbn><eisbn>9780769541099</eisbn><eisbn>1424475414</eisbn><eisbn>0769541097</eisbn><abstract>The precision-recall curve (PRC) has become a widespread conceptual basis for assessing classification performance. The curve relates the positive predictive value of a classifier to its true positive rate and often provides a useful alternative to the well-known receiver operating characteristic (ROC). The empirical PRC, however, turns out to be a highly imprecise estimate of the true curve, especially in the case of a small sample size and class imbalance in favour of negative examples. Ironically, this situation tends to occur precisely in those applications where the curve would be most useful, e.g., in anomaly detection or information retrieval. Here, we propose to estimate the PRC on the basis of a simple distributional assumption about the decision values that generalizes the established binormal model for estimating smooth ROC curves. Using simulations, we show that our approach outperforms empirical estimates, and that an account of the class imbalance is crucial for obtaining unbiased PRC estimates.</abstract><pub>IEEE</pub><doi>10.1109/ICPR.2010.1036</doi><tpages>4</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1051-4651
ispartof 2010 20th International Conference on Pattern Recognition, 2010, p.4263-4266
issn 1051-4651
2831-7475
language eng
recordid cdi_ieee_primary_5597760
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Accuracy
classification performance
Computational modeling
Data models
Estimation
false discovery rate
generalizability
information retrieval
Mathematical model
Predictive models
receiver operating characteristic
Solid modeling
title The Binormal Assumption on Precision-Recall Curves
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T21%3A45%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=The%20Binormal%20Assumption%20on%20Precision-Recall%20Curves&rft.btitle=2010%2020th%20International%20Conference%20on%20Pattern%20Recognition&rft.au=Brodersen,%20K%20H&rft.date=2010-08&rft.spage=4263&rft.epage=4266&rft.pages=4263-4266&rft.issn=1051-4651&rft.eissn=2831-7475&rft.isbn=1424475422&rft.isbn_list=9781424475421&rft_id=info:doi/10.1109/ICPR.2010.1036&rft_dat=%3Cieee_6IE%3E5597760%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424475414&rft.eisbn_list=9780769541099&rft.eisbn_list=1424475414&rft.eisbn_list=0769541097&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5597760&rfr_iscdi=true