WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS

Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering prob...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The annals of applied statistics 2015-06, Vol.9 (2), p.801-820
Hauptverfasser: Wager, Stefan, Blocker, Alexander, Cardin, Niall
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 820
container_issue 2
container_start_page 801
container_title The annals of applied statistics
container_volume 9
creator Wager, Stefan
Blocker, Alexander
Cardin, Niall
description Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research.
doi_str_mv 10.1214/15-AOAS812
format Article
fullrecord <record><control><sourceid>jstor_proje</sourceid><recordid>TN_cdi_projecteuclid_primary_oai_CULeuclid_euclid_aoas_1437397112</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>24522603</jstor_id><sourcerecordid>24522603</sourcerecordid><originalsourceid>FETCH-LOGICAL-c306t-25cc2f1be2630bbce7527d95e914b906e39f57cf141445c8d55de8b7676a32813</originalsourceid><addsrcrecordid>eNo9kE1Lw0AURQdRsFY37oVZC9F585FJ3I3pNAbHRJJGcRWSyQRaKpGkLvz3tjR0dS-P887iInQL5AEo8EcQnspUEQA9QzMIOXiSMXJ-6Ix6Pgh5ia7GcUOI4AGHGUo_tXo1X7go33X-kRR6gSNTFiudJ2n8hI1WebpveJmk2otztY8FLpI4VabAyzx7w1Gm8kJjo561Ka7RRVdvR3cz5RyVS72KXjyTxUmkjGcZ8XceFdbSDhpHfUaaxjopqGxD4ULgTUh8x8JOSNsBB86FDVohWhc00pd-zWgAbI7U0fsz9Btnd-7Xbtdt9TOsv-vhr-rrdRWVZrpOUff1WAFnkoUSgO4d90eHHfpxHFx3egdSHdasQFTTmnv47ghvxl0_nEjKBaU-YewfAIVp-Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS</title><source>Jstor Complete Legacy</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Project Euclid Complete</source><source>Alma/SFX Local Collection</source><source>JSTOR Mathematics &amp; Statistics</source><creator>Wager, Stefan ; Blocker, Alexander ; Cardin, Niall</creator><creatorcontrib>Wager, Stefan ; Blocker, Alexander ; Cardin, Niall</creatorcontrib><description>Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research.</description><identifier>ISSN: 1932-6157</identifier><identifier>EISSN: 1941-7330</identifier><identifier>DOI: 10.1214/15-AOAS812</identifier><language>eng</language><publisher>Institute of Mathematical Statistics</publisher><subject>Latent variables model ; uncertain class label</subject><ispartof>The annals of applied statistics, 2015-06, Vol.9 (2), p.801-820</ispartof><rights>Copyright © 2015 Institute of Mathematical Statistics</rights><rights>Copyright 2015 Institute of Mathematical Statistics</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c306t-25cc2f1be2630bbce7527d95e914b906e39f57cf141445c8d55de8b7676a32813</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/24522603$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/24522603$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,799,828,881,921,27903,27904,57995,57999,58228,58232</link.rule.ids></links><search><creatorcontrib>Wager, Stefan</creatorcontrib><creatorcontrib>Blocker, Alexander</creatorcontrib><creatorcontrib>Cardin, Niall</creatorcontrib><title>WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS</title><title>The annals of applied statistics</title><description>Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research.</description><subject>Latent variables model</subject><subject>uncertain class label</subject><issn>1932-6157</issn><issn>1941-7330</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNo9kE1Lw0AURQdRsFY37oVZC9F585FJ3I3pNAbHRJJGcRWSyQRaKpGkLvz3tjR0dS-P887iInQL5AEo8EcQnspUEQA9QzMIOXiSMXJ-6Ix6Pgh5ia7GcUOI4AGHGUo_tXo1X7go33X-kRR6gSNTFiudJ2n8hI1WebpveJmk2otztY8FLpI4VabAyzx7w1Gm8kJjo561Ka7RRVdvR3cz5RyVS72KXjyTxUmkjGcZ8XceFdbSDhpHfUaaxjopqGxD4ULgTUh8x8JOSNsBB86FDVohWhc00pd-zWgAbI7U0fsz9Btnd-7Xbtdt9TOsv-vhr-rrdRWVZrpOUff1WAFnkoUSgO4d90eHHfpxHFx3egdSHdasQFTTmnv47ghvxl0_nEjKBaU-YewfAIVp-Q</recordid><startdate>20150601</startdate><enddate>20150601</enddate><creator>Wager, Stefan</creator><creator>Blocker, Alexander</creator><creator>Cardin, Niall</creator><general>Institute of Mathematical Statistics</general><general>The Institute of Mathematical Statistics</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20150601</creationdate><title>WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS</title><author>Wager, Stefan ; Blocker, Alexander ; Cardin, Niall</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c306t-25cc2f1be2630bbce7527d95e914b906e39f57cf141445c8d55de8b7676a32813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Latent variables model</topic><topic>uncertain class label</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wager, Stefan</creatorcontrib><creatorcontrib>Blocker, Alexander</creatorcontrib><creatorcontrib>Cardin, Niall</creatorcontrib><collection>CrossRef</collection><jtitle>The annals of applied statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wager, Stefan</au><au>Blocker, Alexander</au><au>Cardin, Niall</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS</atitle><jtitle>The annals of applied statistics</jtitle><date>2015-06-01</date><risdate>2015</risdate><volume>9</volume><issue>2</issue><spage>801</spage><epage>820</epage><pages>801-820</pages><issn>1932-6157</issn><eissn>1941-7330</eissn><abstract>Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research.</abstract><pub>Institute of Mathematical Statistics</pub><doi>10.1214/15-AOAS812</doi><tpages>20</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1932-6157
ispartof The annals of applied statistics, 2015-06, Vol.9 (2), p.801-820
issn 1932-6157
1941-7330
language eng
recordid cdi_projecteuclid_primary_oai_CULeuclid_euclid_aoas_1437397112
source Jstor Complete Legacy; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Project Euclid Complete; Alma/SFX Local Collection; JSTOR Mathematics & Statistics
subjects Latent variables model
uncertain class label
title WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T03%3A57%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proje&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=WEAKLY%20SUPERVISED%20CLUSTERING:%20LEARNING%20FINE-GRAINED%20SIGNALS%20FROM%20COARSE%20LABELS&rft.jtitle=The%20annals%20of%20applied%20statistics&rft.au=Wager,%20Stefan&rft.date=2015-06-01&rft.volume=9&rft.issue=2&rft.spage=801&rft.epage=820&rft.pages=801-820&rft.issn=1932-6157&rft.eissn=1941-7330&rft_id=info:doi/10.1214/15-AOAS812&rft_dat=%3Cjstor_proje%3E24522603%3C/jstor_proje%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_jstor_id=24522603&rfr_iscdi=true