WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS
Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering prob...
Gespeichert in:
Veröffentlicht in: | The annals of applied statistics 2015-06, Vol.9 (2), p.801-820 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 820 |
---|---|
container_issue | 2 |
container_start_page | 801 |
container_title | The annals of applied statistics |
container_volume | 9 |
creator | Wager, Stefan Blocker, Alexander Cardin, Niall |
description | Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research. |
doi_str_mv | 10.1214/15-AOAS812 |
format | Article |
fullrecord | <record><control><sourceid>jstor_proje</sourceid><recordid>TN_cdi_projecteuclid_primary_oai_CULeuclid_euclid_aoas_1437397112</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>24522603</jstor_id><sourcerecordid>24522603</sourcerecordid><originalsourceid>FETCH-LOGICAL-c306t-25cc2f1be2630bbce7527d95e914b906e39f57cf141445c8d55de8b7676a32813</originalsourceid><addsrcrecordid>eNo9kE1Lw0AURQdRsFY37oVZC9F585FJ3I3pNAbHRJJGcRWSyQRaKpGkLvz3tjR0dS-P887iInQL5AEo8EcQnspUEQA9QzMIOXiSMXJ-6Ix6Pgh5ia7GcUOI4AGHGUo_tXo1X7go33X-kRR6gSNTFiudJ2n8hI1WebpveJmk2otztY8FLpI4VabAyzx7w1Gm8kJjo561Ka7RRVdvR3cz5RyVS72KXjyTxUmkjGcZ8XceFdbSDhpHfUaaxjopqGxD4ULgTUh8x8JOSNsBB86FDVohWhc00pd-zWgAbI7U0fsz9Btnd-7Xbtdt9TOsv-vhr-rrdRWVZrpOUff1WAFnkoUSgO4d90eHHfpxHFx3egdSHdasQFTTmnv47ghvxl0_nEjKBaU-YewfAIVp-Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS</title><source>Jstor Complete Legacy</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Project Euclid Complete</source><source>Alma/SFX Local Collection</source><source>JSTOR Mathematics & Statistics</source><creator>Wager, Stefan ; Blocker, Alexander ; Cardin, Niall</creator><creatorcontrib>Wager, Stefan ; Blocker, Alexander ; Cardin, Niall</creatorcontrib><description>Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research.</description><identifier>ISSN: 1932-6157</identifier><identifier>EISSN: 1941-7330</identifier><identifier>DOI: 10.1214/15-AOAS812</identifier><language>eng</language><publisher>Institute of Mathematical Statistics</publisher><subject>Latent variables model ; uncertain class label</subject><ispartof>The annals of applied statistics, 2015-06, Vol.9 (2), p.801-820</ispartof><rights>Copyright © 2015 Institute of Mathematical Statistics</rights><rights>Copyright 2015 Institute of Mathematical Statistics</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c306t-25cc2f1be2630bbce7527d95e914b906e39f57cf141445c8d55de8b7676a32813</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/24522603$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/24522603$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,799,828,881,921,27903,27904,57995,57999,58228,58232</link.rule.ids></links><search><creatorcontrib>Wager, Stefan</creatorcontrib><creatorcontrib>Blocker, Alexander</creatorcontrib><creatorcontrib>Cardin, Niall</creatorcontrib><title>WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS</title><title>The annals of applied statistics</title><description>Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research.</description><subject>Latent variables model</subject><subject>uncertain class label</subject><issn>1932-6157</issn><issn>1941-7330</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNo9kE1Lw0AURQdRsFY37oVZC9F585FJ3I3pNAbHRJJGcRWSyQRaKpGkLvz3tjR0dS-P887iInQL5AEo8EcQnspUEQA9QzMIOXiSMXJ-6Ix6Pgh5ia7GcUOI4AGHGUo_tXo1X7go33X-kRR6gSNTFiudJ2n8hI1WebpveJmk2otztY8FLpI4VabAyzx7w1Gm8kJjo561Ka7RRVdvR3cz5RyVS72KXjyTxUmkjGcZ8XceFdbSDhpHfUaaxjopqGxD4ULgTUh8x8JOSNsBB86FDVohWhc00pd-zWgAbI7U0fsz9Btnd-7Xbtdt9TOsv-vhr-rrdRWVZrpOUff1WAFnkoUSgO4d90eHHfpxHFx3egdSHdasQFTTmnv47ghvxl0_nEjKBaU-YewfAIVp-Q</recordid><startdate>20150601</startdate><enddate>20150601</enddate><creator>Wager, Stefan</creator><creator>Blocker, Alexander</creator><creator>Cardin, Niall</creator><general>Institute of Mathematical Statistics</general><general>The Institute of Mathematical Statistics</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20150601</creationdate><title>WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS</title><author>Wager, Stefan ; Blocker, Alexander ; Cardin, Niall</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c306t-25cc2f1be2630bbce7527d95e914b906e39f57cf141445c8d55de8b7676a32813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Latent variables model</topic><topic>uncertain class label</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wager, Stefan</creatorcontrib><creatorcontrib>Blocker, Alexander</creatorcontrib><creatorcontrib>Cardin, Niall</creatorcontrib><collection>CrossRef</collection><jtitle>The annals of applied statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wager, Stefan</au><au>Blocker, Alexander</au><au>Cardin, Niall</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS</atitle><jtitle>The annals of applied statistics</jtitle><date>2015-06-01</date><risdate>2015</risdate><volume>9</volume><issue>2</issue><spage>801</spage><epage>820</epage><pages>801-820</pages><issn>1932-6157</issn><eissn>1941-7330</eissn><abstract>Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research.</abstract><pub>Institute of Mathematical Statistics</pub><doi>10.1214/15-AOAS812</doi><tpages>20</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1932-6157 |
ispartof | The annals of applied statistics, 2015-06, Vol.9 (2), p.801-820 |
issn | 1932-6157 1941-7330 |
language | eng |
recordid | cdi_projecteuclid_primary_oai_CULeuclid_euclid_aoas_1437397112 |
source | Jstor Complete Legacy; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Project Euclid Complete; Alma/SFX Local Collection; JSTOR Mathematics & Statistics |
subjects | Latent variables model uncertain class label |
title | WEAKLY SUPERVISED CLUSTERING: LEARNING FINE-GRAINED SIGNALS FROM COARSE LABELS |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T03%3A57%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proje&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=WEAKLY%20SUPERVISED%20CLUSTERING:%20LEARNING%20FINE-GRAINED%20SIGNALS%20FROM%20COARSE%20LABELS&rft.jtitle=The%20annals%20of%20applied%20statistics&rft.au=Wager,%20Stefan&rft.date=2015-06-01&rft.volume=9&rft.issue=2&rft.spage=801&rft.epage=820&rft.pages=801-820&rft.issn=1932-6157&rft.eissn=1941-7330&rft_id=info:doi/10.1214/15-AOAS812&rft_dat=%3Cjstor_proje%3E24522603%3C/jstor_proje%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_jstor_id=24522603&rfr_iscdi=true |