Adding Seemingly Uninformative Labels Helps in Low Data Regimes
Evidence suggests that networks trained on large datasets generalize well not solely because of the numerous training examples, but also class diversity which encourages learning of enriched features. This raises the question of whether this remains true when data is scarce - is there an advantage t...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Matsoukas, Christos Hernandez, Albert Bou I Liu, Yue Dembrower, Karin Miranda, Gisele Konuk, Emir Haslum, Johan Fredin Zouzos, Athanasios Lindholm, Peter Strand, Fredrik Smith, Kevin |
description | Evidence suggests that networks trained on large datasets generalize well not
solely because of the numerous training examples, but also class diversity
which encourages learning of enriched features. This raises the question of
whether this remains true when data is scarce - is there an advantage to
learning with additional labels in low-data regimes? In this work, we consider
a task that requires difficult-to-obtain expert annotations: tumor segmentation
in mammography images. We show that, in low-data settings, performance can be
improved by complementing the expert annotations with seemingly uninformative
labels from non-expert annotators, turning the task into a multi-class problem.
We reveal that these gains increase when less expert data is available, and
uncover several interesting properties through further studies. We demonstrate
our findings on CSAW-S, a new dataset that we introduce here, and confirm them
on two public datasets. |
doi_str_mv | 10.48550/arxiv.2008.00807 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2008_00807</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2008_00807</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-4eb37e8404b12c50d81cfddc0115ede4b91b8a36cface3c2a54d552ce2ed51193</originalsourceid><addsrcrecordid>eNotz8kKwjAUheFsXIj6AK7MC7RmtHEl4gwFwWFdbpJbCbRVWnF4e8fF4d8d-AjpcxYrozUbQv0It1gwZuL3WNImk6n3oTrRPWL5bvGkxypU-bku4RpuSFOwWDR0jcWloaGi6flO53AFusNTKLHpklYORYO9fzvksFwcZuso3a42s2kawShJIoVWJmgUU5YLp5k33OXeO8a5Ro_Kjrk1IEcuB4fSCdDKay0cCvSa87HskMHv9ivILnUooX5mH0n2lcgXlMhDig</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Adding Seemingly Uninformative Labels Helps in Low Data Regimes</title><source>arXiv.org</source><creator>Matsoukas, Christos ; Hernandez, Albert Bou I ; Liu, Yue ; Dembrower, Karin ; Miranda, Gisele ; Konuk, Emir ; Haslum, Johan Fredin ; Zouzos, Athanasios ; Lindholm, Peter ; Strand, Fredrik ; Smith, Kevin</creator><creatorcontrib>Matsoukas, Christos ; Hernandez, Albert Bou I ; Liu, Yue ; Dembrower, Karin ; Miranda, Gisele ; Konuk, Emir ; Haslum, Johan Fredin ; Zouzos, Athanasios ; Lindholm, Peter ; Strand, Fredrik ; Smith, Kevin</creatorcontrib><description>Evidence suggests that networks trained on large datasets generalize well not
solely because of the numerous training examples, but also class diversity
which encourages learning of enriched features. This raises the question of
whether this remains true when data is scarce - is there an advantage to
learning with additional labels in low-data regimes? In this work, we consider
a task that requires difficult-to-obtain expert annotations: tumor segmentation
in mammography images. We show that, in low-data settings, performance can be
improved by complementing the expert annotations with seemingly uninformative
labels from non-expert annotators, turning the task into a multi-class problem.
We reveal that these gains increase when less expert data is available, and
uncover several interesting properties through further studies. We demonstrate
our findings on CSAW-S, a new dataset that we introduce here, and confirm them
on two public datasets.</description><identifier>DOI: 10.48550/arxiv.2008.00807</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2020-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2008.00807$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2008.00807$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Matsoukas, Christos</creatorcontrib><creatorcontrib>Hernandez, Albert Bou I</creatorcontrib><creatorcontrib>Liu, Yue</creatorcontrib><creatorcontrib>Dembrower, Karin</creatorcontrib><creatorcontrib>Miranda, Gisele</creatorcontrib><creatorcontrib>Konuk, Emir</creatorcontrib><creatorcontrib>Haslum, Johan Fredin</creatorcontrib><creatorcontrib>Zouzos, Athanasios</creatorcontrib><creatorcontrib>Lindholm, Peter</creatorcontrib><creatorcontrib>Strand, Fredrik</creatorcontrib><creatorcontrib>Smith, Kevin</creatorcontrib><title>Adding Seemingly Uninformative Labels Helps in Low Data Regimes</title><description>Evidence suggests that networks trained on large datasets generalize well not
solely because of the numerous training examples, but also class diversity
which encourages learning of enriched features. This raises the question of
whether this remains true when data is scarce - is there an advantage to
learning with additional labels in low-data regimes? In this work, we consider
a task that requires difficult-to-obtain expert annotations: tumor segmentation
in mammography images. We show that, in low-data settings, performance can be
improved by complementing the expert annotations with seemingly uninformative
labels from non-expert annotators, turning the task into a multi-class problem.
We reveal that these gains increase when less expert data is available, and
uncover several interesting properties through further studies. We demonstrate
our findings on CSAW-S, a new dataset that we introduce here, and confirm them
on two public datasets.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz8kKwjAUheFsXIj6AK7MC7RmtHEl4gwFwWFdbpJbCbRVWnF4e8fF4d8d-AjpcxYrozUbQv0It1gwZuL3WNImk6n3oTrRPWL5bvGkxypU-bku4RpuSFOwWDR0jcWloaGi6flO53AFusNTKLHpklYORYO9fzvksFwcZuso3a42s2kawShJIoVWJmgUU5YLp5k33OXeO8a5Ro_Kjrk1IEcuB4fSCdDKay0cCvSa87HskMHv9ivILnUooX5mH0n2lcgXlMhDig</recordid><startdate>20200720</startdate><enddate>20200720</enddate><creator>Matsoukas, Christos</creator><creator>Hernandez, Albert Bou I</creator><creator>Liu, Yue</creator><creator>Dembrower, Karin</creator><creator>Miranda, Gisele</creator><creator>Konuk, Emir</creator><creator>Haslum, Johan Fredin</creator><creator>Zouzos, Athanasios</creator><creator>Lindholm, Peter</creator><creator>Strand, Fredrik</creator><creator>Smith, Kevin</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20200720</creationdate><title>Adding Seemingly Uninformative Labels Helps in Low Data Regimes</title><author>Matsoukas, Christos ; Hernandez, Albert Bou I ; Liu, Yue ; Dembrower, Karin ; Miranda, Gisele ; Konuk, Emir ; Haslum, Johan Fredin ; Zouzos, Athanasios ; Lindholm, Peter ; Strand, Fredrik ; Smith, Kevin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-4eb37e8404b12c50d81cfddc0115ede4b91b8a36cface3c2a54d552ce2ed51193</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Matsoukas, Christos</creatorcontrib><creatorcontrib>Hernandez, Albert Bou I</creatorcontrib><creatorcontrib>Liu, Yue</creatorcontrib><creatorcontrib>Dembrower, Karin</creatorcontrib><creatorcontrib>Miranda, Gisele</creatorcontrib><creatorcontrib>Konuk, Emir</creatorcontrib><creatorcontrib>Haslum, Johan Fredin</creatorcontrib><creatorcontrib>Zouzos, Athanasios</creatorcontrib><creatorcontrib>Lindholm, Peter</creatorcontrib><creatorcontrib>Strand, Fredrik</creatorcontrib><creatorcontrib>Smith, Kevin</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Matsoukas, Christos</au><au>Hernandez, Albert Bou I</au><au>Liu, Yue</au><au>Dembrower, Karin</au><au>Miranda, Gisele</au><au>Konuk, Emir</au><au>Haslum, Johan Fredin</au><au>Zouzos, Athanasios</au><au>Lindholm, Peter</au><au>Strand, Fredrik</au><au>Smith, Kevin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adding Seemingly Uninformative Labels Helps in Low Data Regimes</atitle><date>2020-07-20</date><risdate>2020</risdate><abstract>Evidence suggests that networks trained on large datasets generalize well not
solely because of the numerous training examples, but also class diversity
which encourages learning of enriched features. This raises the question of
whether this remains true when data is scarce - is there an advantage to
learning with additional labels in low-data regimes? In this work, we consider
a task that requires difficult-to-obtain expert annotations: tumor segmentation
in mammography images. We show that, in low-data settings, performance can be
improved by complementing the expert annotations with seemingly uninformative
labels from non-expert annotators, turning the task into a multi-class problem.
We reveal that these gains increase when less expert data is available, and
uncover several interesting properties through further studies. We demonstrate
our findings on CSAW-S, a new dataset that we introduce here, and confirm them
on two public datasets.</abstract><doi>10.48550/arxiv.2008.00807</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2008.00807 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2008_00807 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning Statistics - Machine Learning |
title | Adding Seemingly Uninformative Labels Helps in Low Data Regimes |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T14%3A47%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adding%20Seemingly%20Uninformative%20Labels%20Helps%20in%20Low%20Data%20Regimes&rft.au=Matsoukas,%20Christos&rft.date=2020-07-20&rft_id=info:doi/10.48550/arxiv.2008.00807&rft_dat=%3Carxiv_GOX%3E2008_00807%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |