Recognition From Web Data: A Progressive Filtering Approach

Leveraging the abundant number of web data is a promising strategy in addressing the problem of data lacking when training convolutional neural networks (CNNs). However, the web images often contain incorrect tags, which may compromise the learned CNN model. To address this problem, this paper focus...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2018-11, Vol.27 (11), p.5303-5315
Hauptverfasser:	Yang, Jufeng, Sun, Xiaoxiao, Lai, Yu-Kun, Zheng, Liang, Cheng, Ming-Ming
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks CNN Data models Filtration Ice Image classification Image quality Iterative methods Labels multiple labels Neural networks Noise measurement Noisy web data progressive filtering Recognition Reliability Task analysis Training Training data
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5315
container_issue	11
container_start_page	5303
container_title	IEEE transactions on image processing
container_volume	27
creator	Yang, Jufeng Sun, Xiaoxiao Lai, Yu-Kun Zheng, Liang Cheng, Ming-Ming
description	Leveraging the abundant number of web data is a promising strategy in addressing the problem of data lacking when training convolutional neural networks (CNNs). However, the web images often contain incorrect tags, which may compromise the learned CNN model. To address this problem, this paper focuses on image classification and proposes to iterate between filtering out noisy web labels and fine-tuning the CNN model using the crawled web images. Overall, the proposed method benefits from the growing modeling capability of the learned model to correct labels for web images and learning from such new data to produce a more effective model. Our contribution is two-fold. First, we propose an iterative method that progressively improves the discriminative ability of CNNs and the accuracy of web image selection. This method is beneficial toward selecting high-quality web training images and expanding the training set as the model gets ameliorated. Second, since web images are usually complex and may not be accurately described by a single tag, we propose to assign a web image multiple labels to reduce the impact of hard label assignment. This labeling strategy mines more training samples to improve the CNN model. In the experiments, we crawl 0.5 million web images covering all categories of four public image classification data sets. Compared with the baseline which has no web images for training, we show that the proposed method brings notable improvement. We also report the competitive recognition accuracy compared with the state of the art.
doi_str_mv	10.1109/TIP.2018.2855449
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_30010575</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8410611</ieee_id><sourcerecordid>2070800733</sourcerecordid><originalsourceid>FETCH-LOGICAL-c436t-55a7cef5e7130f110eaf6fda58ac8784c673d70618ab36e80c95662c65639e8e3</originalsourceid><addsrcrecordid>eNpdkEtLw0AQgBdRfN8FQQJevKTOZJ_RU1GrhYIiisdlu53USJvU3VTw37ul1YOnGZhvXh9jJwg9RCgvX4ZPvQLQ9AojpRDlFtvHUmAOIIrtlIPUuUZR7rGDGD8AUEhUu2yPpzTV5D67fibfTpu6q9smG4R2nr3ROLt1nbvK-tlTaKeBYqy_KBvUs45C3Uyz_mIRWuffj9hO5WaRjjfxkL0O7l5uHvLR4_3wpj_KveCqy6V02lMlSSOHKt1NrlLVxEnjvNFGeKX5RINC48ZckQFfSqUKr6TiJRnih-xiPTet_VxS7Oy8jp5mM9dQu4y2AA0GQHOe0PN_6Ee7DE26zhaIGkELYxIFa8qHNsZAlV2Eeu7Ct0WwK7E2ibUrsXYjNrWcbQYvx3Oa_DX8mkzA6RqoieivbASmx5D_AC4qeWI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2117107488</pqid></control><display><type>article</type><title>Recognition From Web Data: A Progressive Filtering Approach</title><source>IEEE Electronic Library (IEL)</source><creator>Yang, Jufeng ; Sun, Xiaoxiao ; Lai, Yu-Kun ; Zheng, Liang ; Cheng, Ming-Ming</creator><creatorcontrib>Yang, Jufeng ; Sun, Xiaoxiao ; Lai, Yu-Kun ; Zheng, Liang ; Cheng, Ming-Ming</creatorcontrib><description>Leveraging the abundant number of web data is a promising strategy in addressing the problem of data lacking when training convolutional neural networks (CNNs). However, the web images often contain incorrect tags, which may compromise the learned CNN model. To address this problem, this paper focuses on image classification and proposes to iterate between filtering out noisy web labels and fine-tuning the CNN model using the crawled web images. Overall, the proposed method benefits from the growing modeling capability of the learned model to correct labels for web images and learning from such new data to produce a more effective model. Our contribution is two-fold. First, we propose an iterative method that progressively improves the discriminative ability of CNNs and the accuracy of web image selection. This method is beneficial toward selecting high-quality web training images and expanding the training set as the model gets ameliorated. Second, since web images are usually complex and may not be accurately described by a single tag, we propose to assign a web image multiple labels to reduce the impact of hard label assignment. This labeling strategy mines more training samples to improve the CNN model. In the experiments, we crawl 0.5 million web images covering all categories of four public image classification data sets. Compared with the baseline which has no web images for training, we show that the proposed method brings notable improvement. We also report the competitive recognition accuracy compared with the state of the art.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2018.2855449</identifier><identifier>PMID: 30010575</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; CNN ; Data models ; Filtration ; Ice ; Image classification ; Image quality ; Iterative methods ; Labels ; multiple labels ; Neural networks ; Noise measurement ; Noisy web data ; progressive filtering ; Recognition ; Reliability ; Task analysis ; Training ; Training data</subject><ispartof>IEEE transactions on image processing, 2018-11, Vol.27 (11), p.5303-5315</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c436t-55a7cef5e7130f110eaf6fda58ac8784c673d70618ab36e80c95662c65639e8e3</citedby><cites>FETCH-LOGICAL-c436t-55a7cef5e7130f110eaf6fda58ac8784c673d70618ab36e80c95662c65639e8e3</cites><orcidid>0000-0003-0219-3443 ; 0000-0001-5550-8758 ; 0000-0002-1464-9500</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8410611$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8410611$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30010575$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yang, Jufeng</creatorcontrib><creatorcontrib>Sun, Xiaoxiao</creatorcontrib><creatorcontrib>Lai, Yu-Kun</creatorcontrib><creatorcontrib>Zheng, Liang</creatorcontrib><creatorcontrib>Cheng, Ming-Ming</creatorcontrib><title>Recognition From Web Data: A Progressive Filtering Approach</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Leveraging the abundant number of web data is a promising strategy in addressing the problem of data lacking when training convolutional neural networks (CNNs). However, the web images often contain incorrect tags, which may compromise the learned CNN model. To address this problem, this paper focuses on image classification and proposes to iterate between filtering out noisy web labels and fine-tuning the CNN model using the crawled web images. Overall, the proposed method benefits from the growing modeling capability of the learned model to correct labels for web images and learning from such new data to produce a more effective model. Our contribution is two-fold. First, we propose an iterative method that progressively improves the discriminative ability of CNNs and the accuracy of web image selection. This method is beneficial toward selecting high-quality web training images and expanding the training set as the model gets ameliorated. Second, since web images are usually complex and may not be accurately described by a single tag, we propose to assign a web image multiple labels to reduce the impact of hard label assignment. This labeling strategy mines more training samples to improve the CNN model. In the experiments, we crawl 0.5 million web images covering all categories of four public image classification data sets. Compared with the baseline which has no web images for training, we show that the proposed method brings notable improvement. We also report the competitive recognition accuracy compared with the state of the art.</description><subject>Artificial neural networks</subject><subject>CNN</subject><subject>Data models</subject><subject>Filtration</subject><subject>Ice</subject><subject>Image classification</subject><subject>Image quality</subject><subject>Iterative methods</subject><subject>Labels</subject><subject>multiple labels</subject><subject>Neural networks</subject><subject>Noise measurement</subject><subject>Noisy web data</subject><subject>progressive filtering</subject><subject>Recognition</subject><subject>Reliability</subject><subject>Task analysis</subject><subject>Training</subject><subject>Training data</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtLw0AQgBdRfN8FQQJevKTOZJ_RU1GrhYIiisdlu53USJvU3VTw37ul1YOnGZhvXh9jJwg9RCgvX4ZPvQLQ9AojpRDlFtvHUmAOIIrtlIPUuUZR7rGDGD8AUEhUu2yPpzTV5D67fibfTpu6q9smG4R2nr3ROLt1nbvK-tlTaKeBYqy_KBvUs45C3Uyz_mIRWuffj9hO5WaRjjfxkL0O7l5uHvLR4_3wpj_KveCqy6V02lMlSSOHKt1NrlLVxEnjvNFGeKX5RINC48ZckQFfSqUKr6TiJRnih-xiPTet_VxS7Oy8jp5mM9dQu4y2AA0GQHOe0PN_6Ee7DE26zhaIGkELYxIFa8qHNsZAlV2Eeu7Ct0WwK7E2ibUrsXYjNrWcbQYvx3Oa_DX8mkzA6RqoieivbASmx5D_AC4qeWI</recordid><startdate>20181101</startdate><enddate>20181101</enddate><creator>Yang, Jufeng</creator><creator>Sun, Xiaoxiao</creator><creator>Lai, Yu-Kun</creator><creator>Zheng, Liang</creator><creator>Cheng, Ming-Ming</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0219-3443</orcidid><orcidid>https://orcid.org/0000-0001-5550-8758</orcidid><orcidid>https://orcid.org/0000-0002-1464-9500</orcidid></search><sort><creationdate>20181101</creationdate><title>Recognition From Web Data: A Progressive Filtering Approach</title><author>Yang, Jufeng ; Sun, Xiaoxiao ; Lai, Yu-Kun ; Zheng, Liang ; Cheng, Ming-Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c436t-55a7cef5e7130f110eaf6fda58ac8784c673d70618ab36e80c95662c65639e8e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Artificial neural networks</topic><topic>CNN</topic><topic>Data models</topic><topic>Filtration</topic><topic>Ice</topic><topic>Image classification</topic><topic>Image quality</topic><topic>Iterative methods</topic><topic>Labels</topic><topic>multiple labels</topic><topic>Neural networks</topic><topic>Noise measurement</topic><topic>Noisy web data</topic><topic>progressive filtering</topic><topic>Recognition</topic><topic>Reliability</topic><topic>Task analysis</topic><topic>Training</topic><topic>Training data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Jufeng</creatorcontrib><creatorcontrib>Sun, Xiaoxiao</creatorcontrib><creatorcontrib>Lai, Yu-Kun</creatorcontrib><creatorcontrib>Zheng, Liang</creatorcontrib><creatorcontrib>Cheng, Ming-Ming</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang, Jufeng</au><au>Sun, Xiaoxiao</au><au>Lai, Yu-Kun</au><au>Zheng, Liang</au><au>Cheng, Ming-Ming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Recognition From Web Data: A Progressive Filtering Approach</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2018-11-01</date><risdate>2018</risdate><volume>27</volume><issue>11</issue><spage>5303</spage><epage>5315</epage><pages>5303-5315</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Leveraging the abundant number of web data is a promising strategy in addressing the problem of data lacking when training convolutional neural networks (CNNs). However, the web images often contain incorrect tags, which may compromise the learned CNN model. To address this problem, this paper focuses on image classification and proposes to iterate between filtering out noisy web labels and fine-tuning the CNN model using the crawled web images. Overall, the proposed method benefits from the growing modeling capability of the learned model to correct labels for web images and learning from such new data to produce a more effective model. Our contribution is two-fold. First, we propose an iterative method that progressively improves the discriminative ability of CNNs and the accuracy of web image selection. This method is beneficial toward selecting high-quality web training images and expanding the training set as the model gets ameliorated. Second, since web images are usually complex and may not be accurately described by a single tag, we propose to assign a web image multiple labels to reduce the impact of hard label assignment. This labeling strategy mines more training samples to improve the CNN model. In the experiments, we crawl 0.5 million web images covering all categories of four public image classification data sets. Compared with the baseline which has no web images for training, we show that the proposed method brings notable improvement. We also report the competitive recognition accuracy compared with the state of the art.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>30010575</pmid><doi>10.1109/TIP.2018.2855449</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-0219-3443</orcidid><orcidid>https://orcid.org/0000-0001-5550-8758</orcidid><orcidid>https://orcid.org/0000-0002-1464-9500</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1057-7149
ispartof	IEEE transactions on image processing, 2018-11, Vol.27 (11), p.5303-5315
issn	1057-7149 1941-0042
language	eng
recordid	cdi_pubmed_primary_30010575
source	IEEE Electronic Library (IEL)
subjects	Artificial neural networks CNN Data models Filtration Ice Image classification Image quality Iterative methods Labels multiple labels Neural networks Noise measurement Noisy web data progressive filtering Recognition Reliability Task analysis Training Training data
title	Recognition From Web Data: A Progressive Filtering Approach
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T09%3A23%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Recognition%20From%20Web%20Data:%20A%20Progressive%20Filtering%20Approach&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Yang,%20Jufeng&rft.date=2018-11-01&rft.volume=27&rft.issue=11&rft.spage=5303&rft.epage=5315&rft.pages=5303-5315&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2018.2855449&rft_dat=%3Cproquest_RIE%3E2070800733%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2117107488&rft_id=info:pmid/30010575&rft_ieee_id=8410611&rfr_iscdi=true