DIOR: Learning to Hash With Label Noise Via Dual Partition and Contrastive Learning

Due to the excellent computing efficiency, learning to hash has acquired broad popularity for Big Data retrieval. Although supervised hashing methods have achieved promising performance recently, they presume that all training samples are appropriately annotated. Unfortunately, label noise is ubiqui...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on knowledge and data engineering 2024-04, Vol.36 (4), p.1502-1517
Hauptverfasser:	Wang, Haixin, Jiang, Huiyu, Sun, Jinan, Zhang, Shikun, Chen, Chong, Hua, Xian-Sheng, Luo, Xiao
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Big Data Big data retrieval Codes contrastive learning Data models Data retrieval Datasets Labels Learning learning to hash learning with label noise Loss measurement Noise measurement Optimization Parameters Semantics Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1517
container_issue	4
container_start_page	1502
container_title	IEEE transactions on knowledge and data engineering
container_volume	36
creator	Wang, Haixin Jiang, Huiyu Sun, Jinan Zhang, Shikun Chen, Chong Hua, Xian-Sheng Luo, Xiao
description	Due to the excellent computing efficiency, learning to hash has acquired broad popularity for Big Data retrieval. Although supervised hashing methods have achieved promising performance recently, they presume that all training samples are appropriately annotated. Unfortunately, label noise is ubiquitous owing to erroneous annotations in real-world applications, which could seriously deteriorate the retrieval performance due to imprecise supervised guidance and severe memorization of noisy data. Here we propose a comprehensive method DIOR to handle the difficulties of learning to hash with label noise. DIOR performs partitions from two complementary levels, namely sample level and parameter level. On the one hand, DIOR divides the dataset into a labeled set with clean samples and an unlabeled set with noisy samples using an ensemble of perturbed views. Then we train the network in a contrastive semi-supervised manner by reconstructing label embeddings for both reliable supervision of clean data and sufficient exploration of noisy data. On the other hand, inspired by recent pruning techniques, DIOR divides the parameters in the hashing network into crucial parameters and non-crucial parameters, and then optimizes them separately to reduce the overfitting of noisy data. Extensive experiments on four popular benchmark datasets demonstrate the effectiveness of DIOR.
doi_str_mv	10.1109/TKDE.2023.3312109
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TKDE_2023_3312109</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10239525</ieee_id><sourcerecordid>2947827102</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-9e12d590b0fbc0e20d4e67915f73e5b5d6435a05b692ab8a76c736923c89bb8a3</originalsourceid><addsrcrecordid>eNpNkM1KAzEUhYMoWKsPILgIuJ6a38nEnbTVFgcrWnUZkpmMTakzbZIKvr0pLeLqnns551z4ALjEaIAxkjfzx9F4QBChA0oxSZcj0MOcFxnBEh8njRjOGGXiFJyFsEQIFaLAPfA6ms5ebmFptW9d-wljByc6LOCHiwtYamNX8KlzwcJ3p-Foq1fwWfvooutaqNsaDrs2eh2i-7Z_JefgpNGrYC8Osw_e7sfz4SQrZw_T4V2ZVYTlMZMWk5pLZFBjKmQJqpnNhcS8EdRyw-ucUa4RN7kk2hRa5JWgSdOqkCbttA-u971r3222NkS17La-TS8VkUwUROAEpA_w3lX5LgRvG7X27kv7H4WR2rFTO3Zqx04d2KXM1T7jrLX__IRKTjj9BeHHaLU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2947827102</pqid></control><display><type>article</type><title>DIOR: Learning to Hash With Label Noise Via Dual Partition and Contrastive Learning</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Haixin ; Jiang, Huiyu ; Sun, Jinan ; Zhang, Shikun ; Chen, Chong ; Hua, Xian-Sheng ; Luo, Xiao</creator><creatorcontrib>Wang, Haixin ; Jiang, Huiyu ; Sun, Jinan ; Zhang, Shikun ; Chen, Chong ; Hua, Xian-Sheng ; Luo, Xiao</creatorcontrib><description>Due to the excellent computing efficiency, learning to hash has acquired broad popularity for Big Data retrieval. Although supervised hashing methods have achieved promising performance recently, they presume that all training samples are appropriately annotated. Unfortunately, label noise is ubiquitous owing to erroneous annotations in real-world applications, which could seriously deteriorate the retrieval performance due to imprecise supervised guidance and severe memorization of noisy data. Here we propose a comprehensive method DIOR to handle the difficulties of learning to hash with label noise. DIOR performs partitions from two complementary levels, namely sample level and parameter level. On the one hand, DIOR divides the dataset into a labeled set with clean samples and an unlabeled set with noisy samples using an ensemble of perturbed views. Then we train the network in a contrastive semi-supervised manner by reconstructing label embeddings for both reliable supervision of clean data and sufficient exploration of noisy data. On the other hand, inspired by recent pruning techniques, DIOR divides the parameters in the hashing network into crucial parameters and non-crucial parameters, and then optimizes them separately to reduce the overfitting of noisy data. Extensive experiments on four popular benchmark datasets demonstrate the effectiveness of DIOR.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2023.3312109</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Annotations ; Big Data ; Big data retrieval ; Codes ; contrastive learning ; Data models ; Data retrieval ; Datasets ; Labels ; Learning ; learning to hash ; learning with label noise ; Loss measurement ; Noise measurement ; Optimization ; Parameters ; Semantics ; Training</subject><ispartof>IEEE transactions on knowledge and data engineering, 2024-04, Vol.36 (4), p.1502-1517</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0001-8138-9262 ; 0000-0002-8576-2674 ; 0000-0002-8232-5049 ; 0000-0002-5714-0149 ; 0009-0008-9072-8244 ; 0000-0003-0213-9957 ; 0000-0002-7987-3714</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10239525$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10239525$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wang, Haixin</creatorcontrib><creatorcontrib>Jiang, Huiyu</creatorcontrib><creatorcontrib>Sun, Jinan</creatorcontrib><creatorcontrib>Zhang, Shikun</creatorcontrib><creatorcontrib>Chen, Chong</creatorcontrib><creatorcontrib>Hua, Xian-Sheng</creatorcontrib><creatorcontrib>Luo, Xiao</creatorcontrib><title>DIOR: Learning to Hash With Label Noise Via Dual Partition and Contrastive Learning</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Due to the excellent computing efficiency, learning to hash has acquired broad popularity for Big Data retrieval. Although supervised hashing methods have achieved promising performance recently, they presume that all training samples are appropriately annotated. Unfortunately, label noise is ubiquitous owing to erroneous annotations in real-world applications, which could seriously deteriorate the retrieval performance due to imprecise supervised guidance and severe memorization of noisy data. Here we propose a comprehensive method DIOR to handle the difficulties of learning to hash with label noise. DIOR performs partitions from two complementary levels, namely sample level and parameter level. On the one hand, DIOR divides the dataset into a labeled set with clean samples and an unlabeled set with noisy samples using an ensemble of perturbed views. Then we train the network in a contrastive semi-supervised manner by reconstructing label embeddings for both reliable supervision of clean data and sufficient exploration of noisy data. On the other hand, inspired by recent pruning techniques, DIOR divides the parameters in the hashing network into crucial parameters and non-crucial parameters, and then optimizes them separately to reduce the overfitting of noisy data. Extensive experiments on four popular benchmark datasets demonstrate the effectiveness of DIOR.</description><subject>Annotations</subject><subject>Big Data</subject><subject>Big data retrieval</subject><subject>Codes</subject><subject>contrastive learning</subject><subject>Data models</subject><subject>Data retrieval</subject><subject>Datasets</subject><subject>Labels</subject><subject>Learning</subject><subject>learning to hash</subject><subject>learning with label noise</subject><subject>Loss measurement</subject><subject>Noise measurement</subject><subject>Optimization</subject><subject>Parameters</subject><subject>Semantics</subject><subject>Training</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1KAzEUhYMoWKsPILgIuJ6a38nEnbTVFgcrWnUZkpmMTakzbZIKvr0pLeLqnns551z4ALjEaIAxkjfzx9F4QBChA0oxSZcj0MOcFxnBEh8njRjOGGXiFJyFsEQIFaLAPfA6ms5ebmFptW9d-wljByc6LOCHiwtYamNX8KlzwcJ3p-Foq1fwWfvooutaqNsaDrs2eh2i-7Z_JefgpNGrYC8Osw_e7sfz4SQrZw_T4V2ZVYTlMZMWk5pLZFBjKmQJqpnNhcS8EdRyw-ucUa4RN7kk2hRa5JWgSdOqkCbttA-u971r3222NkS17La-TS8VkUwUROAEpA_w3lX5LgRvG7X27kv7H4WR2rFTO3Zqx04d2KXM1T7jrLX__IRKTjj9BeHHaLU</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>Wang, Haixin</creator><creator>Jiang, Huiyu</creator><creator>Sun, Jinan</creator><creator>Zhang, Shikun</creator><creator>Chen, Chong</creator><creator>Hua, Xian-Sheng</creator><creator>Luo, Xiao</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-8138-9262</orcidid><orcidid>https://orcid.org/0000-0002-8576-2674</orcidid><orcidid>https://orcid.org/0000-0002-8232-5049</orcidid><orcidid>https://orcid.org/0000-0002-5714-0149</orcidid><orcidid>https://orcid.org/0009-0008-9072-8244</orcidid><orcidid>https://orcid.org/0000-0003-0213-9957</orcidid><orcidid>https://orcid.org/0000-0002-7987-3714</orcidid></search><sort><creationdate>20240401</creationdate><title>DIOR: Learning to Hash With Label Noise Via Dual Partition and Contrastive Learning</title><author>Wang, Haixin ; Jiang, Huiyu ; Sun, Jinan ; Zhang, Shikun ; Chen, Chong ; Hua, Xian-Sheng ; Luo, Xiao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-9e12d590b0fbc0e20d4e67915f73e5b5d6435a05b692ab8a76c736923c89bb8a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Annotations</topic><topic>Big Data</topic><topic>Big data retrieval</topic><topic>Codes</topic><topic>contrastive learning</topic><topic>Data models</topic><topic>Data retrieval</topic><topic>Datasets</topic><topic>Labels</topic><topic>Learning</topic><topic>learning to hash</topic><topic>learning with label noise</topic><topic>Loss measurement</topic><topic>Noise measurement</topic><topic>Optimization</topic><topic>Parameters</topic><topic>Semantics</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Haixin</creatorcontrib><creatorcontrib>Jiang, Huiyu</creatorcontrib><creatorcontrib>Sun, Jinan</creatorcontrib><creatorcontrib>Zhang, Shikun</creatorcontrib><creatorcontrib>Chen, Chong</creatorcontrib><creatorcontrib>Hua, Xian-Sheng</creatorcontrib><creatorcontrib>Luo, Xiao</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Haixin</au><au>Jiang, Huiyu</au><au>Sun, Jinan</au><au>Zhang, Shikun</au><au>Chen, Chong</au><au>Hua, Xian-Sheng</au><au>Luo, Xiao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DIOR: Learning to Hash With Label Noise Via Dual Partition and Contrastive Learning</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2024-04-01</date><risdate>2024</risdate><volume>36</volume><issue>4</issue><spage>1502</spage><epage>1517</epage><pages>1502-1517</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Due to the excellent computing efficiency, learning to hash has acquired broad popularity for Big Data retrieval. Although supervised hashing methods have achieved promising performance recently, they presume that all training samples are appropriately annotated. Unfortunately, label noise is ubiquitous owing to erroneous annotations in real-world applications, which could seriously deteriorate the retrieval performance due to imprecise supervised guidance and severe memorization of noisy data. Here we propose a comprehensive method DIOR to handle the difficulties of learning to hash with label noise. DIOR performs partitions from two complementary levels, namely sample level and parameter level. On the one hand, DIOR divides the dataset into a labeled set with clean samples and an unlabeled set with noisy samples using an ensemble of perturbed views. Then we train the network in a contrastive semi-supervised manner by reconstructing label embeddings for both reliable supervision of clean data and sufficient exploration of noisy data. On the other hand, inspired by recent pruning techniques, DIOR divides the parameters in the hashing network into crucial parameters and non-crucial parameters, and then optimizes them separately to reduce the overfitting of noisy data. Extensive experiments on four popular benchmark datasets demonstrate the effectiveness of DIOR.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TKDE.2023.3312109</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-8138-9262</orcidid><orcidid>https://orcid.org/0000-0002-8576-2674</orcidid><orcidid>https://orcid.org/0000-0002-8232-5049</orcidid><orcidid>https://orcid.org/0000-0002-5714-0149</orcidid><orcidid>https://orcid.org/0009-0008-9072-8244</orcidid><orcidid>https://orcid.org/0000-0003-0213-9957</orcidid><orcidid>https://orcid.org/0000-0002-7987-3714</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1041-4347
ispartof	IEEE transactions on knowledge and data engineering, 2024-04, Vol.36 (4), p.1502-1517
issn	1041-4347 1558-2191
language	eng
recordid	cdi_crossref_primary_10_1109_TKDE_2023_3312109
source	IEEE Electronic Library (IEL)
subjects	Annotations Big Data Big data retrieval Codes contrastive learning Data models Data retrieval Datasets Labels Learning learning to hash learning with label noise Loss measurement Noise measurement Optimization Parameters Semantics Training
title	DIOR: Learning to Hash With Label Noise Via Dual Partition and Contrastive Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T13%3A50%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DIOR:%20Learning%20to%20Hash%20With%20Label%20Noise%20Via%20Dual%20Partition%20and%20Contrastive%20Learning&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Wang,%20Haixin&rft.date=2024-04-01&rft.volume=36&rft.issue=4&rft.spage=1502&rft.epage=1517&rft.pages=1502-1517&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2023.3312109&rft_dat=%3Cproquest_RIE%3E2947827102%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2947827102&rft_id=info:pmid/&rft_ieee_id=10239525&rfr_iscdi=true