Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval

Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on knowledge and data engineering 2024-11, Vol.36 (11), p.5926-5939
Hauptverfasser:	Huo, Yadong, Qin, Qibing, Zhang, Wenfeng, Huang, Lei, Nie, Jie
Format:	Artikel
Sprache:	eng
Schlagworte:	Binary codes Costs Cross-modal retrieval deep hashing Feature extraction Hash functions hierarchy-aware proxy Representation learning self-paced learning semantic hierarchy Semantics Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5939
container_issue	11
container_start_page	5926
container_title	IEEE transactions on knowledge and data engineering
container_volume	36
creator	Huo, Yadong Qin, Qibing Zhang, Wenfeng Huang, Lei Nie, Jie
description	Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.
doi_str_mv	10.1109/TKDE.2024.3401050
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_10530441</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10530441</ieee_id><sourcerecordid>10_1109_TKDE_2024_3401050</sourcerecordid><originalsourceid>FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</originalsourceid><addsrcrecordid>eNpNkEFLw0AQhRdRsFZ_gOBh_8DWmd1Ns3ssrVqxYtEWj2E2mdhIbMqmqP33NrQHT2_gvTc8PiGuEQaI4G8XT5O7gQZtB8YCQgInoodJ4pRGj6f7Gywqa2x6Li7a9hMAXOqwJ5YT5o2cVhwp5qudGv1QZDmPze9OTqldVesP-V5tV_KN61LNKedCzpjiujPKJspxbNpWPTcF1fKVt7Hib6ovxVlJdctXR-2L5f3dYjxVs5eHx_FopnKNbqtM8MGhJtZEgUvwaZk4Q45y9IYLl2IY7u3UsUE2IRR5GrxF9oVFPTRk-gIPf_NuReQy28Tqi-IuQ8g6LlnHJeu4ZEcu-87NoVMx8798YsBaNH_lRF8M</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><source>IEEE/IET Electronic Library (IEL) - Journals and E-Books</source><creator>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</creator><creatorcontrib>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</creatorcontrib><description>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2024.3401050</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>IEEE</publisher><subject>Binary codes ; Costs ; Cross-modal retrieval ; deep hashing ; Feature extraction ; Hash functions ; hierarchy-aware proxy ; Representation learning ; self-paced learning ; semantic hierarchy ; Semantics ; Training</subject><ispartof>IEEE transactions on knowledge and data engineering, 2024-11, Vol.36 (11), p.5926-5939</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</cites><orcidid>0000-0003-4087-3677 ; 0000-0001-7459-2510 ; 0009-0008-8805-8958 ; 0000-0001-7976-318X ; 0000-0003-4952-7666</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10530441$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10530441$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Huo, Yadong</creatorcontrib><creatorcontrib>Qin, Qibing</creatorcontrib><creatorcontrib>Zhang, Wenfeng</creatorcontrib><creatorcontrib>Huang, Lei</creatorcontrib><creatorcontrib>Nie, Jie</creatorcontrib><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</description><subject>Binary codes</subject><subject>Costs</subject><subject>Cross-modal retrieval</subject><subject>deep hashing</subject><subject>Feature extraction</subject><subject>Hash functions</subject><subject>hierarchy-aware proxy</subject><subject>Representation learning</subject><subject>self-paced learning</subject><subject>semantic hierarchy</subject><subject>Semantics</subject><subject>Training</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkEFLw0AQhRdRsFZ_gOBh_8DWmd1Ns3ssrVqxYtEWj2E2mdhIbMqmqP33NrQHT2_gvTc8PiGuEQaI4G8XT5O7gQZtB8YCQgInoodJ4pRGj6f7Gywqa2x6Li7a9hMAXOqwJ5YT5o2cVhwp5qudGv1QZDmPze9OTqldVesP-V5tV_KN61LNKedCzpjiujPKJspxbNpWPTcF1fKVt7Hib6ovxVlJdctXR-2L5f3dYjxVs5eHx_FopnKNbqtM8MGhJtZEgUvwaZk4Q45y9IYLl2IY7u3UsUE2IRR5GrxF9oVFPTRk-gIPf_NuReQy28Tqi-IuQ8g6LlnHJeu4ZEcu-87NoVMx8798YsBaNH_lRF8M</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Huo, Yadong</creator><creator>Qin, Qibing</creator><creator>Zhang, Wenfeng</creator><creator>Huang, Lei</creator><creator>Nie, Jie</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-4087-3677</orcidid><orcidid>https://orcid.org/0000-0001-7459-2510</orcidid><orcidid>https://orcid.org/0009-0008-8805-8958</orcidid><orcidid>https://orcid.org/0000-0001-7976-318X</orcidid><orcidid>https://orcid.org/0000-0003-4952-7666</orcidid></search><sort><creationdate>20241101</creationdate><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><author>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Binary codes</topic><topic>Costs</topic><topic>Cross-modal retrieval</topic><topic>deep hashing</topic><topic>Feature extraction</topic><topic>Hash functions</topic><topic>hierarchy-aware proxy</topic><topic>Representation learning</topic><topic>self-paced learning</topic><topic>semantic hierarchy</topic><topic>Semantics</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Huo, Yadong</creatorcontrib><creatorcontrib>Qin, Qibing</creatorcontrib><creatorcontrib>Zhang, Wenfeng</creatorcontrib><creatorcontrib>Huang, Lei</creatorcontrib><creatorcontrib>Nie, Jie</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE/IET Electronic Library (IEL) - Journals and E-Books</collection><collection>CrossRef</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Huo, Yadong</au><au>Qin, Qibing</au><au>Zhang, Wenfeng</au><au>Huang, Lei</au><au>Nie, Jie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>36</volume><issue>11</issue><spage>5926</spage><epage>5939</epage><pages>5926-5939</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</abstract><pub>IEEE</pub><doi>10.1109/TKDE.2024.3401050</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-4087-3677</orcidid><orcidid>https://orcid.org/0000-0001-7459-2510</orcidid><orcidid>https://orcid.org/0009-0008-8805-8958</orcidid><orcidid>https://orcid.org/0000-0001-7976-318X</orcidid><orcidid>https://orcid.org/0000-0003-4952-7666</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1041-4347
ispartof	IEEE transactions on knowledge and data engineering, 2024-11, Vol.36 (11), p.5926-5939
issn	1041-4347 1558-2191
language	eng
recordid	cdi_ieee_primary_10530441
source	IEEE/IET Electronic Library (IEL) - Journals and E-Books
subjects	Binary codes Costs Cross-modal retrieval deep hashing Feature extraction Hash functions hierarchy-aware proxy Representation learning self-paced learning semantic hierarchy Semantics Training
title	Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T19%3A09%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Hierarchy-Aware%20Proxy%20Hashing%20With%20Self-Paced%20Learning%20for%20Cross-Modal%20Retrieval&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Huo,%20Yadong&rft.date=2024-11-01&rft.volume=36&rft.issue=11&rft.spage=5926&rft.epage=5939&rft.pages=5926-5939&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2024.3401050&rft_dat=%3Ccrossref_RIE%3E10_1109_TKDE_2024_3401050%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10530441&rfr_iscdi=true