Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval

Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2024-11, Vol.36 (11), p.5926-5939
Hauptverfasser: Huo, Yadong, Qin, Qibing, Zhang, Wenfeng, Huang, Lei, Nie, Jie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5939
container_issue 11
container_start_page 5926
container_title IEEE transactions on knowledge and data engineering
container_volume 36
creator Huo, Yadong
Qin, Qibing
Zhang, Wenfeng
Huang, Lei
Nie, Jie
description Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.
doi_str_mv 10.1109/TKDE.2024.3401050
format Article
fullrecord <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_10530441</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10530441</ieee_id><sourcerecordid>10_1109_TKDE_2024_3401050</sourcerecordid><originalsourceid>FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</originalsourceid><addsrcrecordid>eNpNkEFLw0AQhRdRsFZ_gOBh_8DWmd1Ns3ssrVqxYtEWj2E2mdhIbMqmqP33NrQHT2_gvTc8PiGuEQaI4G8XT5O7gQZtB8YCQgInoodJ4pRGj6f7Gywqa2x6Li7a9hMAXOqwJ5YT5o2cVhwp5qudGv1QZDmPze9OTqldVesP-V5tV_KN61LNKedCzpjiujPKJspxbNpWPTcF1fKVt7Hib6ovxVlJdctXR-2L5f3dYjxVs5eHx_FopnKNbqtM8MGhJtZEgUvwaZk4Q45y9IYLl2IY7u3UsUE2IRR5GrxF9oVFPTRk-gIPf_NuReQy28Tqi-IuQ8g6LlnHJeu4ZEcu-87NoVMx8798YsBaNH_lRF8M</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><source>IEEE/IET Electronic Library (IEL) - Journals and E-Books</source><creator>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</creator><creatorcontrib>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</creatorcontrib><description>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2024.3401050</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>IEEE</publisher><subject>Binary codes ; Costs ; Cross-modal retrieval ; deep hashing ; Feature extraction ; Hash functions ; hierarchy-aware proxy ; Representation learning ; self-paced learning ; semantic hierarchy ; Semantics ; Training</subject><ispartof>IEEE transactions on knowledge and data engineering, 2024-11, Vol.36 (11), p.5926-5939</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</cites><orcidid>0000-0003-4087-3677 ; 0000-0001-7459-2510 ; 0009-0008-8805-8958 ; 0000-0001-7976-318X ; 0000-0003-4952-7666</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10530441$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10530441$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Huo, Yadong</creatorcontrib><creatorcontrib>Qin, Qibing</creatorcontrib><creatorcontrib>Zhang, Wenfeng</creatorcontrib><creatorcontrib>Huang, Lei</creatorcontrib><creatorcontrib>Nie, Jie</creatorcontrib><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</description><subject>Binary codes</subject><subject>Costs</subject><subject>Cross-modal retrieval</subject><subject>deep hashing</subject><subject>Feature extraction</subject><subject>Hash functions</subject><subject>hierarchy-aware proxy</subject><subject>Representation learning</subject><subject>self-paced learning</subject><subject>semantic hierarchy</subject><subject>Semantics</subject><subject>Training</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkEFLw0AQhRdRsFZ_gOBh_8DWmd1Ns3ssrVqxYtEWj2E2mdhIbMqmqP33NrQHT2_gvTc8PiGuEQaI4G8XT5O7gQZtB8YCQgInoodJ4pRGj6f7Gywqa2x6Li7a9hMAXOqwJ5YT5o2cVhwp5qudGv1QZDmPze9OTqldVesP-V5tV_KN61LNKedCzpjiujPKJspxbNpWPTcF1fKVt7Hib6ovxVlJdctXR-2L5f3dYjxVs5eHx_FopnKNbqtM8MGhJtZEgUvwaZk4Q45y9IYLl2IY7u3UsUE2IRR5GrxF9oVFPTRk-gIPf_NuReQy28Tqi-IuQ8g6LlnHJeu4ZEcu-87NoVMx8798YsBaNH_lRF8M</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Huo, Yadong</creator><creator>Qin, Qibing</creator><creator>Zhang, Wenfeng</creator><creator>Huang, Lei</creator><creator>Nie, Jie</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-4087-3677</orcidid><orcidid>https://orcid.org/0000-0001-7459-2510</orcidid><orcidid>https://orcid.org/0009-0008-8805-8958</orcidid><orcidid>https://orcid.org/0000-0001-7976-318X</orcidid><orcidid>https://orcid.org/0000-0003-4952-7666</orcidid></search><sort><creationdate>20241101</creationdate><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><author>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Binary codes</topic><topic>Costs</topic><topic>Cross-modal retrieval</topic><topic>deep hashing</topic><topic>Feature extraction</topic><topic>Hash functions</topic><topic>hierarchy-aware proxy</topic><topic>Representation learning</topic><topic>self-paced learning</topic><topic>semantic hierarchy</topic><topic>Semantics</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Huo, Yadong</creatorcontrib><creatorcontrib>Qin, Qibing</creatorcontrib><creatorcontrib>Zhang, Wenfeng</creatorcontrib><creatorcontrib>Huang, Lei</creatorcontrib><creatorcontrib>Nie, Jie</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE/IET Electronic Library (IEL) - Journals and E-Books</collection><collection>CrossRef</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Huo, Yadong</au><au>Qin, Qibing</au><au>Zhang, Wenfeng</au><au>Huang, Lei</au><au>Nie, Jie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>36</volume><issue>11</issue><spage>5926</spage><epage>5939</epage><pages>5926-5939</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</abstract><pub>IEEE</pub><doi>10.1109/TKDE.2024.3401050</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-4087-3677</orcidid><orcidid>https://orcid.org/0000-0001-7459-2510</orcidid><orcidid>https://orcid.org/0009-0008-8805-8958</orcidid><orcidid>https://orcid.org/0000-0001-7976-318X</orcidid><orcidid>https://orcid.org/0000-0003-4952-7666</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1041-4347
ispartof IEEE transactions on knowledge and data engineering, 2024-11, Vol.36 (11), p.5926-5939
issn 1041-4347
1558-2191
language eng
recordid cdi_ieee_primary_10530441
source IEEE/IET Electronic Library (IEL) - Journals and E-Books
subjects Binary codes
Costs
Cross-modal retrieval
deep hashing
Feature extraction
Hash functions
hierarchy-aware proxy
Representation learning
self-paced learning
semantic hierarchy
Semantics
Training
title Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T19%3A09%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Hierarchy-Aware%20Proxy%20Hashing%20With%20Self-Paced%20Learning%20for%20Cross-Modal%20Retrieval&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Huo,%20Yadong&rft.date=2024-11-01&rft.volume=36&rft.issue=11&rft.spage=5926&rft.epage=5939&rft.pages=5926-5939&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2024.3401050&rft_dat=%3Ccrossref_RIE%3E10_1109_TKDE_2024_3401050%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10530441&rfr_iscdi=true