Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval
Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2024-11, Vol.36 (11), p.5926-5939 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5939 |
---|---|
container_issue | 11 |
container_start_page | 5926 |
container_title | IEEE transactions on knowledge and data engineering |
container_volume | 36 |
creator | Huo, Yadong Qin, Qibing Zhang, Wenfeng Huang, Lei Nie, Jie |
description | Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics. |
doi_str_mv | 10.1109/TKDE.2024.3401050 |
format | Article |
fullrecord | <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_10530441</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10530441</ieee_id><sourcerecordid>10_1109_TKDE_2024_3401050</sourcerecordid><originalsourceid>FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</originalsourceid><addsrcrecordid>eNpNkEFLw0AQhRdRsFZ_gOBh_8DWmd1Ns3ssrVqxYtEWj2E2mdhIbMqmqP33NrQHT2_gvTc8PiGuEQaI4G8XT5O7gQZtB8YCQgInoodJ4pRGj6f7Gywqa2x6Li7a9hMAXOqwJ5YT5o2cVhwp5qudGv1QZDmPze9OTqldVesP-V5tV_KN61LNKedCzpjiujPKJspxbNpWPTcF1fKVt7Hib6ovxVlJdctXR-2L5f3dYjxVs5eHx_FopnKNbqtM8MGhJtZEgUvwaZk4Q45y9IYLl2IY7u3UsUE2IRR5GrxF9oVFPTRk-gIPf_NuReQy28Tqi-IuQ8g6LlnHJeu4ZEcu-87NoVMx8798YsBaNH_lRF8M</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><source>IEEE/IET Electronic Library (IEL) - Journals and E-Books</source><creator>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</creator><creatorcontrib>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</creatorcontrib><description>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2024.3401050</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>IEEE</publisher><subject>Binary codes ; Costs ; Cross-modal retrieval ; deep hashing ; Feature extraction ; Hash functions ; hierarchy-aware proxy ; Representation learning ; self-paced learning ; semantic hierarchy ; Semantics ; Training</subject><ispartof>IEEE transactions on knowledge and data engineering, 2024-11, Vol.36 (11), p.5926-5939</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</cites><orcidid>0000-0003-4087-3677 ; 0000-0001-7459-2510 ; 0009-0008-8805-8958 ; 0000-0001-7976-318X ; 0000-0003-4952-7666</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10530441$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10530441$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Huo, Yadong</creatorcontrib><creatorcontrib>Qin, Qibing</creatorcontrib><creatorcontrib>Zhang, Wenfeng</creatorcontrib><creatorcontrib>Huang, Lei</creatorcontrib><creatorcontrib>Nie, Jie</creatorcontrib><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</description><subject>Binary codes</subject><subject>Costs</subject><subject>Cross-modal retrieval</subject><subject>deep hashing</subject><subject>Feature extraction</subject><subject>Hash functions</subject><subject>hierarchy-aware proxy</subject><subject>Representation learning</subject><subject>self-paced learning</subject><subject>semantic hierarchy</subject><subject>Semantics</subject><subject>Training</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkEFLw0AQhRdRsFZ_gOBh_8DWmd1Ns3ssrVqxYtEWj2E2mdhIbMqmqP33NrQHT2_gvTc8PiGuEQaI4G8XT5O7gQZtB8YCQgInoodJ4pRGj6f7Gywqa2x6Li7a9hMAXOqwJ5YT5o2cVhwp5qudGv1QZDmPze9OTqldVesP-V5tV_KN61LNKedCzpjiujPKJspxbNpWPTcF1fKVt7Hib6ovxVlJdctXR-2L5f3dYjxVs5eHx_FopnKNbqtM8MGhJtZEgUvwaZk4Q45y9IYLl2IY7u3UsUE2IRR5GrxF9oVFPTRk-gIPf_NuReQy28Tqi-IuQ8g6LlnHJeu4ZEcu-87NoVMx8798YsBaNH_lRF8M</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Huo, Yadong</creator><creator>Qin, Qibing</creator><creator>Zhang, Wenfeng</creator><creator>Huang, Lei</creator><creator>Nie, Jie</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-4087-3677</orcidid><orcidid>https://orcid.org/0000-0001-7459-2510</orcidid><orcidid>https://orcid.org/0009-0008-8805-8958</orcidid><orcidid>https://orcid.org/0000-0001-7976-318X</orcidid><orcidid>https://orcid.org/0000-0003-4952-7666</orcidid></search><sort><creationdate>20241101</creationdate><title>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</title><author>Huo, Yadong ; Qin, Qibing ; Zhang, Wenfeng ; Huang, Lei ; Nie, Jie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c218t-3b9b812ae2aabef097f583a8ac193ed871b612a78e31e3bbdc7b941e9d41263a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Binary codes</topic><topic>Costs</topic><topic>Cross-modal retrieval</topic><topic>deep hashing</topic><topic>Feature extraction</topic><topic>Hash functions</topic><topic>hierarchy-aware proxy</topic><topic>Representation learning</topic><topic>self-paced learning</topic><topic>semantic hierarchy</topic><topic>Semantics</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Huo, Yadong</creatorcontrib><creatorcontrib>Qin, Qibing</creatorcontrib><creatorcontrib>Zhang, Wenfeng</creatorcontrib><creatorcontrib>Huang, Lei</creatorcontrib><creatorcontrib>Nie, Jie</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE/IET Electronic Library (IEL) - Journals and E-Books</collection><collection>CrossRef</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Huo, Yadong</au><au>Qin, Qibing</au><au>Zhang, Wenfeng</au><au>Huang, Lei</au><au>Nie, Jie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>36</volume><issue>11</issue><spage>5926</spage><epage>5939</epage><pages>5926-5939</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Due to its low storage cost and high retrieval efficiency, hashing technology is popularly applied in both academia and industry, which provides an interesting solution for cross-modal similarity retrieval. However, most existing supervised cross-modal hashing methods typically view the fixed-level semantic affinity defined by manual labels as supervised signals to guide hash learning, which only represents a small subset of complex semantic relations between multi-modal samples, thus impeding the hash function learning and degrading the obtained hash codes. In the paper, by learning shared hierarchy proxies, a novel deep cross-modal hashing framework, called Deep Hierarchy-aware Proxy Hashing (DHaPH), is proposed to construct the semantic hierarchy in a data-driven manner, thereby capturing the accurate fine-grained semantic relationships and achieving small intra-class scatter and big inter-class scatter. Specifically, by regarding the hierarchical proxies as learnable ancestors, a novel hierarchy-aware proxy loss is designed to model the latent semantic hierarchical structures from different modalities without prior hierarchy knowledge, in which similar samples share the same Lowest Common Ancestor (LCA) and dissimilar points have different LCA. Meanwhile, to adequately capture valuable semantic information from hard pairs, a multi-modal self-paced loss is introduced into cross-modal hashing to reweight multi-modal pairs dynamically, which enables the model to gradually focus on hard pairs while simultaneously learning universal patterns from multi-modal pairs. Extensive experiments on three available benchmark databases demonstrate that our proposed DHaPH framework outperforms the compared baselines with different evaluation metrics.</abstract><pub>IEEE</pub><doi>10.1109/TKDE.2024.3401050</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-4087-3677</orcidid><orcidid>https://orcid.org/0000-0001-7459-2510</orcidid><orcidid>https://orcid.org/0009-0008-8805-8958</orcidid><orcidid>https://orcid.org/0000-0001-7976-318X</orcidid><orcidid>https://orcid.org/0000-0003-4952-7666</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1041-4347 |
ispartof | IEEE transactions on knowledge and data engineering, 2024-11, Vol.36 (11), p.5926-5939 |
issn | 1041-4347 1558-2191 |
language | eng |
recordid | cdi_ieee_primary_10530441 |
source | IEEE/IET Electronic Library (IEL) - Journals and E-Books |
subjects | Binary codes Costs Cross-modal retrieval deep hashing Feature extraction Hash functions hierarchy-aware proxy Representation learning self-paced learning semantic hierarchy Semantics Training |
title | Deep Hierarchy-Aware Proxy Hashing With Self-Paced Learning for Cross-Modal Retrieval |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T19%3A09%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Hierarchy-Aware%20Proxy%20Hashing%20With%20Self-Paced%20Learning%20for%20Cross-Modal%20Retrieval&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Huo,%20Yadong&rft.date=2024-11-01&rft.volume=36&rft.issue=11&rft.spage=5926&rft.epage=5939&rft.pages=5926-5939&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2024.3401050&rft_dat=%3Ccrossref_RIE%3E10_1109_TKDE_2024_3401050%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10530441&rfr_iscdi=true |