HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval
As the rapid growth of multi-modal data, hashing methods for cross-modal retrieval have received considerable attention. Deep-networks-based cross-modal hashing methods are appealing as they can integrate feature learning and hash coding into end-to-end trainable frameworks. However, it is still cha...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Zhang, Xi Zhou, Siyu Feng, Jiashi Lai, Hanjiang Li, Bo Pan, Yan Yin, Jian Yan, Shuicheng |
description | As the rapid growth of multi-modal data, hashing methods for cross-modal
retrieval have received considerable attention. Deep-networks-based cross-modal
hashing methods are appealing as they can integrate feature learning and hash
coding into end-to-end trainable frameworks. However, it is still challenging
to find content similarities between different modalities of data due to the
heterogeneity gap. To further address this problem, we propose an adversarial
hashing network with attention mechanism to enhance the measurement of content
similarities by selectively focusing on informative parts of multi-modal data.
The proposed new adversarial network, HashGAN, consists of three building
blocks: 1) the feature learning module to obtain feature representations, 2)
the generative attention module to generate an attention mask, which is used to
obtain the attended (foreground) and the unattended (background) feature
representations, 3) the discriminative hash coding module to learn hash
functions that preserve the similarities between different modalities. In our
framework, the generative module and the discriminative module are trained in
an adversarial way: the generator is learned to make the discriminator cannot
preserve the similarities of multi-modal data w.r.t. the background feature
representations, while the discriminator aims to preserve the similarities of
multi-modal data w.r.t. both the foreground and the background feature
representations. Extensive evaluations on several benchmark datasets
demonstrate that the proposed HashGAN brings substantial improvements over
other state-of-the-art cross-modal hashing methods. |
doi_str_mv | 10.48550/arxiv.1711.09347 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1711_09347</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1711_09347</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-bd42d3678b9d04e3865667e4426208e22180c1dbfc5487554eece16f3b957b5f3</originalsourceid><addsrcrecordid>eNotz81KxDAUBeBsXMjoA7gyL9CaNL91V6rOOIwKMvty09xooLZDUqq-vc7o6sDhcOAj5IqzUlql2A2kr7iU3HBeslpIc062G8jv6-b5tplnHOc4jQV8QkJ6h3igjV8wZUgRBnocxvGNhinRNk0506fJ__avOKeICwwX5CzAkPHyP1dk_3C_bzfF7mX92Da7ArQxhfOy8kIb62rPJAqrldYGpax0xSxWFbes596FXklrlJKIPXIdhKuVcSqIFbn-uz1hukOKH5C-uyOqO6HED7S8Rl4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval</title><source>arXiv.org</source><creator>Zhang, Xi ; Zhou, Siyu ; Feng, Jiashi ; Lai, Hanjiang ; Li, Bo ; Pan, Yan ; Yin, Jian ; Yan, Shuicheng</creator><creatorcontrib>Zhang, Xi ; Zhou, Siyu ; Feng, Jiashi ; Lai, Hanjiang ; Li, Bo ; Pan, Yan ; Yin, Jian ; Yan, Shuicheng</creatorcontrib><description>As the rapid growth of multi-modal data, hashing methods for cross-modal
retrieval have received considerable attention. Deep-networks-based cross-modal
hashing methods are appealing as they can integrate feature learning and hash
coding into end-to-end trainable frameworks. However, it is still challenging
to find content similarities between different modalities of data due to the
heterogeneity gap. To further address this problem, we propose an adversarial
hashing network with attention mechanism to enhance the measurement of content
similarities by selectively focusing on informative parts of multi-modal data.
The proposed new adversarial network, HashGAN, consists of three building
blocks: 1) the feature learning module to obtain feature representations, 2)
the generative attention module to generate an attention mask, which is used to
obtain the attended (foreground) and the unattended (background) feature
representations, 3) the discriminative hash coding module to learn hash
functions that preserve the similarities between different modalities. In our
framework, the generative module and the discriminative module are trained in
an adversarial way: the generator is learned to make the discriminator cannot
preserve the similarities of multi-modal data w.r.t. the background feature
representations, while the discriminator aims to preserve the similarities of
multi-modal data w.r.t. both the foreground and the background feature
representations. Extensive evaluations on several benchmark datasets
demonstrate that the proposed HashGAN brings substantial improvements over
other state-of-the-art cross-modal hashing methods.</description><identifier>DOI: 10.48550/arxiv.1711.09347</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2017-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1711.09347$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1711.09347$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Xi</creatorcontrib><creatorcontrib>Zhou, Siyu</creatorcontrib><creatorcontrib>Feng, Jiashi</creatorcontrib><creatorcontrib>Lai, Hanjiang</creatorcontrib><creatorcontrib>Li, Bo</creatorcontrib><creatorcontrib>Pan, Yan</creatorcontrib><creatorcontrib>Yin, Jian</creatorcontrib><creatorcontrib>Yan, Shuicheng</creatorcontrib><title>HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval</title><description>As the rapid growth of multi-modal data, hashing methods for cross-modal
retrieval have received considerable attention. Deep-networks-based cross-modal
hashing methods are appealing as they can integrate feature learning and hash
coding into end-to-end trainable frameworks. However, it is still challenging
to find content similarities between different modalities of data due to the
heterogeneity gap. To further address this problem, we propose an adversarial
hashing network with attention mechanism to enhance the measurement of content
similarities by selectively focusing on informative parts of multi-modal data.
The proposed new adversarial network, HashGAN, consists of three building
blocks: 1) the feature learning module to obtain feature representations, 2)
the generative attention module to generate an attention mask, which is used to
obtain the attended (foreground) and the unattended (background) feature
representations, 3) the discriminative hash coding module to learn hash
functions that preserve the similarities between different modalities. In our
framework, the generative module and the discriminative module are trained in
an adversarial way: the generator is learned to make the discriminator cannot
preserve the similarities of multi-modal data w.r.t. the background feature
representations, while the discriminator aims to preserve the similarities of
multi-modal data w.r.t. both the foreground and the background feature
representations. Extensive evaluations on several benchmark datasets
demonstrate that the proposed HashGAN brings substantial improvements over
other state-of-the-art cross-modal hashing methods.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81KxDAUBeBsXMjoA7gyL9CaNL91V6rOOIwKMvty09xooLZDUqq-vc7o6sDhcOAj5IqzUlql2A2kr7iU3HBeslpIc062G8jv6-b5tplnHOc4jQV8QkJ6h3igjV8wZUgRBnocxvGNhinRNk0506fJ__avOKeICwwX5CzAkPHyP1dk_3C_bzfF7mX92Da7ArQxhfOy8kIb62rPJAqrldYGpax0xSxWFbes596FXklrlJKIPXIdhKuVcSqIFbn-uz1hukOKH5C-uyOqO6HED7S8Rl4</recordid><startdate>20171126</startdate><enddate>20171126</enddate><creator>Zhang, Xi</creator><creator>Zhou, Siyu</creator><creator>Feng, Jiashi</creator><creator>Lai, Hanjiang</creator><creator>Li, Bo</creator><creator>Pan, Yan</creator><creator>Yin, Jian</creator><creator>Yan, Shuicheng</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20171126</creationdate><title>HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval</title><author>Zhang, Xi ; Zhou, Siyu ; Feng, Jiashi ; Lai, Hanjiang ; Li, Bo ; Pan, Yan ; Yin, Jian ; Yan, Shuicheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-bd42d3678b9d04e3865667e4426208e22180c1dbfc5487554eece16f3b957b5f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Xi</creatorcontrib><creatorcontrib>Zhou, Siyu</creatorcontrib><creatorcontrib>Feng, Jiashi</creatorcontrib><creatorcontrib>Lai, Hanjiang</creatorcontrib><creatorcontrib>Li, Bo</creatorcontrib><creatorcontrib>Pan, Yan</creatorcontrib><creatorcontrib>Yin, Jian</creatorcontrib><creatorcontrib>Yan, Shuicheng</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Xi</au><au>Zhou, Siyu</au><au>Feng, Jiashi</au><au>Lai, Hanjiang</au><au>Li, Bo</au><au>Pan, Yan</au><au>Yin, Jian</au><au>Yan, Shuicheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval</atitle><date>2017-11-26</date><risdate>2017</risdate><abstract>As the rapid growth of multi-modal data, hashing methods for cross-modal
retrieval have received considerable attention. Deep-networks-based cross-modal
hashing methods are appealing as they can integrate feature learning and hash
coding into end-to-end trainable frameworks. However, it is still challenging
to find content similarities between different modalities of data due to the
heterogeneity gap. To further address this problem, we propose an adversarial
hashing network with attention mechanism to enhance the measurement of content
similarities by selectively focusing on informative parts of multi-modal data.
The proposed new adversarial network, HashGAN, consists of three building
blocks: 1) the feature learning module to obtain feature representations, 2)
the generative attention module to generate an attention mask, which is used to
obtain the attended (foreground) and the unattended (background) feature
representations, 3) the discriminative hash coding module to learn hash
functions that preserve the similarities between different modalities. In our
framework, the generative module and the discriminative module are trained in
an adversarial way: the generator is learned to make the discriminator cannot
preserve the similarities of multi-modal data w.r.t. the background feature
representations, while the discriminator aims to preserve the similarities of
multi-modal data w.r.t. both the foreground and the background feature
representations. Extensive evaluations on several benchmark datasets
demonstrate that the proposed HashGAN brings substantial improvements over
other state-of-the-art cross-modal hashing methods.</abstract><doi>10.48550/arxiv.1711.09347</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.1711.09347 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_1711_09347 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T21%3A57%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HashGAN:Attention-aware%20Deep%20Adversarial%20Hashing%20for%20Cross%20Modal%20Retrieval&rft.au=Zhang,%20Xi&rft.date=2017-11-26&rft_id=info:doi/10.48550/arxiv.1711.09347&rft_dat=%3Carxiv_GOX%3E1711_09347%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |