Cross-Modal Generation and Pair Correlation Alignment Hashing

Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2023-03, Vol.24 (3), p.3018-3026
Hauptverfasser: Ou, Weihua, Deng, Jiaxin, Zhang, Lei, Gou, Jianping, Zhou, Quan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3026
container_issue 3
container_start_page 3018
container_title IEEE transactions on intelligent transportation systems
container_volume 24
creator Ou, Weihua
Deng, Jiaxin
Zhang, Lei
Gou, Jianping
Zhou, Quan
description Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.
doi_str_mv 10.1109/TITS.2022.3221787
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9954182</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9954182</ieee_id><sourcerecordid>2780986934</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</originalsourceid><addsrcrecordid>eNo9kE9Lw0AQxRdRsFY_gHgJeE7d2T_J7sFDCdoWKgrG87LdndSUNKm76cFvb0OKpxke783wfoTcA50BUP1UrsrPGaOMzThjkKv8gkxASpVSCtnlsDORairpNbmJcXdShQSYkOcidDGmb523TbLAFoPt665NbOuTD1uHpOhCwGYU5029bffY9snSxu-63d6Sq8o2Ee_Oc0q-Xl_KYpmu3xerYr5OHdO8T5FTtDlKrZmWleduY3VVabXxXoJQGiQq6Zn0rqLgvMi8owwBGRMeMif4lDyOdw-h-zli7M2uO4b29NKwXFGtMs0HF4wuN3QKWJlDqPc2_BqgZqBkBkpmoGTOlE6ZhzFTI-K_X2spQDH-By6bYs0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2780986934</pqid></control><display><type>article</type><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><source>IEEE Electronic Library (IEL)</source><creator>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</creator><creatorcontrib>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</creatorcontrib><description>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2022.3221787</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Alignment ; Codes ; Correlation ; correlation alignment ; cross-modal generation ; Cross-modal hashing ; cross-modal interaction ; Data mining ; Feature extraction ; Generative adversarial networks ; position semantic information ; Representations ; Semantics ; Transformers</subject><ispartof>IEEE transactions on intelligent transportation systems, 2023-03, Vol.24 (3), p.3018-3026</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</citedby><cites>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</cites><orcidid>0000-0002-7894-7929 ; 0000-0001-5241-7703 ; 0000-0003-1413-0693</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9954182$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9954182$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ou, Weihua</creatorcontrib><creatorcontrib>Deng, Jiaxin</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Gou, Jianping</creatorcontrib><creatorcontrib>Zhou, Quan</creatorcontrib><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</description><subject>Alignment</subject><subject>Codes</subject><subject>Correlation</subject><subject>correlation alignment</subject><subject>cross-modal generation</subject><subject>Cross-modal hashing</subject><subject>cross-modal interaction</subject><subject>Data mining</subject><subject>Feature extraction</subject><subject>Generative adversarial networks</subject><subject>position semantic information</subject><subject>Representations</subject><subject>Semantics</subject><subject>Transformers</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE9Lw0AQxRdRsFY_gHgJeE7d2T_J7sFDCdoWKgrG87LdndSUNKm76cFvb0OKpxke783wfoTcA50BUP1UrsrPGaOMzThjkKv8gkxASpVSCtnlsDORairpNbmJcXdShQSYkOcidDGmb523TbLAFoPt665NbOuTD1uHpOhCwGYU5029bffY9snSxu-63d6Sq8o2Ee_Oc0q-Xl_KYpmu3xerYr5OHdO8T5FTtDlKrZmWleduY3VVabXxXoJQGiQq6Zn0rqLgvMi8owwBGRMeMif4lDyOdw-h-zli7M2uO4b29NKwXFGtMs0HF4wuN3QKWJlDqPc2_BqgZqBkBkpmoGTOlE6ZhzFTI-K_X2spQDH-By6bYs0</recordid><startdate>20230301</startdate><enddate>20230301</enddate><creator>Ou, Weihua</creator><creator>Deng, Jiaxin</creator><creator>Zhang, Lei</creator><creator>Gou, Jianping</creator><creator>Zhou, Quan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-7894-7929</orcidid><orcidid>https://orcid.org/0000-0001-5241-7703</orcidid><orcidid>https://orcid.org/0000-0003-1413-0693</orcidid></search><sort><creationdate>20230301</creationdate><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><author>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Alignment</topic><topic>Codes</topic><topic>Correlation</topic><topic>correlation alignment</topic><topic>cross-modal generation</topic><topic>Cross-modal hashing</topic><topic>cross-modal interaction</topic><topic>Data mining</topic><topic>Feature extraction</topic><topic>Generative adversarial networks</topic><topic>position semantic information</topic><topic>Representations</topic><topic>Semantics</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ou, Weihua</creatorcontrib><creatorcontrib>Deng, Jiaxin</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Gou, Jianping</creatorcontrib><creatorcontrib>Zhou, Quan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ou, Weihua</au><au>Deng, Jiaxin</au><au>Zhang, Lei</au><au>Gou, Jianping</au><au>Zhou, Quan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cross-Modal Generation and Pair Correlation Alignment Hashing</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2023-03-01</date><risdate>2023</risdate><volume>24</volume><issue>3</issue><spage>3018</spage><epage>3026</epage><pages>3018-3026</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TITS.2022.3221787</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0002-7894-7929</orcidid><orcidid>https://orcid.org/0000-0001-5241-7703</orcidid><orcidid>https://orcid.org/0000-0003-1413-0693</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1524-9050
ispartof IEEE transactions on intelligent transportation systems, 2023-03, Vol.24 (3), p.3018-3026
issn 1524-9050
1558-0016
language eng
recordid cdi_ieee_primary_9954182
source IEEE Electronic Library (IEL)
subjects Alignment
Codes
Correlation
correlation alignment
cross-modal generation
Cross-modal hashing
cross-modal interaction
Data mining
Feature extraction
Generative adversarial networks
position semantic information
Representations
Semantics
Transformers
title Cross-Modal Generation and Pair Correlation Alignment Hashing
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T11%3A38%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cross-Modal%20Generation%20and%20Pair%20Correlation%20Alignment%20Hashing&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Ou,%20Weihua&rft.date=2023-03-01&rft.volume=24&rft.issue=3&rft.spage=3018&rft.epage=3026&rft.pages=3018-3026&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2022.3221787&rft_dat=%3Cproquest_RIE%3E2780986934%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2780986934&rft_id=info:pmid/&rft_ieee_id=9954182&rfr_iscdi=true