Cross-Modal Generation and Pair Correlation Alignment Hashing

Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on intelligent transportation systems 2023-03, Vol.24 (3), p.3018-3026
Hauptverfasser:	Ou, Weihua, Deng, Jiaxin, Zhang, Lei, Gou, Jianping, Zhou, Quan
Format:	Artikel
Sprache:	eng
Schlagworte:	Alignment Codes Correlation correlation alignment cross-modal generation Cross-modal hashing cross-modal interaction Data mining Feature extraction Generative adversarial networks position semantic information Representations Semantics Transformers
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3026
container_issue	3
container_start_page	3018
container_title	IEEE transactions on intelligent transportation systems
container_volume	24
creator	Ou, Weihua Deng, Jiaxin Zhang, Lei Gou, Jianping Zhou, Quan
description	Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.
doi_str_mv	10.1109/TITS.2022.3221787
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9954182</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9954182</ieee_id><sourcerecordid>2780986934</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</originalsourceid><addsrcrecordid>eNo9kE9Lw0AQxRdRsFY_gHgJeE7d2T_J7sFDCdoWKgrG87LdndSUNKm76cFvb0OKpxke783wfoTcA50BUP1UrsrPGaOMzThjkKv8gkxASpVSCtnlsDORairpNbmJcXdShQSYkOcidDGmb523TbLAFoPt665NbOuTD1uHpOhCwGYU5029bffY9snSxu-63d6Sq8o2Ee_Oc0q-Xl_KYpmu3xerYr5OHdO8T5FTtDlKrZmWleduY3VVabXxXoJQGiQq6Zn0rqLgvMi8owwBGRMeMif4lDyOdw-h-zli7M2uO4b29NKwXFGtMs0HF4wuN3QKWJlDqPc2_BqgZqBkBkpmoGTOlE6ZhzFTI-K_X2spQDH-By6bYs0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2780986934</pqid></control><display><type>article</type><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><source>IEEE Electronic Library (IEL)</source><creator>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</creator><creatorcontrib>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</creatorcontrib><description>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2022.3221787</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Alignment ; Codes ; Correlation ; correlation alignment ; cross-modal generation ; Cross-modal hashing ; cross-modal interaction ; Data mining ; Feature extraction ; Generative adversarial networks ; position semantic information ; Representations ; Semantics ; Transformers</subject><ispartof>IEEE transactions on intelligent transportation systems, 2023-03, Vol.24 (3), p.3018-3026</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</citedby><cites>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</cites><orcidid>0000-0002-7894-7929 ; 0000-0001-5241-7703 ; 0000-0003-1413-0693</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9954182$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9954182$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ou, Weihua</creatorcontrib><creatorcontrib>Deng, Jiaxin</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Gou, Jianping</creatorcontrib><creatorcontrib>Zhou, Quan</creatorcontrib><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</description><subject>Alignment</subject><subject>Codes</subject><subject>Correlation</subject><subject>correlation alignment</subject><subject>cross-modal generation</subject><subject>Cross-modal hashing</subject><subject>cross-modal interaction</subject><subject>Data mining</subject><subject>Feature extraction</subject><subject>Generative adversarial networks</subject><subject>position semantic information</subject><subject>Representations</subject><subject>Semantics</subject><subject>Transformers</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE9Lw0AQxRdRsFY_gHgJeE7d2T_J7sFDCdoWKgrG87LdndSUNKm76cFvb0OKpxke783wfoTcA50BUP1UrsrPGaOMzThjkKv8gkxASpVSCtnlsDORairpNbmJcXdShQSYkOcidDGmb523TbLAFoPt665NbOuTD1uHpOhCwGYU5029bffY9snSxu-63d6Sq8o2Ee_Oc0q-Xl_KYpmu3xerYr5OHdO8T5FTtDlKrZmWleduY3VVabXxXoJQGiQq6Zn0rqLgvMi8owwBGRMeMif4lDyOdw-h-zli7M2uO4b29NKwXFGtMs0HF4wuN3QKWJlDqPc2_BqgZqBkBkpmoGTOlE6ZhzFTI-K_X2spQDH-By6bYs0</recordid><startdate>20230301</startdate><enddate>20230301</enddate><creator>Ou, Weihua</creator><creator>Deng, Jiaxin</creator><creator>Zhang, Lei</creator><creator>Gou, Jianping</creator><creator>Zhou, Quan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-7894-7929</orcidid><orcidid>https://orcid.org/0000-0001-5241-7703</orcidid><orcidid>https://orcid.org/0000-0003-1413-0693</orcidid></search><sort><creationdate>20230301</creationdate><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><author>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Alignment</topic><topic>Codes</topic><topic>Correlation</topic><topic>correlation alignment</topic><topic>cross-modal generation</topic><topic>Cross-modal hashing</topic><topic>cross-modal interaction</topic><topic>Data mining</topic><topic>Feature extraction</topic><topic>Generative adversarial networks</topic><topic>position semantic information</topic><topic>Representations</topic><topic>Semantics</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ou, Weihua</creatorcontrib><creatorcontrib>Deng, Jiaxin</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Gou, Jianping</creatorcontrib><creatorcontrib>Zhou, Quan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ou, Weihua</au><au>Deng, Jiaxin</au><au>Zhang, Lei</au><au>Gou, Jianping</au><au>Zhou, Quan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cross-Modal Generation and Pair Correlation Alignment Hashing</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2023-03-01</date><risdate>2023</risdate><volume>24</volume><issue>3</issue><spage>3018</spage><epage>3026</epage><pages>3018-3026</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TITS.2022.3221787</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0002-7894-7929</orcidid><orcidid>https://orcid.org/0000-0001-5241-7703</orcidid><orcidid>https://orcid.org/0000-0003-1413-0693</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1524-9050
ispartof	IEEE transactions on intelligent transportation systems, 2023-03, Vol.24 (3), p.3018-3026
issn	1524-9050 1558-0016
language	eng
recordid	cdi_ieee_primary_9954182
source	IEEE Electronic Library (IEL)
subjects	Alignment Codes Correlation correlation alignment cross-modal generation Cross-modal hashing cross-modal interaction Data mining Feature extraction Generative adversarial networks position semantic information Representations Semantics Transformers
title	Cross-Modal Generation and Pair Correlation Alignment Hashing
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T11%3A38%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cross-Modal%20Generation%20and%20Pair%20Correlation%20Alignment%20Hashing&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Ou,%20Weihua&rft.date=2023-03-01&rft.volume=24&rft.issue=3&rft.spage=3018&rft.epage=3026&rft.pages=3018-3026&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2022.3221787&rft_dat=%3Cproquest_RIE%3E2780986934%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2780986934&rft_id=info:pmid/&rft_ieee_id=9954182&rfr_iscdi=true