Cross-Modal Generation and Pair Correlation Alignment Hashing
Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on intelligent transportation systems 2023-03, Vol.24 (3), p.3018-3026 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 3026 |
---|---|
container_issue | 3 |
container_start_page | 3018 |
container_title | IEEE transactions on intelligent transportation systems |
container_volume | 24 |
creator | Ou, Weihua Deng, Jiaxin Zhang, Lei Gou, Jianping Zhou, Quan |
description | Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods. |
doi_str_mv | 10.1109/TITS.2022.3221787 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9954182</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9954182</ieee_id><sourcerecordid>2780986934</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</originalsourceid><addsrcrecordid>eNo9kE9Lw0AQxRdRsFY_gHgJeE7d2T_J7sFDCdoWKgrG87LdndSUNKm76cFvb0OKpxke783wfoTcA50BUP1UrsrPGaOMzThjkKv8gkxASpVSCtnlsDORairpNbmJcXdShQSYkOcidDGmb523TbLAFoPt665NbOuTD1uHpOhCwGYU5029bffY9snSxu-63d6Sq8o2Ee_Oc0q-Xl_KYpmu3xerYr5OHdO8T5FTtDlKrZmWleduY3VVabXxXoJQGiQq6Zn0rqLgvMi8owwBGRMeMif4lDyOdw-h-zli7M2uO4b29NKwXFGtMs0HF4wuN3QKWJlDqPc2_BqgZqBkBkpmoGTOlE6ZhzFTI-K_X2spQDH-By6bYs0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2780986934</pqid></control><display><type>article</type><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><source>IEEE Electronic Library (IEL)</source><creator>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</creator><creatorcontrib>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</creatorcontrib><description>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2022.3221787</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Alignment ; Codes ; Correlation ; correlation alignment ; cross-modal generation ; Cross-modal hashing ; cross-modal interaction ; Data mining ; Feature extraction ; Generative adversarial networks ; position semantic information ; Representations ; Semantics ; Transformers</subject><ispartof>IEEE transactions on intelligent transportation systems, 2023-03, Vol.24 (3), p.3018-3026</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</citedby><cites>FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</cites><orcidid>0000-0002-7894-7929 ; 0000-0001-5241-7703 ; 0000-0003-1413-0693</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9954182$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9954182$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ou, Weihua</creatorcontrib><creatorcontrib>Deng, Jiaxin</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Gou, Jianping</creatorcontrib><creatorcontrib>Zhou, Quan</creatorcontrib><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</description><subject>Alignment</subject><subject>Codes</subject><subject>Correlation</subject><subject>correlation alignment</subject><subject>cross-modal generation</subject><subject>Cross-modal hashing</subject><subject>cross-modal interaction</subject><subject>Data mining</subject><subject>Feature extraction</subject><subject>Generative adversarial networks</subject><subject>position semantic information</subject><subject>Representations</subject><subject>Semantics</subject><subject>Transformers</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE9Lw0AQxRdRsFY_gHgJeE7d2T_J7sFDCdoWKgrG87LdndSUNKm76cFvb0OKpxke783wfoTcA50BUP1UrsrPGaOMzThjkKv8gkxASpVSCtnlsDORairpNbmJcXdShQSYkOcidDGmb523TbLAFoPt665NbOuTD1uHpOhCwGYU5029bffY9snSxu-63d6Sq8o2Ee_Oc0q-Xl_KYpmu3xerYr5OHdO8T5FTtDlKrZmWleduY3VVabXxXoJQGiQq6Zn0rqLgvMi8owwBGRMeMif4lDyOdw-h-zli7M2uO4b29NKwXFGtMs0HF4wuN3QKWJlDqPc2_BqgZqBkBkpmoGTOlE6ZhzFTI-K_X2spQDH-By6bYs0</recordid><startdate>20230301</startdate><enddate>20230301</enddate><creator>Ou, Weihua</creator><creator>Deng, Jiaxin</creator><creator>Zhang, Lei</creator><creator>Gou, Jianping</creator><creator>Zhou, Quan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-7894-7929</orcidid><orcidid>https://orcid.org/0000-0001-5241-7703</orcidid><orcidid>https://orcid.org/0000-0003-1413-0693</orcidid></search><sort><creationdate>20230301</creationdate><title>Cross-Modal Generation and Pair Correlation Alignment Hashing</title><author>Ou, Weihua ; Deng, Jiaxin ; Zhang, Lei ; Gou, Jianping ; Zhou, Quan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-e30ea7e599295fd3cba9ff98bdd5148915e85d25dcf01cd46dc02e1e224d16c43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Alignment</topic><topic>Codes</topic><topic>Correlation</topic><topic>correlation alignment</topic><topic>cross-modal generation</topic><topic>Cross-modal hashing</topic><topic>cross-modal interaction</topic><topic>Data mining</topic><topic>Feature extraction</topic><topic>Generative adversarial networks</topic><topic>position semantic information</topic><topic>Representations</topic><topic>Semantics</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ou, Weihua</creatorcontrib><creatorcontrib>Deng, Jiaxin</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Gou, Jianping</creatorcontrib><creatorcontrib>Zhou, Quan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ou, Weihua</au><au>Deng, Jiaxin</au><au>Zhang, Lei</au><au>Gou, Jianping</au><au>Zhou, Quan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cross-Modal Generation and Pair Correlation Alignment Hashing</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2023-03-01</date><risdate>2023</risdate><volume>24</volume><issue>3</issue><spage>3018</spage><epage>3026</epage><pages>3018-3026</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TITS.2022.3221787</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0002-7894-7929</orcidid><orcidid>https://orcid.org/0000-0001-5241-7703</orcidid><orcidid>https://orcid.org/0000-0003-1413-0693</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1524-9050 |
ispartof | IEEE transactions on intelligent transportation systems, 2023-03, Vol.24 (3), p.3018-3026 |
issn | 1524-9050 1558-0016 |
language | eng |
recordid | cdi_ieee_primary_9954182 |
source | IEEE Electronic Library (IEL) |
subjects | Alignment Codes Correlation correlation alignment cross-modal generation Cross-modal hashing cross-modal interaction Data mining Feature extraction Generative adversarial networks position semantic information Representations Semantics Transformers |
title | Cross-Modal Generation and Pair Correlation Alignment Hashing |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T11%3A38%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cross-Modal%20Generation%20and%20Pair%20Correlation%20Alignment%20Hashing&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Ou,%20Weihua&rft.date=2023-03-01&rft.volume=24&rft.issue=3&rft.spage=3018&rft.epage=3026&rft.pages=3018-3026&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2022.3221787&rft_dat=%3Cproquest_RIE%3E2780986934%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2780986934&rft_id=info:pmid/&rft_ieee_id=9954182&rfr_iscdi=true |