Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering
Multi-modal clustering (MMC) aims to explore complementary information from diverse modalities for clustering performance facilitating. This article studies challenging problems in MMC methods based on deep neural networks. On one hand, most existing methods lack a unified objective to simultaneousl...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on image processing 2023-01, Vol.PP, p.1-1 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE transactions on image processing |
container_volume | PP |
creator | Xia, Wei Wang, Tianxiu Gao, Quanxue Yang, Ming Gao, Xinbo |
description | Multi-modal clustering (MMC) aims to explore complementary information from diverse modalities for clustering performance facilitating. This article studies challenging problems in MMC methods based on deep neural networks. On one hand, most existing methods lack a unified objective to simultaneously learn the inter- and intra-modality consistency, resulting in a limited representation learning capacity. On the other hand, most existing processes are modeled for a finite sample set and cannot handle out-of-sample data. To handle the above two challenges, we propose a novel Graph Embedding Contrastive Multi-modal Clustering network (GECMC), which treats the representation learning and multi-modal clustering as two sides of one coin rather than two separate problems. In brief, we specifically design a contrastive loss by benefiting from pseudo-labels to explore consistency across modalities. Thus, GECMC shows an effective way to maximize the similarities of intra-cluster representations while minimizing the similarities of inter-cluster representations at both inter- and intra-modality levels. So, the clustering and representation learning interact and jointly evolve in a co-training framework. After that, we build a clustering layer parameterized with cluster centroids, showing that GECMC can learn the clustering labels with given samples and handle out-of-sample data. GECMC yields superior results than 14 competitive methods on four challenging datasets. Codes and datasets are available: https://github.com/xdweixia/GECMC. |
doi_str_mv | 10.1109/TIP.2023.3240863 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_2797148768</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10036442</ieee_id><sourcerecordid>2776795838</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-caac8e6b814ecf23bff2844d70f4c3f3b55e52a8e6c515fe1d0f37c57c203dcc3</originalsourceid><addsrcrecordid>eNpdkE1Lw0AQhhdRrFbvHkQCXryk7mc2OUqotdCqSD2HzWZWU_LlbiL4793QKuJpZuCZl5kHoQuCZ4Tg5HazfJ5RTNmMUY7jiB2gE5JwEmLM6aHvsZChJDyZoFPnthgTLkh0jCZMYko5IyfocWFV9x7M6xyKomzegrRteqtcX35CsB6qvgzXbaGq4AU6Cw6aXvVl2wQrULYZedPaIK0G14P14xk6MqpycL6vU_R6P9-kD-HqabFM71ahZjzuQ62UjiHKY8JBG8pyY2jMeSGx4ZoZlgsBgiqPaEGEAVJgw6QWUlPMCq3ZFN3scjvbfgzg-qwunYaqUg20g8uoTPzfsYxij17_Q7ftYBt_nadkJBMRs5HCO0rb1jkLJutsWSv7lRGcja4z7zobXWd7137lah885DUUvws_cj1wuQNKAPiTh1nEOWXfNKeC7Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2776795838</pqid></control><display><type>article</type><title>Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering</title><source>IEEE Electronic Library (IEL)</source><creator>Xia, Wei ; Wang, Tianxiu ; Gao, Quanxue ; Yang, Ming ; Gao, Xinbo</creator><creatorcontrib>Xia, Wei ; Wang, Tianxiu ; Gao, Quanxue ; Yang, Ming ; Gao, Xinbo</creatorcontrib><description>Multi-modal clustering (MMC) aims to explore complementary information from diverse modalities for clustering performance facilitating. This article studies challenging problems in MMC methods based on deep neural networks. On one hand, most existing methods lack a unified objective to simultaneously learn the inter- and intra-modality consistency, resulting in a limited representation learning capacity. On the other hand, most existing processes are modeled for a finite sample set and cannot handle out-of-sample data. To handle the above two challenges, we propose a novel Graph Embedding Contrastive Multi-modal Clustering network (GECMC), which treats the representation learning and multi-modal clustering as two sides of one coin rather than two separate problems. In brief, we specifically design a contrastive loss by benefiting from pseudo-labels to explore consistency across modalities. Thus, GECMC shows an effective way to maximize the similarities of intra-cluster representations while minimizing the similarities of inter-cluster representations at both inter- and intra-modality levels. So, the clustering and representation learning interact and jointly evolve in a co-training framework. After that, we build a clustering layer parameterized with cluster centroids, showing that GECMC can learn the clustering labels with given samples and handle out-of-sample data. GECMC yields superior results than 14 competitive methods on four challenging datasets. Codes and datasets are available: https://github.com/xdweixia/GECMC.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2023.3240863</identifier><identifier>PMID: 37022431</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Centroids ; Clustering ; Consistency ; Datasets ; Embedding ; Graphical representations ; Labels ; Machine learning ; Multi-modal learning ; representation learning ; self-supervision ; Similarity</subject><ispartof>IEEE transactions on image processing, 2023-01, Vol.PP, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-caac8e6b814ecf23bff2844d70f4c3f3b55e52a8e6c515fe1d0f37c57c203dcc3</citedby><cites>FETCH-LOGICAL-c348t-caac8e6b814ecf23bff2844d70f4c3f3b55e52a8e6c515fe1d0f37c57c203dcc3</cites><orcidid>0000-0001-8988-8381 ; 0000-0003-1810-1566 ; 0000-0001-5082-2940</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10036442$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10036442$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37022431$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Xia, Wei</creatorcontrib><creatorcontrib>Wang, Tianxiu</creatorcontrib><creatorcontrib>Gao, Quanxue</creatorcontrib><creatorcontrib>Yang, Ming</creatorcontrib><creatorcontrib>Gao, Xinbo</creatorcontrib><title>Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Multi-modal clustering (MMC) aims to explore complementary information from diverse modalities for clustering performance facilitating. This article studies challenging problems in MMC methods based on deep neural networks. On one hand, most existing methods lack a unified objective to simultaneously learn the inter- and intra-modality consistency, resulting in a limited representation learning capacity. On the other hand, most existing processes are modeled for a finite sample set and cannot handle out-of-sample data. To handle the above two challenges, we propose a novel Graph Embedding Contrastive Multi-modal Clustering network (GECMC), which treats the representation learning and multi-modal clustering as two sides of one coin rather than two separate problems. In brief, we specifically design a contrastive loss by benefiting from pseudo-labels to explore consistency across modalities. Thus, GECMC shows an effective way to maximize the similarities of intra-cluster representations while minimizing the similarities of inter-cluster representations at both inter- and intra-modality levels. So, the clustering and representation learning interact and jointly evolve in a co-training framework. After that, we build a clustering layer parameterized with cluster centroids, showing that GECMC can learn the clustering labels with given samples and handle out-of-sample data. GECMC yields superior results than 14 competitive methods on four challenging datasets. Codes and datasets are available: https://github.com/xdweixia/GECMC.</description><subject>Artificial neural networks</subject><subject>Centroids</subject><subject>Clustering</subject><subject>Consistency</subject><subject>Datasets</subject><subject>Embedding</subject><subject>Graphical representations</subject><subject>Labels</subject><subject>Machine learning</subject><subject>Multi-modal learning</subject><subject>representation learning</subject><subject>self-supervision</subject><subject>Similarity</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1Lw0AQhhdRrFbvHkQCXryk7mc2OUqotdCqSD2HzWZWU_LlbiL4793QKuJpZuCZl5kHoQuCZ4Tg5HazfJ5RTNmMUY7jiB2gE5JwEmLM6aHvsZChJDyZoFPnthgTLkh0jCZMYko5IyfocWFV9x7M6xyKomzegrRteqtcX35CsB6qvgzXbaGq4AU6Cw6aXvVl2wQrULYZedPaIK0G14P14xk6MqpycL6vU_R6P9-kD-HqabFM71ahZjzuQ62UjiHKY8JBG8pyY2jMeSGx4ZoZlgsBgiqPaEGEAVJgw6QWUlPMCq3ZFN3scjvbfgzg-qwunYaqUg20g8uoTPzfsYxij17_Q7ftYBt_nadkJBMRs5HCO0rb1jkLJutsWSv7lRGcja4z7zobXWd7137lah885DUUvws_cj1wuQNKAPiTh1nEOWXfNKeC7Q</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Xia, Wei</creator><creator>Wang, Tianxiu</creator><creator>Gao, Quanxue</creator><creator>Yang, Ming</creator><creator>Gao, Xinbo</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-8988-8381</orcidid><orcidid>https://orcid.org/0000-0003-1810-1566</orcidid><orcidid>https://orcid.org/0000-0001-5082-2940</orcidid></search><sort><creationdate>20230101</creationdate><title>Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering</title><author>Xia, Wei ; Wang, Tianxiu ; Gao, Quanxue ; Yang, Ming ; Gao, Xinbo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-caac8e6b814ecf23bff2844d70f4c3f3b55e52a8e6c515fe1d0f37c57c203dcc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Centroids</topic><topic>Clustering</topic><topic>Consistency</topic><topic>Datasets</topic><topic>Embedding</topic><topic>Graphical representations</topic><topic>Labels</topic><topic>Machine learning</topic><topic>Multi-modal learning</topic><topic>representation learning</topic><topic>self-supervision</topic><topic>Similarity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xia, Wei</creatorcontrib><creatorcontrib>Wang, Tianxiu</creatorcontrib><creatorcontrib>Gao, Quanxue</creatorcontrib><creatorcontrib>Yang, Ming</creatorcontrib><creatorcontrib>Gao, Xinbo</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xia, Wei</au><au>Wang, Tianxiu</au><au>Gao, Quanxue</au><au>Yang, Ming</au><au>Gao, Xinbo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>PP</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Multi-modal clustering (MMC) aims to explore complementary information from diverse modalities for clustering performance facilitating. This article studies challenging problems in MMC methods based on deep neural networks. On one hand, most existing methods lack a unified objective to simultaneously learn the inter- and intra-modality consistency, resulting in a limited representation learning capacity. On the other hand, most existing processes are modeled for a finite sample set and cannot handle out-of-sample data. To handle the above two challenges, we propose a novel Graph Embedding Contrastive Multi-modal Clustering network (GECMC), which treats the representation learning and multi-modal clustering as two sides of one coin rather than two separate problems. In brief, we specifically design a contrastive loss by benefiting from pseudo-labels to explore consistency across modalities. Thus, GECMC shows an effective way to maximize the similarities of intra-cluster representations while minimizing the similarities of inter-cluster representations at both inter- and intra-modality levels. So, the clustering and representation learning interact and jointly evolve in a co-training framework. After that, we build a clustering layer parameterized with cluster centroids, showing that GECMC can learn the clustering labels with given samples and handle out-of-sample data. GECMC yields superior results than 14 competitive methods on four challenging datasets. Codes and datasets are available: https://github.com/xdweixia/GECMC.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>37022431</pmid><doi>10.1109/TIP.2023.3240863</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-8988-8381</orcidid><orcidid>https://orcid.org/0000-0003-1810-1566</orcidid><orcidid>https://orcid.org/0000-0001-5082-2940</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1057-7149 |
ispartof | IEEE transactions on image processing, 2023-01, Vol.PP, p.1-1 |
issn | 1057-7149 1941-0042 |
language | eng |
recordid | cdi_proquest_miscellaneous_2797148768 |
source | IEEE Electronic Library (IEL) |
subjects | Artificial neural networks Centroids Clustering Consistency Datasets Embedding Graphical representations Labels Machine learning Multi-modal learning representation learning self-supervision Similarity |
title | Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T10%3A21%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Graph%20Embedding%20Contrastive%20Multi-Modal%20Representation%20Learning%20for%20Clustering&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Xia,%20Wei&rft.date=2023-01-01&rft.volume=PP&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2023.3240863&rft_dat=%3Cproquest_RIE%3E2776795838%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2776795838&rft_id=info:pmid/37022431&rft_ieee_id=10036442&rfr_iscdi=true |