Learning Representation for Clustering Via Prototype Scattering and Positive Sampling

Existing deep clustering methods rely on either contrastive or non-contrastive representation learning for downstream clustering task. Contrastive-based methods thanks to negative pairs learn uniform representations for clustering, in which negative pairs, however, may inevitably lead to the class c...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2023-06, Vol.45 (6), p.7509-7524
Hauptverfasser:	Huang, Zhizhong, Chen, Jie, Zhang, Junping, Shan, Hongming
Format:	Artikel
Sprache:	eng
Schlagworte:	Clustering Clustering methods Collision avoidance Contrastive learning Datasets deep clustering Learning Optimization Prototypes Representation learning Representations Sampling Scattering Self-supervised learning Semantics Source code Task analysis unsupervised learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	7524
container_issue	6
container_start_page	7509
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume	45
creator	Huang, Zhizhong Chen, Jie Zhang, Junping Shan, Hongming
description	Existing deep clustering methods rely on either contrastive or non-contrastive representation learning for downstream clustering task. Contrastive-based methods thanks to negative pairs learn uniform representations for clustering, in which negative pairs, however, may inevitably lead to the class collision issue and consequently compromise the clustering performance. Non-contrastive-based methods, on the other hand, avoid class collision issue, but the resulting non-uniform representations may cause the collapse of clustering. To enjoy the strengths of both worlds, this paper presents a novel end-to-end deep clustering method with prototype scattering and positive sampling, termed ProPos. Specifically, we first maximize the distance between prototypical representations, named prototype scattering loss, which improves the uniformity of representations. Second, we align one augmented view of instance with the sampled neighbors of another view-assumed to be truly positive pair in the embedding space-to improve the within-cluster compactness, termed positive sampling alignment. The strengths of ProPos are avoidable class collision issue, uniform representations, well-separated clusters, and within-cluster compactness. By optimizing ProPos in an end-to-end expectation-maximization framework, extensive experimental results demonstrate that ProPos achieves competing performance on moderate-scale clustering benchmark datasets and establishes new state-of-the-art performance on large-scale datasets. Source code is available at https://github.com/Hzzone/ProPos .
doi_str_mv	10.1109/TPAMI.2022.3216454
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPAMI_2022_3216454</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9926200</ieee_id><sourcerecordid>2809879629</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-4d2a6fb4c8d89f1ac19331ea49e873203f40e73e2396a364a73f73c5ac6af3053</originalsourceid><addsrcrecordid>eNpdkE1Lw0AQhhdRbP34AwoS8OIldXcm2WSPUvwoVCx-XZdtOpGUNBt3N4L_3tTWHjwNzPu8w_Awdib4SAiurl9nN4-TEXCAEYKQSZrssaFQqGJMUe2zIRcS4jyHfMCOvF9yLpKU4yEboASpFJdD9jYl45qq-YieqXXkqQkmVLaJSuuicd35QG6dvlcmmjkbbPhuKXopTNgGpllEM-urUH31e7Nq6357wg5KU3s63c5j9nZ3-zp-iKdP95PxzTQuUKUhThZgZDlPinyRq1KYon8eBZlEUZ4hcCwTThkSoJIGZWIyLDMsUlNIUyJP8Zhdbe62zn525INeVb6gujYN2c5ryCBbe4GsRy__oUvbuab_TkPOVZ4pCaqnYEMVznrvqNStq1bGfWvB9Vq6_pWu19L1Vnpfutie7uYrWuwqf5Z74HwDVES0i5UCCZzjD4b8ha0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2809879629</pqid></control><display><type>article</type><title>Learning Representation for Clustering Via Prototype Scattering and Positive Sampling</title><source>IEEE Electronic Library (IEL)</source><creator>Huang, Zhizhong ; Chen, Jie ; Zhang, Junping ; Shan, Hongming</creator><creatorcontrib>Huang, Zhizhong ; Chen, Jie ; Zhang, Junping ; Shan, Hongming</creatorcontrib><description>Existing deep clustering methods rely on either contrastive or non-contrastive representation learning for downstream clustering task. Contrastive-based methods thanks to negative pairs learn uniform representations for clustering, in which negative pairs, however, may inevitably lead to the class collision issue and consequently compromise the clustering performance. Non-contrastive-based methods, on the other hand, avoid class collision issue, but the resulting non-uniform representations may cause the collapse of clustering. To enjoy the strengths of both worlds, this paper presents a novel end-to-end deep clustering method with prototype scattering and positive sampling, termed ProPos. Specifically, we first maximize the distance between prototypical representations, named prototype scattering loss, which improves the uniformity of representations. Second, we align one augmented view of instance with the sampled neighbors of another view-assumed to be truly positive pair in the embedding space-to improve the within-cluster compactness, termed positive sampling alignment. The strengths of ProPos are avoidable class collision issue, uniform representations, well-separated clusters, and within-cluster compactness. By optimizing ProPos in an end-to-end expectation-maximization framework, extensive experimental results demonstrate that ProPos achieves competing performance on moderate-scale clustering benchmark datasets and establishes new state-of-the-art performance on large-scale datasets. Source code is available at https://github.com/Hzzone/ProPos .</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2022.3216454</identifier><identifier>PMID: 36269906</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Clustering ; Clustering methods ; Collision avoidance ; Contrastive learning ; Datasets ; deep clustering ; Learning ; Optimization ; Prototypes ; Representation learning ; Representations ; Sampling ; Scattering ; Self-supervised learning ; Semantics ; Source code ; Task analysis ; unsupervised learning</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-06, Vol.45 (6), p.7509-7524</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-4d2a6fb4c8d89f1ac19331ea49e873203f40e73e2396a364a73f73c5ac6af3053</citedby><cites>FETCH-LOGICAL-c395t-4d2a6fb4c8d89f1ac19331ea49e873203f40e73e2396a364a73f73c5ac6af3053</cites><orcidid>0000-0002-0604-3197 ; 0000-0002-5924-3360 ; 0000-0002-5625-5729 ; 0000-0002-3581-7331</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9926200$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36269906$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Huang, Zhizhong</creatorcontrib><creatorcontrib>Chen, Jie</creatorcontrib><creatorcontrib>Zhang, Junping</creatorcontrib><creatorcontrib>Shan, Hongming</creatorcontrib><title>Learning Representation for Clustering Via Prototype Scattering and Positive Sampling</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Existing deep clustering methods rely on either contrastive or non-contrastive representation learning for downstream clustering task. Contrastive-based methods thanks to negative pairs learn uniform representations for clustering, in which negative pairs, however, may inevitably lead to the class collision issue and consequently compromise the clustering performance. Non-contrastive-based methods, on the other hand, avoid class collision issue, but the resulting non-uniform representations may cause the collapse of clustering. To enjoy the strengths of both worlds, this paper presents a novel end-to-end deep clustering method with prototype scattering and positive sampling, termed ProPos. Specifically, we first maximize the distance between prototypical representations, named prototype scattering loss, which improves the uniformity of representations. Second, we align one augmented view of instance with the sampled neighbors of another view-assumed to be truly positive pair in the embedding space-to improve the within-cluster compactness, termed positive sampling alignment. The strengths of ProPos are avoidable class collision issue, uniform representations, well-separated clusters, and within-cluster compactness. By optimizing ProPos in an end-to-end expectation-maximization framework, extensive experimental results demonstrate that ProPos achieves competing performance on moderate-scale clustering benchmark datasets and establishes new state-of-the-art performance on large-scale datasets. Source code is available at https://github.com/Hzzone/ProPos .</description><subject>Clustering</subject><subject>Clustering methods</subject><subject>Collision avoidance</subject><subject>Contrastive learning</subject><subject>Datasets</subject><subject>deep clustering</subject><subject>Learning</subject><subject>Optimization</subject><subject>Prototypes</subject><subject>Representation learning</subject><subject>Representations</subject><subject>Sampling</subject><subject>Scattering</subject><subject>Self-supervised learning</subject><subject>Semantics</subject><subject>Source code</subject><subject>Task analysis</subject><subject>unsupervised learning</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><recordid>eNpdkE1Lw0AQhhdRbP34AwoS8OIldXcm2WSPUvwoVCx-XZdtOpGUNBt3N4L_3tTWHjwNzPu8w_Awdib4SAiurl9nN4-TEXCAEYKQSZrssaFQqGJMUe2zIRcS4jyHfMCOvF9yLpKU4yEboASpFJdD9jYl45qq-YieqXXkqQkmVLaJSuuicd35QG6dvlcmmjkbbPhuKXopTNgGpllEM-urUH31e7Nq6357wg5KU3s63c5j9nZ3-zp-iKdP95PxzTQuUKUhThZgZDlPinyRq1KYon8eBZlEUZ4hcCwTThkSoJIGZWIyLDMsUlNIUyJP8Zhdbe62zn525INeVb6gujYN2c5ryCBbe4GsRy__oUvbuab_TkPOVZ4pCaqnYEMVznrvqNStq1bGfWvB9Vq6_pWu19L1Vnpfutie7uYrWuwqf5Z74HwDVES0i5UCCZzjD4b8ha0</recordid><startdate>20230601</startdate><enddate>20230601</enddate><creator>Huang, Zhizhong</creator><creator>Chen, Jie</creator><creator>Zhang, Junping</creator><creator>Shan, Hongming</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-0604-3197</orcidid><orcidid>https://orcid.org/0000-0002-5924-3360</orcidid><orcidid>https://orcid.org/0000-0002-5625-5729</orcidid><orcidid>https://orcid.org/0000-0002-3581-7331</orcidid></search><sort><creationdate>20230601</creationdate><title>Learning Representation for Clustering Via Prototype Scattering and Positive Sampling</title><author>Huang, Zhizhong ; Chen, Jie ; Zhang, Junping ; Shan, Hongming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-4d2a6fb4c8d89f1ac19331ea49e873203f40e73e2396a364a73f73c5ac6af3053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Clustering</topic><topic>Clustering methods</topic><topic>Collision avoidance</topic><topic>Contrastive learning</topic><topic>Datasets</topic><topic>deep clustering</topic><topic>Learning</topic><topic>Optimization</topic><topic>Prototypes</topic><topic>Representation learning</topic><topic>Representations</topic><topic>Sampling</topic><topic>Scattering</topic><topic>Self-supervised learning</topic><topic>Semantics</topic><topic>Source code</topic><topic>Task analysis</topic><topic>unsupervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Huang, Zhizhong</creatorcontrib><creatorcontrib>Chen, Jie</creatorcontrib><creatorcontrib>Zhang, Junping</creatorcontrib><creatorcontrib>Shan, Hongming</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Huang, Zhizhong</au><au>Chen, Jie</au><au>Zhang, Junping</au><au>Shan, Hongming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Representation for Clustering Via Prototype Scattering and Positive Sampling</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2023-06-01</date><risdate>2023</risdate><volume>45</volume><issue>6</issue><spage>7509</spage><epage>7524</epage><pages>7509-7524</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Existing deep clustering methods rely on either contrastive or non-contrastive representation learning for downstream clustering task. Contrastive-based methods thanks to negative pairs learn uniform representations for clustering, in which negative pairs, however, may inevitably lead to the class collision issue and consequently compromise the clustering performance. Non-contrastive-based methods, on the other hand, avoid class collision issue, but the resulting non-uniform representations may cause the collapse of clustering. To enjoy the strengths of both worlds, this paper presents a novel end-to-end deep clustering method with prototype scattering and positive sampling, termed ProPos. Specifically, we first maximize the distance between prototypical representations, named prototype scattering loss, which improves the uniformity of representations. Second, we align one augmented view of instance with the sampled neighbors of another view-assumed to be truly positive pair in the embedding space-to improve the within-cluster compactness, termed positive sampling alignment. The strengths of ProPos are avoidable class collision issue, uniform representations, well-separated clusters, and within-cluster compactness. By optimizing ProPos in an end-to-end expectation-maximization framework, extensive experimental results demonstrate that ProPos achieves competing performance on moderate-scale clustering benchmark datasets and establishes new state-of-the-art performance on large-scale datasets. Source code is available at https://github.com/Hzzone/ProPos .</abstract><cop>United States</cop><pub>IEEE</pub><pmid>36269906</pmid><doi>10.1109/TPAMI.2022.3216454</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0002-0604-3197</orcidid><orcidid>https://orcid.org/0000-0002-5924-3360</orcidid><orcidid>https://orcid.org/0000-0002-5625-5729</orcidid><orcidid>https://orcid.org/0000-0002-3581-7331</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0162-8828
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2023-06, Vol.45 (6), p.7509-7524
issn	0162-8828 1939-3539 2160-9292
language	eng
recordid	cdi_crossref_primary_10_1109_TPAMI_2022_3216454
source	IEEE Electronic Library (IEL)
subjects	Clustering Clustering methods Collision avoidance Contrastive learning Datasets deep clustering Learning Optimization Prototypes Representation learning Representations Sampling Scattering Self-supervised learning Semantics Source code Task analysis unsupervised learning
title	Learning Representation for Clustering Via Prototype Scattering and Positive Sampling
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T08%3A07%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Representation%20for%20Clustering%20Via%20Prototype%20Scattering%20and%20Positive%20Sampling&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Huang,%20Zhizhong&rft.date=2023-06-01&rft.volume=45&rft.issue=6&rft.spage=7509&rft.epage=7524&rft.pages=7509-7524&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2022.3216454&rft_dat=%3Cproquest_cross%3E2809879629%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2809879629&rft_id=info:pmid/36269906&rft_ieee_id=9926200&rfr_iscdi=true