Unsupervised Manifold Linearizing and Clustering
We consider the problem of simultaneously clustering and learning a linear representation of data lying close to a union of low-dimensional manifolds, a fundamental task in machine learning and computer vision. When the manifolds are assumed to be linear subspaces, this reduces to the classical prob...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Ding, Tianjiao Tong, Shengbang Chan, Kwan Ho Ryan Dai, Xili Ma, Yi Haeffele, Benjamin D |
description | We consider the problem of simultaneously clustering and learning a linear
representation of data lying close to a union of low-dimensional manifolds, a
fundamental task in machine learning and computer vision. When the manifolds
are assumed to be linear subspaces, this reduces to the classical problem of
subspace clustering, which has been studied extensively over the past two
decades. Unfortunately, many real-world datasets such as natural images can not
be well approximated by linear subspaces. On the other hand, numerous works
have attempted to learn an appropriate transformation of the data, such that
data is mapped from a union of general non-linear manifolds to a union of
linear subspaces (with points from the same manifold being mapped to the same
subspace). However, many existing works have limitations such as assuming
knowledge of the membership of samples to clusters, requiring high sampling
density, or being shown theoretically to learn trivial representations. In this
paper, we propose to optimize the Maximal Coding Rate Reduction metric with
respect to both the data representation and a novel doubly stochastic cluster
membership, inspired by state-of-the-art subspace clustering results. We give a
parameterization of such a representation and membership, allowing efficient
mini-batching and one-shot initialization. Experiments on CIFAR-10, -20, -100,
and TinyImageNet-200 datasets show that the proposed method is much more
accurate and scalable than state-of-the-art deep clustering methods, and
further learns a latent linear representation of the data. |
doi_str_mv | 10.48550/arxiv.2301.01805 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2301_01805</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2301_01805</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-7f1f5815370255f60d17de5dac2966e01eeec457eeb9b01c27e29c4b44b4108e3</originalsourceid><addsrcrecordid>eNotzs2KwjAUBeBsZiGOD-DKvkDrvWlv0y6HMv5ABze6LmlzI4GakXQU9en9GzhwOJvDJ8QUIckKIpjrcHHnRKaACWABNBKw88PpyOHsBjbRj_bO_vYmqp1nHdzN-X2kvYmq_jT8cXjMT_FhdT_w5L_HYrv43laruN4s19VXHetcUawsWiqQUgWSyOZgUBkmoztZ5jkDMnOXkWJuyxawk4pl2WVt9ghCwelYzN63L3JzDO6gw7V50psXPb0DZK4-Jw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Unsupervised Manifold Linearizing and Clustering</title><source>arXiv.org</source><creator>Ding, Tianjiao ; Tong, Shengbang ; Chan, Kwan Ho Ryan ; Dai, Xili ; Ma, Yi ; Haeffele, Benjamin D</creator><creatorcontrib>Ding, Tianjiao ; Tong, Shengbang ; Chan, Kwan Ho Ryan ; Dai, Xili ; Ma, Yi ; Haeffele, Benjamin D</creatorcontrib><description>We consider the problem of simultaneously clustering and learning a linear
representation of data lying close to a union of low-dimensional manifolds, a
fundamental task in machine learning and computer vision. When the manifolds
are assumed to be linear subspaces, this reduces to the classical problem of
subspace clustering, which has been studied extensively over the past two
decades. Unfortunately, many real-world datasets such as natural images can not
be well approximated by linear subspaces. On the other hand, numerous works
have attempted to learn an appropriate transformation of the data, such that
data is mapped from a union of general non-linear manifolds to a union of
linear subspaces (with points from the same manifold being mapped to the same
subspace). However, many existing works have limitations such as assuming
knowledge of the membership of samples to clusters, requiring high sampling
density, or being shown theoretically to learn trivial representations. In this
paper, we propose to optimize the Maximal Coding Rate Reduction metric with
respect to both the data representation and a novel doubly stochastic cluster
membership, inspired by state-of-the-art subspace clustering results. We give a
parameterization of such a representation and membership, allowing efficient
mini-batching and one-shot initialization. Experiments on CIFAR-10, -20, -100,
and TinyImageNet-200 datasets show that the proposed method is much more
accurate and scalable than state-of-the-art deep clustering methods, and
further learns a latent linear representation of the data.</description><identifier>DOI: 10.48550/arxiv.2301.01805</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning</subject><creationdate>2023-01</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2301.01805$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2301.01805$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ding, Tianjiao</creatorcontrib><creatorcontrib>Tong, Shengbang</creatorcontrib><creatorcontrib>Chan, Kwan Ho Ryan</creatorcontrib><creatorcontrib>Dai, Xili</creatorcontrib><creatorcontrib>Ma, Yi</creatorcontrib><creatorcontrib>Haeffele, Benjamin D</creatorcontrib><title>Unsupervised Manifold Linearizing and Clustering</title><description>We consider the problem of simultaneously clustering and learning a linear
representation of data lying close to a union of low-dimensional manifolds, a
fundamental task in machine learning and computer vision. When the manifolds
are assumed to be linear subspaces, this reduces to the classical problem of
subspace clustering, which has been studied extensively over the past two
decades. Unfortunately, many real-world datasets such as natural images can not
be well approximated by linear subspaces. On the other hand, numerous works
have attempted to learn an appropriate transformation of the data, such that
data is mapped from a union of general non-linear manifolds to a union of
linear subspaces (with points from the same manifold being mapped to the same
subspace). However, many existing works have limitations such as assuming
knowledge of the membership of samples to clusters, requiring high sampling
density, or being shown theoretically to learn trivial representations. In this
paper, we propose to optimize the Maximal Coding Rate Reduction metric with
respect to both the data representation and a novel doubly stochastic cluster
membership, inspired by state-of-the-art subspace clustering results. We give a
parameterization of such a representation and membership, allowing efficient
mini-batching and one-shot initialization. Experiments on CIFAR-10, -20, -100,
and TinyImageNet-200 datasets show that the proposed method is much more
accurate and scalable than state-of-the-art deep clustering methods, and
further learns a latent linear representation of the data.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzs2KwjAUBeBsZiGOD-DKvkDrvWlv0y6HMv5ABze6LmlzI4GakXQU9en9GzhwOJvDJ8QUIckKIpjrcHHnRKaACWABNBKw88PpyOHsBjbRj_bO_vYmqp1nHdzN-X2kvYmq_jT8cXjMT_FhdT_w5L_HYrv43laruN4s19VXHetcUawsWiqQUgWSyOZgUBkmoztZ5jkDMnOXkWJuyxawk4pl2WVt9ghCwelYzN63L3JzDO6gw7V50psXPb0DZK4-Jw</recordid><startdate>20230104</startdate><enddate>20230104</enddate><creator>Ding, Tianjiao</creator><creator>Tong, Shengbang</creator><creator>Chan, Kwan Ho Ryan</creator><creator>Dai, Xili</creator><creator>Ma, Yi</creator><creator>Haeffele, Benjamin D</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230104</creationdate><title>Unsupervised Manifold Linearizing and Clustering</title><author>Ding, Tianjiao ; Tong, Shengbang ; Chan, Kwan Ho Ryan ; Dai, Xili ; Ma, Yi ; Haeffele, Benjamin D</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-7f1f5815370255f60d17de5dac2966e01eeec457eeb9b01c27e29c4b44b4108e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Ding, Tianjiao</creatorcontrib><creatorcontrib>Tong, Shengbang</creatorcontrib><creatorcontrib>Chan, Kwan Ho Ryan</creatorcontrib><creatorcontrib>Dai, Xili</creatorcontrib><creatorcontrib>Ma, Yi</creatorcontrib><creatorcontrib>Haeffele, Benjamin D</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ding, Tianjiao</au><au>Tong, Shengbang</au><au>Chan, Kwan Ho Ryan</au><au>Dai, Xili</au><au>Ma, Yi</au><au>Haeffele, Benjamin D</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unsupervised Manifold Linearizing and Clustering</atitle><date>2023-01-04</date><risdate>2023</risdate><abstract>We consider the problem of simultaneously clustering and learning a linear
representation of data lying close to a union of low-dimensional manifolds, a
fundamental task in machine learning and computer vision. When the manifolds
are assumed to be linear subspaces, this reduces to the classical problem of
subspace clustering, which has been studied extensively over the past two
decades. Unfortunately, many real-world datasets such as natural images can not
be well approximated by linear subspaces. On the other hand, numerous works
have attempted to learn an appropriate transformation of the data, such that
data is mapped from a union of general non-linear manifolds to a union of
linear subspaces (with points from the same manifold being mapped to the same
subspace). However, many existing works have limitations such as assuming
knowledge of the membership of samples to clusters, requiring high sampling
density, or being shown theoretically to learn trivial representations. In this
paper, we propose to optimize the Maximal Coding Rate Reduction metric with
respect to both the data representation and a novel doubly stochastic cluster
membership, inspired by state-of-the-art subspace clustering results. We give a
parameterization of such a representation and membership, allowing efficient
mini-batching and one-shot initialization. Experiments on CIFAR-10, -20, -100,
and TinyImageNet-200 datasets show that the proposed method is much more
accurate and scalable than state-of-the-art deep clustering methods, and
further learns a latent linear representation of the data.</abstract><doi>10.48550/arxiv.2301.01805</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2301.01805 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2301_01805 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning |
title | Unsupervised Manifold Linearizing and Clustering |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T12%3A13%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unsupervised%20Manifold%20Linearizing%20and%20Clustering&rft.au=Ding,%20Tianjiao&rft.date=2023-01-04&rft_id=info:doi/10.48550/arxiv.2301.01805&rft_dat=%3Carxiv_GOX%3E2301_01805%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |