Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction

To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction ($\text{MCR}^2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the su...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yu, Yaodong, Chan, Kwan Ho Ryan, You, Chong, Song, Chaobing, Ma, Yi
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition Computer Science - Information Theory Computer Science - Learning Mathematics - Information Theory Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Yu, Yaodong Chan, Kwan Ho Ryan You, Chong Song, Chaobing Ma, Yi
description	To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction ($\text{MCR}^2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive and contrastive learning, and provide theoretical guarantees for learning diverse and discriminative features. The coding rate can be accurately computed from finite samples of degenerate subspace-like distributions and can learn intrinsic representations in supervised, self-supervised, and unsupervised settings in a unified manner. Empirically, the representations learned using this principle alone are significantly more robust to label corruptions in classification than those using cross-entropy, and can lead to state-of-the-art results in clustering mixed data from self-learned invariant features.
doi_str_mv	10.48550/arxiv.2006.08558
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2006_08558</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2006_08558</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-6c5ff4f1f14e8f106a6890cdb8b6b9ae58e471147c4279e55b05a68726de0d723</originalsourceid><addsrcrecordid>eNotj81OwzAQhH3hgAoPwAm_QIKd-i9HVH6lIFDVe7Sx12ApdSInROXtcQqn2RntjPQRcsNZKYyU7A7SKSxlxZgqWQ7MJfENQoohftKHsGCakEJ0-Z5sCscQYc4p3eOYcMI4ZzvEiS4B6PyF9COFaMPYIx08fYNTOEJPd4Nb5_Ywr0X3bdfOFbnw0E94_a8bcnh6POxeiub9-XV33xSgtCmUld4Lzz0XaDxnCpSpmXWd6VRXA0qDQnMutBWVrlHKjsn8oivlkDldbTfk9m_2DNqOmQHST7sCt2fg7S9qIVHk</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction</title><source>arXiv.org</source><creator>Yu, Yaodong ; Chan, Kwan Ho Ryan ; You, Chong ; Song, Chaobing ; Ma, Yi</creator><creatorcontrib>Yu, Yaodong ; Chan, Kwan Ho Ryan ; You, Chong ; Song, Chaobing ; Ma, Yi</creatorcontrib><description>To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction ($\text{MCR}^2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive and contrastive learning, and provide theoretical guarantees for learning diverse and discriminative features. The coding rate can be accurately computed from finite samples of degenerate subspace-like distributions and can learn intrinsic representations in supervised, self-supervised, and unsupervised settings in a unified manner. Empirically, the representations learned using this principle alone are significantly more robust to label corruptions in classification than those using cross-entropy, and can lead to state-of-the-art results in clustering mixed data from self-learned invariant features.</description><identifier>DOI: 10.48550/arxiv.2006.08558</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Information Theory ; Computer Science - Learning ; Mathematics - Information Theory ; Statistics - Machine Learning</subject><creationdate>2020-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2006.08558$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2006.08558$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yu, Yaodong</creatorcontrib><creatorcontrib>Chan, Kwan Ho Ryan</creatorcontrib><creatorcontrib>You, Chong</creatorcontrib><creatorcontrib>Song, Chaobing</creatorcontrib><creatorcontrib>Ma, Yi</creatorcontrib><title>Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction</title><description>To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction ($\text{MCR}^2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive and contrastive learning, and provide theoretical guarantees for learning diverse and discriminative features. The coding rate can be accurately computed from finite samples of degenerate subspace-like distributions and can learn intrinsic representations in supervised, self-supervised, and unsupervised settings in a unified manner. Empirically, the representations learned using this principle alone are significantly more robust to label corruptions in classification than those using cross-entropy, and can lead to state-of-the-art results in clustering mixed data from self-learned invariant features.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Information Theory</subject><subject>Computer Science - Learning</subject><subject>Mathematics - Information Theory</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81OwzAQhH3hgAoPwAm_QIKd-i9HVH6lIFDVe7Sx12ApdSInROXtcQqn2RntjPQRcsNZKYyU7A7SKSxlxZgqWQ7MJfENQoohftKHsGCakEJ0-Z5sCscQYc4p3eOYcMI4ZzvEiS4B6PyF9COFaMPYIx08fYNTOEJPd4Nb5_Ywr0X3bdfOFbnw0E94_a8bcnh6POxeiub9-XV33xSgtCmUld4Lzz0XaDxnCpSpmXWd6VRXA0qDQnMutBWVrlHKjsn8oivlkDldbTfk9m_2DNqOmQHST7sCt2fg7S9qIVHk</recordid><startdate>20200615</startdate><enddate>20200615</enddate><creator>Yu, Yaodong</creator><creator>Chan, Kwan Ho Ryan</creator><creator>You, Chong</creator><creator>Song, Chaobing</creator><creator>Ma, Yi</creator><scope>AKY</scope><scope>AKZ</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20200615</creationdate><title>Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction</title><author>Yu, Yaodong ; Chan, Kwan Ho Ryan ; You, Chong ; Song, Chaobing ; Ma, Yi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-6c5ff4f1f14e8f106a6890cdb8b6b9ae58e471147c4279e55b05a68726de0d723</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Information Theory</topic><topic>Computer Science - Learning</topic><topic>Mathematics - Information Theory</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Yu, Yaodong</creatorcontrib><creatorcontrib>Chan, Kwan Ho Ryan</creatorcontrib><creatorcontrib>You, Chong</creatorcontrib><creatorcontrib>Song, Chaobing</creatorcontrib><creatorcontrib>Ma, Yi</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Mathematics</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yu, Yaodong</au><au>Chan, Kwan Ho Ryan</au><au>You, Chong</au><au>Song, Chaobing</au><au>Ma, Yi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction</atitle><date>2020-06-15</date><risdate>2020</risdate><abstract>To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction ($\text{MCR}^2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive and contrastive learning, and provide theoretical guarantees for learning diverse and discriminative features. The coding rate can be accurately computed from finite samples of degenerate subspace-like distributions and can learn intrinsic representations in supervised, self-supervised, and unsupervised settings in a unified manner. Empirically, the representations learned using this principle alone are significantly more robust to label corruptions in classification than those using cross-entropy, and can lead to state-of-the-art results in clustering mixed data from self-learned invariant features.</abstract><doi>10.48550/arxiv.2006.08558</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2006.08558
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2006_08558
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Information Theory Computer Science - Learning Mathematics - Information Theory Statistics - Machine Learning
title	Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T12%3A25%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Diverse%20and%20Discriminative%20Representations%20via%20the%20Principle%20of%20Maximal%20Coding%20Rate%20Reduction&rft.au=Yu,%20Yaodong&rft.date=2020-06-15&rft_id=info:doi/10.48550/arxiv.2006.08558&rft_dat=%3Carxiv_GOX%3E2006_08558%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true