Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction
To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction ($\text{MCR}^2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the su...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | To learn intrinsic low-dimensional structures from high-dimensional data that
most discriminate between classes, we propose the principle of Maximal Coding
Rate Reduction ($\text{MCR}^2$), an information-theoretic measure that
maximizes the coding rate difference between the whole dataset and the sum of
each individual class. We clarify its relationships with most existing
frameworks such as cross-entropy, information bottleneck, information gain,
contractive and contrastive learning, and provide theoretical guarantees for
learning diverse and discriminative features. The coding rate can be accurately
computed from finite samples of degenerate subspace-like distributions and can
learn intrinsic representations in supervised, self-supervised, and
unsupervised settings in a unified manner. Empirically, the representations
learned using this principle alone are significantly more robust to label
corruptions in classification than those using cross-entropy, and can lead to
state-of-the-art results in clustering mixed data from self-learned invariant
features. |
---|---|
DOI: | 10.48550/arxiv.2006.08558 |