The Challenges of Continuous Self-Supervised Learning
Self-supervised learning (SSL) aims to eliminate one of the major bottlenecks in representation learning - the need for human annotations. As a result, SSL holds the promise to learn representations from data in-the-wild, i.e., without the need for finite and static datasets. Instead, true SSL algor...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Self-supervised learning (SSL) aims to eliminate one of the major bottlenecks
in representation learning - the need for human annotations. As a result, SSL
holds the promise to learn representations from data in-the-wild, i.e., without
the need for finite and static datasets. Instead, true SSL algorithms should be
able to exploit the continuous stream of data being generated on the internet
or by agents exploring their environments. But do traditional self-supervised
learning approaches work in this setup? In this work, we investigate this
question by conducting experiments on the continuous self-supervised learning
problem. While learning in the wild, we expect to see a continuous (infinite)
non-IID data stream that follows a non-stationary distribution of visual
concepts. The goal is to learn a representation that can be robust, adaptive
yet not forgetful of concepts seen in the past. We show that a direct
application of current methods to such continuous setup is 1) inefficient both
computationally and in the amount of data required, 2) leads to inferior
representations due to temporal correlations (non-IID data) in some sources of
streaming data and 3) exhibits signs of catastrophic forgetting when trained on
sources with non-stationary data distributions. We propose the use of replay
buffers as an approach to alleviate the issues of inefficiency and temporal
correlations. We further propose a novel method to enhance the replay buffer by
maintaining the least redundant samples. Minimum redundancy (MinRed) buffers
allow us to learn effective representations even in the most challenging
streaming scenarios composed of sequential visual data obtained from a single
embodied agent, and alleviates the problem of catastrophic forgetting when
learning from data with non-stationary semantic distributions. |
---|---|
DOI: | 10.48550/arxiv.2203.12710 |