Self-supervised Auxiliary Loss for Metric Learning in Music Similarity-based Retrieval and Auto-tagging
In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Given the limitations and non-scalability of human supervision signals, it becomes crucial for models to learn from alternative sources to enhance their performance. Self-supervise...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the realm of music information retrieval, similarity-based retrieval and
auto-tagging serve as essential components. Given the limitations and
non-scalability of human supervision signals, it becomes crucial for models to
learn from alternative sources to enhance their performance. Self-supervised
learning, which exclusively relies on learning signals derived from music audio
data, has demonstrated its efficacy in the context of auto-tagging. In this
study, we propose a model that builds on the self-supervised learning approach
to address the similarity-based retrieval challenge by introducing our method
of metric learning with a self-supervised auxiliary loss. Furthermore,
diverging from conventional self-supervised learning methodologies, we
discovered the advantages of concurrently training the model with both
self-supervision and supervision signals, without freezing pre-trained models.
We also found that refraining from employing augmentation during the
fine-tuning phase yields better results. Our experimental results confirm that
the proposed methodology enhances retrieval and tagging performance metrics in
two distinct scenarios: one where human-annotated tags are consistently
available for all music tracks, and another where such tags are accessible only
for a subset of tracks. |
---|---|
DOI: | 10.48550/arxiv.2304.07449 |