CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models
Challenges in managing linguistic diversity and integrating various musical modalities are faced by current music information retrieval systems. These limitations reduce their effectiveness in a global, multimodal music environment. To address these issues, we introduce CLaMP 2, a system compatible...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Challenges in managing linguistic diversity and integrating various musical
modalities are faced by current music information retrieval systems. These
limitations reduce their effectiveness in a global, multimodal music
environment. To address these issues, we introduce CLaMP 2, a system compatible
with 101 languages that supports both ABC notation (a text-based musical
notation format) and MIDI (Musical Instrument Digital Interface) for music
information retrieval. CLaMP 2, pre-trained on 1.5 million ABC-MIDI-text
triplets, includes a multilingual text encoder and a multimodal music encoder
aligned via contrastive learning. By leveraging large language models, we
obtain refined and consistent multilingual descriptions at scale, significantly
reducing textual noise and balancing language distribution. Our experiments
show that CLaMP 2 achieves state-of-the-art results in both multilingual
semantic search and music classification across modalities, thus establishing a
new standard for inclusive and global music information retrieval. |
---|---|
DOI: | 10.48550/arxiv.2410.13267 |