Codec-SUPERB: An In-Depth Analysis of Sound Codec Models
The sound codec's dual roles in minimizing data transmission latency and serving as tokenizers underscore its critical importance. Recent years have witnessed significant developments in codec models. The ideal sound codec should preserve content, paralinguistics, speakers, and audio informatio...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The sound codec's dual roles in minimizing data transmission latency and
serving as tokenizers underscore its critical importance. Recent years have
witnessed significant developments in codec models. The ideal sound codec
should preserve content, paralinguistics, speakers, and audio information.
However, the question of which codec achieves optimal sound information
preservation remains unanswered, as in different papers, models are evaluated
on their selected experimental settings. This study introduces Codec-SUPERB, an
acronym for Codec sound processing Universal PERformance Benchmark. It is an
ecosystem designed to assess codec models across representative sound
applications and signal-level metrics rooted in sound domain
knowledge.Codec-SUPERB simplifies result sharing through an online leaderboard,
promoting collaboration within a community-driven benchmark database, thereby
stimulating new development cycles for codecs. Furthermore, we undertake an
in-depth analysis to offer insights into codec models from both application and
signal perspectives, diverging from previous codec papers mainly concentrating
on signal-level comparisons. Finally, we will release codes, the leaderboard,
and data to accelerate progress within the community. |
---|---|
DOI: | 10.48550/arxiv.2402.13071 |