Efficient And Scalable Neural Residual Waveform Coding With Collaborative Quantization
Scalability and efficiency are desired in neural speech codecs, which supports a wide range of bitrates for applications on various devices. We propose a collaborative quantization (CQ) scheme to jointly learn the codebook of LPC coefficients and the corresponding residuals. CQ does not simply shoeh...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Scalability and efficiency are desired in neural speech codecs, which
supports a wide range of bitrates for applications on various devices. We
propose a collaborative quantization (CQ) scheme to jointly learn the codebook
of LPC coefficients and the corresponding residuals. CQ does not simply
shoehorn LPC to a neural network, but bridges the computational capacity of
advanced neural network models and traditional, yet efficient and
domain-specific digital signal processing methods in an integrated manner. We
demonstrate that CQ achieves much higher quality than its predecessor at 9 kbps
with even lower model complexity. We also show that CQ can scale up to 24 kbps
where it outperforms AMR-WB and Opus. As a neural waveform codec, CQ models are
with less than 1 million parameters, significantly less than many other
generative models. |
---|---|
DOI: | 10.48550/arxiv.2002.05604 |