LACE: A light-weight, causal model for enhancing coded speech through adaptive convolutions
Classical speech coding uses low-complexity postfilters with zero lookahead to enhance the quality of coded speech, but their effectiveness is limited by their simplicity. Deep Neural Networks (DNNs) can be much more effective, but require high complexity and model size, or added delay. We propose a...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Classical speech coding uses low-complexity postfilters with zero lookahead
to enhance the quality of coded speech, but their effectiveness is limited by
their simplicity. Deep Neural Networks (DNNs) can be much more effective, but
require high complexity and model size, or added delay. We propose a DNN model
that generates classical filter kernels on a per-frame basis with a model of
just 300~K parameters and 100~MFLOPS complexity, which is a practical
complexity for desktop or mobile device CPUs. The lack of added delay allows it
to be integrated into the Opus codec, and we demonstrate that it enables
effective wideband encoding for bitrates down to 6 kb/s. |
---|---|
DOI: | 10.48550/arxiv.2307.06610 |