Diversify, Contextualize, and Adapt: Efficient Entropy Modeling for Neural Image Codec
Designing a fast and effective entropy model is challenging but essential for practical application of neural codecs. Beyond spatial autoregressive entropy models, more efficient backward adaptation-based entropy models have been recently developed. They not only reduce decoding time by using smalle...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Designing a fast and effective entropy model is challenging but essential for
practical application of neural codecs. Beyond spatial autoregressive entropy
models, more efficient backward adaptation-based entropy models have been
recently developed. They not only reduce decoding time by using smaller number
of modeling steps but also maintain or even improve rate--distortion
performance by leveraging more diverse contexts for backward adaptation.
Despite their significant progress, we argue that their performance has been
limited by the simple adoption of the design convention for forward adaptation:
using only a single type of hyper latent representation, which does not provide
sufficient contextual information, especially in the first modeling step. In
this paper, we propose a simple yet effective entropy modeling framework that
leverages sufficient contexts for forward adaptation without compromising on
bit-rate. Specifically, we introduce a strategy of diversifying hyper latent
representations for forward adaptation, i.e., using two additional types of
contexts along with the existing single type of context. In addition, we
present a method to effectively use the diverse contexts for contextualizing
the current elements to be encoded/decoded. By addressing the limitation of the
previous approach, our proposed framework leads to significant performance
improvements. Experimental results on popular datasets show that our proposed
framework consistently improves rate--distortion performance across various
bit-rate regions, e.g., 3.73% BD-rate gain over the state-of-the-art baseline
on the Kodak dataset. |
---|---|
DOI: | 10.48550/arxiv.2411.05832 |