End-to-End Learning-Based Image Compression With a Decoupled Framework
The autoregressive model has been widely used in learning-based image compression due to its superior context modeling capability. However, its sequential processing nature also undermines the ability of decoding in parallel and hinders the deployment in real applications. In this paper, we propose...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems for video technology 2024-05, Vol.34 (5), p.3067-3081 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The autoregressive model has been widely used in learning-based image compression due to its superior context modeling capability. However, its sequential processing nature also undermines the ability of decoding in parallel and hinders the deployment in real applications. In this paper, we propose a decoupled framework to resolve this issue. With the decoupled architecture, the entropy decoding process is independent of the latent sample reconstruction process. The entropy decoding process thus can be finished before the latent sample prediction process begins, which leads to significant decoding time savings by enabling the two processes to be conducted in parallel. To further reduce the decoding time, we introduce wavefront processing, where multiple rows can be processed simultaneously when reconstructing the latent samples. On top of that, we design a series of coding tools to improve the rate-distortion efficiency and reduce the decoding complexity. Device interoperability is also supported by the proposed solution, where the same bitstream can be successfully decoded on different CPU/GPU devices. Comprehensive experiments are conducted to validate the effectiveness of the proposed method. Using objective evaluation metrics required by JPEG AI Call for Proposals (CfP), the proposed method achieves a BD-rate change of −29.6% on average with 2.44 times faster decoding speed compared to VVC image coding. When compared to the commonly used benchmark learning-based methods, the proposed method achieves −30.5% BD-rate changes and 101 times faster decoding speed over cheng2020attn. The proposed solution has been proposed to JPEG AI and IEEE 1857.11 as a response to CfP and the core techniques have been adopted by both. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2023.3313974 |