Neural texture transfer assisted video coding with adaptive up-sampling

Deep learning techniques have been extensively investigated for the purpose of further increasing the efficiency of traditional video compression. Some deep learning techniques for down/up-sampling-based video coding were found to be especially effective when the bandwidth or storage is limited. Exi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Signal processing. Image communication 2022-09, Vol.107, p.116754, Article 116754
Hauptverfasser: Yu, Li, Chang, Wenshuai, Quan, Weize, Xiao, Jimin, Yan, Dong-Ming, Gabbouj, Moncef
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deep learning techniques have been extensively investigated for the purpose of further increasing the efficiency of traditional video compression. Some deep learning techniques for down/up-sampling-based video coding were found to be especially effective when the bandwidth or storage is limited. Existing works mainly differ in the super-resolution models used. Some works simply use a single image super-resolution model, ignoring the rich information in the correlation between video frames, while others explore the correlation between frames by simply concatenating the features across adjacent frames. This, however, may fail when the textures are not well aligned. In this paper, we propose to utilize neural texture transfer which exploits the semantic correlation between frames and is able to explore the correlated information even when the textures are not aligned. Meanwhile, an adaptive group of pictures (GOP) method is proposed to automatically decide whether a frame should be down-sampled or not. Experimental results show that the proposed method outperforms the standard HEVC and state-of-the-art methods under different compression configurations. When compared to standard HEVC, the BD-rate (PSNR) and BD-rate (SSIM) of the proposed method are up to -19.1% and -26.5%, respectively. •We introduce reference-based SR in down/up-sampling-based video coding method, where target and reference images are not required to be texture-aligned as required in existing methods.•We proposed an adaptive group of pictures (GOP) method to automatically decide the adaptive sampling scheme.•The neural texture transfer model for reference-based SR produces realistic up-sampled frame at the decoding end.
ISSN:0923-5965
1879-2677
DOI:10.1016/j.image.2022.116754