One-for-All: Grouped Variation Network-Based Fractional Interpolation in Video Coding

Fractional interpolation is used to provide sub-pixel level references for motion compensation in the interprediction of video coding, which attempts to remove temporal redundancy in video sequences. Traditional handcrafted fractional interpolation filters face the challenge of modeling discontinuou...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2019-05, Vol.28 (5), p.2140-2151
Hauptverfasser:	Liu, Jiaying, Xia, Sifeng, Yang, Wenhan, Li, Mading, Liu, Dong
Format:	Artikel
Sprache:	eng
Schlagworte:	Aliasing Artificial neural networks Blurring Coding Computer simulation convolutional neural network (CNN) fractional interpolation grouped variation network High efficient video coding (HEVC) Impact analysis Interpolation Machine learning Motion compensation Pixels Redundancy Training data Video coding
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Fractional interpolation is used to provide sub-pixel level references for motion compensation in the interprediction of video coding, which attempts to remove temporal redundancy in video sequences. Traditional handcrafted fractional interpolation filters face the challenge of modeling discontinuous regions in videos, while existing deep learning-based methods are either designed for a single quantization parameter (QP), only generating half-pixel samples, or need to train a model for each sub-pixel position. In this paper, we present a one-for-all fractional interpolation method based on a grouped variation convolutional neural network (GVCNN). Our method can deal with video frames coded using different QPs and is capable of generating all sub-pixel positions at one sub-pixel level. Also, by predicting variations between integer-position pixels and sub-pixels, our network offers more expressive power. Moreover, we perform specific measurements in training data generation to simulate practical situations in video coding, including blurring the down-sampled sub-pixel samples to avoid aliasing effects and coding integer pixels to simulate reconstruction errors. In addition, we analyze the impact of the size of blur kernels theoretically. Experimental results verify the efficiency of GVCNN. Compared with HEVC, our method achieves 2.2% in bit saving on average and up to 5.2% under low-delay P configuration.
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2018.2882923