Real-Time Lightweight Video Super-Resolution With RRED-Based Perceptual Constraint

Real-time video services are gaining popularity in our daily life, yet limited network bandwidth can constrain the delivered video quality. Video Super Resolution (VSR) technology emerges as a key solution to enhance user experience by reconstructing high-resolution (HR) videos. The existing real-ti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2024-10, Vol.34 (10), p.10310-10325
Hauptverfasser: Wu, Xinyi, Lopez-Tapia, Santiago, Wang, Xijun, Molina, Rafael, Katsaggelos, Aggelos K.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Real-time video services are gaining popularity in our daily life, yet limited network bandwidth can constrain the delivered video quality. Video Super Resolution (VSR) technology emerges as a key solution to enhance user experience by reconstructing high-resolution (HR) videos. The existing real-time VSR frameworks have primarily emphasized spatial quality metrics like PSNR and SSIM, which often lack consideration of temporal coherence, a critical factor for accurately reflecting the overall quality of super-resolved videos. Inspired by Video Quality Assessment (VQA) strategies, we propose a dual-frame training framework and a lightweight multi-branch network to address VSR processing in real time. Such designs thoroughly leverage the spatio-temporal correlations between consecutive frames so as to ensure efficient video restoration. Furthermore, we incorporate ST-RRED, a powerful VQA approach that separately measures spatial and temporal consistency aligning with human perception principles, into our loss functions. This guides us to synthesize quality-aware perceptual features across both space and time for realistic reconstruction. Our model demonstrates remarkable efficiency, achieving near real-time processing of 4K videos. Compared to the state-of-the-art lightweight model MRVSR, ours is more compact and faster, 60% smaller in size (0.483M vs. 1.21M parameters), and 106% quicker (96.44fps vs. 46.7fps on 1080p frames), with significantly improved perceptual quality.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2024.3405827