A multi-scale no-reference video quality assessment method based on transformer
Video quality assessment is essential for optimizing user experience, enhancing network efficiency, supporting video production and editing, improving advertising effectiveness, and strengthening security in monitoring and other domains. Reacting to the prevailing focus of current research on video...
Gespeichert in:
Veröffentlicht in: | Multimedia systems 2024-08, Vol.30 (4), Article 201 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Video quality assessment is essential for optimizing user experience, enhancing network efficiency, supporting video production and editing, improving advertising effectiveness, and strengthening security in monitoring and other domains. Reacting to the prevailing focus of current research on video detail distortion while overlooking the temporal relationships between video frames and the impact of content-dependent characteristics of the human visual system on video quality, this paper proposes a multi-scale no-reference video quality assessment method based on transformer. On the one hand, spatial features of the video are extracted using a network that combines swin-transformer and deformable convolution, and further information preservation is achieved through mixed pooling of features in video frames. On the other hand, a pyramid aggregation module is utilized to merge long-term and short-term memories, enhancing the ability to capture temporal changes. Experimental results on public datasets such as KoNViD-1k, CVD2014, and LIVE-VQC demonstrate the effectiveness of the proposed method in video quality prediction. |
---|---|
ISSN: | 0942-4962 1432-1882 |
DOI: | 10.1007/s00530-024-01403-y |