SwinTExCo: Exemplar-based video colorization using Swin Transformer

Video colorization represents a compelling domain within the field of Computer Vision. The traditional approach in this field relies on Convolutional Neural Networks (CNNs) to extract features from each video frame and employs a recurrent network to learn information between video frames. While demo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2025-01, Vol.260, p.125437, Article 125437
Hauptverfasser: Tran, Duong Thanh, Nguyen, Nguyen Doan Hieu, Pham, Trung Thanh, Tran, Phuong-Nam, Vu, Thuy-Duong Thi, Nguyen, Cuong Tuan, Dang-Ngoc, Hanh, Dang, Duc Ngoc Minh
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Video colorization represents a compelling domain within the field of Computer Vision. The traditional approach in this field relies on Convolutional Neural Networks (CNNs) to extract features from each video frame and employs a recurrent network to learn information between video frames. While demonstrating considerable success in colorization, most traditional CNNs suffer from a limited receptive field size, capturing local information within a fixed-sized window. Consequently, they struggle to directly grasp long-range dependencies or pixel relationships that span large image or video frame areas. To address this limitation, recent advancements in the field have leveraged Vision Transformer (ViT) and their variants to enhance performance. This article introduces Swin Transformer Exemplar-based Video Colorization (SwinTExCo), an end-to-end model for the video colorization process that incorporates the Swin Transformer architecture as the backbone. The experimental results demonstrate that our proposed method outperforms many other state-of-the-art methods in both quantitative and qualitative metrics. The achievements of this research have significant implications for the domain of documentary and history video restoration, contributing to the broader goal of preserving cultural heritage and facilitating a deeper understanding of historical events through enhanced audiovisual materials. [Display omitted] •Apply Swin Transformer to enhance inference quality in video colorization.•Promote a video colorization model with rapid inference speed.•Conduct diverse metrics, measurements, and surveys to evaluate different models.
ISSN:0957-4174
DOI:10.1016/j.eswa.2024.125437