RSHAN: Image super-resolution network based on residual separation hybrid attention module

Transformer has become one of the main architectures in deep learning, showing impressive performance in various vision tasks, especially for image super-resolution (SR). However, due to the usage of high-resolution input images, most current Transformer-based image super-resolution models have a la...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering applications of artificial intelligence 2023-06, Vol.122, p.106072, Article 106072
Hauptverfasser: Shen, Ying, Zheng, Weihuang, Chen, Liqiong, Huang, Feng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Transformer has become one of the main architectures in deep learning, showing impressive performance in various vision tasks, especially for image super-resolution (SR). However, due to the usage of high-resolution input images, most current Transformer-based image super-resolution models have a large number of parameters and high computational complexity. Moreover, some components employed in the Transformer may be redundant for SR tasks, which may limit the SR performance. In this work, we propose an efficient and concise model for image super-resolution tasks termed Residual Separation Hybrid Attention Network (RSHAN), which aims to solve the problems of redundant components and insufficient ability to extract high-frequency information of Transformer. Specifically, we present the Residual Separation Hybrid Attention Module (RSHAM) which fuses the local features extracted by the convolutional neural network (CNN) branch and the long-range dependencies extracted by Transformers to improve the performance of RSHAN. Extensive experiments on numerous benchmark datasets show that the proposed method outperforms state-of-the-art SR methods by up to 0.11 dB in peak signal-to-noise ratio (PSNR) metric, while the computational complexity and the inference time is reduced by 5% and 10%, respectively.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2023.106072