Multi-resolution Twinned Residual Auto-Encoders (MR-TRAE)—A Novel DL Model for Image Multi-resolution

In this paper, we design and evaluate the performance of the Multi-resolution Twinned Residual Auto-Encoders (MR-TRAE) model, a deep learning (DL)-based architecture specifically designed for achieving multi-resolution super-resolved images from low-resolution (LR) inputs at various scaling factors....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Cognitive computation 2024-07, Vol.16 (4), p.1447-1469
Hauptverfasser: Momenzadeh, Alireza, Baccarelli, Enzo, Scarpiniti, Michele, Sarv Ahrabi, Sima
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we design and evaluate the performance of the Multi-resolution Twinned Residual Auto-Encoders (MR-TRAE) model, a deep learning (DL)-based architecture specifically designed for achieving multi-resolution super-resolved images from low-resolution (LR) inputs at various scaling factors. For this purpose, we expand on the recently introduced Twinned Residual Auto-Encoders (TRAE) paradigm for single-image super-resolution (SISR) to extend it to the multi-resolution (MR) domain. The main contributions of this work include (i) the architecture of the MR-TRAE model, which utilizes cascaded trainable up-sampling modules for progressively increasing the spatial resolution of low-resolution (LR) input images at multiple scaling factors; (ii) a novel loss function designed for the joint and semi-blind training of all MR-TRAE model components; and (iii) a comprehensive analysis of the MR-TRAE trade-off between model complexity and performance. Furthermore, we thoroughly explore the connections between the MR-TRAE architecture and broader cognitive paradigms, including knowledge distillation, the teacher-student learning model, and hierarchical cognition. Performance evaluations of the MR-TRAE benchmarked against state-of-the-art models (such as U-Net, generative adversarial network (GAN)-based, and single-resolution baselines) were conducted using publicly available datasets. These datasets consist of LR computer tomography (CT) scans from patients with COVID-19. Our tests, which explored multi-resolutions at scaling factors × ( 2 , 4 , 8 ) , showed a significant finding: the MR-TRAE model can reduce training times by up to 60 % compared to those of the baselines, without a noticeable impact on achieved performance.
ISSN:1866-9956
1866-9964
DOI:10.1007/s12559-024-10293-1