PMSGAN: Parallel Multistage GANs for Face Image Translation

In this article, we address the face image translation task, which aims to translate a face image of a source domain to a target domain. Although significant progress has been made by recent studies, face image translation is still a challenging task because it has more strict requirements for textu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transaction on neural networks and learning systems 2024-07, Vol.35 (7), p.9352-9365
Hauptverfasser:	Liang, Changcheng, Zhu, Mingrui, Wang, Nannan, Yang, Heng, Gao, Xinbo
Format:	Artikel
Sprache:	eng
Schlagworte:	Atrous spatial pyramid Benchmarks Decoding Disintegration Face Face detection face image translation Generative adversarial networks Image analysis Image quality Information processing parallel multistage Spatial discrimination Spatial resolution Training Translation Visual tasks
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this article, we address the face image translation task, which aims to translate a face image of a source domain to a target domain. Although significant progress has been made by recent studies, face image translation is still a challenging task because it has more strict requirements for texture details: even a few artifacts will greatly affect the impression of generated face images. Targeting to synthesize high-quality face images with admirable visual appearance, we revisit the coarse-to-fine strategy and propose a novel p arallel m ultistage architecture on the basis of g enerative a dversarial n etworks (PMSGAN). More specifically, PMSGAN progressively learns the translation function by disintegrating the general synthesis process into multiple parallel stages that take images with gradually decreasing spatial resolution as inputs. To prompt the information exchange between various stages, a cross-stage atrous spatial pyramid (CSASP) structure is specially designed to receive and fuse the contextual information from other stages. At the end of the parallel model, we introduce a novel attention-based module that leverages multistage decoded outputs as in situ supervised attention to refine the final activations and yield the target image. Extensive experiments on several face image translation benchmarks show that PMSGAN performs considerably better than state-of-the-art approaches.
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2022.3233025