Stable parallel training of Wasserstein conditional generative adversarial neural networks

We propose a stable, parallel approach to train Wasserstein conditional generative adversarial neural networks (W-CGANs) under the constraint of a fixed computational budget. Differently from previous distributed GANs training techniques, our approach avoids inter-process communications, reduces the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2023-02, Vol.79 (2), p.1856-1876
Hauptverfasser: Lupo Pasini, Massimiliano, Yin, Junqi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose a stable, parallel approach to train Wasserstein conditional generative adversarial neural networks (W-CGANs) under the constraint of a fixed computational budget. Differently from previous distributed GANs training techniques, our approach avoids inter-process communications, reduces the risk of mode collapse and enhances scalability by using multiple generators, each one of them concurrently trained on a single data label. The use of the Wasserstein metric also reduces the risk of cycling by stabilizing the training of each generator. We illustrate the approach on the CIFAR10, CIFAR100, and ImageNet1k datasets, three standard benchmark image datasets, maintaining the original resolution of the images for each dataset. Performance is assessed in terms of scalability and final accuracy within a limited fixed computational time and computational resources. To measure accuracy, we use the inception score, the Fréchet inception distance, and image quality. An improvement in inception score and Fréchet inception distance is shown in comparison to previous results obtained by performing the parallel approach on deep convolutional conditional generative adversarial neural networks as well as an improvement of image quality of the new images created by the GANs approach. Weak scaling is attained on both datasets using up to 2000 NVIDIA V100 GPUs on the OLCF supercomputer Summit.
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-022-04721-y