EM-LAST: Effective Multidimensional Latent Space Transport for an Unpaired Image-to-Image Translation With an Energy-Based Model

For an unpaired image-to-image translation to work effectively, the latent space of each image domain must be well-designed. The codes of each style must be translated toward the target while preserving the parts corresponding to the source content. In general, most Variational Autoencoder (VAE)-bas...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2022, Vol.10, p.72839-72849
Hauptverfasser: Han, Giwoong, Min, Jinhong, Han, Sung Won
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:For an unpaired image-to-image translation to work effectively, the latent space of each image domain must be well-designed. The codes of each style must be translated toward the target while preserving the parts corresponding to the source content. In general, most Variational Autoencoder (VAE)-based models use a one-dimensional latent space. However, to apply high dimensional methodologies such as vector quantization, controlling a multidimensional latent space is necessary. In this study, among the VAE-based models that use relatively complex multidimensional latent spaces, we apply an Energy-Based Model and Vector-Quantized VAE v2, with the latter as the main model. We show that among the latent spaces that represent each image domain, the importance of each feature at the top and bottom latent spaces must be interpreted differently for appropriate translation. Therefore, we argue that simply understanding the features of latent space composition well can show effective image translation results. We also present various analyses and visual outcomes of multidimensional latent space transport.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3189352