High-fidelity synthesis with causal disentangled representation

There exists a general problem in numerous disentangled representation learning algorithms that improves disentanglement performance by sacrificing generation performance. In order to solve the contradiction between disentanglement and generative performance, we propose a high-fidelity synthetic cau...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2025-03, Vol.265, p.125998, Article 125998
Hauptverfasser:	Yang, Tongsen, Shao, Youjia, Wang, Hao, Zhao, Wencang
Format:	Artikel
Sprache:	eng
Schlagworte:	Disentangled representation learning Generative Adversarial Network Structural Causal Model Variational Auto-Encoder
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	There exists a general problem in numerous disentangled representation learning algorithms that improves disentanglement performance by sacrificing generation performance. In order to solve the contradiction between disentanglement and generative performance, we propose a high-fidelity synthetic causal disentangled representation learning framework (HCAPE) combining Variational Auto-Encoder (VAE) and Generative Adversarial Network (GAN). First, in the inference stage, the assumption that the latent factors of the variation are mutually independent is abandoned. And a Structural Causal Model (SCM) in the encoder is resorted to learn causal representations. Second, a supervision term and balance parameter η were introduced to reduce the interference of disentanglement performance on generation performance. Finally, in the generation stage, a discriminator estimates the model’s gradient information, restoring high-frequency details in the image, thus synthesizing high-fidelity images. Experimental results show that the causal representations learned by the model are semantically interpretable and can generate counterfactual data. Most importantly, the model is able to generate high-fidelity images, achieving an optimal balance between disentangling and generative performance.
ISSN:	0957-4174
DOI:	10.1016/j.eswa.2024.125998