Wav2Pix Enhancement and evaluation of a speech-conditioned image generator

We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Tubau Pires, Miquel
Format:	Dissertation
Sprache:	eng
Schlagworte:	adversarial learning Aprenentatge automàtic Computer vision deep learning face synthesis Informàtica Machine learning Visió per ordinador Àrees temàtiques de la UPC
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).