Wav2Pix Enhancement and evaluation of a speech-conditioned image generator

We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Tubau Pires, Miquel
Format:	Dissertation
Sprache:	eng
Schlagworte:	adversarial learning Aprenentatge automàtic Computer vision deep learning face synthesis Informàtica Machine learning Visió per ordinador Àrees temàtiques de la UPC
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Tubau Pires, Miquel
description	We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).
format	Dissertation
fullrecord	<record><control><sourceid>csuc_XX2</sourceid><recordid>TN_cdi_csuc_recercat_oai_recercat_cat_2072_354436</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_recercat_cat_2072_354436</sourcerecordid><originalsourceid>FETCH-csuc_recercat_oai_recercat_cat_2072_3544363</originalsourceid><addsrcrecordid>eNqdizEKwkAQRdNYiHqHuUBAN9EcQCJiZSFYLsPkJ1lIZmV3Ezy-BAR7i8_jPfjr7Pbk2dzdm2rtWQUjNBFrQ5h5mDg5r-RbYoovQPpcvDZuqWjIjdyBOigCJx-22arlIWL35SY7XOrH-ZpLnMQGCIJwsp7dT5aZfWVscSzL4lT88_kA_3FCDg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>dissertation</recordtype></control><display><type>dissertation</type><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><source>Recercat</source><creator>Tubau Pires, Miquel</creator><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><description>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</description><language>eng</language><publisher>Universitat Politècnica de Catalunya</publisher><subject>adversarial learning ; Aprenentatge automàtic ; Computer vision ; deep learning ; face synthesis ; Informàtica ; Machine learning ; Visió per ordinador ; Àrees temàtiques de la UPC</subject><creationdate>2019</creationdate><rights>info:eu-repo/semantics/openAccess</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,311,780,885,26974</link.rule.ids><linktorsrc>$$Uhttps://recercat.cat/handle/2072/354436$$EView_record_in_Consorci_de_Serveis_Universitaris_de_Catalunya_(CSUC)$$FView_record_in_$$GConsorci_de_Serveis_Universitaris_de_Catalunya_(CSUC)$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><description>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</description><subject>adversarial learning</subject><subject>Aprenentatge automàtic</subject><subject>Computer vision</subject><subject>deep learning</subject><subject>face synthesis</subject><subject>Informàtica</subject><subject>Machine learning</subject><subject>Visió per ordinador</subject><subject>Àrees temàtiques de la UPC</subject><fulltext>true</fulltext><rsrctype>dissertation</rsrctype><creationdate>2019</creationdate><recordtype>dissertation</recordtype><sourceid>XX2</sourceid><recordid>eNqdizEKwkAQRdNYiHqHuUBAN9EcQCJiZSFYLsPkJ1lIZmV3Ezy-BAR7i8_jPfjr7Pbk2dzdm2rtWQUjNBFrQ5h5mDg5r-RbYoovQPpcvDZuqWjIjdyBOigCJx-22arlIWL35SY7XOrH-ZpLnMQGCIJwsp7dT5aZfWVscSzL4lT88_kA_3FCDg</recordid><startdate>201901</startdate><enddate>201901</enddate><creator>Tubau Pires, Miquel</creator><general>Universitat Politècnica de Catalunya</general><scope>XX2</scope></search><sort><creationdate>201901</creationdate><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><author>Tubau Pires, Miquel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-csuc_recercat_oai_recercat_cat_2072_3544363</frbrgroupid><rsrctype>dissertations</rsrctype><prefilter>dissertations</prefilter><language>eng</language><creationdate>2019</creationdate><topic>adversarial learning</topic><topic>Aprenentatge automàtic</topic><topic>Computer vision</topic><topic>deep learning</topic><topic>face synthesis</topic><topic>Informàtica</topic><topic>Machine learning</topic><topic>Visió per ordinador</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>online_resources</toplevel><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><collection>Recercat</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tubau Pires, Miquel</au><format>dissertation</format><genre>dissertation</genre><ristype>THES</ristype><btitle>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</btitle><date>2019-01</date><risdate>2019</risdate><abstract>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</abstract><pub>Universitat Politècnica de Catalunya</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng
recordid	cdi_csuc_recercat_oai_recercat_cat_2072_354436
source	Recercat
subjects	adversarial learning Aprenentatge automàtic Computer vision deep learning face synthesis Informàtica Machine learning Visió per ordinador Àrees temàtiques de la UPC
title	Wav2Pix Enhancement and evaluation of a speech-conditioned image generator
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T04%3A21%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-csuc_XX2&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft.genre=dissertation&rft.btitle=Wav2Pix%20Enhancement%20and%20evaluation%20of%20a%20speech-conditioned%20image%20generator&rft.au=Tubau%20Pires,%20Miquel&rft.date=2019-01&rft_id=info:doi/&rft_dat=%3Ccsuc_XX2%3Eoai_recercat_cat_2072_354436%3C/csuc_XX2%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true