Wav2Pix Enhancement and evaluation of a speech-conditioned image generator

We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Tubau Pires, Miquel
Format: Dissertation
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Tubau Pires, Miquel
description We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).
format Dissertation
fullrecord <record><control><sourceid>csuc_XX2</sourceid><recordid>TN_cdi_csuc_recercat_oai_recercat_cat_2072_354436</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_recercat_cat_2072_354436</sourcerecordid><originalsourceid>FETCH-csuc_recercat_oai_recercat_cat_2072_3544363</originalsourceid><addsrcrecordid>eNqdizEKwkAQRdNYiHqHuUBAN9EcQCJiZSFYLsPkJ1lIZmV3Ezy-BAR7i8_jPfjr7Pbk2dzdm2rtWQUjNBFrQ5h5mDg5r-RbYoovQPpcvDZuqWjIjdyBOigCJx-22arlIWL35SY7XOrH-ZpLnMQGCIJwsp7dT5aZfWVscSzL4lT88_kA_3FCDg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>dissertation</recordtype></control><display><type>dissertation</type><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><source>Recercat</source><creator>Tubau Pires, Miquel</creator><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><description>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</description><language>eng</language><publisher>Universitat Politècnica de Catalunya</publisher><subject>adversarial learning ; Aprenentatge automàtic ; Computer vision ; deep learning ; face synthesis ; Informàtica ; Machine learning ; Visió per ordinador ; Àrees temàtiques de la UPC</subject><creationdate>2019</creationdate><rights>info:eu-repo/semantics/openAccess</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,311,780,885,26974</link.rule.ids><linktorsrc>$$Uhttps://recercat.cat/handle/2072/354436$$EView_record_in_Consorci_de_Serveis_Universitaris_de_Catalunya_(CSUC)$$FView_record_in_$$GConsorci_de_Serveis_Universitaris_de_Catalunya_(CSUC)$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><description>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</description><subject>adversarial learning</subject><subject>Aprenentatge automàtic</subject><subject>Computer vision</subject><subject>deep learning</subject><subject>face synthesis</subject><subject>Informàtica</subject><subject>Machine learning</subject><subject>Visió per ordinador</subject><subject>Àrees temàtiques de la UPC</subject><fulltext>true</fulltext><rsrctype>dissertation</rsrctype><creationdate>2019</creationdate><recordtype>dissertation</recordtype><sourceid>XX2</sourceid><recordid>eNqdizEKwkAQRdNYiHqHuUBAN9EcQCJiZSFYLsPkJ1lIZmV3Ezy-BAR7i8_jPfjr7Pbk2dzdm2rtWQUjNBFrQ5h5mDg5r-RbYoovQPpcvDZuqWjIjdyBOigCJx-22arlIWL35SY7XOrH-ZpLnMQGCIJwsp7dT5aZfWVscSzL4lT88_kA_3FCDg</recordid><startdate>201901</startdate><enddate>201901</enddate><creator>Tubau Pires, Miquel</creator><general>Universitat Politècnica de Catalunya</general><scope>XX2</scope></search><sort><creationdate>201901</creationdate><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><author>Tubau Pires, Miquel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-csuc_recercat_oai_recercat_cat_2072_3544363</frbrgroupid><rsrctype>dissertations</rsrctype><prefilter>dissertations</prefilter><language>eng</language><creationdate>2019</creationdate><topic>adversarial learning</topic><topic>Aprenentatge automàtic</topic><topic>Computer vision</topic><topic>deep learning</topic><topic>face synthesis</topic><topic>Informàtica</topic><topic>Machine learning</topic><topic>Visió per ordinador</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>online_resources</toplevel><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><collection>Recercat</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tubau Pires, Miquel</au><format>dissertation</format><genre>dissertation</genre><ristype>THES</ristype><btitle>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</btitle><date>2019-01</date><risdate>2019</risdate><abstract>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</abstract><pub>Universitat Politècnica de Catalunya</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_csuc_recercat_oai_recercat_cat_2072_354436
source Recercat
subjects adversarial learning
Aprenentatge automàtic
Computer vision
deep learning
face synthesis
Informàtica
Machine learning
Visió per ordinador
Àrees temàtiques de la UPC
title Wav2Pix Enhancement and evaluation of a speech-conditioned image generator
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T04%3A21%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-csuc_XX2&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft.genre=dissertation&rft.btitle=Wav2Pix%20Enhancement%20and%20evaluation%20of%20a%20speech-conditioned%20image%20generator&rft.au=Tubau%20Pires,%20Miquel&rft.date=2019-01&rft_id=info:doi/&rft_dat=%3Ccsuc_XX2%3Eoai_recercat_cat_2072_354436%3C/csuc_XX2%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true