Wav2Pix Enhancement and evaluation of a speech-conditioned image generator
We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Dissertation |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Tubau Pires, Miquel |
description | We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding). |
format | Dissertation |
fullrecord | <record><control><sourceid>csuc_XX2</sourceid><recordid>TN_cdi_csuc_recercat_oai_recercat_cat_2072_354436</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_recercat_cat_2072_354436</sourcerecordid><originalsourceid>FETCH-csuc_recercat_oai_recercat_cat_2072_3544363</originalsourceid><addsrcrecordid>eNqdizEKwkAQRdNYiHqHuUBAN9EcQCJiZSFYLsPkJ1lIZmV3Ezy-BAR7i8_jPfjr7Pbk2dzdm2rtWQUjNBFrQ5h5mDg5r-RbYoovQPpcvDZuqWjIjdyBOigCJx-22arlIWL35SY7XOrH-ZpLnMQGCIJwsp7dT5aZfWVscSzL4lT88_kA_3FCDg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>dissertation</recordtype></control><display><type>dissertation</type><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><source>Recercat</source><creator>Tubau Pires, Miquel</creator><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><description>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</description><language>eng</language><publisher>Universitat Politècnica de Catalunya</publisher><subject>adversarial learning ; Aprenentatge automàtic ; Computer vision ; deep learning ; face synthesis ; Informàtica ; Machine learning ; Visió per ordinador ; Àrees temàtiques de la UPC</subject><creationdate>2019</creationdate><rights>info:eu-repo/semantics/openAccess</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,311,780,885,26974</link.rule.ids><linktorsrc>$$Uhttps://recercat.cat/handle/2072/354436$$EView_record_in_Consorci_de_Serveis_Universitaris_de_Catalunya_(CSUC)$$FView_record_in_$$GConsorci_de_Serveis_Universitaris_de_Catalunya_(CSUC)$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><description>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</description><subject>adversarial learning</subject><subject>Aprenentatge automàtic</subject><subject>Computer vision</subject><subject>deep learning</subject><subject>face synthesis</subject><subject>Informàtica</subject><subject>Machine learning</subject><subject>Visió per ordinador</subject><subject>Àrees temàtiques de la UPC</subject><fulltext>true</fulltext><rsrctype>dissertation</rsrctype><creationdate>2019</creationdate><recordtype>dissertation</recordtype><sourceid>XX2</sourceid><recordid>eNqdizEKwkAQRdNYiHqHuUBAN9EcQCJiZSFYLsPkJ1lIZmV3Ezy-BAR7i8_jPfjr7Pbk2dzdm2rtWQUjNBFrQ5h5mDg5r-RbYoovQPpcvDZuqWjIjdyBOigCJx-22arlIWL35SY7XOrH-ZpLnMQGCIJwsp7dT5aZfWVscSzL4lT88_kA_3FCDg</recordid><startdate>201901</startdate><enddate>201901</enddate><creator>Tubau Pires, Miquel</creator><general>Universitat Politècnica de Catalunya</general><scope>XX2</scope></search><sort><creationdate>201901</creationdate><title>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</title><author>Tubau Pires, Miquel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-csuc_recercat_oai_recercat_cat_2072_3544363</frbrgroupid><rsrctype>dissertations</rsrctype><prefilter>dissertations</prefilter><language>eng</language><creationdate>2019</creationdate><topic>adversarial learning</topic><topic>Aprenentatge automàtic</topic><topic>Computer vision</topic><topic>deep learning</topic><topic>face synthesis</topic><topic>Informàtica</topic><topic>Machine learning</topic><topic>Visió per ordinador</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>online_resources</toplevel><creatorcontrib>Tubau Pires, Miquel</creatorcontrib><collection>Recercat</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tubau Pires, Miquel</au><format>dissertation</format><genre>dissertation</genre><ristype>THES</ristype><btitle>Wav2Pix Enhancement and evaluation of a speech-conditioned image generator</btitle><date>2019-01</date><risdate>2019</risdate><abstract>We propose the enhancement and evaluation of a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding).</abstract><pub>Universitat Politècnica de Catalunya</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng |
recordid | cdi_csuc_recercat_oai_recercat_cat_2072_354436 |
source | Recercat |
subjects | adversarial learning Aprenentatge automàtic Computer vision deep learning face synthesis Informàtica Machine learning Visió per ordinador Àrees temàtiques de la UPC |
title | Wav2Pix Enhancement and evaluation of a speech-conditioned image generator |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T04%3A21%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-csuc_XX2&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft.genre=dissertation&rft.btitle=Wav2Pix%20Enhancement%20and%20evaluation%20of%20a%20speech-conditioned%20image%20generator&rft.au=Tubau%20Pires,%20Miquel&rft.date=2019-01&rft_id=info:doi/&rft_dat=%3Ccsuc_XX2%3Eoai_recercat_cat_2072_354436%3C/csuc_XX2%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |