FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method

This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorith...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2020, Vol.8, p.228740-228753
Hauptverfasser:	Li, Li, Kameoka, Hirokazu, Inoue, Shota, Makino, Shoji
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms auxiliary classifier VAE Computational efficiency Computational modeling Computing costs Computing time Decoding FastMVAE algorithm Information theory Iterative methods Multichannel source separation multichannel variational autoencoder (MVAE) method Neural networks Optimization Optimization algorithms Separation Source separation Spectrogram Spectrograms Task analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	228753
container_issue
container_start_page	228740
container_title	IEEE access
container_volume	8
creator	Li, Li Kameoka, Hirokazu Inoue, Shota Makino, Shoji
description	This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorithm and the idea of jointly performing multi-speaker separation and speaker identification. However, one drawback is the high computational cost of the optimization algorithm. To overcome this drawback, this paper proposes using an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE (CVAE), to train the generative model of the source spectrograms and using it to efficiently update the parameters of the source spectrogram models at each iteration of the source separation algorithm. We call the proposed algorithm "FastMVAE" (or fMVAE for short). Experimental evaluations revealed that the proposed fast algorithm can achieve high source separation performance in both speaker-dependent and speaker-independent scenarios while significantly reducing the computational time compared to the original MVAE method by more than 90% on both GPU and CPU. However, there is still room for improvement of about 3 dB compared to the original MVAE method.
doi_str_mv	10.1109/ACCESS.2020.3045704
format	Article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9298772</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9298772</ieee_id><doaj_id>oai_doaj_org_article_bb3cce8ae2e746a7ad257dbc2dba2cad</doaj_id><sourcerecordid>2474859728</sourcerecordid><originalsourceid>FETCH-LOGICAL-c544t-d859486e552f270fd33aaf39489862e18601909225220a24dbf5729def8fc41b3</originalsourceid><addsrcrecordid>eNpNUcFqGzEQXUoDDWm-IBdBz3a1I2kl9bYYpw3E5JAmlx6EVhrFMmvL1cqH9usrZ0PoXGZ4vPdmhtc0Ny1dti3VX_vVav34uAQKdMkoF5LyD80ltJ1eMMG6j__Nn5rradrRWqpCQl42v27tVDbP_fob6cl5Jg_HEvfxry0xHUg_vqQcy3ZPQsqkbJFsTmOJbmsPBxzJs83xlWhH0p9KwoNLHjPZYNkm_7m5CHac8PqtXzVPt-ufqx-L-4fvd6v-fuEE52XhldBcdSgEBJA0eMasDaxiWnWArepoq6kGEADUAvdDEBK0x6CC4-3Arpq72dcnuzPHHPc2_zHJRvMKpPxibK5Hj2iGgTmHyiKg5J2V1oOQfnDgBwvO-ur1ZfY65vT7hFMxu3TK9b_JAJe8nipBVRabWS6nacoY3re21JxDMXMo5hyKeQulqm5mVUTEd4UGraQE9g89_4gO</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2474859728</pqid></control><display><type>article</type><title>FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Li, Li ; Kameoka, Hirokazu ; Inoue, Shota ; Makino, Shoji</creator><creatorcontrib>Li, Li ; Kameoka, Hirokazu ; Inoue, Shota ; Makino, Shoji</creatorcontrib><description>This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorithm and the idea of jointly performing multi-speaker separation and speaker identification. However, one drawback is the high computational cost of the optimization algorithm. To overcome this drawback, this paper proposes using an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE (CVAE), to train the generative model of the source spectrograms and using it to efficiently update the parameters of the source spectrogram models at each iteration of the source separation algorithm. We call the proposed algorithm "FastMVAE" (or fMVAE for short). Experimental evaluations revealed that the proposed fast algorithm can achieve high source separation performance in both speaker-dependent and speaker-independent scenarios while significantly reducing the computational time compared to the original MVAE method by more than 90% on both GPU and CPU. However, there is still room for improvement of about 3 dB compared to the original MVAE method.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.3045704</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; auxiliary classifier VAE ; Computational efficiency ; Computational modeling ; Computing costs ; Computing time ; Decoding ; FastMVAE algorithm ; Information theory ; Iterative methods ; Multichannel source separation ; multichannel variational autoencoder (MVAE) method ; Neural networks ; Optimization ; Optimization algorithms ; Separation ; Source separation ; Spectrogram ; Spectrograms ; Task analysis</subject><ispartof>IEEE access, 2020, Vol.8, p.228740-228753</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c544t-d859486e552f270fd33aaf39489862e18601909225220a24dbf5729def8fc41b3</citedby><cites>FETCH-LOGICAL-c544t-d859486e552f270fd33aaf39489862e18601909225220a24dbf5729def8fc41b3</cites><orcidid>0000-0003-1934-640X ; 0000-0003-3102-0162 ; 0000-0002-3121-7857</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9298772$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2100,4022,27632,27922,27923,27924,54932</link.rule.ids></links><search><creatorcontrib>Li, Li</creatorcontrib><creatorcontrib>Kameoka, Hirokazu</creatorcontrib><creatorcontrib>Inoue, Shota</creatorcontrib><creatorcontrib>Makino, Shoji</creatorcontrib><title>FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method</title><title>IEEE access</title><addtitle>Access</addtitle><description>This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorithm and the idea of jointly performing multi-speaker separation and speaker identification. However, one drawback is the high computational cost of the optimization algorithm. To overcome this drawback, this paper proposes using an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE (CVAE), to train the generative model of the source spectrograms and using it to efficiently update the parameters of the source spectrogram models at each iteration of the source separation algorithm. We call the proposed algorithm "FastMVAE" (or fMVAE for short). Experimental evaluations revealed that the proposed fast algorithm can achieve high source separation performance in both speaker-dependent and speaker-independent scenarios while significantly reducing the computational time compared to the original MVAE method by more than 90% on both GPU and CPU. However, there is still room for improvement of about 3 dB compared to the original MVAE method.</description><subject>Algorithms</subject><subject>auxiliary classifier VAE</subject><subject>Computational efficiency</subject><subject>Computational modeling</subject><subject>Computing costs</subject><subject>Computing time</subject><subject>Decoding</subject><subject>FastMVAE algorithm</subject><subject>Information theory</subject><subject>Iterative methods</subject><subject>Multichannel source separation</subject><subject>multichannel variational autoencoder (MVAE) method</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Optimization algorithms</subject><subject>Separation</subject><subject>Source separation</subject><subject>Spectrogram</subject><subject>Spectrograms</subject><subject>Task analysis</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUcFqGzEQXUoDDWm-IBdBz3a1I2kl9bYYpw3E5JAmlx6EVhrFMmvL1cqH9usrZ0PoXGZ4vPdmhtc0Ny1dti3VX_vVav34uAQKdMkoF5LyD80ltJ1eMMG6j__Nn5rradrRWqpCQl42v27tVDbP_fob6cl5Jg_HEvfxry0xHUg_vqQcy3ZPQsqkbJFsTmOJbmsPBxzJs83xlWhH0p9KwoNLHjPZYNkm_7m5CHac8PqtXzVPt-ufqx-L-4fvd6v-fuEE52XhldBcdSgEBJA0eMasDaxiWnWArepoq6kGEADUAvdDEBK0x6CC4-3Arpq72dcnuzPHHPc2_zHJRvMKpPxibK5Hj2iGgTmHyiKg5J2V1oOQfnDgBwvO-ur1ZfY65vT7hFMxu3TK9b_JAJe8nipBVRabWS6nacoY3re21JxDMXMo5hyKeQulqm5mVUTEd4UGraQE9g89_4gO</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Li, Li</creator><creator>Kameoka, Hirokazu</creator><creator>Inoue, Shota</creator><creator>Makino, Shoji</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-1934-640X</orcidid><orcidid>https://orcid.org/0000-0003-3102-0162</orcidid><orcidid>https://orcid.org/0000-0002-3121-7857</orcidid></search><sort><creationdate>2020</creationdate><title>FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method</title><author>Li, Li ; Kameoka, Hirokazu ; Inoue, Shota ; Makino, Shoji</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c544t-d859486e552f270fd33aaf39489862e18601909225220a24dbf5729def8fc41b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>auxiliary classifier VAE</topic><topic>Computational efficiency</topic><topic>Computational modeling</topic><topic>Computing costs</topic><topic>Computing time</topic><topic>Decoding</topic><topic>FastMVAE algorithm</topic><topic>Information theory</topic><topic>Iterative methods</topic><topic>Multichannel source separation</topic><topic>multichannel variational autoencoder (MVAE) method</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Optimization algorithms</topic><topic>Separation</topic><topic>Source separation</topic><topic>Spectrogram</topic><topic>Spectrograms</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Li</creatorcontrib><creatorcontrib>Kameoka, Hirokazu</creatorcontrib><creatorcontrib>Inoue, Shota</creatorcontrib><creatorcontrib>Makino, Shoji</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Li</au><au>Kameoka, Hirokazu</au><au>Inoue, Shota</au><au>Makino, Shoji</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>228740</spage><epage>228753</epage><pages>228740-228753</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorithm and the idea of jointly performing multi-speaker separation and speaker identification. However, one drawback is the high computational cost of the optimization algorithm. To overcome this drawback, this paper proposes using an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE (CVAE), to train the generative model of the source spectrograms and using it to efficiently update the parameters of the source spectrogram models at each iteration of the source separation algorithm. We call the proposed algorithm "FastMVAE" (or fMVAE for short). Experimental evaluations revealed that the proposed fast algorithm can achieve high source separation performance in both speaker-dependent and speaker-independent scenarios while significantly reducing the computational time compared to the original MVAE method by more than 90% on both GPU and CPU. However, there is still room for improvement of about 3 dB compared to the original MVAE method.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.3045704</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-1934-640X</orcidid><orcidid>https://orcid.org/0000-0003-3102-0162</orcidid><orcidid>https://orcid.org/0000-0002-3121-7857</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2020, Vol.8, p.228740-228753
issn	2169-3536 2169-3536
language	eng
recordid	cdi_ieee_primary_9298772
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects	Algorithms auxiliary classifier VAE Computational efficiency Computational modeling Computing costs Computing time Decoding FastMVAE algorithm Information theory Iterative methods Multichannel source separation multichannel variational autoencoder (MVAE) method Neural networks Optimization Optimization algorithms Separation Source separation Spectrogram Spectrograms Task analysis
title	FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T12%3A23%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=FastMVAE:%20A%20Fast%20Optimization%20Algorithm%20for%20the%20Multichannel%20Variational%20Autoencoder%20Method&rft.jtitle=IEEE%20access&rft.au=Li,%20Li&rft.date=2020&rft.volume=8&rft.spage=228740&rft.epage=228753&rft.pages=228740-228753&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.3045704&rft_dat=%3Cproquest_ieee_%3E2474859728%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2474859728&rft_id=info:pmid/&rft_ieee_id=9298772&rft_doaj_id=oai_doaj_org_article_bb3cce8ae2e746a7ad257dbc2dba2cad&rfr_iscdi=true