DNN-based ensemble singing voice synthesis with interactions between singers

We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' v...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Hyodo, Hiroaki, Takamichi, Shinnosuke, Nakamura, Tomohiko, Koguchi, Junya, Saruwatari, Hiroshi
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Sound
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Hyodo, Hiroaki Takamichi, Shinnosuke Nakamura, Tomohiko Koguchi, Junya Saruwatari, Hiroshi
description	We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.
doi_str_mv	10.48550/arxiv.2409.09988
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2409_09988</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2409_09988</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2409_099883</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DOwtLSw4GTwcfHz001KLE5NUUjNK07NTcpJVSjOzEsHIoWy_MxkIK8yryQjtTizWKE8syRDITOvJLUoMbkkMz-vWCEptaQ8NTUPrCO1qJiHgTUtMac4lRdKczPIu7mGOHvogu2NLyjKzE0sqowH2R8Ptt-YsAoAPPo8DQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>DNN-based ensemble singing voice synthesis with interactions between singers</title><source>arXiv.org</source><creator>Hyodo, Hiroaki ; Takamichi, Shinnosuke ; Nakamura, Tomohiko ; Koguchi, Junya ; Saruwatari, Hiroshi</creator><creatorcontrib>Hyodo, Hiroaki ; Takamichi, Shinnosuke ; Nakamura, Tomohiko ; Koguchi, Junya ; Saruwatari, Hiroshi</creatorcontrib><description>We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.</description><identifier>DOI: 10.48550/arxiv.2409.09988</identifier><language>eng</language><subject>Computer Science - Sound</subject><creationdate>2024-09</creationdate><rights>http://creativecommons.org/licenses/by-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2409.09988$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2409.09988$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hyodo, Hiroaki</creatorcontrib><creatorcontrib>Takamichi, Shinnosuke</creatorcontrib><creatorcontrib>Nakamura, Tomohiko</creatorcontrib><creatorcontrib>Koguchi, Junya</creatorcontrib><creatorcontrib>Saruwatari, Hiroshi</creatorcontrib><title>DNN-based ensemble singing voice synthesis with interactions between singers</title><description>We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.</description><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DOwtLSw4GTwcfHz001KLE5NUUjNK07NTcpJVSjOzEsHIoWy_MxkIK8yryQjtTizWKE8syRDITOvJLUoMbkkMz-vWCEptaQ8NTUPrCO1qJiHgTUtMac4lRdKczPIu7mGOHvogu2NLyjKzE0sqowH2R8Ptt-YsAoAPPo8DQ</recordid><startdate>20240916</startdate><enddate>20240916</enddate><creator>Hyodo, Hiroaki</creator><creator>Takamichi, Shinnosuke</creator><creator>Nakamura, Tomohiko</creator><creator>Koguchi, Junya</creator><creator>Saruwatari, Hiroshi</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240916</creationdate><title>DNN-based ensemble singing voice synthesis with interactions between singers</title><author>Hyodo, Hiroaki ; Takamichi, Shinnosuke ; Nakamura, Tomohiko ; Koguchi, Junya ; Saruwatari, Hiroshi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2409_099883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Hyodo, Hiroaki</creatorcontrib><creatorcontrib>Takamichi, Shinnosuke</creatorcontrib><creatorcontrib>Nakamura, Tomohiko</creatorcontrib><creatorcontrib>Koguchi, Junya</creatorcontrib><creatorcontrib>Saruwatari, Hiroshi</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hyodo, Hiroaki</au><au>Takamichi, Shinnosuke</au><au>Nakamura, Tomohiko</au><au>Koguchi, Junya</au><au>Saruwatari, Hiroshi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DNN-based ensemble singing voice synthesis with interactions between singers</atitle><date>2024-09-16</date><risdate>2024</risdate><abstract>We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.</abstract><doi>10.48550/arxiv.2409.09988</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2409.09988
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2409_09988
source	arXiv.org
subjects	Computer Science - Sound
title	DNN-based ensemble singing voice synthesis with interactions between singers
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T14%3A56%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DNN-based%20ensemble%20singing%20voice%20synthesis%20with%20interactions%20between%20singers&rft.au=Hyodo,%20Hiroaki&rft.date=2024-09-16&rft_id=info:doi/10.48550/arxiv.2409.09988&rft_dat=%3Carxiv_GOX%3E2409_09988%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true