DNN-based ensemble singing voice synthesis with interactions between singers

We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' v...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hyodo, Hiroaki, Takamichi, Shinnosuke, Nakamura, Tomohiko, Koguchi, Junya, Saruwatari, Hiroshi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Hyodo, Hiroaki
Takamichi, Shinnosuke
Nakamura, Tomohiko
Koguchi, Junya
Saruwatari, Hiroshi
description We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.
doi_str_mv 10.48550/arxiv.2409.09988
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2409_09988</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2409_09988</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2409_099883</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DOwtLSw4GTwcfHz001KLE5NUUjNK07NTcpJVSjOzEsHIoWy_MxkIK8yryQjtTizWKE8syRDITOvJLUoMbkkMz-vWCEptaQ8NTUPrCO1qJiHgTUtMac4lRdKczPIu7mGOHvogu2NLyjKzE0sqowH2R8Ptt-YsAoAPPo8DQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>DNN-based ensemble singing voice synthesis with interactions between singers</title><source>arXiv.org</source><creator>Hyodo, Hiroaki ; Takamichi, Shinnosuke ; Nakamura, Tomohiko ; Koguchi, Junya ; Saruwatari, Hiroshi</creator><creatorcontrib>Hyodo, Hiroaki ; Takamichi, Shinnosuke ; Nakamura, Tomohiko ; Koguchi, Junya ; Saruwatari, Hiroshi</creatorcontrib><description>We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.</description><identifier>DOI: 10.48550/arxiv.2409.09988</identifier><language>eng</language><subject>Computer Science - Sound</subject><creationdate>2024-09</creationdate><rights>http://creativecommons.org/licenses/by-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2409.09988$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2409.09988$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hyodo, Hiroaki</creatorcontrib><creatorcontrib>Takamichi, Shinnosuke</creatorcontrib><creatorcontrib>Nakamura, Tomohiko</creatorcontrib><creatorcontrib>Koguchi, Junya</creatorcontrib><creatorcontrib>Saruwatari, Hiroshi</creatorcontrib><title>DNN-based ensemble singing voice synthesis with interactions between singers</title><description>We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.</description><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DOwtLSw4GTwcfHz001KLE5NUUjNK07NTcpJVSjOzEsHIoWy_MxkIK8yryQjtTizWKE8syRDITOvJLUoMbkkMz-vWCEptaQ8NTUPrCO1qJiHgTUtMac4lRdKczPIu7mGOHvogu2NLyjKzE0sqowH2R8Ptt-YsAoAPPo8DQ</recordid><startdate>20240916</startdate><enddate>20240916</enddate><creator>Hyodo, Hiroaki</creator><creator>Takamichi, Shinnosuke</creator><creator>Nakamura, Tomohiko</creator><creator>Koguchi, Junya</creator><creator>Saruwatari, Hiroshi</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240916</creationdate><title>DNN-based ensemble singing voice synthesis with interactions between singers</title><author>Hyodo, Hiroaki ; Takamichi, Shinnosuke ; Nakamura, Tomohiko ; Koguchi, Junya ; Saruwatari, Hiroshi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2409_099883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Hyodo, Hiroaki</creatorcontrib><creatorcontrib>Takamichi, Shinnosuke</creatorcontrib><creatorcontrib>Nakamura, Tomohiko</creatorcontrib><creatorcontrib>Koguchi, Junya</creatorcontrib><creatorcontrib>Saruwatari, Hiroshi</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hyodo, Hiroaki</au><au>Takamichi, Shinnosuke</au><au>Nakamura, Tomohiko</au><au>Koguchi, Junya</au><au>Saruwatari, Hiroshi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DNN-based ensemble singing voice synthesis with interactions between singers</atitle><date>2024-09-16</date><risdate>2024</risdate><abstract>We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.</abstract><doi>10.48550/arxiv.2409.09988</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2409.09988
ispartof
issn
language eng
recordid cdi_arxiv_primary_2409_09988
source arXiv.org
subjects Computer Science - Sound
title DNN-based ensemble singing voice synthesis with interactions between singers
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T14%3A56%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DNN-based%20ensemble%20singing%20voice%20synthesis%20with%20interactions%20between%20singers&rft.au=Hyodo,%20Hiroaki&rft.date=2024-09-16&rft_id=info:doi/10.48550/arxiv.2409.09988&rft_dat=%3Carxiv_GOX%3E2409_09988%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true