Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis

A fundamental frequency (F0) control model, which can cope with F0 dynamic characteristics related to singing-voice perception, is required to construct natural singing-voice synthesis systems. This paper discusses importance of F0 dynamic characteristics in singing-voices and demonstrates how stron...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Speech communication 2005-07, Vol.46 (3), p.405-417
Hauptverfasser:	Saitou, Takeshi, Unoki, Masashi, Akagi, Masato
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Exact sciences and technology F0 control model F0 fluctuation Information, signal and communications theory Signal processing Singing-voice perception Singing-voice synthesis Speech processing Telecommunications and information theory
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	417
container_issue	3
container_start_page	405
container_title	Speech communication
container_volume	46
creator	Saitou, Takeshi Unoki, Masashi Akagi, Masato
description	A fundamental frequency (F0) control model, which can cope with F0 dynamic characteristics related to singing-voice perception, is required to construct natural singing-voice synthesis systems. This paper discusses importance of F0 dynamic characteristics in singing-voices and demonstrates how strongly they influence singing-voice perception through psychoacoustic experiments. This paper, then, proposes an F0 control model that can generate F0 contours of singing-voices based on these considerations, and a singing-voice synthesis system. The results show that several types of F0 fluctuation—overshoot, vibrato, preparation, and fine fluctuation—affect the perception and quality of a singing-voice, and that overshoot has the greatest effect. Moreover, the results show that the proposed F0 control model can control F0 fluctuations, generate F0 contours of singing-voices, and can be applied to natural singing-voice synthesis.
doi_str_mv	10.1016/j.specom.2005.01.010
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_85629011</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167639305000993</els_id><sourcerecordid>85629011</sourcerecordid><originalsourceid>FETCH-LOGICAL-c473t-dcfb8668ffb9f504e6dc7cc2cdf4ff3cdd5d42c902f21f71a856c5520172f6143</originalsourceid><addsrcrecordid>eNqNkUGLFDEQhYMoOK7-Aw-56K3HStKddF8EWXdVWPCi55CpVNwM3Z0x1Tsw_94eZ8GbCg_qUN-rKuoJ8VrBVoGy7_ZbPhCWaasBui2oVfBEbFTvdONUr5-KzYq5xprBPBcvmPcA0Pa93ojdRzrSWA4TzYssSYZZ3oLEMi-1jHIqkUa5C0xRlt-deJrDlFHifagBF6qZl4wsU6mS8_xjVXMsGUnyaV7uiTO_FM9SGJlePdYr8f325tv15-bu66cv1x_uGmydWZqIaddb26e0G1IHLdmIDlFjTG1KBmPsYqtxAJ20Sk6FvrPYdRqU08mq1lyJt5e5h1p-PhAvfsqMNI5hpvLAfuX1AEr9B2iMcab7J6gHrcHYM9heQKyFuVLyh5qnUE9egT9H5Pf-EpE_R-RBrYLV9uZxfmAMY6phxsx_vHbojXXng99fOFrfd8xUPWOmGSnmSrj4WPLfF_0C6k2prw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>29220365</pqid></control><display><type>article</type><title>Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis</title><source>ScienceDirect Journals (5 years ago - present)</source><creator>Saitou, Takeshi ; Unoki, Masashi ; Akagi, Masato</creator><creatorcontrib>Saitou, Takeshi ; Unoki, Masashi ; Akagi, Masato</creatorcontrib><description>A fundamental frequency (F0) control model, which can cope with F0 dynamic characteristics related to singing-voice perception, is required to construct natural singing-voice synthesis systems. This paper discusses importance of F0 dynamic characteristics in singing-voices and demonstrates how strongly they influence singing-voice perception through psychoacoustic experiments. This paper, then, proposes an F0 control model that can generate F0 contours of singing-voices based on these considerations, and a singing-voice synthesis system. The results show that several types of F0 fluctuation—overshoot, vibrato, preparation, and fine fluctuation—affect the perception and quality of a singing-voice, and that overshoot has the greatest effect. Moreover, the results show that the proposed F0 control model can control F0 fluctuations, generate F0 contours of singing-voices, and can be applied to natural singing-voice synthesis.</description><identifier>ISSN: 0167-6393</identifier><identifier>EISSN: 1872-7182</identifier><identifier>DOI: 10.1016/j.specom.2005.01.010</identifier><identifier>CODEN: SCOMDH</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Applied sciences ; Exact sciences and technology ; F0 control model ; F0 fluctuation ; Information, signal and communications theory ; Signal processing ; Singing-voice perception ; Singing-voice synthesis ; Speech processing ; Telecommunications and information theory</subject><ispartof>Speech communication, 2005-07, Vol.46 (3), p.405-417</ispartof><rights>2005 Elsevier B.V.</rights><rights>2005 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c473t-dcfb8668ffb9f504e6dc7cc2cdf4ff3cdd5d42c902f21f71a856c5520172f6143</citedby><cites>FETCH-LOGICAL-c473t-dcfb8668ffb9f504e6dc7cc2cdf4ff3cdd5d42c902f21f71a856c5520172f6143</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.specom.2005.01.010$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>309,310,314,780,784,789,790,3548,23929,23930,25139,27923,27924,45994</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=16983671$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Saitou, Takeshi</creatorcontrib><creatorcontrib>Unoki, Masashi</creatorcontrib><creatorcontrib>Akagi, Masato</creatorcontrib><title>Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis</title><title>Speech communication</title><description>A fundamental frequency (F0) control model, which can cope with F0 dynamic characteristics related to singing-voice perception, is required to construct natural singing-voice synthesis systems. This paper discusses importance of F0 dynamic characteristics in singing-voices and demonstrates how strongly they influence singing-voice perception through psychoacoustic experiments. This paper, then, proposes an F0 control model that can generate F0 contours of singing-voices based on these considerations, and a singing-voice synthesis system. The results show that several types of F0 fluctuation—overshoot, vibrato, preparation, and fine fluctuation—affect the perception and quality of a singing-voice, and that overshoot has the greatest effect. Moreover, the results show that the proposed F0 control model can control F0 fluctuations, generate F0 contours of singing-voices, and can be applied to natural singing-voice synthesis.</description><subject>Applied sciences</subject><subject>Exact sciences and technology</subject><subject>F0 control model</subject><subject>F0 fluctuation</subject><subject>Information, signal and communications theory</subject><subject>Signal processing</subject><subject>Singing-voice perception</subject><subject>Singing-voice synthesis</subject><subject>Speech processing</subject><subject>Telecommunications and information theory</subject><issn>0167-6393</issn><issn>1872-7182</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><recordid>eNqNkUGLFDEQhYMoOK7-Aw-56K3HStKddF8EWXdVWPCi55CpVNwM3Z0x1Tsw_94eZ8GbCg_qUN-rKuoJ8VrBVoGy7_ZbPhCWaasBui2oVfBEbFTvdONUr5-KzYq5xprBPBcvmPcA0Pa93ojdRzrSWA4TzYssSYZZ3oLEMi-1jHIqkUa5C0xRlt-deJrDlFHifagBF6qZl4wsU6mS8_xjVXMsGUnyaV7uiTO_FM9SGJlePdYr8f325tv15-bu66cv1x_uGmydWZqIaddb26e0G1IHLdmIDlFjTG1KBmPsYqtxAJ20Sk6FvrPYdRqU08mq1lyJt5e5h1p-PhAvfsqMNI5hpvLAfuX1AEr9B2iMcab7J6gHrcHYM9heQKyFuVLyh5qnUE9egT9H5Pf-EpE_R-RBrYLV9uZxfmAMY6phxsx_vHbojXXng99fOFrfd8xUPWOmGSnmSrj4WPLfF_0C6k2prw</recordid><startdate>20050701</startdate><enddate>20050701</enddate><creator>Saitou, Takeshi</creator><creator>Unoki, Masashi</creator><creator>Akagi, Masato</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>8BM</scope><scope>7T9</scope></search><sort><creationdate>20050701</creationdate><title>Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis</title><author>Saitou, Takeshi ; Unoki, Masashi ; Akagi, Masato</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c473t-dcfb8668ffb9f504e6dc7cc2cdf4ff3cdd5d42c902f21f71a856c5520172f6143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Applied sciences</topic><topic>Exact sciences and technology</topic><topic>F0 control model</topic><topic>F0 fluctuation</topic><topic>Information, signal and communications theory</topic><topic>Signal processing</topic><topic>Singing-voice perception</topic><topic>Singing-voice synthesis</topic><topic>Speech processing</topic><topic>Telecommunications and information theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Saitou, Takeshi</creatorcontrib><creatorcontrib>Unoki, Masashi</creatorcontrib><creatorcontrib>Akagi, Masato</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ComDisDome</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>Speech communication</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Saitou, Takeshi</au><au>Unoki, Masashi</au><au>Akagi, Masato</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis</atitle><jtitle>Speech communication</jtitle><date>2005-07-01</date><risdate>2005</risdate><volume>46</volume><issue>3</issue><spage>405</spage><epage>417</epage><pages>405-417</pages><issn>0167-6393</issn><eissn>1872-7182</eissn><coden>SCOMDH</coden><abstract>A fundamental frequency (F0) control model, which can cope with F0 dynamic characteristics related to singing-voice perception, is required to construct natural singing-voice synthesis systems. This paper discusses importance of F0 dynamic characteristics in singing-voices and demonstrates how strongly they influence singing-voice perception through psychoacoustic experiments. This paper, then, proposes an F0 control model that can generate F0 contours of singing-voices based on these considerations, and a singing-voice synthesis system. The results show that several types of F0 fluctuation—overshoot, vibrato, preparation, and fine fluctuation—affect the perception and quality of a singing-voice, and that overshoot has the greatest effect. Moreover, the results show that the proposed F0 control model can control F0 fluctuations, generate F0 contours of singing-voices, and can be applied to natural singing-voice synthesis.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.specom.2005.01.010</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0167-6393
ispartof	Speech communication, 2005-07, Vol.46 (3), p.405-417
issn	0167-6393 1872-7182
language	eng
recordid	cdi_proquest_miscellaneous_85629011
source	ScienceDirect Journals (5 years ago - present)
subjects	Applied sciences Exact sciences and technology F0 control model F0 fluctuation Information, signal and communications theory Signal processing Singing-voice perception Singing-voice synthesis Speech processing Telecommunications and information theory
title	Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T02%3A04%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Development%20of%20an%20F0%20control%20model%20based%20on%20F0%20dynamic%20characteristics%20for%20singing-voice%20synthesis&rft.jtitle=Speech%20communication&rft.au=Saitou,%20Takeshi&rft.date=2005-07-01&rft.volume=46&rft.issue=3&rft.spage=405&rft.epage=417&rft.pages=405-417&rft.issn=0167-6393&rft.eissn=1872-7182&rft.coden=SCOMDH&rft_id=info:doi/10.1016/j.specom.2005.01.010&rft_dat=%3Cproquest_cross%3E85629011%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=29220365&rft_id=info:pmid/&rft_els_id=S0167639305000993&rfr_iscdi=true