Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil

This paper describes ways to improve prosody modeling in syllable-based concatenative speech synthesis systems for two Indian languages, namely Hindi and Tamil, within the unit selection paradigm. The syllable is a larger unit than the diphone and contains most of the coarticulation information. Alt...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bellur, A, Narayan, K B, Krishnan, K Raghava, Murthy, H A
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Acoustic measurements Context Speech Speech synthesis Synthesizers
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5
container_issue
container_start_page	1
container_title
container_volume
creator	Bellur, A Narayan, K B Krishnan, K Raghava Murthy, H A
description	This paper describes ways to improve prosody modeling in syllable-based concatenative speech synthesis systems for two Indian languages, namely Hindi and Tamil, within the unit selection paradigm. The syllable is a larger unit than the diphone and contains most of the coarticulation information. Although syllable-based synthesis is quite intelligible compared to diphone based systems, naturalness especially in terms of prosody, requires additional processing. Since the synthesizer is built using a cluster unit framework, a hybrid approach, where a combination of both rule based and statistical models are proposed to model prosody of syllable like units better. It is further observed that prediction of phrase boundaries is crucial, particularly because Indian languages are replete with polysyllabic words. CART based phrase modeling for Hindi and Tamil are discussed. Perceptual experiments show a significant improvement in the MOS for both Hindi and Tamil synthesizers.
doi_str_mv	10.1109/NCC.2011.5734737
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5734737</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5734737</ieee_id><sourcerecordid>5734737</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-692edc34510ffded35e3b1d682838964970fc072d6dec0f0ed1c06e9f3b2b0573</originalsourceid><addsrcrecordid>eNpNkE1LxDAYhCMiKGvvgpf8gdY3SZuPoxR1hUU91POSJm_cSNouTRH6711wD85lGHgYmCHkjkHFGJiHt7atODBWNUrUSqgLUhilmWRc16BNc_k_G2DXpMj5G05quNKNvCGfH_OUJ7_SYfKY4vhFwzTTvKZk-4RlbzN66qbR2QVHu8QfpPmI6A4nZlwOmGOmU6DbOPpI7ehpZ4eYbslVsCljcfYN6Z6funZb7t5fXtvHXRkNLKU0HL0TdcMgBI9eNCh65qXmWmgja6MgOFDcS48OAqBnDiSaIHrew2nyhtz_1UZE3B_nONh53Z-_EL-l8lK4</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Bellur, A ; Narayan, K B ; Krishnan, K Raghava ; Murthy, H A</creator><creatorcontrib>Bellur, A ; Narayan, K B ; Krishnan, K Raghava ; Murthy, H A</creatorcontrib><description>This paper describes ways to improve prosody modeling in syllable-based concatenative speech synthesis systems for two Indian languages, namely Hindi and Tamil, within the unit selection paradigm. The syllable is a larger unit than the diphone and contains most of the coarticulation information. Although syllable-based synthesis is quite intelligible compared to diphone based systems, naturalness especially in terms of prosody, requires additional processing. Since the synthesizer is built using a cluster unit framework, a hybrid approach, where a combination of both rule based and statistical models are proposed to model prosody of syllable like units better. It is further observed that prediction of phrase boundaries is crucial, particularly because Indian languages are replete with polysyllabic words. CART based phrase modeling for Hindi and Tamil are discussed. Perceptual experiments show a significant improvement in the MOS for both Hindi and Tamil synthesizers.</description><identifier>ISBN: 9781612840901</identifier><identifier>ISBN: 1612840906</identifier><identifier>EISBN: 9781612840895</identifier><identifier>EISBN: 1612840892</identifier><identifier>EISBN: 9781612840918</identifier><identifier>EISBN: 1612840914</identifier><identifier>DOI: 10.1109/NCC.2011.5734737</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic measurements ; Context ; Speech ; Speech synthesis ; Synthesizers</subject><ispartof>2011 National Conference on Communications (NCC), 2011, p.1-5</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5734737$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5734737$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Bellur, A</creatorcontrib><creatorcontrib>Narayan, K B</creatorcontrib><creatorcontrib>Krishnan, K Raghava</creatorcontrib><creatorcontrib>Murthy, H A</creatorcontrib><title>Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil</title><title>2011 National Conference on Communications (NCC)</title><addtitle>NCC</addtitle><description>This paper describes ways to improve prosody modeling in syllable-based concatenative speech synthesis systems for two Indian languages, namely Hindi and Tamil, within the unit selection paradigm. The syllable is a larger unit than the diphone and contains most of the coarticulation information. Although syllable-based synthesis is quite intelligible compared to diphone based systems, naturalness especially in terms of prosody, requires additional processing. Since the synthesizer is built using a cluster unit framework, a hybrid approach, where a combination of both rule based and statistical models are proposed to model prosody of syllable like units better. It is further observed that prediction of phrase boundaries is crucial, particularly because Indian languages are replete with polysyllabic words. CART based phrase modeling for Hindi and Tamil are discussed. Perceptual experiments show a significant improvement in the MOS for both Hindi and Tamil synthesizers.</description><subject>Acoustic measurements</subject><subject>Context</subject><subject>Speech</subject><subject>Speech synthesis</subject><subject>Synthesizers</subject><isbn>9781612840901</isbn><isbn>1612840906</isbn><isbn>9781612840895</isbn><isbn>1612840892</isbn><isbn>9781612840918</isbn><isbn>1612840914</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpNkE1LxDAYhCMiKGvvgpf8gdY3SZuPoxR1hUU91POSJm_cSNouTRH6711wD85lGHgYmCHkjkHFGJiHt7atODBWNUrUSqgLUhilmWRc16BNc_k_G2DXpMj5G05quNKNvCGfH_OUJ7_SYfKY4vhFwzTTvKZk-4RlbzN66qbR2QVHu8QfpPmI6A4nZlwOmGOmU6DbOPpI7ehpZ4eYbslVsCljcfYN6Z6funZb7t5fXtvHXRkNLKU0HL0TdcMgBI9eNCh65qXmWmgja6MgOFDcS48OAqBnDiSaIHrew2nyhtz_1UZE3B_nONh53Z-_EL-l8lK4</recordid><startdate>201101</startdate><enddate>201101</enddate><creator>Bellur, A</creator><creator>Narayan, K B</creator><creator>Krishnan, K Raghava</creator><creator>Murthy, H A</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201101</creationdate><title>Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil</title><author>Bellur, A ; Narayan, K B ; Krishnan, K Raghava ; Murthy, H A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-692edc34510ffded35e3b1d682838964970fc072d6dec0f0ed1c06e9f3b2b0573</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Acoustic measurements</topic><topic>Context</topic><topic>Speech</topic><topic>Speech synthesis</topic><topic>Synthesizers</topic><toplevel>online_resources</toplevel><creatorcontrib>Bellur, A</creatorcontrib><creatorcontrib>Narayan, K B</creatorcontrib><creatorcontrib>Krishnan, K Raghava</creatorcontrib><creatorcontrib>Murthy, H A</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bellur, A</au><au>Narayan, K B</au><au>Krishnan, K Raghava</au><au>Murthy, H A</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil</atitle><btitle>2011 National Conference on Communications (NCC)</btitle><stitle>NCC</stitle><date>2011-01</date><risdate>2011</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><isbn>9781612840901</isbn><isbn>1612840906</isbn><eisbn>9781612840895</eisbn><eisbn>1612840892</eisbn><eisbn>9781612840918</eisbn><eisbn>1612840914</eisbn><abstract>This paper describes ways to improve prosody modeling in syllable-based concatenative speech synthesis systems for two Indian languages, namely Hindi and Tamil, within the unit selection paradigm. The syllable is a larger unit than the diphone and contains most of the coarticulation information. Although syllable-based synthesis is quite intelligible compared to diphone based systems, naturalness especially in terms of prosody, requires additional processing. Since the synthesizer is built using a cluster unit framework, a hybrid approach, where a combination of both rule based and statistical models are proposed to model prosody of syllable like units better. It is further observed that prediction of phrase boundaries is crucial, particularly because Indian languages are replete with polysyllabic words. CART based phrase modeling for Hindi and Tamil are discussed. Perceptual experiments show a significant improvement in the MOS for both Hindi and Tamil synthesizers.</abstract><pub>IEEE</pub><doi>10.1109/NCC.2011.5734737</doi><tpages>5</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISBN: 9781612840901
ispartof	2011 National Conference on Communications (NCC), 2011, p.1-5
issn
language	eng
recordid	cdi_ieee_primary_5734737
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Acoustic measurements Context Speech Speech synthesis Synthesizers
title	Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T03%3A32%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Prosody%20modeling%20for%20syllable-based%20concatenative%20speech%20synthesis%20of%20Hindi%20and%20Tamil&rft.btitle=2011%20National%20Conference%20on%20Communications%20(NCC)&rft.au=Bellur,%20A&rft.date=2011-01&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.isbn=9781612840901&rft.isbn_list=1612840906&rft_id=info:doi/10.1109/NCC.2011.5734737&rft_dat=%3Cieee_6IE%3E5734737%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781612840895&rft.eisbn_list=1612840892&rft.eisbn_list=9781612840918&rft.eisbn_list=1612840914&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5734737&rfr_iscdi=true