High Quality Sinusoidal Modeling of Wideband Speech for the Purposes of Speech Synthesis and Modification

This paper describes an efficient sinusoidal modeling framework for high quality wide band (WB) speech synthesis and modification. This technique may serve as a basis for speech compression in the context of small footprint concatenative Text to Speech systems. In addition, it is a useful representa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Chazan, D., Hoory, R., Sagi, A., Shechtman, S., Sorin, A., Zhi Wei Shuang, Bakis, R.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page I
container_issue
container_start_page I
container_title
container_volume 1
creator Chazan, D.
Hoory, R.
Sagi, A.
Shechtman, S.
Sorin, A.
Zhi Wei Shuang
Bakis, R.
description This paper describes an efficient sinusoidal modeling framework for high quality wide band (WB) speech synthesis and modification. This technique may serve as a basis for speech compression in the context of small footprint concatenative Text to Speech systems. In addition, it is a useful representation for voice transformation and morphing purposes, e.g., simultaneous pitch modification and spectral envelope warping. The conventional sinusoidal modeling is enhanced with an adaptive frequency dithering mechanism, based on a degree of voicing analysis. Considerable reduction of the amount of model parameters is achieved by high band phase extension. The proposed model is evaluated and compared to the alternative STRAIGHT framework [1]. Being simpler and considerably more efficient than STRAIGHT, it outperforms it in speech quality for both speech reconstruction and transformation.
doi_str_mv 10.1109/ICASSP.2006.1660161
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1660161</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1660161</ieee_id><sourcerecordid>1660161</sourcerecordid><originalsourceid>FETCH-ieee_primary_16601613</originalsourceid><addsrcrecordid>eNp9j81Kw0AURgd_wKB5gm7mBRLvnYyTzlKKUhdCZQTdlbG5aa7ETMgki7y9LWTd1bc4hwOfECuEHBHs49vm2bldrgBMjsYAGrwSiSpKm6GF72uR2nKNWmkN2tj1jUjwSUFmUNs7kcb4CwBoTakLlQje8rGRH5NveZyl426KgSvfyvdQUcvdUYZafnFFP76rpOuJDo2swyDHhuRuGvoQKZ6dBbm5O5HIUZ79U4RrPviRQ_cgbmvfRkqXvRer15fPzTZjItr3A__5Yd4vh4rL9B_9jUyd</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>High Quality Sinusoidal Modeling of Wideband Speech for the Purposes of Speech Synthesis and Modification</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Chazan, D. ; Hoory, R. ; Sagi, A. ; Shechtman, S. ; Sorin, A. ; Zhi Wei Shuang ; Bakis, R.</creator><creatorcontrib>Chazan, D. ; Hoory, R. ; Sagi, A. ; Shechtman, S. ; Sorin, A. ; Zhi Wei Shuang ; Bakis, R.</creatorcontrib><description>This paper describes an efficient sinusoidal modeling framework for high quality wide band (WB) speech synthesis and modification. This technique may serve as a basis for speech compression in the context of small footprint concatenative Text to Speech systems. In addition, it is a useful representation for voice transformation and morphing purposes, e.g., simultaneous pitch modification and spectral envelope warping. The conventional sinusoidal modeling is enhanced with an adaptive frequency dithering mechanism, based on a degree of voicing analysis. Considerable reduction of the amount of model parameters is achieved by high band phase extension. The proposed model is evaluated and compared to the alternative STRAIGHT framework [1]. Being simpler and considerably more efficient than STRAIGHT, it outperforms it in speech quality for both speech reconstruction and transformation.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9781424404698</identifier><identifier>ISBN: 142440469X</identifier><identifier>EISSN: 2379-190X</identifier><identifier>DOI: 10.1109/ICASSP.2006.1660161</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic waves ; Frequency ; Laboratories ; Power harmonic filters ; Signal synthesis ; Speech analysis ; Speech coding ; Speech enhancement ; Speech synthesis ; Wideband</subject><ispartof>2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 2006, Vol.1, p.I-I</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1660161$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1660161$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Chazan, D.</creatorcontrib><creatorcontrib>Hoory, R.</creatorcontrib><creatorcontrib>Sagi, A.</creatorcontrib><creatorcontrib>Shechtman, S.</creatorcontrib><creatorcontrib>Sorin, A.</creatorcontrib><creatorcontrib>Zhi Wei Shuang</creatorcontrib><creatorcontrib>Bakis, R.</creatorcontrib><title>High Quality Sinusoidal Modeling of Wideband Speech for the Purposes of Speech Synthesis and Modification</title><title>2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings</title><addtitle>ICASSP</addtitle><description>This paper describes an efficient sinusoidal modeling framework for high quality wide band (WB) speech synthesis and modification. This technique may serve as a basis for speech compression in the context of small footprint concatenative Text to Speech systems. In addition, it is a useful representation for voice transformation and morphing purposes, e.g., simultaneous pitch modification and spectral envelope warping. The conventional sinusoidal modeling is enhanced with an adaptive frequency dithering mechanism, based on a degree of voicing analysis. Considerable reduction of the amount of model parameters is achieved by high band phase extension. The proposed model is evaluated and compared to the alternative STRAIGHT framework [1]. Being simpler and considerably more efficient than STRAIGHT, it outperforms it in speech quality for both speech reconstruction and transformation.</description><subject>Acoustic waves</subject><subject>Frequency</subject><subject>Laboratories</subject><subject>Power harmonic filters</subject><subject>Signal synthesis</subject><subject>Speech analysis</subject><subject>Speech coding</subject><subject>Speech enhancement</subject><subject>Speech synthesis</subject><subject>Wideband</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9781424404698</isbn><isbn>142440469X</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNp9j81Kw0AURgd_wKB5gm7mBRLvnYyTzlKKUhdCZQTdlbG5aa7ETMgki7y9LWTd1bc4hwOfECuEHBHs49vm2bldrgBMjsYAGrwSiSpKm6GF72uR2nKNWmkN2tj1jUjwSUFmUNs7kcb4CwBoTakLlQje8rGRH5NveZyl426KgSvfyvdQUcvdUYZafnFFP76rpOuJDo2swyDHhuRuGvoQKZ6dBbm5O5HIUZ79U4RrPviRQ_cgbmvfRkqXvRer15fPzTZjItr3A__5Yd4vh4rL9B_9jUyd</recordid><startdate>2006</startdate><enddate>2006</enddate><creator>Chazan, D.</creator><creator>Hoory, R.</creator><creator>Sagi, A.</creator><creator>Shechtman, S.</creator><creator>Sorin, A.</creator><creator>Zhi Wei Shuang</creator><creator>Bakis, R.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>2006</creationdate><title>High Quality Sinusoidal Modeling of Wideband Speech for the Purposes of Speech Synthesis and Modification</title><author>Chazan, D. ; Hoory, R. ; Sagi, A. ; Shechtman, S. ; Sorin, A. ; Zhi Wei Shuang ; Bakis, R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_16601613</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Acoustic waves</topic><topic>Frequency</topic><topic>Laboratories</topic><topic>Power harmonic filters</topic><topic>Signal synthesis</topic><topic>Speech analysis</topic><topic>Speech coding</topic><topic>Speech enhancement</topic><topic>Speech synthesis</topic><topic>Wideband</topic><toplevel>online_resources</toplevel><creatorcontrib>Chazan, D.</creatorcontrib><creatorcontrib>Hoory, R.</creatorcontrib><creatorcontrib>Sagi, A.</creatorcontrib><creatorcontrib>Shechtman, S.</creatorcontrib><creatorcontrib>Sorin, A.</creatorcontrib><creatorcontrib>Zhi Wei Shuang</creatorcontrib><creatorcontrib>Bakis, R.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chazan, D.</au><au>Hoory, R.</au><au>Sagi, A.</au><au>Shechtman, S.</au><au>Sorin, A.</au><au>Zhi Wei Shuang</au><au>Bakis, R.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>High Quality Sinusoidal Modeling of Wideband Speech for the Purposes of Speech Synthesis and Modification</atitle><btitle>2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings</btitle><stitle>ICASSP</stitle><date>2006</date><risdate>2006</risdate><volume>1</volume><spage>I</spage><epage>I</epage><pages>I-I</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9781424404698</isbn><isbn>142440469X</isbn><abstract>This paper describes an efficient sinusoidal modeling framework for high quality wide band (WB) speech synthesis and modification. This technique may serve as a basis for speech compression in the context of small footprint concatenative Text to Speech systems. In addition, it is a useful representation for voice transformation and morphing purposes, e.g., simultaneous pitch modification and spectral envelope warping. The conventional sinusoidal modeling is enhanced with an adaptive frequency dithering mechanism, based on a degree of voicing analysis. Considerable reduction of the amount of model parameters is achieved by high band phase extension. The proposed model is evaluated and compared to the alternative STRAIGHT framework [1]. Being simpler and considerably more efficient than STRAIGHT, it outperforms it in speech quality for both speech reconstruction and transformation.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2006.1660161</doi></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-6149
ispartof 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 2006, Vol.1, p.I-I
issn 1520-6149
2379-190X
language eng
recordid cdi_ieee_primary_1660161
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Acoustic waves
Frequency
Laboratories
Power harmonic filters
Signal synthesis
Speech analysis
Speech coding
Speech enhancement
Speech synthesis
Wideband
title High Quality Sinusoidal Modeling of Wideband Speech for the Purposes of Speech Synthesis and Modification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T07%3A02%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=High%20Quality%20Sinusoidal%20Modeling%20of%20Wideband%20Speech%20for%20the%20Purposes%20of%20Speech%20Synthesis%20and%20Modification&rft.btitle=2006%20IEEE%20International%20Conference%20on%20Acoustics%20Speech%20and%20Signal%20Processing%20Proceedings&rft.au=Chazan,%20D.&rft.date=2006&rft.volume=1&rft.spage=I&rft.epage=I&rft.pages=I-I&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9781424404698&rft.isbn_list=142440469X&rft_id=info:doi/10.1109/ICASSP.2006.1660161&rft_dat=%3Cieee_6IE%3E1660161%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1660161&rfr_iscdi=true