Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise

In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios su...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Valentini-Botinhao, C., Maia, R., Yamagishi, J., King, S., Zen, H.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4000
container_issue
container_start_page 3997
container_title
container_volume
creator Valentini-Botinhao, C.
Maia, R.
Yamagishi, J.
King, S.
Zen, H.
description In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure operates and further show how we approximated it to integrate it into an existing spectral envelope parameter extraction method commonly used in the HMM-based speech synthesis framework. We then demonstrate how this new method changes the modelled spectrum according to the characteristics of the noise and show results for a listening test with vocoded and HMM-based synthetic speech. The test indicates that the proposed method can significantly improve intelligibility of synthetic speech in speech shaped noise.
doi_str_mv 10.1109/ICASSP.2012.6288794
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6288794</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6288794</ieee_id><sourcerecordid>6288794</sourcerecordid><originalsourceid>FETCH-LOGICAL-i220t-1e07fcd989ceb65dfc0c4eda0b721d1812cff12115d9bc8b4fbab57ca1aa92cc3</originalsourceid><addsrcrecordid>eNo1kN1KAzEQheMfWGufoDd5gV0z2exPLqWoFVoUquBdSbKTNpLuLskq7K1P7mLr3AzM-c6BM4TMgaUATN49L-43m9eUM-BpwauqlOKMzGRZgSjKjDFRyHMy4VkpE5Ds44Lc_Au5uCQTyDlLChDymsxi_GTjjFaWFRPys8Au9kF5qhrlh-gi1SpiTduG9nukO-8OXUTahbZrQ-_G8wFV_ApIbRvoKIb22zW7P9g1PXrvdk477_qBtpYu1-vkGBiHZmR6Z2jsEM1-pGnTuoi35MoqH3F22lPy_vjwtlgmq5ensfgqcZyzPgFkpTW1rKRBXeS1NcwIrBXTJYcaKuDGWuAAeS21qbSwWum8NAqUktyYbErmx1yHiNsuuIMKw_b0zuwXzVFplA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Valentini-Botinhao, C. ; Maia, R. ; Yamagishi, J. ; King, S. ; Zen, H.</creator><creatorcontrib>Valentini-Botinhao, C. ; Maia, R. ; Yamagishi, J. ; King, S. ; Zen, H.</creatorcontrib><description>In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure operates and further show how we approximated it to integrate it into an existing spectral envelope parameter extraction method commonly used in the HMM-based speech synthesis framework. We then demonstrate how this new method changes the modelled spectrum according to the characteristics of the noise and show results for a listening test with vocoded and HMM-based synthetic speech. The test indicates that the proposed method can significantly improve intelligibility of synthetic speech in speech shaped noise.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 1467300454</identifier><identifier>ISBN: 9781467300452</identifier><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 9781467300469</identifier><identifier>EISBN: 1467300446</identifier><identifier>EISBN: 9781467300445</identifier><identifier>EISBN: 1467300462</identifier><identifier>DOI: 10.1109/ICASSP.2012.6288794</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; Approximation methods ; Cepstral analysis ; cepstral coefficient extraction ; Hidden Markov models ; HMM-based speech synthesis ; Lombard speech ; Noise ; Noise measurement ; objective measure for speech intelligibility ; Speech</subject><ispartof>2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, p.3997-4000</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6288794$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6288794$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Valentini-Botinhao, C.</creatorcontrib><creatorcontrib>Maia, R.</creatorcontrib><creatorcontrib>Yamagishi, J.</creatorcontrib><creatorcontrib>King, S.</creatorcontrib><creatorcontrib>Zen, H.</creatorcontrib><title>Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise</title><title>2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</title><addtitle>ICASSP</addtitle><description>In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure operates and further show how we approximated it to integrate it into an existing spectral envelope parameter extraction method commonly used in the HMM-based speech synthesis framework. We then demonstrate how this new method changes the modelled spectrum according to the characteristics of the noise and show results for a listening test with vocoded and HMM-based synthetic speech. The test indicates that the proposed method can significantly improve intelligibility of synthetic speech in speech shaped noise.</description><subject>Accuracy</subject><subject>Approximation methods</subject><subject>Cepstral analysis</subject><subject>cepstral coefficient extraction</subject><subject>Hidden Markov models</subject><subject>HMM-based speech synthesis</subject><subject>Lombard speech</subject><subject>Noise</subject><subject>Noise measurement</subject><subject>objective measure for speech intelligibility</subject><subject>Speech</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>1467300454</isbn><isbn>9781467300452</isbn><isbn>9781467300469</isbn><isbn>1467300446</isbn><isbn>9781467300445</isbn><isbn>1467300462</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1kN1KAzEQheMfWGufoDd5gV0z2exPLqWoFVoUquBdSbKTNpLuLskq7K1P7mLr3AzM-c6BM4TMgaUATN49L-43m9eUM-BpwauqlOKMzGRZgSjKjDFRyHMy4VkpE5Ds44Lc_Au5uCQTyDlLChDymsxi_GTjjFaWFRPys8Au9kF5qhrlh-gi1SpiTduG9nukO-8OXUTahbZrQ-_G8wFV_ApIbRvoKIb22zW7P9g1PXrvdk477_qBtpYu1-vkGBiHZmR6Z2jsEM1-pGnTuoi35MoqH3F22lPy_vjwtlgmq5ensfgqcZyzPgFkpTW1rKRBXeS1NcwIrBXTJYcaKuDGWuAAeS21qbSwWum8NAqUktyYbErmx1yHiNsuuIMKw_b0zuwXzVFplA</recordid><startdate>20120101</startdate><enddate>20120101</enddate><creator>Valentini-Botinhao, C.</creator><creator>Maia, R.</creator><creator>Yamagishi, J.</creator><creator>King, S.</creator><creator>Zen, H.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20120101</creationdate><title>Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise</title><author>Valentini-Botinhao, C. ; Maia, R. ; Yamagishi, J. ; King, S. ; Zen, H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i220t-1e07fcd989ceb65dfc0c4eda0b721d1812cff12115d9bc8b4fbab57ca1aa92cc3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Accuracy</topic><topic>Approximation methods</topic><topic>Cepstral analysis</topic><topic>cepstral coefficient extraction</topic><topic>Hidden Markov models</topic><topic>HMM-based speech synthesis</topic><topic>Lombard speech</topic><topic>Noise</topic><topic>Noise measurement</topic><topic>objective measure for speech intelligibility</topic><topic>Speech</topic><toplevel>online_resources</toplevel><creatorcontrib>Valentini-Botinhao, C.</creatorcontrib><creatorcontrib>Maia, R.</creatorcontrib><creatorcontrib>Yamagishi, J.</creatorcontrib><creatorcontrib>King, S.</creatorcontrib><creatorcontrib>Zen, H.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Valentini-Botinhao, C.</au><au>Maia, R.</au><au>Yamagishi, J.</au><au>King, S.</au><au>Zen, H.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise</atitle><btitle>2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</btitle><stitle>ICASSP</stitle><date>2012-01-01</date><risdate>2012</risdate><spage>3997</spage><epage>4000</epage><pages>3997-4000</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>1467300454</isbn><isbn>9781467300452</isbn><eisbn>9781467300469</eisbn><eisbn>1467300446</eisbn><eisbn>9781467300445</eisbn><eisbn>1467300462</eisbn><abstract>In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure operates and further show how we approximated it to integrate it into an existing spectral envelope parameter extraction method commonly used in the HMM-based speech synthesis framework. We then demonstrate how this new method changes the modelled spectrum according to the characteristics of the noise and show results for a listening test with vocoded and HMM-based synthetic speech. The test indicates that the proposed method can significantly improve intelligibility of synthetic speech in speech shaped noise.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2012.6288794</doi><tpages>4</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-6149
ispartof 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, p.3997-4000
issn 1520-6149
2379-190X
language eng
recordid cdi_ieee_primary_6288794
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Accuracy
Approximation methods
Cepstral analysis
cepstral coefficient extraction
Hidden Markov models
HMM-based speech synthesis
Lombard speech
Noise
Noise measurement
objective measure for speech intelligibility
Speech
title Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-11T20%3A46%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Cepstral%20analysis%20based%20on%20the%20glimpse%20proportion%20measure%20for%20improving%20the%20intelligibility%20of%20HMM-based%20synthetic%20speech%20in%20noise&rft.btitle=2012%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing%20(ICASSP)&rft.au=Valentini-Botinhao,%20C.&rft.date=2012-01-01&rft.spage=3997&rft.epage=4000&rft.pages=3997-4000&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=1467300454&rft.isbn_list=9781467300452&rft_id=info:doi/10.1109/ICASSP.2012.6288794&rft_dat=%3Cieee_6IE%3E6288794%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781467300469&rft.eisbn_list=1467300446&rft.eisbn_list=9781467300445&rft.eisbn_list=1467300462&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6288794&rfr_iscdi=true