Prediction of speech intelligibility based on an auditory preprocessing model

Classical speech intelligibility models, such as the speech transmission index (STI) and the speech intelligibility index (SII) are based on calculations on the physical acoustic signals. The present study predicts speech intelligibility by combining a psychoacoustically validated model of auditory...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Speech communication 2010-07, Vol.52 (7), p.678-692
Hauptverfasser:	Christiansen, Claus, Pedersen, Michael Syskind, Dau, Torsten
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Auditory processing model Exact sciences and technology Ideal binary mask Information, signal and communications theory Intelligibility Masks Mathematical models Noise Preprocessing Signal processing Similarity Speech Speech intelligibility Speech intelligibility index Speech processing Speech transmission index Telecommunications and information theory Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	692
container_issue	7
container_start_page	678
container_title	Speech communication
container_volume	52
creator	Christiansen, Claus Pedersen, Michael Syskind Dau, Torsten
description	Classical speech intelligibility models, such as the speech transmission index (STI) and the speech intelligibility index (SII) are based on calculations on the physical acoustic signals. The present study predicts speech intelligibility by combining a psychoacoustically validated model of auditory preprocessing [Dau et al., 1997. J. Acoust. Soc. Am. 102, 2892–2905] with a simple central stage that describes the similarity of the test signal with the corresponding reference signal at a level of the internal representation of the signals. The model was compared with previous approaches, whereby a speech in noise experiment was used for training and an ideal binary mask experiment was used for evaluation. All three models were able to capture the trends in the speech in noise training data well, but the proposed model provides a better prediction of the binary mask test data, particularly when the binary masks degenerate to a noise vocoder.
doi_str_mv	10.1016/j.specom.2010.03.004
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_853230194</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167639310000440</els_id><sourcerecordid>853230194</sourcerecordid><originalsourceid>FETCH-LOGICAL-c433t-33479377ce9279b7cb3ca4d8c76cd9c61de66a7b0d2d625c3b6cdf19fce1583d3</originalsourceid><addsrcrecordid>eNqFkU-LFDEQxYMoOI5-Aw99Eb30bJLqTtIXQRb_LKzoQc8hXaleM_R0xqRHmG9vLbN4XKGgoPjVq-I9IV4ruVNSmav9rh4J82GnJY8k7KTsnoiNcla3Vjn9VGwYs62BAZ6LF7XuJRPO6Y34-r1QTLimvDR5aliH8FeTlpXmOd2lMc1pPTdjqBQbRgLXKaY1l3NzLHQsGanWtNw1hxxpfimeTWGu9Oqhb8XPTx9_XH9pb799vrn-cNtiB7C2AJ0dwFqkQdthtDgChi46tAbjgEZFMibYUUYdje4RRp5PapiQVO8gwla8vejyA79PVFd_SBX55bBQPlXvetAg1dD9l7Q9OC2NGZh89yjJBiqQRrLyVnQXFEuutdDkjyUdQjl7Jf19In7vL4n4-0S8BM9-89qbhwuhYpinEhZM9d-u1k65Xjvm3l84Ygv_JCq-YqIFOahCuPqY0-OH_gIAHKOY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1671306032</pqid></control><display><type>article</type><title>Prediction of speech intelligibility based on an auditory preprocessing model</title><source>Elsevier ScienceDirect Journals</source><creator>Christiansen, Claus ; Pedersen, Michael Syskind ; Dau, Torsten</creator><creatorcontrib>Christiansen, Claus ; Pedersen, Michael Syskind ; Dau, Torsten</creatorcontrib><description>Classical speech intelligibility models, such as the speech transmission index (STI) and the speech intelligibility index (SII) are based on calculations on the physical acoustic signals. The present study predicts speech intelligibility by combining a psychoacoustically validated model of auditory preprocessing [Dau et al., 1997. J. Acoust. Soc. Am. 102, 2892–2905] with a simple central stage that describes the similarity of the test signal with the corresponding reference signal at a level of the internal representation of the signals. The model was compared with previous approaches, whereby a speech in noise experiment was used for training and an ideal binary mask experiment was used for evaluation. All three models were able to capture the trends in the speech in noise training data well, but the proposed model provides a better prediction of the binary mask test data, particularly when the binary masks degenerate to a noise vocoder.</description><identifier>ISSN: 0167-6393</identifier><identifier>EISSN: 1872-7182</identifier><identifier>DOI: 10.1016/j.specom.2010.03.004</identifier><identifier>CODEN: SCOMDH</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Applied sciences ; Auditory processing model ; Exact sciences and technology ; Ideal binary mask ; Information, signal and communications theory ; Intelligibility ; Masks ; Mathematical models ; Noise ; Preprocessing ; Signal processing ; Similarity ; Speech ; Speech intelligibility ; Speech intelligibility index ; Speech processing ; Speech transmission index ; Telecommunications and information theory ; Training</subject><ispartof>Speech communication, 2010-07, Vol.52 (7), p.678-692</ispartof><rights>2010 Elsevier B.V.</rights><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c433t-33479377ce9279b7cb3ca4d8c76cd9c61de66a7b0d2d625c3b6cdf19fce1583d3</citedby><cites>FETCH-LOGICAL-c433t-33479377ce9279b7cb3ca4d8c76cd9c61de66a7b0d2d625c3b6cdf19fce1583d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0167639310000440$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=22818528$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Christiansen, Claus</creatorcontrib><creatorcontrib>Pedersen, Michael Syskind</creatorcontrib><creatorcontrib>Dau, Torsten</creatorcontrib><title>Prediction of speech intelligibility based on an auditory preprocessing model</title><title>Speech communication</title><description>Classical speech intelligibility models, such as the speech transmission index (STI) and the speech intelligibility index (SII) are based on calculations on the physical acoustic signals. The present study predicts speech intelligibility by combining a psychoacoustically validated model of auditory preprocessing [Dau et al., 1997. J. Acoust. Soc. Am. 102, 2892–2905] with a simple central stage that describes the similarity of the test signal with the corresponding reference signal at a level of the internal representation of the signals. The model was compared with previous approaches, whereby a speech in noise experiment was used for training and an ideal binary mask experiment was used for evaluation. All three models were able to capture the trends in the speech in noise training data well, but the proposed model provides a better prediction of the binary mask test data, particularly when the binary masks degenerate to a noise vocoder.</description><subject>Applied sciences</subject><subject>Auditory processing model</subject><subject>Exact sciences and technology</subject><subject>Ideal binary mask</subject><subject>Information, signal and communications theory</subject><subject>Intelligibility</subject><subject>Masks</subject><subject>Mathematical models</subject><subject>Noise</subject><subject>Preprocessing</subject><subject>Signal processing</subject><subject>Similarity</subject><subject>Speech</subject><subject>Speech intelligibility</subject><subject>Speech intelligibility index</subject><subject>Speech processing</subject><subject>Speech transmission index</subject><subject>Telecommunications and information theory</subject><subject>Training</subject><issn>0167-6393</issn><issn>1872-7182</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><recordid>eNqFkU-LFDEQxYMoOI5-Aw99Eb30bJLqTtIXQRb_LKzoQc8hXaleM_R0xqRHmG9vLbN4XKGgoPjVq-I9IV4ruVNSmav9rh4J82GnJY8k7KTsnoiNcla3Vjn9VGwYs62BAZ6LF7XuJRPO6Y34-r1QTLimvDR5aliH8FeTlpXmOd2lMc1pPTdjqBQbRgLXKaY1l3NzLHQsGanWtNw1hxxpfimeTWGu9Oqhb8XPTx9_XH9pb799vrn-cNtiB7C2AJ0dwFqkQdthtDgChi46tAbjgEZFMibYUUYdje4RRp5PapiQVO8gwla8vejyA79PVFd_SBX55bBQPlXvetAg1dD9l7Q9OC2NGZh89yjJBiqQRrLyVnQXFEuutdDkjyUdQjl7Jf19In7vL4n4-0S8BM9-89qbhwuhYpinEhZM9d-u1k65Xjvm3l84Ygv_JCq-YqIFOahCuPqY0-OH_gIAHKOY</recordid><startdate>20100701</startdate><enddate>20100701</enddate><creator>Christiansen, Claus</creator><creator>Pedersen, Michael Syskind</creator><creator>Dau, Torsten</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7T9</scope><scope>8BM</scope></search><sort><creationdate>20100701</creationdate><title>Prediction of speech intelligibility based on an auditory preprocessing model</title><author>Christiansen, Claus ; Pedersen, Michael Syskind ; Dau, Torsten</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c433t-33479377ce9279b7cb3ca4d8c76cd9c61de66a7b0d2d625c3b6cdf19fce1583d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Applied sciences</topic><topic>Auditory processing model</topic><topic>Exact sciences and technology</topic><topic>Ideal binary mask</topic><topic>Information, signal and communications theory</topic><topic>Intelligibility</topic><topic>Masks</topic><topic>Mathematical models</topic><topic>Noise</topic><topic>Preprocessing</topic><topic>Signal processing</topic><topic>Similarity</topic><topic>Speech</topic><topic>Speech intelligibility</topic><topic>Speech intelligibility index</topic><topic>Speech processing</topic><topic>Speech transmission index</topic><topic>Telecommunications and information theory</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Christiansen, Claus</creatorcontrib><creatorcontrib>Pedersen, Michael Syskind</creatorcontrib><creatorcontrib>Dau, Torsten</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ComDisDome</collection><jtitle>Speech communication</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Christiansen, Claus</au><au>Pedersen, Michael Syskind</au><au>Dau, Torsten</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prediction of speech intelligibility based on an auditory preprocessing model</atitle><jtitle>Speech communication</jtitle><date>2010-07-01</date><risdate>2010</risdate><volume>52</volume><issue>7</issue><spage>678</spage><epage>692</epage><pages>678-692</pages><issn>0167-6393</issn><eissn>1872-7182</eissn><coden>SCOMDH</coden><abstract>Classical speech intelligibility models, such as the speech transmission index (STI) and the speech intelligibility index (SII) are based on calculations on the physical acoustic signals. The present study predicts speech intelligibility by combining a psychoacoustically validated model of auditory preprocessing [Dau et al., 1997. J. Acoust. Soc. Am. 102, 2892–2905] with a simple central stage that describes the similarity of the test signal with the corresponding reference signal at a level of the internal representation of the signals. The model was compared with previous approaches, whereby a speech in noise experiment was used for training and an ideal binary mask experiment was used for evaluation. All three models were able to capture the trends in the speech in noise training data well, but the proposed model provides a better prediction of the binary mask test data, particularly when the binary masks degenerate to a noise vocoder.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.specom.2010.03.004</doi><tpages>15</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0167-6393
ispartof	Speech communication, 2010-07, Vol.52 (7), p.678-692
issn	0167-6393 1872-7182
language	eng
recordid	cdi_proquest_miscellaneous_853230194
source	Elsevier ScienceDirect Journals
subjects	Applied sciences Auditory processing model Exact sciences and technology Ideal binary mask Information, signal and communications theory Intelligibility Masks Mathematical models Noise Preprocessing Signal processing Similarity Speech Speech intelligibility Speech intelligibility index Speech processing Speech transmission index Telecommunications and information theory Training
title	Prediction of speech intelligibility based on an auditory preprocessing model
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T10%3A24%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prediction%20of%20speech%20intelligibility%20based%20on%20an%20auditory%20preprocessing%20model&rft.jtitle=Speech%20communication&rft.au=Christiansen,%20Claus&rft.date=2010-07-01&rft.volume=52&rft.issue=7&rft.spage=678&rft.epage=692&rft.pages=678-692&rft.issn=0167-6393&rft.eissn=1872-7182&rft.coden=SCOMDH&rft_id=info:doi/10.1016/j.specom.2010.03.004&rft_dat=%3Cproquest_cross%3E853230194%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1671306032&rft_id=info:pmid/&rft_els_id=S0167639310000440&rfr_iscdi=true