Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions

Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. Wh...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on audio, speech, and language processing speech, and language processing, 2012-01, Vol.20 (1), p.92-102
Hauptverfasser:	Jensen, J., Hendriks, R. C.
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Auditory system Detection, estimation, filtering, equalization, prediction Discrete Fourier transforms Exact sciences and technology Ideal Binary Mask Information, signal and communications theory Noise measurement Signal and communications theory Signal processing Signal to noise ratio Signal, noise Spectral Magnitude Estimation Speech Speech Enhancement Speech Intelligibility Speech processing Speech Quality Telecommunications and information theory Time frequency analysis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	102
container_issue	1
container_start_page	92
container_title	IEEE transactions on audio, speech, and language processing
container_volume	20
creator	Jensen, J. Hendriks, R. C.
description	Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. When implemented using discrete Fourier transform (DFT) techniques, the binary mask techniques can be seen as a special case of the broader class of DFT-based speech enhancement algorithms, for which the applied gain function is not constrained to be binary. In this context, we develop and compare binary mask techniques to state-of-the-art continuous gain techniques. We derive spectral magnitude minimum mean-square error binary gain estimators; the binary gain estimators turn out to be simple functions of the continuous gain estimators. We show that the optimal binary estimators are closely related to a range of existing, heuristically developed, binary gain estimators. The derived binary gain estimators perform better than existing binary gain estimators in simulation experiments with speech signals contaminated by several different noise sources as measured by speech quality and intelligibility measures. However, even the best binary mask method is significantly outperformed by state-of-the-art continuous gain estimators. The instrumental intelligibility results are confirmed in an intelligibility listening test.
doi_str_mv	10.1109/TASL.2011.2157685
format	Article
fullrecord	<record><control><sourceid>pascalfrancis_RIE</sourceid><recordid>TN_cdi_pascalfrancis_primary_25473436</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5773481</ieee_id><sourcerecordid>25473436</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-85b559c15f3af6154390da551af753079ffe20c7e0ad574e30201ac367203a7e3</originalsourceid><addsrcrecordid>eNo9kDFvwjAQha2qlUppf0DVxUvHUF-ci8lIEdBKoA7AHB2OjVyBQ-1k4N83EYjpTnrvne59jL2CGAGI4mMzWS9HqQAYpYAqH-MdGwDiOFFFmt3fdsgf2VOMv0JkMs9gwHbrk9FNoANf0d67pq0MXznvju2Rrwz5ZP3XUjB8FkId-Cw27kiNqz3fRuf3_NN5CmdOvuLT2jfOt3Ub-YKc5_PW694Zn9mDpUM0L9c5ZNv5bDP9SpY_i-_pZJnotMAmGeMOsdCAVpLNATNZiIoQgaxCKVRhrUmFVkZQhSozUnR1SctcpUKSMnLI4HJXhzrGYGx5Ct234VyCKHtIZQ-p7CGVV0hd5v2SOVHUdLCBvHbxFkwxU7Ij1fneLj5njLnJqDp1DPIfsupwkg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions</title><source>IEEE Electronic Library (IEL)</source><creator>Jensen, J. ; Hendriks, R. C.</creator><creatorcontrib>Jensen, J. ; Hendriks, R. C.</creatorcontrib><description>Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. When implemented using discrete Fourier transform (DFT) techniques, the binary mask techniques can be seen as a special case of the broader class of DFT-based speech enhancement algorithms, for which the applied gain function is not constrained to be binary. In this context, we develop and compare binary mask techniques to state-of-the-art continuous gain techniques. We derive spectral magnitude minimum mean-square error binary gain estimators; the binary gain estimators turn out to be simple functions of the continuous gain estimators. We show that the optimal binary estimators are closely related to a range of existing, heuristically developed, binary gain estimators. The derived binary gain estimators perform better than existing binary gain estimators in simulation experiments with speech signals contaminated by several different noise sources as measured by speech quality and intelligibility measures. However, even the best binary mask method is significantly outperformed by state-of-the-art continuous gain estimators. The instrumental intelligibility results are confirmed in an intelligibility listening test.</description><identifier>ISSN: 1558-7916</identifier><identifier>EISSN: 1558-7924</identifier><identifier>DOI: 10.1109/TASL.2011.2157685</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>Piscataway, NJ: IEEE</publisher><subject>Applied sciences ; Auditory system ; Detection, estimation, filtering, equalization, prediction ; Discrete Fourier transforms ; Exact sciences and technology ; Ideal Binary Mask ; Information, signal and communications theory ; Noise measurement ; Signal and communications theory ; Signal processing ; Signal to noise ratio ; Signal, noise ; Spectral Magnitude Estimation ; Speech ; Speech Enhancement ; Speech Intelligibility ; Speech processing ; Speech Quality ; Telecommunications and information theory ; Time frequency analysis</subject><ispartof>IEEE transactions on audio, speech, and language processing, 2012-01, Vol.20 (1), p.92-102</ispartof><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-85b559c15f3af6154390da551af753079ffe20c7e0ad574e30201ac367203a7e3</citedby><cites>FETCH-LOGICAL-c295t-85b559c15f3af6154390da551af753079ffe20c7e0ad574e30201ac367203a7e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5773481$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5773481$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25473436$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Jensen, J.</creatorcontrib><creatorcontrib>Hendriks, R. C.</creatorcontrib><title>Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions</title><title>IEEE transactions on audio, speech, and language processing</title><addtitle>TASL</addtitle><description>Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. When implemented using discrete Fourier transform (DFT) techniques, the binary mask techniques can be seen as a special case of the broader class of DFT-based speech enhancement algorithms, for which the applied gain function is not constrained to be binary. In this context, we develop and compare binary mask techniques to state-of-the-art continuous gain techniques. We derive spectral magnitude minimum mean-square error binary gain estimators; the binary gain estimators turn out to be simple functions of the continuous gain estimators. We show that the optimal binary estimators are closely related to a range of existing, heuristically developed, binary gain estimators. The derived binary gain estimators perform better than existing binary gain estimators in simulation experiments with speech signals contaminated by several different noise sources as measured by speech quality and intelligibility measures. However, even the best binary mask method is significantly outperformed by state-of-the-art continuous gain estimators. The instrumental intelligibility results are confirmed in an intelligibility listening test.</description><subject>Applied sciences</subject><subject>Auditory system</subject><subject>Detection, estimation, filtering, equalization, prediction</subject><subject>Discrete Fourier transforms</subject><subject>Exact sciences and technology</subject><subject>Ideal Binary Mask</subject><subject>Information, signal and communications theory</subject><subject>Noise measurement</subject><subject>Signal and communications theory</subject><subject>Signal processing</subject><subject>Signal to noise ratio</subject><subject>Signal, noise</subject><subject>Spectral Magnitude Estimation</subject><subject>Speech</subject><subject>Speech Enhancement</subject><subject>Speech Intelligibility</subject><subject>Speech processing</subject><subject>Speech Quality</subject><subject>Telecommunications and information theory</subject><subject>Time frequency analysis</subject><issn>1558-7916</issn><issn>1558-7924</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kDFvwjAQha2qlUppf0DVxUvHUF-ci8lIEdBKoA7AHB2OjVyBQ-1k4N83EYjpTnrvne59jL2CGAGI4mMzWS9HqQAYpYAqH-MdGwDiOFFFmt3fdsgf2VOMv0JkMs9gwHbrk9FNoANf0d67pq0MXznvju2Rrwz5ZP3XUjB8FkId-Cw27kiNqz3fRuf3_NN5CmdOvuLT2jfOt3Ub-YKc5_PW694Zn9mDpUM0L9c5ZNv5bDP9SpY_i-_pZJnotMAmGeMOsdCAVpLNATNZiIoQgaxCKVRhrUmFVkZQhSozUnR1SctcpUKSMnLI4HJXhzrGYGx5Ct234VyCKHtIZQ-p7CGVV0hd5v2SOVHUdLCBvHbxFkwxU7Ij1fneLj5njLnJqDp1DPIfsupwkg</recordid><startdate>201201</startdate><enddate>201201</enddate><creator>Jensen, J.</creator><creator>Hendriks, R. C.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>201201</creationdate><title>Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions</title><author>Jensen, J. ; Hendriks, R. C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-85b559c15f3af6154390da551af753079ffe20c7e0ad574e30201ac367203a7e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Applied sciences</topic><topic>Auditory system</topic><topic>Detection, estimation, filtering, equalization, prediction</topic><topic>Discrete Fourier transforms</topic><topic>Exact sciences and technology</topic><topic>Ideal Binary Mask</topic><topic>Information, signal and communications theory</topic><topic>Noise measurement</topic><topic>Signal and communications theory</topic><topic>Signal processing</topic><topic>Signal to noise ratio</topic><topic>Signal, noise</topic><topic>Spectral Magnitude Estimation</topic><topic>Speech</topic><topic>Speech Enhancement</topic><topic>Speech Intelligibility</topic><topic>Speech processing</topic><topic>Speech Quality</topic><topic>Telecommunications and information theory</topic><topic>Time frequency analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jensen, J.</creatorcontrib><creatorcontrib>Hendriks, R. C.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><jtitle>IEEE transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jensen, J.</au><au>Hendriks, R. C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions</atitle><jtitle>IEEE transactions on audio, speech, and language processing</jtitle><stitle>TASL</stitle><date>2012-01</date><risdate>2012</risdate><volume>20</volume><issue>1</issue><spage>92</spage><epage>102</epage><pages>92-102</pages><issn>1558-7916</issn><eissn>1558-7924</eissn><coden>ITASD8</coden><abstract>Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. When implemented using discrete Fourier transform (DFT) techniques, the binary mask techniques can be seen as a special case of the broader class of DFT-based speech enhancement algorithms, for which the applied gain function is not constrained to be binary. In this context, we develop and compare binary mask techniques to state-of-the-art continuous gain techniques. We derive spectral magnitude minimum mean-square error binary gain estimators; the binary gain estimators turn out to be simple functions of the continuous gain estimators. We show that the optimal binary estimators are closely related to a range of existing, heuristically developed, binary gain estimators. The derived binary gain estimators perform better than existing binary gain estimators in simulation experiments with speech signals contaminated by several different noise sources as measured by speech quality and intelligibility measures. However, even the best binary mask method is significantly outperformed by state-of-the-art continuous gain estimators. The instrumental intelligibility results are confirmed in an intelligibility listening test.</abstract><cop>Piscataway, NJ</cop><pub>IEEE</pub><doi>10.1109/TASL.2011.2157685</doi><tpages>11</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1558-7916
ispartof	IEEE transactions on audio, speech, and language processing, 2012-01, Vol.20 (1), p.92-102
issn	1558-7916 1558-7924
language	eng
recordid	cdi_pascalfrancis_primary_25473436
source	IEEE Electronic Library (IEL)
subjects	Applied sciences Auditory system Detection, estimation, filtering, equalization, prediction Discrete Fourier transforms Exact sciences and technology Ideal Binary Mask Information, signal and communications theory Noise measurement Signal and communications theory Signal processing Signal to noise ratio Signal, noise Spectral Magnitude Estimation Speech Speech Enhancement Speech Intelligibility Speech processing Speech Quality Telecommunications and information theory Time frequency analysis
title	Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T21%3A55%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Spectral%20Magnitude%20Minimum%20Mean-Square%20Error%20Estimation%20Using%20Binary%20and%20Continuous%20Gain%20Functions&rft.jtitle=IEEE%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Jensen,%20J.&rft.date=2012-01&rft.volume=20&rft.issue=1&rft.spage=92&rft.epage=102&rft.pages=92-102&rft.issn=1558-7916&rft.eissn=1558-7924&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASL.2011.2157685&rft_dat=%3Cpascalfrancis_RIE%3E25473436%3C/pascalfrancis_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5773481&rfr_iscdi=true