Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions
Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. Wh...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on audio, speech, and language processing speech, and language processing, 2012-01, Vol.20 (1), p.92-102 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 102 |
---|---|
container_issue | 1 |
container_start_page | 92 |
container_title | IEEE transactions on audio, speech, and language processing |
container_volume | 20 |
creator | Jensen, J. Hendriks, R. C. |
description | Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. When implemented using discrete Fourier transform (DFT) techniques, the binary mask techniques can be seen as a special case of the broader class of DFT-based speech enhancement algorithms, for which the applied gain function is not constrained to be binary. In this context, we develop and compare binary mask techniques to state-of-the-art continuous gain techniques. We derive spectral magnitude minimum mean-square error binary gain estimators; the binary gain estimators turn out to be simple functions of the continuous gain estimators. We show that the optimal binary estimators are closely related to a range of existing, heuristically developed, binary gain estimators. The derived binary gain estimators perform better than existing binary gain estimators in simulation experiments with speech signals contaminated by several different noise sources as measured by speech quality and intelligibility measures. However, even the best binary mask method is significantly outperformed by state-of-the-art continuous gain estimators. The instrumental intelligibility results are confirmed in an intelligibility listening test. |
doi_str_mv | 10.1109/TASL.2011.2157685 |
format | Article |
fullrecord | <record><control><sourceid>pascalfrancis_RIE</sourceid><recordid>TN_cdi_pascalfrancis_primary_25473436</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5773481</ieee_id><sourcerecordid>25473436</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-85b559c15f3af6154390da551af753079ffe20c7e0ad574e30201ac367203a7e3</originalsourceid><addsrcrecordid>eNo9kDFvwjAQha2qlUppf0DVxUvHUF-ci8lIEdBKoA7AHB2OjVyBQ-1k4N83EYjpTnrvne59jL2CGAGI4mMzWS9HqQAYpYAqH-MdGwDiOFFFmt3fdsgf2VOMv0JkMs9gwHbrk9FNoANf0d67pq0MXznvju2Rrwz5ZP3XUjB8FkId-Cw27kiNqz3fRuf3_NN5CmdOvuLT2jfOt3Ub-YKc5_PW694Zn9mDpUM0L9c5ZNv5bDP9SpY_i-_pZJnotMAmGeMOsdCAVpLNATNZiIoQgaxCKVRhrUmFVkZQhSozUnR1SctcpUKSMnLI4HJXhzrGYGx5Ct234VyCKHtIZQ-p7CGVV0hd5v2SOVHUdLCBvHbxFkwxU7Ij1fneLj5njLnJqDp1DPIfsupwkg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions</title><source>IEEE Electronic Library (IEL)</source><creator>Jensen, J. ; Hendriks, R. C.</creator><creatorcontrib>Jensen, J. ; Hendriks, R. C.</creatorcontrib><description>Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. When implemented using discrete Fourier transform (DFT) techniques, the binary mask techniques can be seen as a special case of the broader class of DFT-based speech enhancement algorithms, for which the applied gain function is not constrained to be binary. In this context, we develop and compare binary mask techniques to state-of-the-art continuous gain techniques. We derive spectral magnitude minimum mean-square error binary gain estimators; the binary gain estimators turn out to be simple functions of the continuous gain estimators. We show that the optimal binary estimators are closely related to a range of existing, heuristically developed, binary gain estimators. The derived binary gain estimators perform better than existing binary gain estimators in simulation experiments with speech signals contaminated by several different noise sources as measured by speech quality and intelligibility measures. However, even the best binary mask method is significantly outperformed by state-of-the-art continuous gain estimators. The instrumental intelligibility results are confirmed in an intelligibility listening test.</description><identifier>ISSN: 1558-7916</identifier><identifier>EISSN: 1558-7924</identifier><identifier>DOI: 10.1109/TASL.2011.2157685</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>Piscataway, NJ: IEEE</publisher><subject>Applied sciences ; Auditory system ; Detection, estimation, filtering, equalization, prediction ; Discrete Fourier transforms ; Exact sciences and technology ; Ideal Binary Mask ; Information, signal and communications theory ; Noise measurement ; Signal and communications theory ; Signal processing ; Signal to noise ratio ; Signal, noise ; Spectral Magnitude Estimation ; Speech ; Speech Enhancement ; Speech Intelligibility ; Speech processing ; Speech Quality ; Telecommunications and information theory ; Time frequency analysis</subject><ispartof>IEEE transactions on audio, speech, and language processing, 2012-01, Vol.20 (1), p.92-102</ispartof><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-85b559c15f3af6154390da551af753079ffe20c7e0ad574e30201ac367203a7e3</citedby><cites>FETCH-LOGICAL-c295t-85b559c15f3af6154390da551af753079ffe20c7e0ad574e30201ac367203a7e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5773481$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5773481$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25473436$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Jensen, J.</creatorcontrib><creatorcontrib>Hendriks, R. C.</creatorcontrib><title>Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions</title><title>IEEE transactions on audio, speech, and language processing</title><addtitle>TASL</addtitle><description>Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. When implemented using discrete Fourier transform (DFT) techniques, the binary mask techniques can be seen as a special case of the broader class of DFT-based speech enhancement algorithms, for which the applied gain function is not constrained to be binary. In this context, we develop and compare binary mask techniques to state-of-the-art continuous gain techniques. We derive spectral magnitude minimum mean-square error binary gain estimators; the binary gain estimators turn out to be simple functions of the continuous gain estimators. We show that the optimal binary estimators are closely related to a range of existing, heuristically developed, binary gain estimators. The derived binary gain estimators perform better than existing binary gain estimators in simulation experiments with speech signals contaminated by several different noise sources as measured by speech quality and intelligibility measures. However, even the best binary mask method is significantly outperformed by state-of-the-art continuous gain estimators. The instrumental intelligibility results are confirmed in an intelligibility listening test.</description><subject>Applied sciences</subject><subject>Auditory system</subject><subject>Detection, estimation, filtering, equalization, prediction</subject><subject>Discrete Fourier transforms</subject><subject>Exact sciences and technology</subject><subject>Ideal Binary Mask</subject><subject>Information, signal and communications theory</subject><subject>Noise measurement</subject><subject>Signal and communications theory</subject><subject>Signal processing</subject><subject>Signal to noise ratio</subject><subject>Signal, noise</subject><subject>Spectral Magnitude Estimation</subject><subject>Speech</subject><subject>Speech Enhancement</subject><subject>Speech Intelligibility</subject><subject>Speech processing</subject><subject>Speech Quality</subject><subject>Telecommunications and information theory</subject><subject>Time frequency analysis</subject><issn>1558-7916</issn><issn>1558-7924</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kDFvwjAQha2qlUppf0DVxUvHUF-ci8lIEdBKoA7AHB2OjVyBQ-1k4N83EYjpTnrvne59jL2CGAGI4mMzWS9HqQAYpYAqH-MdGwDiOFFFmt3fdsgf2VOMv0JkMs9gwHbrk9FNoANf0d67pq0MXznvju2Rrwz5ZP3XUjB8FkId-Cw27kiNqz3fRuf3_NN5CmdOvuLT2jfOt3Ub-YKc5_PW694Zn9mDpUM0L9c5ZNv5bDP9SpY_i-_pZJnotMAmGeMOsdCAVpLNATNZiIoQgaxCKVRhrUmFVkZQhSozUnR1SctcpUKSMnLI4HJXhzrGYGx5Ct234VyCKHtIZQ-p7CGVV0hd5v2SOVHUdLCBvHbxFkwxU7Ij1fneLj5njLnJqDp1DPIfsupwkg</recordid><startdate>201201</startdate><enddate>201201</enddate><creator>Jensen, J.</creator><creator>Hendriks, R. C.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>201201</creationdate><title>Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions</title><author>Jensen, J. ; Hendriks, R. C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-85b559c15f3af6154390da551af753079ffe20c7e0ad574e30201ac367203a7e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Applied sciences</topic><topic>Auditory system</topic><topic>Detection, estimation, filtering, equalization, prediction</topic><topic>Discrete Fourier transforms</topic><topic>Exact sciences and technology</topic><topic>Ideal Binary Mask</topic><topic>Information, signal and communications theory</topic><topic>Noise measurement</topic><topic>Signal and communications theory</topic><topic>Signal processing</topic><topic>Signal to noise ratio</topic><topic>Signal, noise</topic><topic>Spectral Magnitude Estimation</topic><topic>Speech</topic><topic>Speech Enhancement</topic><topic>Speech Intelligibility</topic><topic>Speech processing</topic><topic>Speech Quality</topic><topic>Telecommunications and information theory</topic><topic>Time frequency analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jensen, J.</creatorcontrib><creatorcontrib>Hendriks, R. C.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><jtitle>IEEE transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jensen, J.</au><au>Hendriks, R. C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions</atitle><jtitle>IEEE transactions on audio, speech, and language processing</jtitle><stitle>TASL</stitle><date>2012-01</date><risdate>2012</risdate><volume>20</volume><issue>1</issue><spage>92</spage><epage>102</epage><pages>92-102</pages><issn>1558-7916</issn><eissn>1558-7924</eissn><coden>ITASD8</coden><abstract>Recently, binary mask techniques have been proposed as a tool for retrieving a target speech signal from a noisy observation. A binary gain function is applied to time-frequency tiles of the noisy observation in order to suppress noise dominated and retain target dominated time-frequency regions. When implemented using discrete Fourier transform (DFT) techniques, the binary mask techniques can be seen as a special case of the broader class of DFT-based speech enhancement algorithms, for which the applied gain function is not constrained to be binary. In this context, we develop and compare binary mask techniques to state-of-the-art continuous gain techniques. We derive spectral magnitude minimum mean-square error binary gain estimators; the binary gain estimators turn out to be simple functions of the continuous gain estimators. We show that the optimal binary estimators are closely related to a range of existing, heuristically developed, binary gain estimators. The derived binary gain estimators perform better than existing binary gain estimators in simulation experiments with speech signals contaminated by several different noise sources as measured by speech quality and intelligibility measures. However, even the best binary mask method is significantly outperformed by state-of-the-art continuous gain estimators. The instrumental intelligibility results are confirmed in an intelligibility listening test.</abstract><cop>Piscataway, NJ</cop><pub>IEEE</pub><doi>10.1109/TASL.2011.2157685</doi><tpages>11</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1558-7916 |
ispartof | IEEE transactions on audio, speech, and language processing, 2012-01, Vol.20 (1), p.92-102 |
issn | 1558-7916 1558-7924 |
language | eng |
recordid | cdi_pascalfrancis_primary_25473436 |
source | IEEE Electronic Library (IEL) |
subjects | Applied sciences Auditory system Detection, estimation, filtering, equalization, prediction Discrete Fourier transforms Exact sciences and technology Ideal Binary Mask Information, signal and communications theory Noise measurement Signal and communications theory Signal processing Signal to noise ratio Signal, noise Spectral Magnitude Estimation Speech Speech Enhancement Speech Intelligibility Speech processing Speech Quality Telecommunications and information theory Time frequency analysis |
title | Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T21%3A55%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Spectral%20Magnitude%20Minimum%20Mean-Square%20Error%20Estimation%20Using%20Binary%20and%20Continuous%20Gain%20Functions&rft.jtitle=IEEE%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Jensen,%20J.&rft.date=2012-01&rft.volume=20&rft.issue=1&rft.spage=92&rft.epage=102&rft.pages=92-102&rft.issn=1558-7916&rft.eissn=1558-7924&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASL.2011.2157685&rft_dat=%3Cpascalfrancis_RIE%3E25473436%3C/pascalfrancis_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5773481&rfr_iscdi=true |