Strategies for distant speech recognitionin reverberant environments

Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition sy...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:EURASIP journal on advances in signal processing 2015-07, Vol.2015 (1), p.1-15, Article 60
Hauptverfasser: Delcroix, Marc, Yoshioka, Takuya, Ogawa, Atsunori, Kubo, Yotaro, Fujimoto, Masakiyo, Ito, Nobutaka, Kinoshita, Keisuke, Espi, Miquel, Araki, Shoko, Hori, Takaaki, Nakatani, Tomohiro
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 15
container_issue 1
container_start_page 1
container_title EURASIP journal on advances in signal processing
container_volume 2015
creator Delcroix, Marc
Yoshioka, Takuya
Ogawa, Atsunori
Kubo, Yotaro
Fujimoto, Masakiyo
Ito, Nobutaka
Kinoshita, Keisuke
Espi, Miquel
Araki, Shoko
Hori, Takaaki
Nakatani, Tomohiro
description Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.
doi_str_mv 10.1186/s13634-015-0245-7
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1808047595</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1808047595</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3077-672720e3281dc38df29920b3613d5cc99149360db9647c14698e52a7798ddb8e3</originalsourceid><addsrcrecordid>eNp1kE1LAzEQhoMoWKs_wNuCFy-r-dh8HaV-QsGDeg672dma0iY1SQv-e1PWQxE8zQw87zszL0KXBN8QosRtIkywpsaE15g2vJZHaEKEkrUgCh8f9KfoLKUlxlxQTCfo_i3HNsPCQaqGEKvepdz6XKUNgP2sItiw8C674J0v0w5iB3EPgN-5GPwafE7n6GRoVwkufusUfTw-vM-e6_nr08vsbl5bhmVZL6mkGBhVpLdM9QPVmuKOCcJ6bq3WpNFM4L7TopGWNEIr4LSVUqu-7xSwKboefTcxfG0hZbN2ycJq1XoI22TKewo3kmte0Ks_6DJsoy_XGSK0xIIzrQpFRsrGkFKEwWyiW7fx2xBs9rmaMVdTcjX7XI0sGjpqUmH9AuKB87-iHwcxecI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1697065398</pqid></control><display><type>article</type><title>Strategies for distant speech recognitionin reverberant environments</title><source>DOAJ Directory of Open Access Journals</source><source>Springer Nature OA Free Journals</source><source>Springer Nature - Complete Springer Journals</source><source>Alma/SFX Local Collection</source><creator>Delcroix, Marc ; Yoshioka, Takuya ; Ogawa, Atsunori ; Kubo, Yotaro ; Fujimoto, Masakiyo ; Ito, Nobutaka ; Kinoshita, Keisuke ; Espi, Miquel ; Araki, Shoko ; Hori, Takaaki ; Nakatani, Tomohiro</creator><creatorcontrib>Delcroix, Marc ; Yoshioka, Takuya ; Ogawa, Atsunori ; Kubo, Yotaro ; Fujimoto, Masakiyo ; Ito, Nobutaka ; Kinoshita, Keisuke ; Espi, Miquel ; Araki, Shoko ; Hori, Takaaki ; Nakatani, Tomohiro</creatorcontrib><description>Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.</description><identifier>ISSN: 1687-6180</identifier><identifier>ISSN: 1687-6172</identifier><identifier>EISSN: 1687-6180</identifier><identifier>DOI: 10.1186/s13634-015-0245-7</identifier><language>eng</language><publisher>Cham: Springer International Publishing</publisher><subject>Acoustic noise ; Engineering ; Mathematical models ; Neural networks ; Quantum Information Technology ; Signal,Image and Speech Processing ; Speech ; Speech processing ; Speech recognition ; Spintronics ; Strategy ; Tasks ; ‘Silencing the Echoes’ – Processing of Reverberant Speech</subject><ispartof>EURASIP journal on advances in signal processing, 2015-07, Vol.2015 (1), p.1-15, Article 60</ispartof><rights>Delcroix et al.; licensee Springer. 2015. This is an Open Access article distributed under the terms of the Creative Commons Attribution License( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>The Author(s) 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3077-672720e3281dc38df29920b3613d5cc99149360db9647c14698e52a7798ddb8e3</citedby><cites>FETCH-LOGICAL-c3077-672720e3281dc38df29920b3613d5cc99149360db9647c14698e52a7798ddb8e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1186/s13634-015-0245-7$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1186/s13634-015-0245-7$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,778,782,862,27907,27908,41103,41471,42172,42540,51302,51559</link.rule.ids></links><search><creatorcontrib>Delcroix, Marc</creatorcontrib><creatorcontrib>Yoshioka, Takuya</creatorcontrib><creatorcontrib>Ogawa, Atsunori</creatorcontrib><creatorcontrib>Kubo, Yotaro</creatorcontrib><creatorcontrib>Fujimoto, Masakiyo</creatorcontrib><creatorcontrib>Ito, Nobutaka</creatorcontrib><creatorcontrib>Kinoshita, Keisuke</creatorcontrib><creatorcontrib>Espi, Miquel</creatorcontrib><creatorcontrib>Araki, Shoko</creatorcontrib><creatorcontrib>Hori, Takaaki</creatorcontrib><creatorcontrib>Nakatani, Tomohiro</creatorcontrib><title>Strategies for distant speech recognitionin reverberant environments</title><title>EURASIP journal on advances in signal processing</title><addtitle>EURASIP J. Adv. Signal Process</addtitle><description>Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.</description><subject>Acoustic noise</subject><subject>Engineering</subject><subject>Mathematical models</subject><subject>Neural networks</subject><subject>Quantum Information Technology</subject><subject>Signal,Image and Speech Processing</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Spintronics</subject><subject>Strategy</subject><subject>Tasks</subject><subject>‘Silencing the Echoes’ – Processing of Reverberant Speech</subject><issn>1687-6180</issn><issn>1687-6172</issn><issn>1687-6180</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp1kE1LAzEQhoMoWKs_wNuCFy-r-dh8HaV-QsGDeg672dma0iY1SQv-e1PWQxE8zQw87zszL0KXBN8QosRtIkywpsaE15g2vJZHaEKEkrUgCh8f9KfoLKUlxlxQTCfo_i3HNsPCQaqGEKvepdz6XKUNgP2sItiw8C674J0v0w5iB3EPgN-5GPwafE7n6GRoVwkufusUfTw-vM-e6_nr08vsbl5bhmVZL6mkGBhVpLdM9QPVmuKOCcJ6bq3WpNFM4L7TopGWNEIr4LSVUqu-7xSwKboefTcxfG0hZbN2ycJq1XoI22TKewo3kmte0Ks_6DJsoy_XGSK0xIIzrQpFRsrGkFKEwWyiW7fx2xBs9rmaMVdTcjX7XI0sGjpqUmH9AuKB87-iHwcxecI</recordid><startdate>20150719</startdate><enddate>20150719</enddate><creator>Delcroix, Marc</creator><creator>Yoshioka, Takuya</creator><creator>Ogawa, Atsunori</creator><creator>Kubo, Yotaro</creator><creator>Fujimoto, Masakiyo</creator><creator>Ito, Nobutaka</creator><creator>Kinoshita, Keisuke</creator><creator>Espi, Miquel</creator><creator>Araki, Shoko</creator><creator>Hori, Takaaki</creator><creator>Nakatani, Tomohiro</creator><general>Springer International Publishing</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7SP</scope><scope>7XB</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope></search><sort><creationdate>20150719</creationdate><title>Strategies for distant speech recognitionin reverberant environments</title><author>Delcroix, Marc ; Yoshioka, Takuya ; Ogawa, Atsunori ; Kubo, Yotaro ; Fujimoto, Masakiyo ; Ito, Nobutaka ; Kinoshita, Keisuke ; Espi, Miquel ; Araki, Shoko ; Hori, Takaaki ; Nakatani, Tomohiro</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3077-672720e3281dc38df29920b3613d5cc99149360db9647c14698e52a7798ddb8e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Acoustic noise</topic><topic>Engineering</topic><topic>Mathematical models</topic><topic>Neural networks</topic><topic>Quantum Information Technology</topic><topic>Signal,Image and Speech Processing</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Spintronics</topic><topic>Strategy</topic><topic>Tasks</topic><topic>‘Silencing the Echoes’ – Processing of Reverberant Speech</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Delcroix, Marc</creatorcontrib><creatorcontrib>Yoshioka, Takuya</creatorcontrib><creatorcontrib>Ogawa, Atsunori</creatorcontrib><creatorcontrib>Kubo, Yotaro</creatorcontrib><creatorcontrib>Fujimoto, Masakiyo</creatorcontrib><creatorcontrib>Ito, Nobutaka</creatorcontrib><creatorcontrib>Kinoshita, Keisuke</creatorcontrib><creatorcontrib>Espi, Miquel</creatorcontrib><creatorcontrib>Araki, Shoko</creatorcontrib><creatorcontrib>Hori, Takaaki</creatorcontrib><creatorcontrib>Nakatani, Tomohiro</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>EURASIP journal on advances in signal processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Delcroix, Marc</au><au>Yoshioka, Takuya</au><au>Ogawa, Atsunori</au><au>Kubo, Yotaro</au><au>Fujimoto, Masakiyo</au><au>Ito, Nobutaka</au><au>Kinoshita, Keisuke</au><au>Espi, Miquel</au><au>Araki, Shoko</au><au>Hori, Takaaki</au><au>Nakatani, Tomohiro</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Strategies for distant speech recognitionin reverberant environments</atitle><jtitle>EURASIP journal on advances in signal processing</jtitle><stitle>EURASIP J. Adv. Signal Process</stitle><date>2015-07-19</date><risdate>2015</risdate><volume>2015</volume><issue>1</issue><spage>1</spage><epage>15</epage><pages>1-15</pages><artnum>60</artnum><issn>1687-6180</issn><issn>1687-6172</issn><eissn>1687-6180</eissn><abstract>Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.</abstract><cop>Cham</cop><pub>Springer International Publishing</pub><doi>10.1186/s13634-015-0245-7</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1687-6180
ispartof EURASIP journal on advances in signal processing, 2015-07, Vol.2015 (1), p.1-15, Article 60
issn 1687-6180
1687-6172
1687-6180
language eng
recordid cdi_proquest_miscellaneous_1808047595
source DOAJ Directory of Open Access Journals; Springer Nature OA Free Journals; Springer Nature - Complete Springer Journals; Alma/SFX Local Collection
subjects Acoustic noise
Engineering
Mathematical models
Neural networks
Quantum Information Technology
Signal,Image and Speech Processing
Speech
Speech processing
Speech recognition
Spintronics
Strategy
Tasks
‘Silencing the Echoes’ – Processing of Reverberant Speech
title Strategies for distant speech recognitionin reverberant environments
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T13%3A16%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Strategies%20for%20distant%20speech%20recognitionin%20reverberant%20environments&rft.jtitle=EURASIP%20journal%20on%20advances%20in%20signal%20processing&rft.au=Delcroix,%20Marc&rft.date=2015-07-19&rft.volume=2015&rft.issue=1&rft.spage=1&rft.epage=15&rft.pages=1-15&rft.artnum=60&rft.issn=1687-6180&rft.eissn=1687-6180&rft_id=info:doi/10.1186/s13634-015-0245-7&rft_dat=%3Cproquest_cross%3E1808047595%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1697065398&rft_id=info:pmid/&rfr_iscdi=true