Auditory scene analysis and hidden Markov model recognition of speech in noise

We describe a novel paradigm for automatic speech recognition in noisy environments in which an initial stage of auditory scene analysis separates out the evidence for the speech to be recognised from the evidence for other sounds. In general, this evidence will be incomplete, since intruding sound...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Green, P.D., Cooke, M.P., Crawford, M.D.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Acoustic noise Automatic speech recognition Computational modeling Hidden Markov models Image analysis Noise robustness Speech analysis Speech enhancement Speech recognition Working environment noise
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	404 vol.1
container_issue
container_start_page	401
container_title
container_volume	1
creator	Green, P.D. Cooke, M.P. Crawford, M.D.
description	We describe a novel paradigm for automatic speech recognition in noisy environments in which an initial stage of auditory scene analysis separates out the evidence for the speech to be recognised from the evidence for other sounds. In general, this evidence will be incomplete, since intruding sound sources will dominate some spectro-temporal regions. We generalise continuous-density hidden Markov model recognition to this 'occluded speech' case. The technique is based on estimating the probability that a Gaussian mixture density distribution for an auditory firing rate map will generate an observation such that the separated components are at their observed values and the remaining components are not greater than their values in the acoustic mixture. Experiments on isolated digit recognition in noise demonstrate the potential of the new approach to yield performance comparable to that of listeners.
doi_str_mv	10.1109/ICASSP.1995.479606
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_479606</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>479606</ieee_id><sourcerecordid>479606</sourcerecordid><originalsourceid>FETCH-ieee_primary_4796063</originalsourceid><addsrcrecordid>eNp9jstqAkEQAJs8wNXkBzz1D-zas-85ihiSQ0TQgzcZdtrYyTojMyrs3yeQnHOqgroUwFRRphTp2dtivtmsM6V1lZWNrqm-gyQvGp0qTbt7GFPTUpGXhaoeIFFVTmmtSj2CcYyfRNQ2ZZvAan61cvFhwNixYzTO9EOU-CMWj2ItO3w34cvf8OQt9xi48x9OLuId-gPGM3N3RHHovER-gseD6SM__3EC05fldvGaCjPvz0FOJgz7393i3_gNMMFBUw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Auditory scene analysis and hidden Markov model recognition of speech in noise</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Green, P.D. ; Cooke, M.P. ; Crawford, M.D.</creator><creatorcontrib>Green, P.D. ; Cooke, M.P. ; Crawford, M.D.</creatorcontrib><description>We describe a novel paradigm for automatic speech recognition in noisy environments in which an initial stage of auditory scene analysis separates out the evidence for the speech to be recognised from the evidence for other sounds. In general, this evidence will be incomplete, since intruding sound sources will dominate some spectro-temporal regions. We generalise continuous-density hidden Markov model recognition to this 'occluded speech' case. The technique is based on estimating the probability that a Gaussian mixture density distribution for an auditory firing rate map will generate an observation such that the separated components are at their observed values and the remaining components are not greater than their values in the acoustic mixture. Experiments on isolated digit recognition in noise demonstrate the potential of the new approach to yield performance comparable to that of listeners.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 0780324315</identifier><identifier>ISBN: 9780780324312</identifier><identifier>EISSN: 2379-190X</identifier><identifier>DOI: 10.1109/ICASSP.1995.479606</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustic noise ; Automatic speech recognition ; Computational modeling ; Hidden Markov models ; Image analysis ; Noise robustness ; Speech analysis ; Speech enhancement ; Speech recognition ; Working environment noise</subject><ispartof>1995 International Conference on Acoustics, Speech, and Signal Processing, 1995, Vol.1, p.401-404 vol.1</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/479606$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/479606$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Green, P.D.</creatorcontrib><creatorcontrib>Cooke, M.P.</creatorcontrib><creatorcontrib>Crawford, M.D.</creatorcontrib><title>Auditory scene analysis and hidden Markov model recognition of speech in noise</title><title>1995 International Conference on Acoustics, Speech, and Signal Processing</title><addtitle>ICASSP</addtitle><description>We describe a novel paradigm for automatic speech recognition in noisy environments in which an initial stage of auditory scene analysis separates out the evidence for the speech to be recognised from the evidence for other sounds. In general, this evidence will be incomplete, since intruding sound sources will dominate some spectro-temporal regions. We generalise continuous-density hidden Markov model recognition to this 'occluded speech' case. The technique is based on estimating the probability that a Gaussian mixture density distribution for an auditory firing rate map will generate an observation such that the separated components are at their observed values and the remaining components are not greater than their values in the acoustic mixture. Experiments on isolated digit recognition in noise demonstrate the potential of the new approach to yield performance comparable to that of listeners.</description><subject>Acoustic noise</subject><subject>Automatic speech recognition</subject><subject>Computational modeling</subject><subject>Hidden Markov models</subject><subject>Image analysis</subject><subject>Noise robustness</subject><subject>Speech analysis</subject><subject>Speech enhancement</subject><subject>Speech recognition</subject><subject>Working environment noise</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>0780324315</isbn><isbn>9780780324312</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1995</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNp9jstqAkEQAJs8wNXkBzz1D-zas-85ihiSQ0TQgzcZdtrYyTojMyrs3yeQnHOqgroUwFRRphTp2dtivtmsM6V1lZWNrqm-gyQvGp0qTbt7GFPTUpGXhaoeIFFVTmmtSj2CcYyfRNQ2ZZvAan61cvFhwNixYzTO9EOU-CMWj2ItO3w34cvf8OQt9xi48x9OLuId-gPGM3N3RHHovER-gseD6SM__3EC05fldvGaCjPvz0FOJgz7393i3_gNMMFBUw</recordid><startdate>1995</startdate><enddate>1995</enddate><creator>Green, P.D.</creator><creator>Cooke, M.P.</creator><creator>Crawford, M.D.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>1995</creationdate><title>Auditory scene analysis and hidden Markov model recognition of speech in noise</title><author>Green, P.D. ; Cooke, M.P. ; Crawford, M.D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_4796063</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1995</creationdate><topic>Acoustic noise</topic><topic>Automatic speech recognition</topic><topic>Computational modeling</topic><topic>Hidden Markov models</topic><topic>Image analysis</topic><topic>Noise robustness</topic><topic>Speech analysis</topic><topic>Speech enhancement</topic><topic>Speech recognition</topic><topic>Working environment noise</topic><toplevel>online_resources</toplevel><creatorcontrib>Green, P.D.</creatorcontrib><creatorcontrib>Cooke, M.P.</creatorcontrib><creatorcontrib>Crawford, M.D.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Green, P.D.</au><au>Cooke, M.P.</au><au>Crawford, M.D.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Auditory scene analysis and hidden Markov model recognition of speech in noise</atitle><btitle>1995 International Conference on Acoustics, Speech, and Signal Processing</btitle><stitle>ICASSP</stitle><date>1995</date><risdate>1995</risdate><volume>1</volume><spage>401</spage><epage>404 vol.1</epage><pages>401-404 vol.1</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>0780324315</isbn><isbn>9780780324312</isbn><abstract>We describe a novel paradigm for automatic speech recognition in noisy environments in which an initial stage of auditory scene analysis separates out the evidence for the speech to be recognised from the evidence for other sounds. In general, this evidence will be incomplete, since intruding sound sources will dominate some spectro-temporal regions. We generalise continuous-density hidden Markov model recognition to this 'occluded speech' case. The technique is based on estimating the probability that a Gaussian mixture density distribution for an auditory firing rate map will generate an observation such that the separated components are at their observed values and the remaining components are not greater than their values in the acoustic mixture. Experiments on isolated digit recognition in noise demonstrate the potential of the new approach to yield performance comparable to that of listeners.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.1995.479606</doi></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1520-6149
ispartof	1995 International Conference on Acoustics, Speech, and Signal Processing, 1995, Vol.1, p.401-404 vol.1
issn	1520-6149 2379-190X
language	eng
recordid	cdi_ieee_primary_479606
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Acoustic noise Automatic speech recognition Computational modeling Hidden Markov models Image analysis Noise robustness Speech analysis Speech enhancement Speech recognition Working environment noise
title	Auditory scene analysis and hidden Markov model recognition of speech in noise
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T13%3A02%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Auditory%20scene%20analysis%20and%20hidden%20Markov%20model%20recognition%20of%20speech%20in%20noise&rft.btitle=1995%20International%20Conference%20on%20Acoustics,%20Speech,%20and%20Signal%20Processing&rft.au=Green,%20P.D.&rft.date=1995&rft.volume=1&rft.spage=401&rft.epage=404%20vol.1&rft.pages=401-404%20vol.1&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=0780324315&rft.isbn_list=9780780324312&rft_id=info:doi/10.1109/ICASSP.1995.479606&rft_dat=%3Cieee_6IE%3E479606%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=479606&rfr_iscdi=true