The robustness of speech representations obtained from simulated auditory nerve fibers under different noise conditions

Different methods of extracting speech features from an auditory model were systematically investigated in terms of their robustness to different noises. The methods either computed the average firing rate within frequency channels (spectral features) or inter-spike-intervals (timing features) from...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of the Acoustical Society of America 2013-09, Vol.134 (3), p.EL282-EL288
Hauptverfasser:	Jürgens, Tim, Brand, Thomas, Clark, Nicholas R, Meddis, Ray, Brown, Guy J
Format:	Artikel
Sprache:	eng
Schlagworte:	Cochlear Nerve - physiology Computer Simulation Fourier Analysis Humans Models, Neurological Noise Pattern Recognition, Automated Signal-To-Noise Ratio Sound Spectrography Speech Acoustics Speech Production Measurement - methods Speech Recognition Software Time Factors
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	EL288
container_issue	3
container_start_page	EL282
container_title	The Journal of the Acoustical Society of America
container_volume	134
creator	Jürgens, Tim Brand, Thomas Clark, Nicholas R Meddis, Ray Brown, Guy J
description	Different methods of extracting speech features from an auditory model were systematically investigated in terms of their robustness to different noises. The methods either computed the average firing rate within frequency channels (spectral features) or inter-spike-intervals (timing features) from the simulated auditory nerve response. When used as the front-end for an automatic speech recognizer, timing features outperformed spectral features in Gaussian noise. However, this advantage was lost in babble, because timing features extracted the spectro-temporal structure of babble noise, which is similar to the target speaker. This suggests that different feature extraction methods are optimal depending on the background noise.
doi_str_mv	10.1121/1.4817912
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1443381794</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1443381794</sourcerecordid><originalsourceid>FETCH-LOGICAL-c320t-8a1927cbe4efee8cb9bad69a382affb7a93a23d03dbd6c29efba8299f3b56d483</originalsourceid><addsrcrecordid>eNo9kDtPwzAURi0EoqUw8AeQRxhS_Epqj6jiJVViKXNkx9eqUWIHOwHx70nVwnT16R6d4SB0TcmSUkbv6VJIulKUnaA5LRkpZMnEKZoTQmghVFXN0EXOH9MsJVfnaMa4qiSp6Bx9b3eAUzRjHgLkjKPDuQdodjhBnyBDGPTgY5g-ZtA-gMUuxQ5n342tHqapR-uHmH5wgPQF2HkDKeMxWEjYeucgTQ4cos-AmxgmeK-7RGdOtxmujneB3p8et-uXYvP2_Lp-2BQNZ2QopKaKrRoDAhyAbIwy2lZKc8m0c2alFdeMW8KtsVXDFDijJVPKcVNWVki-QLcHb5_i5wh5qDufG2hbHSCOuaZCcL6PJyb07oA2KeacwNV98p1OPzUl9b5zTetj54m9OWpH04H9J__C8l8BR3vI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1443381794</pqid></control><display><type>article</type><title>The robustness of speech representations obtained from simulated auditory nerve fibers under different noise conditions</title><source>MEDLINE</source><source>American Institute of Physics (AIP) Journals</source><source>Alma/SFX Local Collection</source><source>AIP Acoustical Society of America</source><creator>Jürgens, Tim ; Brand, Thomas ; Clark, Nicholas R ; Meddis, Ray ; Brown, Guy J</creator><creatorcontrib>Jürgens, Tim ; Brand, Thomas ; Clark, Nicholas R ; Meddis, Ray ; Brown, Guy J</creatorcontrib><description>Different methods of extracting speech features from an auditory model were systematically investigated in terms of their robustness to different noises. The methods either computed the average firing rate within frequency channels (spectral features) or inter-spike-intervals (timing features) from the simulated auditory nerve response. When used as the front-end for an automatic speech recognizer, timing features outperformed spectral features in Gaussian noise. However, this advantage was lost in babble, because timing features extracted the spectro-temporal structure of babble noise, which is similar to the target speaker. This suggests that different feature extraction methods are optimal depending on the background noise.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.4817912</identifier><identifier>PMID: 23968061</identifier><language>eng</language><publisher>United States</publisher><subject>Cochlear Nerve - physiology ; Computer Simulation ; Fourier Analysis ; Humans ; Models, Neurological ; Noise ; Pattern Recognition, Automated ; Signal-To-Noise Ratio ; Sound Spectrography ; Speech Acoustics ; Speech Production Measurement - methods ; Speech Recognition Software ; Time Factors</subject><ispartof>The Journal of the Acoustical Society of America, 2013-09, Vol.134 (3), p.EL282-EL288</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c320t-8a1927cbe4efee8cb9bad69a382affb7a93a23d03dbd6c29efba8299f3b56d483</citedby><cites>FETCH-LOGICAL-c320t-8a1927cbe4efee8cb9bad69a382affb7a93a23d03dbd6c29efba8299f3b56d483</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>207,208,314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23968061$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Jürgens, Tim</creatorcontrib><creatorcontrib>Brand, Thomas</creatorcontrib><creatorcontrib>Clark, Nicholas R</creatorcontrib><creatorcontrib>Meddis, Ray</creatorcontrib><creatorcontrib>Brown, Guy J</creatorcontrib><title>The robustness of speech representations obtained from simulated auditory nerve fibers under different noise conditions</title><title>The Journal of the Acoustical Society of America</title><addtitle>J Acoust Soc Am</addtitle><description>Different methods of extracting speech features from an auditory model were systematically investigated in terms of their robustness to different noises. The methods either computed the average firing rate within frequency channels (spectral features) or inter-spike-intervals (timing features) from the simulated auditory nerve response. When used as the front-end for an automatic speech recognizer, timing features outperformed spectral features in Gaussian noise. However, this advantage was lost in babble, because timing features extracted the spectro-temporal structure of babble noise, which is similar to the target speaker. This suggests that different feature extraction methods are optimal depending on the background noise.</description><subject>Cochlear Nerve - physiology</subject><subject>Computer Simulation</subject><subject>Fourier Analysis</subject><subject>Humans</subject><subject>Models, Neurological</subject><subject>Noise</subject><subject>Pattern Recognition, Automated</subject><subject>Signal-To-Noise Ratio</subject><subject>Sound Spectrography</subject><subject>Speech Acoustics</subject><subject>Speech Production Measurement - methods</subject><subject>Speech Recognition Software</subject><subject>Time Factors</subject><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNo9kDtPwzAURi0EoqUw8AeQRxhS_Epqj6jiJVViKXNkx9eqUWIHOwHx70nVwnT16R6d4SB0TcmSUkbv6VJIulKUnaA5LRkpZMnEKZoTQmghVFXN0EXOH9MsJVfnaMa4qiSp6Bx9b3eAUzRjHgLkjKPDuQdodjhBnyBDGPTgY5g-ZtA-gMUuxQ5n342tHqapR-uHmH5wgPQF2HkDKeMxWEjYeucgTQ4cos-AmxgmeK-7RGdOtxmujneB3p8et-uXYvP2_Lp-2BQNZ2QopKaKrRoDAhyAbIwy2lZKc8m0c2alFdeMW8KtsVXDFDijJVPKcVNWVki-QLcHb5_i5wh5qDufG2hbHSCOuaZCcL6PJyb07oA2KeacwNV98p1OPzUl9b5zTetj54m9OWpH04H9J__C8l8BR3vI</recordid><startdate>201309</startdate><enddate>201309</enddate><creator>Jürgens, Tim</creator><creator>Brand, Thomas</creator><creator>Clark, Nicholas R</creator><creator>Meddis, Ray</creator><creator>Brown, Guy J</creator><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>201309</creationdate><title>The robustness of speech representations obtained from simulated auditory nerve fibers under different noise conditions</title><author>Jürgens, Tim ; Brand, Thomas ; Clark, Nicholas R ; Meddis, Ray ; Brown, Guy J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c320t-8a1927cbe4efee8cb9bad69a382affb7a93a23d03dbd6c29efba8299f3b56d483</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Cochlear Nerve - physiology</topic><topic>Computer Simulation</topic><topic>Fourier Analysis</topic><topic>Humans</topic><topic>Models, Neurological</topic><topic>Noise</topic><topic>Pattern Recognition, Automated</topic><topic>Signal-To-Noise Ratio</topic><topic>Sound Spectrography</topic><topic>Speech Acoustics</topic><topic>Speech Production Measurement - methods</topic><topic>Speech Recognition Software</topic><topic>Time Factors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jürgens, Tim</creatorcontrib><creatorcontrib>Brand, Thomas</creatorcontrib><creatorcontrib>Clark, Nicholas R</creatorcontrib><creatorcontrib>Meddis, Ray</creatorcontrib><creatorcontrib>Brown, Guy J</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jürgens, Tim</au><au>Brand, Thomas</au><au>Clark, Nicholas R</au><au>Meddis, Ray</au><au>Brown, Guy J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The robustness of speech representations obtained from simulated auditory nerve fibers under different noise conditions</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><addtitle>J Acoust Soc Am</addtitle><date>2013-09</date><risdate>2013</risdate><volume>134</volume><issue>3</issue><spage>EL282</spage><epage>EL288</epage><pages>EL282-EL288</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><abstract>Different methods of extracting speech features from an auditory model were systematically investigated in terms of their robustness to different noises. The methods either computed the average firing rate within frequency channels (spectral features) or inter-spike-intervals (timing features) from the simulated auditory nerve response. When used as the front-end for an automatic speech recognizer, timing features outperformed spectral features in Gaussian noise. However, this advantage was lost in babble, because timing features extracted the spectro-temporal structure of babble noise, which is similar to the target speaker. This suggests that different feature extraction methods are optimal depending on the background noise.</abstract><cop>United States</cop><pmid>23968061</pmid><doi>10.1121/1.4817912</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0001-4966
ispartof	The Journal of the Acoustical Society of America, 2013-09, Vol.134 (3), p.EL282-EL288
issn	0001-4966 1520-8524
language	eng
recordid	cdi_proquest_miscellaneous_1443381794
source	MEDLINE; American Institute of Physics (AIP) Journals; Alma/SFX Local Collection; AIP Acoustical Society of America
subjects	Cochlear Nerve - physiology Computer Simulation Fourier Analysis Humans Models, Neurological Noise Pattern Recognition, Automated Signal-To-Noise Ratio Sound Spectrography Speech Acoustics Speech Production Measurement - methods Speech Recognition Software Time Factors
title	The robustness of speech representations obtained from simulated auditory nerve fibers under different noise conditions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T09%3A06%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20robustness%20of%20speech%20representations%20obtained%20from%20simulated%20auditory%20nerve%20fibers%20under%20different%20noise%20conditions&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=J%C3%BCrgens,%20Tim&rft.date=2013-09&rft.volume=134&rft.issue=3&rft.spage=EL282&rft.epage=EL288&rft.pages=EL282-EL288&rft.issn=0001-4966&rft.eissn=1520-8524&rft_id=info:doi/10.1121/1.4817912&rft_dat=%3Cproquest_cross%3E1443381794%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1443381794&rft_id=info:pmid/23968061&rfr_iscdi=true