Sparse gammatone signal model optimized for English speech does not match the human auditory filters

Abstract Evidence that neurosensory systems use sparse signal representations as well as improved performance of signal processing algorithms using sparse signal models raised interest in sparse signal coding in the last years. For natural audio signals like speech and environmental sounds, gammaton...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Brain research 2008-07, Vol.1220 (18 July), p.224-233
Hauptverfasser:	Strahl, Stefan, Mertins, Alfred
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic Stimulation - methods Adult Auditory Perception - physiology Efficient coding hypothesis Gammatone Humans Matching pursuit Models, Biological Neurology Noise Psychoacoustics Reaction Time Sparse coding Speech - physiology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	233
container_issue	18 July
container_start_page	224
container_title	Brain research
container_volume	1220
creator	Strahl, Stefan Mertins, Alfred
description	Abstract Evidence that neurosensory systems use sparse signal representations as well as improved performance of signal processing algorithms using sparse signal models raised interest in sparse signal coding in the last years. For natural audio signals like speech and environmental sounds, gammatone atoms have been derived as expansion functions that generate a nearly optimal sparse signal model (Smith, E., Lewicki, M., 2006. Efficient auditory coding. Nature 439, 978–982). Furthermore, gammatone functions are established models for the human auditory filters. Thus far, a practical application of a sparse gammatone signal model has been prevented by the fact that deriving the sparsest representation is, in general, computationally intractable. In this paper, we applied an accelerated version of the matching pursuit algorithm for gammatone dictionaries allowing real-time and large data set applications. We show that a sparse signal model in general has advantages in audio coding and that a sparse gammatone signal model encodes speech more efficiently in terms of sparseness than a sparse modified discrete cosine transform (MDCT) signal model. We also show that the optimal gammatone parameters derived for English speech do not match the human auditory filters, suggesting for signal processing applications to derive the parameters individually for each applied signal class instead of using psychometrically derived parameters. For brain research, it means that care should be taken with directly transferring findings of optimality for technical to biological systems.
doi_str_mv	10.1016/j.brainres.2007.11.059
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_85695793</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>1_s2_0_S000689930702865X</els_id><sourcerecordid>20876977</sourcerecordid><originalsourceid>FETCH-LOGICAL-c483t-9c76f0226ad075450b7ec5fe24cae195f056a1b9b66408264e0ff82e8464f1e93</originalsourceid><addsrcrecordid>eNqFks2LFDEQxYMo7rj6Lyw5eZu2ku7Ox0WUZf2ABQ-r4C2k05WZjN2dNukWxr_eDDMieNlTUfDeK6jfI-SGQcWAiTeHqks2TAlzxQFkxVgFrX5CNkxJvhW8gadkAwBiq7Sur8iLnA9lrWsNz8kVU7yEKL0h_cNsU0a6s-NolzghzWE32YGOsceBxnkJY_iNPfUx0btpN4S8p3lGdHvaR8x0igstzrIue6T7dbQTtWsflpiO1IdhwZRfkmfeDhlfXeY1-fbh7uvtp-39l4-fb9_fb12j6mWrnRQeOBe2B9k2LXQSXeuRN84i062HVljW6U6IBhQXDYL3iqNqROMZ6vqavD7nzin-XDEvZgzZ4TDYCeOajWqFbqWuHxVyUFJoKYtQnIUuxZwTejOnMNp0NAzMCYQ5mL8gzAmEYcwUEMV4c7mwdiP2_2yXzxfBu7MAy0N-BUwmu4CTwz4kdIvpY3j8xtv_ItwQpuDs8AOPmA9xTYVkNsxkbsA8nOpwagNI4Eq03-s_eR-zGQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>20876977</pqid></control><display><type>article</type><title>Sparse gammatone signal model optimized for English speech does not match the human auditory filters</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals Complete</source><creator>Strahl, Stefan ; Mertins, Alfred</creator><creatorcontrib>Strahl, Stefan ; Mertins, Alfred</creatorcontrib><description>Abstract Evidence that neurosensory systems use sparse signal representations as well as improved performance of signal processing algorithms using sparse signal models raised interest in sparse signal coding in the last years. For natural audio signals like speech and environmental sounds, gammatone atoms have been derived as expansion functions that generate a nearly optimal sparse signal model (Smith, E., Lewicki, M., 2006. Efficient auditory coding. Nature 439, 978–982). Furthermore, gammatone functions are established models for the human auditory filters. Thus far, a practical application of a sparse gammatone signal model has been prevented by the fact that deriving the sparsest representation is, in general, computationally intractable. In this paper, we applied an accelerated version of the matching pursuit algorithm for gammatone dictionaries allowing real-time and large data set applications. We show that a sparse signal model in general has advantages in audio coding and that a sparse gammatone signal model encodes speech more efficiently in terms of sparseness than a sparse modified discrete cosine transform (MDCT) signal model. We also show that the optimal gammatone parameters derived for English speech do not match the human auditory filters, suggesting for signal processing applications to derive the parameters individually for each applied signal class instead of using psychometrically derived parameters. For brain research, it means that care should be taken with directly transferring findings of optimality for technical to biological systems.</description><identifier>ISSN: 0006-8993</identifier><identifier>EISSN: 1872-6240</identifier><identifier>DOI: 10.1016/j.brainres.2007.11.059</identifier><identifier>PMID: 18201689</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>Acoustic Stimulation - methods ; Adult ; Auditory Perception - physiology ; Efficient coding hypothesis ; Gammatone ; Humans ; Matching pursuit ; Models, Biological ; Neurology ; Noise ; Psychoacoustics ; Reaction Time ; Sparse coding ; Speech - physiology</subject><ispartof>Brain research, 2008-07, Vol.1220 (18 July), p.224-233</ispartof><rights>Elsevier B.V.</rights><rights>2007 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c483t-9c76f0226ad075450b7ec5fe24cae195f056a1b9b66408264e0ff82e8464f1e93</citedby><cites>FETCH-LOGICAL-c483t-9c76f0226ad075450b7ec5fe24cae195f056a1b9b66408264e0ff82e8464f1e93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.brainres.2007.11.059$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/18201689$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Strahl, Stefan</creatorcontrib><creatorcontrib>Mertins, Alfred</creatorcontrib><title>Sparse gammatone signal model optimized for English speech does not match the human auditory filters</title><title>Brain research</title><addtitle>Brain Res</addtitle><description>Abstract Evidence that neurosensory systems use sparse signal representations as well as improved performance of signal processing algorithms using sparse signal models raised interest in sparse signal coding in the last years. For natural audio signals like speech and environmental sounds, gammatone atoms have been derived as expansion functions that generate a nearly optimal sparse signal model (Smith, E., Lewicki, M., 2006. Efficient auditory coding. Nature 439, 978–982). Furthermore, gammatone functions are established models for the human auditory filters. Thus far, a practical application of a sparse gammatone signal model has been prevented by the fact that deriving the sparsest representation is, in general, computationally intractable. In this paper, we applied an accelerated version of the matching pursuit algorithm for gammatone dictionaries allowing real-time and large data set applications. We show that a sparse signal model in general has advantages in audio coding and that a sparse gammatone signal model encodes speech more efficiently in terms of sparseness than a sparse modified discrete cosine transform (MDCT) signal model. We also show that the optimal gammatone parameters derived for English speech do not match the human auditory filters, suggesting for signal processing applications to derive the parameters individually for each applied signal class instead of using psychometrically derived parameters. For brain research, it means that care should be taken with directly transferring findings of optimality for technical to biological systems.</description><subject>Acoustic Stimulation - methods</subject><subject>Adult</subject><subject>Auditory Perception - physiology</subject><subject>Efficient coding hypothesis</subject><subject>Gammatone</subject><subject>Humans</subject><subject>Matching pursuit</subject><subject>Models, Biological</subject><subject>Neurology</subject><subject>Noise</subject><subject>Psychoacoustics</subject><subject>Reaction Time</subject><subject>Sparse coding</subject><subject>Speech - physiology</subject><issn>0006-8993</issn><issn>1872-6240</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFks2LFDEQxYMo7rj6Lyw5eZu2ku7Ox0WUZf2ABQ-r4C2k05WZjN2dNukWxr_eDDMieNlTUfDeK6jfI-SGQcWAiTeHqks2TAlzxQFkxVgFrX5CNkxJvhW8gadkAwBiq7Sur8iLnA9lrWsNz8kVU7yEKL0h_cNsU0a6s-NolzghzWE32YGOsceBxnkJY_iNPfUx0btpN4S8p3lGdHvaR8x0igstzrIue6T7dbQTtWsflpiO1IdhwZRfkmfeDhlfXeY1-fbh7uvtp-39l4-fb9_fb12j6mWrnRQeOBe2B9k2LXQSXeuRN84i062HVljW6U6IBhQXDYL3iqNqROMZ6vqavD7nzin-XDEvZgzZ4TDYCeOajWqFbqWuHxVyUFJoKYtQnIUuxZwTejOnMNp0NAzMCYQ5mL8gzAmEYcwUEMV4c7mwdiP2_2yXzxfBu7MAy0N-BUwmu4CTwz4kdIvpY3j8xtv_ItwQpuDs8AOPmA9xTYVkNsxkbsA8nOpwagNI4Eq03-s_eR-zGQ</recordid><startdate>20080718</startdate><enddate>20080718</enddate><creator>Strahl, Stefan</creator><creator>Mertins, Alfred</creator><general>Elsevier B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7TK</scope><scope>7T9</scope></search><sort><creationdate>20080718</creationdate><title>Sparse gammatone signal model optimized for English speech does not match the human auditory filters</title><author>Strahl, Stefan ; Mertins, Alfred</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c483t-9c76f0226ad075450b7ec5fe24cae195f056a1b9b66408264e0ff82e8464f1e93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Acoustic Stimulation - methods</topic><topic>Adult</topic><topic>Auditory Perception - physiology</topic><topic>Efficient coding hypothesis</topic><topic>Gammatone</topic><topic>Humans</topic><topic>Matching pursuit</topic><topic>Models, Biological</topic><topic>Neurology</topic><topic>Noise</topic><topic>Psychoacoustics</topic><topic>Reaction Time</topic><topic>Sparse coding</topic><topic>Speech - physiology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Strahl, Stefan</creatorcontrib><creatorcontrib>Mertins, Alfred</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Neurosciences Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>Brain research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Strahl, Stefan</au><au>Mertins, Alfred</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sparse gammatone signal model optimized for English speech does not match the human auditory filters</atitle><jtitle>Brain research</jtitle><addtitle>Brain Res</addtitle><date>2008-07-18</date><risdate>2008</risdate><volume>1220</volume><issue>18 July</issue><spage>224</spage><epage>233</epage><pages>224-233</pages><issn>0006-8993</issn><eissn>1872-6240</eissn><abstract>Abstract Evidence that neurosensory systems use sparse signal representations as well as improved performance of signal processing algorithms using sparse signal models raised interest in sparse signal coding in the last years. For natural audio signals like speech and environmental sounds, gammatone atoms have been derived as expansion functions that generate a nearly optimal sparse signal model (Smith, E., Lewicki, M., 2006. Efficient auditory coding. Nature 439, 978–982). Furthermore, gammatone functions are established models for the human auditory filters. Thus far, a practical application of a sparse gammatone signal model has been prevented by the fact that deriving the sparsest representation is, in general, computationally intractable. In this paper, we applied an accelerated version of the matching pursuit algorithm for gammatone dictionaries allowing real-time and large data set applications. We show that a sparse signal model in general has advantages in audio coding and that a sparse gammatone signal model encodes speech more efficiently in terms of sparseness than a sparse modified discrete cosine transform (MDCT) signal model. We also show that the optimal gammatone parameters derived for English speech do not match the human auditory filters, suggesting for signal processing applications to derive the parameters individually for each applied signal class instead of using psychometrically derived parameters. For brain research, it means that care should be taken with directly transferring findings of optimality for technical to biological systems.</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>18201689</pmid><doi>10.1016/j.brainres.2007.11.059</doi><tpages>10</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0006-8993
ispartof	Brain research, 2008-07, Vol.1220 (18 July), p.224-233
issn	0006-8993 1872-6240
language	eng
recordid	cdi_proquest_miscellaneous_85695793
source	MEDLINE; Elsevier ScienceDirect Journals Complete
subjects	Acoustic Stimulation - methods Adult Auditory Perception - physiology Efficient coding hypothesis Gammatone Humans Matching pursuit Models, Biological Neurology Noise Psychoacoustics Reaction Time Sparse coding Speech - physiology
title	Sparse gammatone signal model optimized for English speech does not match the human auditory filters
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T06%3A30%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sparse%20gammatone%20signal%20model%20optimized%20for%20English%20speech%20does%20not%20match%20the%20human%20auditory%20filters&rft.jtitle=Brain%20research&rft.au=Strahl,%20Stefan&rft.date=2008-07-18&rft.volume=1220&rft.issue=18%20July&rft.spage=224&rft.epage=233&rft.pages=224-233&rft.issn=0006-8993&rft.eissn=1872-6240&rft_id=info:doi/10.1016/j.brainres.2007.11.059&rft_dat=%3Cproquest_cross%3E20876977%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=20876977&rft_id=info:pmid/18201689&rft_els_id=1_s2_0_S000689930702865X&rfr_iscdi=true