Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences

Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 2010-10, Vol.128 (4), p.2112-2126
Hauptverfasser: Stilp, Christian E., Kiefte, Michael, Alexander, Joshua M., Kluender, Keith R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2126
container_issue 4
container_start_page 2112
container_title The Journal of the Acoustical Society of America
container_volume 128
creator Stilp, Christian E.
Kiefte, Michael
Alexander, Joshua M.
Kluender, Keith R.
description Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.
doi_str_mv 10.1121/1.3483719
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2981123</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>831205954</sourcerecordid><originalsourceid>FETCH-LOGICAL-c589t-1791e1e88b67332ae1475e927ab5b8c938da043451fc9eb0f0b48f5e7684a31b3</originalsourceid><addsrcrecordid>eNqNkkuLFDEUhYMoTju68A9IbURc1JibR1WyEYbGFwy40XVIpW7NRNKVMkkP9L83bZczuhBdhZAvJznnXEKeA70AYPAGLrhQvAf9gGxAMtoqycRDsqGUQit0152RJzl_q1upuH5MzhjVneKKbYjfRncT0LbZ2YBjkxd0JdnQ4FxSXA7NknD0ruQm2YKtn29t8nYujZ8LhuCv_eCDL4cmTk3B3RLr3XBoRp9LTOUoWIVwdpifkkeTDRmfres5-fr-3Zftx_bq84dP28ur1kmlSwu9BgRUauh6zplFEL1EzXo7yEE5zdVoqeBCwuQ0DnSig1CTxL5TwnIY-Dl5e9Jd9sMOR3c0YoNZkt_ZdDDRevPnyexvzHW8NUyrmiavAq9WgRS_7zEXs_PZVbN2xrjPpiYthKpB_gcJjEotxT_JvqOsdihlJV-fSJdizgmnu58DNce2DZi17cq--N3qHfmr3gq8XAF77HdKdnY-33OcVxsK7jPLzhdbfJz__uo6MOY0MObnwPAfsxzJVw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>760234855</pqid></control><display><type>article</type><title>Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences</title><source>MEDLINE</source><source>AIP Journals Complete</source><source>Alma/SFX Local Collection</source><source>AIP Acoustical Society of America</source><creator>Stilp, Christian E. ; Kiefte, Michael ; Alexander, Joshua M. ; Kluender, Keith R.</creator><creatorcontrib>Stilp, Christian E. ; Kiefte, Michael ; Alexander, Joshua M. ; Kluender, Keith R.</creatorcontrib><description>Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.3483719</identifier><identifier>PMID: 20968382</identifier><identifier>CODEN: JASMAN</identifier><language>eng</language><publisher>Melville, NY: Acoustical Society of America</publisher><subject>Acoustic Stimulation ; Audiometry ; Audition ; Biological and medical sciences ; Cochlea - physiology ; Distortion ; Ear and associated structures. Auditory pathways and centers. Hearing. Vocal organ. Phonation. Sound production. Echolocation ; Entropy ; Fundamental and applied biological sciences. Psychology ; Humans ; Intelligibility ; Male ; Models, Theoretical ; Perception ; Phonetics ; Psychology. Psychoanalysis. Psychiatry ; Psychology. Psychophysiology ; Segments ; Sentences ; Sound Spectrography ; Spectra ; Speech ; Speech Acoustics ; Speech Intelligibility ; Speech Perception ; Time Factors ; Vertebrates: nervous system and sense organs</subject><ispartof>The Journal of the Acoustical Society of America, 2010-10, Vol.128 (4), p.2112-2126</ispartof><rights>2010 Acoustical Society of America</rights><rights>2015 INIST-CNRS</rights><rights>Copyright © 2010 Acoustical Society of America 2010 Acoustical Society of America</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c589t-1791e1e88b67332ae1475e927ab5b8c938da043451fc9eb0f0b48f5e7684a31b3</citedby><cites>FETCH-LOGICAL-c589t-1791e1e88b67332ae1475e927ab5b8c938da043451fc9eb0f0b48f5e7684a31b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/jasa/article-lookup/doi/10.1121/1.3483719$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>207,208,230,314,776,780,790,881,1559,4498,27901,27902,76126</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=23383981$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/20968382$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Stilp, Christian E.</creatorcontrib><creatorcontrib>Kiefte, Michael</creatorcontrib><creatorcontrib>Alexander, Joshua M.</creatorcontrib><creatorcontrib>Kluender, Keith R.</creatorcontrib><title>Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences</title><title>The Journal of the Acoustical Society of America</title><addtitle>J Acoust Soc Am</addtitle><description>Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.</description><subject>Acoustic Stimulation</subject><subject>Audiometry</subject><subject>Audition</subject><subject>Biological and medical sciences</subject><subject>Cochlea - physiology</subject><subject>Distortion</subject><subject>Ear and associated structures. Auditory pathways and centers. Hearing. Vocal organ. Phonation. Sound production. Echolocation</subject><subject>Entropy</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Humans</subject><subject>Intelligibility</subject><subject>Male</subject><subject>Models, Theoretical</subject><subject>Perception</subject><subject>Phonetics</subject><subject>Psychology. Psychoanalysis. Psychiatry</subject><subject>Psychology. Psychophysiology</subject><subject>Segments</subject><subject>Sentences</subject><subject>Sound Spectrography</subject><subject>Spectra</subject><subject>Speech</subject><subject>Speech Acoustics</subject><subject>Speech Intelligibility</subject><subject>Speech Perception</subject><subject>Time Factors</subject><subject>Vertebrates: nervous system and sense organs</subject><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkkuLFDEUhYMoTju68A9IbURc1JibR1WyEYbGFwy40XVIpW7NRNKVMkkP9L83bZczuhBdhZAvJznnXEKeA70AYPAGLrhQvAf9gGxAMtoqycRDsqGUQit0152RJzl_q1upuH5MzhjVneKKbYjfRncT0LbZ2YBjkxd0JdnQ4FxSXA7NknD0ruQm2YKtn29t8nYujZ8LhuCv_eCDL4cmTk3B3RLr3XBoRp9LTOUoWIVwdpifkkeTDRmfres5-fr-3Zftx_bq84dP28ur1kmlSwu9BgRUauh6zplFEL1EzXo7yEE5zdVoqeBCwuQ0DnSig1CTxL5TwnIY-Dl5e9Jd9sMOR3c0YoNZkt_ZdDDRevPnyexvzHW8NUyrmiavAq9WgRS_7zEXs_PZVbN2xrjPpiYthKpB_gcJjEotxT_JvqOsdihlJV-fSJdizgmnu58DNce2DZi17cq--N3qHfmr3gq8XAF77HdKdnY-33OcVxsK7jPLzhdbfJz__uo6MOY0MObnwPAfsxzJVw</recordid><startdate>20101001</startdate><enddate>20101001</enddate><creator>Stilp, Christian E.</creator><creator>Kiefte, Michael</creator><creator>Alexander, Joshua M.</creator><creator>Kluender, Keith R.</creator><general>Acoustical Society of America</general><general>American Institute of Physics</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope><scope>7T9</scope><scope>5PM</scope></search><sort><creationdate>20101001</creationdate><title>Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences</title><author>Stilp, Christian E. ; Kiefte, Michael ; Alexander, Joshua M. ; Kluender, Keith R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c589t-1791e1e88b67332ae1475e927ab5b8c938da043451fc9eb0f0b48f5e7684a31b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Acoustic Stimulation</topic><topic>Audiometry</topic><topic>Audition</topic><topic>Biological and medical sciences</topic><topic>Cochlea - physiology</topic><topic>Distortion</topic><topic>Ear and associated structures. Auditory pathways and centers. Hearing. Vocal organ. Phonation. Sound production. Echolocation</topic><topic>Entropy</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Humans</topic><topic>Intelligibility</topic><topic>Male</topic><topic>Models, Theoretical</topic><topic>Perception</topic><topic>Phonetics</topic><topic>Psychology. Psychoanalysis. Psychiatry</topic><topic>Psychology. Psychophysiology</topic><topic>Segments</topic><topic>Sentences</topic><topic>Sound Spectrography</topic><topic>Spectra</topic><topic>Speech</topic><topic>Speech Acoustics</topic><topic>Speech Intelligibility</topic><topic>Speech Perception</topic><topic>Time Factors</topic><topic>Vertebrates: nervous system and sense organs</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Stilp, Christian E.</creatorcontrib><creatorcontrib>Kiefte, Michael</creatorcontrib><creatorcontrib>Alexander, Joshua M.</creatorcontrib><creatorcontrib>Kluender, Keith R.</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Stilp, Christian E.</au><au>Kiefte, Michael</au><au>Alexander, Joshua M.</au><au>Kluender, Keith R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><addtitle>J Acoust Soc Am</addtitle><date>2010-10-01</date><risdate>2010</risdate><volume>128</volume><issue>4</issue><spage>2112</spage><epage>2126</epage><pages>2112-2126</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><coden>JASMAN</coden><abstract>Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.</abstract><cop>Melville, NY</cop><pub>Acoustical Society of America</pub><pmid>20968382</pmid><doi>10.1121/1.3483719</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0001-4966
ispartof The Journal of the Acoustical Society of America, 2010-10, Vol.128 (4), p.2112-2126
issn 0001-4966
1520-8524
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2981123
source MEDLINE; AIP Journals Complete; Alma/SFX Local Collection; AIP Acoustical Society of America
subjects Acoustic Stimulation
Audiometry
Audition
Biological and medical sciences
Cochlea - physiology
Distortion
Ear and associated structures. Auditory pathways and centers. Hearing. Vocal organ. Phonation. Sound production. Echolocation
Entropy
Fundamental and applied biological sciences. Psychology
Humans
Intelligibility
Male
Models, Theoretical
Perception
Phonetics
Psychology. Psychoanalysis. Psychiatry
Psychology. Psychophysiology
Segments
Sentences
Sound Spectrography
Spectra
Speech
Speech Acoustics
Speech Intelligibility
Speech Perception
Time Factors
Vertebrates: nervous system and sense organs
title Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T06%3A18%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cochlea-scaled%20spectral%20entropy%20predicts%20rate-invariant%20intelligibility%20of%20temporally%20distorted%20sentences&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Stilp,%20Christian%20E.&rft.date=2010-10-01&rft.volume=128&rft.issue=4&rft.spage=2112&rft.epage=2126&rft.pages=2112-2126&rft.issn=0001-4966&rft.eissn=1520-8524&rft.coden=JASMAN&rft_id=info:doi/10.1121/1.3483719&rft_dat=%3Cproquest_pubme%3E831205954%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=760234855&rft_id=info:pmid/20968382&rfr_iscdi=true