Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences
Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of s...
Gespeichert in:
Veröffentlicht in: | The Journal of the Acoustical Society of America 2010-10, Vol.128 (4), p.2112-2126 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2126 |
---|---|
container_issue | 4 |
container_start_page | 2112 |
container_title | The Journal of the Acoustical Society of America |
container_volume | 128 |
creator | Stilp, Christian E. Kiefte, Michael Alexander, Joshua M. Kluender, Keith R. |
description | Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time. |
doi_str_mv | 10.1121/1.3483719 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2981123</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>831205954</sourcerecordid><originalsourceid>FETCH-LOGICAL-c589t-1791e1e88b67332ae1475e927ab5b8c938da043451fc9eb0f0b48f5e7684a31b3</originalsourceid><addsrcrecordid>eNqNkkuLFDEUhYMoTju68A9IbURc1JibR1WyEYbGFwy40XVIpW7NRNKVMkkP9L83bZczuhBdhZAvJznnXEKeA70AYPAGLrhQvAf9gGxAMtoqycRDsqGUQit0152RJzl_q1upuH5MzhjVneKKbYjfRncT0LbZ2YBjkxd0JdnQ4FxSXA7NknD0ruQm2YKtn29t8nYujZ8LhuCv_eCDL4cmTk3B3RLr3XBoRp9LTOUoWIVwdpifkkeTDRmfres5-fr-3Zftx_bq84dP28ur1kmlSwu9BgRUauh6zplFEL1EzXo7yEE5zdVoqeBCwuQ0DnSig1CTxL5TwnIY-Dl5e9Jd9sMOR3c0YoNZkt_ZdDDRevPnyexvzHW8NUyrmiavAq9WgRS_7zEXs_PZVbN2xrjPpiYthKpB_gcJjEotxT_JvqOsdihlJV-fSJdizgmnu58DNce2DZi17cq--N3qHfmr3gq8XAF77HdKdnY-33OcVxsK7jPLzhdbfJz__uo6MOY0MObnwPAfsxzJVw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>760234855</pqid></control><display><type>article</type><title>Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences</title><source>MEDLINE</source><source>AIP Journals Complete</source><source>Alma/SFX Local Collection</source><source>AIP Acoustical Society of America</source><creator>Stilp, Christian E. ; Kiefte, Michael ; Alexander, Joshua M. ; Kluender, Keith R.</creator><creatorcontrib>Stilp, Christian E. ; Kiefte, Michael ; Alexander, Joshua M. ; Kluender, Keith R.</creatorcontrib><description>Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.3483719</identifier><identifier>PMID: 20968382</identifier><identifier>CODEN: JASMAN</identifier><language>eng</language><publisher>Melville, NY: Acoustical Society of America</publisher><subject>Acoustic Stimulation ; Audiometry ; Audition ; Biological and medical sciences ; Cochlea - physiology ; Distortion ; Ear and associated structures. Auditory pathways and centers. Hearing. Vocal organ. Phonation. Sound production. Echolocation ; Entropy ; Fundamental and applied biological sciences. Psychology ; Humans ; Intelligibility ; Male ; Models, Theoretical ; Perception ; Phonetics ; Psychology. Psychoanalysis. Psychiatry ; Psychology. Psychophysiology ; Segments ; Sentences ; Sound Spectrography ; Spectra ; Speech ; Speech Acoustics ; Speech Intelligibility ; Speech Perception ; Time Factors ; Vertebrates: nervous system and sense organs</subject><ispartof>The Journal of the Acoustical Society of America, 2010-10, Vol.128 (4), p.2112-2126</ispartof><rights>2010 Acoustical Society of America</rights><rights>2015 INIST-CNRS</rights><rights>Copyright © 2010 Acoustical Society of America 2010 Acoustical Society of America</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c589t-1791e1e88b67332ae1475e927ab5b8c938da043451fc9eb0f0b48f5e7684a31b3</citedby><cites>FETCH-LOGICAL-c589t-1791e1e88b67332ae1475e927ab5b8c938da043451fc9eb0f0b48f5e7684a31b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/jasa/article-lookup/doi/10.1121/1.3483719$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>207,208,230,314,776,780,790,881,1559,4498,27901,27902,76126</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=23383981$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/20968382$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Stilp, Christian E.</creatorcontrib><creatorcontrib>Kiefte, Michael</creatorcontrib><creatorcontrib>Alexander, Joshua M.</creatorcontrib><creatorcontrib>Kluender, Keith R.</creatorcontrib><title>Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences</title><title>The Journal of the Acoustical Society of America</title><addtitle>J Acoust Soc Am</addtitle><description>Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.</description><subject>Acoustic Stimulation</subject><subject>Audiometry</subject><subject>Audition</subject><subject>Biological and medical sciences</subject><subject>Cochlea - physiology</subject><subject>Distortion</subject><subject>Ear and associated structures. Auditory pathways and centers. Hearing. Vocal organ. Phonation. Sound production. Echolocation</subject><subject>Entropy</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Humans</subject><subject>Intelligibility</subject><subject>Male</subject><subject>Models, Theoretical</subject><subject>Perception</subject><subject>Phonetics</subject><subject>Psychology. Psychoanalysis. Psychiatry</subject><subject>Psychology. Psychophysiology</subject><subject>Segments</subject><subject>Sentences</subject><subject>Sound Spectrography</subject><subject>Spectra</subject><subject>Speech</subject><subject>Speech Acoustics</subject><subject>Speech Intelligibility</subject><subject>Speech Perception</subject><subject>Time Factors</subject><subject>Vertebrates: nervous system and sense organs</subject><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkkuLFDEUhYMoTju68A9IbURc1JibR1WyEYbGFwy40XVIpW7NRNKVMkkP9L83bZczuhBdhZAvJznnXEKeA70AYPAGLrhQvAf9gGxAMtoqycRDsqGUQit0152RJzl_q1upuH5MzhjVneKKbYjfRncT0LbZ2YBjkxd0JdnQ4FxSXA7NknD0ruQm2YKtn29t8nYujZ8LhuCv_eCDL4cmTk3B3RLr3XBoRp9LTOUoWIVwdpifkkeTDRmfres5-fr-3Zftx_bq84dP28ur1kmlSwu9BgRUauh6zplFEL1EzXo7yEE5zdVoqeBCwuQ0DnSig1CTxL5TwnIY-Dl5e9Jd9sMOR3c0YoNZkt_ZdDDRevPnyexvzHW8NUyrmiavAq9WgRS_7zEXs_PZVbN2xrjPpiYthKpB_gcJjEotxT_JvqOsdihlJV-fSJdizgmnu58DNce2DZi17cq--N3qHfmr3gq8XAF77HdKdnY-33OcVxsK7jPLzhdbfJz__uo6MOY0MObnwPAfsxzJVw</recordid><startdate>20101001</startdate><enddate>20101001</enddate><creator>Stilp, Christian E.</creator><creator>Kiefte, Michael</creator><creator>Alexander, Joshua M.</creator><creator>Kluender, Keith R.</creator><general>Acoustical Society of America</general><general>American Institute of Physics</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope><scope>7T9</scope><scope>5PM</scope></search><sort><creationdate>20101001</creationdate><title>Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences</title><author>Stilp, Christian E. ; Kiefte, Michael ; Alexander, Joshua M. ; Kluender, Keith R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c589t-1791e1e88b67332ae1475e927ab5b8c938da043451fc9eb0f0b48f5e7684a31b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Acoustic Stimulation</topic><topic>Audiometry</topic><topic>Audition</topic><topic>Biological and medical sciences</topic><topic>Cochlea - physiology</topic><topic>Distortion</topic><topic>Ear and associated structures. Auditory pathways and centers. Hearing. Vocal organ. Phonation. Sound production. Echolocation</topic><topic>Entropy</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Humans</topic><topic>Intelligibility</topic><topic>Male</topic><topic>Models, Theoretical</topic><topic>Perception</topic><topic>Phonetics</topic><topic>Psychology. Psychoanalysis. Psychiatry</topic><topic>Psychology. Psychophysiology</topic><topic>Segments</topic><topic>Sentences</topic><topic>Sound Spectrography</topic><topic>Spectra</topic><topic>Speech</topic><topic>Speech Acoustics</topic><topic>Speech Intelligibility</topic><topic>Speech Perception</topic><topic>Time Factors</topic><topic>Vertebrates: nervous system and sense organs</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Stilp, Christian E.</creatorcontrib><creatorcontrib>Kiefte, Michael</creatorcontrib><creatorcontrib>Alexander, Joshua M.</creatorcontrib><creatorcontrib>Kluender, Keith R.</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Electronics & Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Stilp, Christian E.</au><au>Kiefte, Michael</au><au>Alexander, Joshua M.</au><au>Kluender, Keith R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><addtitle>J Acoust Soc Am</addtitle><date>2010-10-01</date><risdate>2010</risdate><volume>128</volume><issue>4</issue><spage>2112</spage><epage>2126</epage><pages>2112-2126</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><coden>JASMAN</coden><abstract>Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.</abstract><cop>Melville, NY</cop><pub>Acoustical Society of America</pub><pmid>20968382</pmid><doi>10.1121/1.3483719</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0001-4966 |
ispartof | The Journal of the Acoustical Society of America, 2010-10, Vol.128 (4), p.2112-2126 |
issn | 0001-4966 1520-8524 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2981123 |
source | MEDLINE; AIP Journals Complete; Alma/SFX Local Collection; AIP Acoustical Society of America |
subjects | Acoustic Stimulation Audiometry Audition Biological and medical sciences Cochlea - physiology Distortion Ear and associated structures. Auditory pathways and centers. Hearing. Vocal organ. Phonation. Sound production. Echolocation Entropy Fundamental and applied biological sciences. Psychology Humans Intelligibility Male Models, Theoretical Perception Phonetics Psychology. Psychoanalysis. Psychiatry Psychology. Psychophysiology Segments Sentences Sound Spectrography Spectra Speech Speech Acoustics Speech Intelligibility Speech Perception Time Factors Vertebrates: nervous system and sense organs |
title | Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T06%3A18%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cochlea-scaled%20spectral%20entropy%20predicts%20rate-invariant%20intelligibility%20of%20temporally%20distorted%20sentences&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Stilp,%20Christian%20E.&rft.date=2010-10-01&rft.volume=128&rft.issue=4&rft.spage=2112&rft.epage=2126&rft.pages=2112-2126&rft.issn=0001-4966&rft.eissn=1520-8524&rft.coden=JASMAN&rft_id=info:doi/10.1121/1.3483719&rft_dat=%3Cproquest_pubme%3E831205954%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=760234855&rft_id=info:pmid/20968382&rfr_iscdi=true |