The cocktail-party problem revisited: early processing and selection of multi-talker speech

How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Impor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Attention, perception & psychophysics perception & psychophysics, 2015-07, Vol.77 (5), p.1465-1487
1. Verfasser: Bronkhorst, Adelbert W.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1487
container_issue 5
container_start_page 1465
container_title Attention, perception & psychophysics
container_volume 77
creator Bronkhorst, Adelbert W.
description How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and “unmasking” resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping—the segregation and streaming of sounds—represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped—and subsequently selected—using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.
doi_str_mv 10.3758/s13414-015-0882-9
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4469089</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1689624263</sourcerecordid><originalsourceid>FETCH-LOGICAL-c584t-24a4e3253536bdbf94d028a4d209b86ac5b2ca02f568582c295748d4f223d0783</originalsourceid><addsrcrecordid>eNp1kc1rFTEUxQdRbK3-AW4k4MZNNN8vcSFI8QsKbioILkImc-e9tJnJmGQK_e_N89VHFVzdwPnl3HM5Xfecktd8I_WbQrmgAhMqMdGaYfOgO6VGcMwN__7w-Gb0pHtSyhUhiqsNedydMKmZFoqfdj8ud4B88tfVhYgXl-stWnLqI0wow00oocLwFoHL8bfgoZQwb5GbB1Qggq8hzSiNaFpjDbi6eA0ZlQXA7552j0YXCzy7m2fdt48fLs8_44uvn76cv7_AXmpRMRNOAGeSS676oR-NGAjTTgyMmF4r52XPvCNslEq33J4ZuRF6ECNjfCAbzc-6dwffZe0nGDzMNbtolxwml29tcsH-rcxhZ7fpxgqhDNGmGby6M8jp5wql2ikUDzG6GdJaLFXaKCaY4g19-Q96ldY8t_MaZajUilPRKHqgfE6lZBiPYSix--rsoTrbqrP76uw-xIv7Vxx__OmqAewAlCbNW8j3Vv_X9RdTP6Up</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1691586314</pqid></control><display><type>article</type><title>The cocktail-party problem revisited: early processing and selection of multi-talker speech</title><source>MEDLINE</source><source>Alma/SFX Local Collection</source><source>SpringerLink Journals - AutoHoldings</source><creator>Bronkhorst, Adelbert W.</creator><creatorcontrib>Bronkhorst, Adelbert W.</creatorcontrib><description>How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and “unmasking” resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping—the segregation and streaming of sounds—represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped—and subsequently selected—using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.</description><identifier>ISSN: 1943-3921</identifier><identifier>EISSN: 1943-393X</identifier><identifier>DOI: 10.3758/s13414-015-0882-9</identifier><identifier>PMID: 25828463</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Attention ; Attention - physiology ; Auditory Perception ; Behavioral Science and Psychology ; Cognitive Psychology ; Cues ; Ears &amp; hearing ; Feedback (Response) ; Hearing Impairments ; Hearing loss ; Humans ; Listening Comprehension ; Noise ; Perceptual Masking - physiology ; Phonemes ; Psychoacoustics ; Psycholinguistics ; Psychology ; Semantics ; Sound ; Sound Localization - physiology ; Sound Spectrography ; Speech ; Speech - physiology ; Speech Acoustics ; Speech Communication ; Speech Perception - physiology ; Stimuli ; Studies</subject><ispartof>Attention, perception &amp; psychophysics, 2015-07, Vol.77 (5), p.1465-1487</ispartof><rights>The Author(s) 2015</rights><rights>Copyright Springer Science &amp; Business Media Jul 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c584t-24a4e3253536bdbf94d028a4d209b86ac5b2ca02f568582c295748d4f223d0783</citedby><cites>FETCH-LOGICAL-c584t-24a4e3253536bdbf94d028a4d209b86ac5b2ca02f568582c295748d4f223d0783</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.3758/s13414-015-0882-9$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.3758/s13414-015-0882-9$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,27903,27904,41467,42536,51297</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25828463$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Bronkhorst, Adelbert W.</creatorcontrib><title>The cocktail-party problem revisited: early processing and selection of multi-talker speech</title><title>Attention, perception &amp; psychophysics</title><addtitle>Atten Percept Psychophys</addtitle><addtitle>Atten Percept Psychophys</addtitle><description>How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and “unmasking” resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping—the segregation and streaming of sounds—represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped—and subsequently selected—using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.</description><subject>Attention</subject><subject>Attention - physiology</subject><subject>Auditory Perception</subject><subject>Behavioral Science and Psychology</subject><subject>Cognitive Psychology</subject><subject>Cues</subject><subject>Ears &amp; hearing</subject><subject>Feedback (Response)</subject><subject>Hearing Impairments</subject><subject>Hearing loss</subject><subject>Humans</subject><subject>Listening Comprehension</subject><subject>Noise</subject><subject>Perceptual Masking - physiology</subject><subject>Phonemes</subject><subject>Psychoacoustics</subject><subject>Psycholinguistics</subject><subject>Psychology</subject><subject>Semantics</subject><subject>Sound</subject><subject>Sound Localization - physiology</subject><subject>Sound Spectrography</subject><subject>Speech</subject><subject>Speech - physiology</subject><subject>Speech Acoustics</subject><subject>Speech Communication</subject><subject>Speech Perception - physiology</subject><subject>Stimuli</subject><subject>Studies</subject><issn>1943-3921</issn><issn>1943-393X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp1kc1rFTEUxQdRbK3-AW4k4MZNNN8vcSFI8QsKbioILkImc-e9tJnJmGQK_e_N89VHFVzdwPnl3HM5Xfecktd8I_WbQrmgAhMqMdGaYfOgO6VGcMwN__7w-Gb0pHtSyhUhiqsNedydMKmZFoqfdj8ud4B88tfVhYgXl-stWnLqI0wow00oocLwFoHL8bfgoZQwb5GbB1Qggq8hzSiNaFpjDbi6eA0ZlQXA7552j0YXCzy7m2fdt48fLs8_44uvn76cv7_AXmpRMRNOAGeSS676oR-NGAjTTgyMmF4r52XPvCNslEq33J4ZuRF6ECNjfCAbzc-6dwffZe0nGDzMNbtolxwml29tcsH-rcxhZ7fpxgqhDNGmGby6M8jp5wql2ikUDzG6GdJaLFXaKCaY4g19-Q96ldY8t_MaZajUilPRKHqgfE6lZBiPYSix--rsoTrbqrP76uw-xIv7Vxx__OmqAewAlCbNW8j3Vv_X9RdTP6Up</recordid><startdate>20150701</startdate><enddate>20150701</enddate><creator>Bronkhorst, Adelbert W.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>3V.</scope><scope>4T-</scope><scope>4U-</scope><scope>7X7</scope><scope>7XB</scope><scope>88B</scope><scope>88E</scope><scope>88G</scope><scope>88J</scope><scope>8AO</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>AN0</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>CJNVE</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>K9.</scope><scope>M0P</scope><scope>M0S</scope><scope>M1P</scope><scope>M2M</scope><scope>M2O</scope><scope>M2R</scope><scope>MBDVC</scope><scope>PQEDU</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PSYQQ</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20150701</creationdate><title>The cocktail-party problem revisited: early processing and selection of multi-talker speech</title><author>Bronkhorst, Adelbert W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c584t-24a4e3253536bdbf94d028a4d209b86ac5b2ca02f568582c295748d4f223d0783</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Attention</topic><topic>Attention - physiology</topic><topic>Auditory Perception</topic><topic>Behavioral Science and Psychology</topic><topic>Cognitive Psychology</topic><topic>Cues</topic><topic>Ears &amp; hearing</topic><topic>Feedback (Response)</topic><topic>Hearing Impairments</topic><topic>Hearing loss</topic><topic>Humans</topic><topic>Listening Comprehension</topic><topic>Noise</topic><topic>Perceptual Masking - physiology</topic><topic>Phonemes</topic><topic>Psychoacoustics</topic><topic>Psycholinguistics</topic><topic>Psychology</topic><topic>Semantics</topic><topic>Sound</topic><topic>Sound Localization - physiology</topic><topic>Sound Spectrography</topic><topic>Speech</topic><topic>Speech - physiology</topic><topic>Speech Acoustics</topic><topic>Speech Communication</topic><topic>Speech Perception - physiology</topic><topic>Stimuli</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bronkhorst, Adelbert W.</creatorcontrib><collection>Springer Nature OA/Free Journals</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ProQuest Central (Corporate)</collection><collection>Docstoc</collection><collection>University Readers</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Education Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Psychology Database (Alumni)</collection><collection>Social Science Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>British Nursing Database</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>Education Collection</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Education Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>ProQuest Psychology</collection><collection>Research Library</collection><collection>Social Science Database</collection><collection>Research Library (Corporate)</collection><collection>ProQuest One Education</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest One Psychology</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Attention, perception &amp; psychophysics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bronkhorst, Adelbert W.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The cocktail-party problem revisited: early processing and selection of multi-talker speech</atitle><jtitle>Attention, perception &amp; psychophysics</jtitle><stitle>Atten Percept Psychophys</stitle><addtitle>Atten Percept Psychophys</addtitle><date>2015-07-01</date><risdate>2015</risdate><volume>77</volume><issue>5</issue><spage>1465</spage><epage>1487</epage><pages>1465-1487</pages><issn>1943-3921</issn><eissn>1943-393X</eissn><abstract>How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and “unmasking” resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping—the segregation and streaming of sounds—represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped—and subsequently selected—using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.</abstract><cop>New York</cop><pub>Springer US</pub><pmid>25828463</pmid><doi>10.3758/s13414-015-0882-9</doi><tpages>23</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1943-3921
ispartof Attention, perception & psychophysics, 2015-07, Vol.77 (5), p.1465-1487
issn 1943-3921
1943-393X
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4469089
source MEDLINE; Alma/SFX Local Collection; SpringerLink Journals - AutoHoldings
subjects Attention
Attention - physiology
Auditory Perception
Behavioral Science and Psychology
Cognitive Psychology
Cues
Ears & hearing
Feedback (Response)
Hearing Impairments
Hearing loss
Humans
Listening Comprehension
Noise
Perceptual Masking - physiology
Phonemes
Psychoacoustics
Psycholinguistics
Psychology
Semantics
Sound
Sound Localization - physiology
Sound Spectrography
Speech
Speech - physiology
Speech Acoustics
Speech Communication
Speech Perception - physiology
Stimuli
Studies
title The cocktail-party problem revisited: early processing and selection of multi-talker speech
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T17%3A00%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20cocktail-party%20problem%20revisited:%20early%20processing%20and%20selection%20of%20multi-talker%20speech&rft.jtitle=Attention,%20perception%20&%20psychophysics&rft.au=Bronkhorst,%20Adelbert%20W.&rft.date=2015-07-01&rft.volume=77&rft.issue=5&rft.spage=1465&rft.epage=1487&rft.pages=1465-1487&rft.issn=1943-3921&rft.eissn=1943-393X&rft_id=info:doi/10.3758/s13414-015-0882-9&rft_dat=%3Cproquest_pubme%3E1689624263%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1691586314&rft_id=info:pmid/25828463&rfr_iscdi=true