Listeners track talker-specific prosody to deal with talker-variability

•We investigated how listeners deal with prosodic talker variability.•Listeners learn which suprasegmental cues are used by talkers to signal stress.•Listeners predict upcoming speech based on this learned information.•The N200 was no marker for prediction of prosodic cues. One of the challenges in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Brain research 2021-10, Vol.1769, p.147605-147605, Article 147605
Hauptverfasser: Severijnen, Giulio G.A., Bosker, Hans Rutger, Piai, Vitória, McQueen, James M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We investigated how listeners deal with prosodic talker variability.•Listeners learn which suprasegmental cues are used by talkers to signal stress.•Listeners predict upcoming speech based on this learned information.•The N200 was no marker for prediction of prosodic cues. One of the challenges in speech perception is that listeners must deal with considerable segmental and suprasegmental variability in the acoustic signal due to differences between talkers. Most previous studies have focused on how listeners deal with segmental variability. In this EEG experiment, we investigated whether listeners track talker-specific usage of suprasegmental cues to lexical stress to recognize spoken words correctly. In a three-day training phase, Dutch participants learned to map non-word minimal stress pairs onto different object referents (e.g., USklot meant “lamp”; usKLOT meant “train”). These non-words were produced by two male talkers. Critically, each talker used only one suprasegmental cue to signal stress (e.g., Talker A used only F0 and Talker B only intensity). We expected participants to learn which talker used which cue to signal stress. In the test phase, participants indicated whether spoken sentences including these non-words were correct (“The word for lamp is…”). We found that participants were slower to indicate that a stimulus was correct if the non-word was produced with the unexpected cue (e.g., Talker A using intensity). That is, if in training Talker A used F0 to signal stress, participants experienced a mismatch between predicted and perceived phonological word-forms if, at test, Talker A unexpectedly used intensity to cue stress. In contrast, the N200 amplitude, an event-related potential related to phonological prediction, was not modulated by the cue mismatch. Theoretical implications of these contrasting results are discussed. The behavioral findings illustrate talker-specific prediction of prosodic cues, picked up through perceptual learning during training.
ISSN:0006-8993
1872-6240
DOI:10.1016/j.brainres.2021.147605