Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and Attention
We advance a novel explanation of similarity-based interference effects in subject-verb and reflexive pronoun agreement processing, grounded in surprisal values computed from a pretrained large-scale Transformer model, GPT-2. Specifically, we show that surprisal of the verb or reflexive pronoun pred...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We advance a novel explanation of similarity-based interference effects in
subject-verb and reflexive pronoun agreement processing, grounded in surprisal
values computed from a pretrained large-scale Transformer model, GPT-2.
Specifically, we show that surprisal of the verb or reflexive pronoun predicts
facilitatory interference effects in ungrammatical sentences, where a
distractor noun that matches in number with the verb or pronoun leads to faster
reading times, despite the distractor not participating in the agreement
relation. We review the human empirical evidence for such effects, including
recent meta-analyses and large-scale studies. We also show that attention
patterns (indexed by entropy and other measures) in the Transformer show
patterns of diffuse attention in the presence of similar distractors,
consistent with cue-based retrieval models of parsing. But in contrast to these
models, the attentional cues and memory representations are learned entirely
from the simple self-supervised task of predicting the next word. |
---|---|
DOI: | 10.48550/arxiv.2104.12874 |