Neural Encoding and Decoding With Distributed Sentence Representations

Building computational models to account for the cortical representation of language plays an important role in understanding the human linguistic system. Recent progress in distributed semantic models (DSMs), especially transformer-based methods, has driven advances in many language understanding t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transaction on neural networks and learning systems 2021-02, Vol.32 (2), p.589-603
Hauptverfasser: Sun, Jingyuan, Wang, Shaonan, Zhang, Jiajun, Zong, Chengqing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Building computational models to account for the cortical representation of language plays an important role in understanding the human linguistic system. Recent progress in distributed semantic models (DSMs), especially transformer-based methods, has driven advances in many language understanding tasks, making DSM a promising methodology to probe brain language processing. DSMs have been shown to reliably explain cortical responses to word stimuli. However, characterizing the brain activities for sentence processing is much less exhaustively explored with DSMs, especially the deep neural network-based methods. What is the relationship between cortical sentence representations against DSMs? What linguistic features that a DSM catches better explain its correlation with the brain activities aroused by sentence stimuli? Could distributed sentence representations help to reveal the semantic selectivity of different brain areas? We address these questions through the lens of neural encoding and decoding, fueled by the latest developments in natural language representation learning. We begin by evaluating the ability of a wide range of 12 DSMs to predict and decipher the functional magnetic resonance imaging (fMRI) images from humans reading sentences. Most models deliver high accuracy in the left middle temporal gyrus (LMTG) and left occipital complex (LOC). Notably, encoders trained with transformer-based DSMs consistently outperform other unsupervised structured models and all the unstructured baselines. With probing and ablation tasks, we further find that differences in the performance of the DSMs in modeling brain activities can be at least partially explained by the granularity of their semantic representations. We also illustrate the DSM's selectivity for concept categories and show that the topics are represented by spatially overlapping and distributed cortical patterns. Our results corroborate and extend previous findings in understanding the relation between DSMs and neural activation patterns and contribute to building solid brain-machine interfaces with deep neural network representations.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2020.3027595