Foundational GPT Model for MEG
Deep learning techniques can be used to first training unsupervised models on large amounts of unlabelled data, before fine-tuning the models on specific tasks. This approach has seen massive success for various kinds of data, e.g. images, language, audio, and holds the promise of improving performa...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep learning techniques can be used to first training unsupervised models on
large amounts of unlabelled data, before fine-tuning the models on specific
tasks. This approach has seen massive success for various kinds of data, e.g.
images, language, audio, and holds the promise of improving performance in
various downstream tasks (e.g. encoding or decoding brain data). However, there
has been limited progress taking this approach for modelling brain signals,
such as Magneto-/electroencephalography (M/EEG). Here we propose two classes of
deep learning foundational models that can be trained using forecasting of
unlabelled MEG. First, we consider a modified Wavenet; and second, we consider
a modified Transformer-based (GPT2) model. The modified GPT2 includes a novel
application of tokenisation and embedding methods, allowing a model developed
initially for the discrete domain of language to be applied to continuous
multichannel time series data. We also extend the forecasting framework to
include condition labels as inputs, enabling better modelling (encoding) of
task data. We compare the performance of these deep learning models with
standard linear autoregressive (AR) modelling on MEG data. This shows that
GPT2-based models provide better modelling capabilities than Wavenet and linear
AR models, by better reproducing the temporal, spatial and spectral
characteristics of real data and evoked activity in task data. We show how the
GPT2 model scales well to multiple subjects, while adapting its model to each
subject through subject embedding. Finally, we show how such a model can be
useful in downstream decoding tasks through data simulation. All code is
available on GitHub (https://github.com/ricsinaruto/MEG-transfer-decoding). |
---|---|
DOI: | 10.48550/arxiv.2404.09256 |