A new benchmark dataset for P300 ERP-based BCI applications

Because of its non-invasive nature, one of the most commonly used event-related potentials in brain-computer interface (BCI) system designs is the P300 electroencephalogram (EEG) signal. The fact that the P300 response can easily be stimulated and measured is particularly important for participants...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Digital signal processing 2023-04, Vol.135, p.103950, Article 103950
Hauptverfasser: Yağan, Mehmet, Musellim, Serkan, Arslan, Suayb S., Çakar, Tuna, Alp, Nihan, Ozkan, Huseyin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Because of its non-invasive nature, one of the most commonly used event-related potentials in brain-computer interface (BCI) system designs is the P300 electroencephalogram (EEG) signal. The fact that the P300 response can easily be stimulated and measured is particularly important for participants with severe motor disabilities. In order to train and test P300-based BCI speller systems in more realistic high-speed settings, there is a pressing need for a large and challenging benchmark dataset. Various datasets already exist in the literature but most of them are not publicly available, and they either have a limited number of participants or utilize relatively long stimulus duration (SD) and inter-stimulus intervals (ISI). They are also typically based on a 36 target (6×6) character matrix. The use of long ISI, in particular, not only reduces the speed and the information transfer rates (ITRs) but also oversimplifies the P300 detection. This leaves a limited challenge to state-of-the-art machine learning and signal processing algorithms. In fact, near-perfect P300 classification accuracies are reported with the existing datasets. Therefore, one certainly needs a large-scale dataset with challenging settings to fully exploit the recent advancements in algorithm design (machine learning and signal processing) and achieve high-performance speller results. To this end, in this article we introduce a new freely- and publicly-accessible P300 dataset obtained using 32-channel EEG, in the hope that it will lead to new research findings and eventually more efficient BCI designs. The introduced dataset comprises 18 participants performing a 40-target (5×8) cued-spelling task, with reduced SD (66.6 ms) and ISI (33.3 ms) for fast spelling. We have also processed, analyzed, and character-classified the introduced dataset and we presented the accuracy and ITR results as a benchmark. The introduced dataset and the codes of our experiments are publicly accessible at https://data.mendeley.com/datasets/vyczny2r4w. •A challenging P300 speller dataset with short stimulation duration for fast spelling.•Large data size of 18 participants each spelling 160 characters on a 5×8 character matrix.•Performance benchmark with the state-of-the-art algorithms.
ISSN:1051-2004
1095-4333
DOI:10.1016/j.dsp.2023.103950