| Literature DB >> 30626132 |
Zied Tayeb1,2, Juri Fedjaev3, Nejla Ghaboosi4, Christoph Richter5, Lukas Everding6, Xingwei Qu7, Yingyu Wu8, Gordon Cheng9, Jörg Conradt10.
Abstract
Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject's motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g., hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause by the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: (1) A long short-term memory (LSTM); (2) a spectrogram-based convolutional neural network model (CNN); and (3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (any manual) feature engineering. Results were evaluated on our own publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from "BCI Competition IV". Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.Entities:
Keywords: Brain-Computer Interfaces; Deep Learning; electroencephalography (EEG); long short-term memory (LSTM); recurrent convolutional neural network (RCNN); spectrogram-based convolutional neural network model (pCNN)
Mesh:
Year: 2019 PMID: 30626132 PMCID: PMC6338892 DOI: 10.3390/s19010210
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Experimental setup and an example of a recording session of motor imagery-electroencephalography (MI-EEG) recording.
Figure 2Example of the generated spectrograms from C3 and C4 electrode during left (class 1) and right (class 2) hand movements imagination.
Figure 3The pragmatic conventional neural network (pCNN) model’s architecture, where E is the number of electrodes, T is the number of timesteps, and K is the number of classes.
Figure 4Training and validation loss of the pCNN model. The blue and green lines represent the average of the 5 folds for training and validation, respectively.
Pragmatic CNN (pCNN) architecture. E is the number of channels, T is the number of timesteps and K is the number of classes. Input and Output sizes are shown for cropped training with (electrodes C3, C4, and Cz) and for window size of 4 seconds; binary classification with two classes for .
| Layer | Input | Operation | Output | Parameters |
|---|---|---|---|---|
| 1 |
| STFT |
| - |
| 2 |
|
|
| 10,392 |
|
| BatchNorm |
| 260 | |
|
| MaxPool2D |
| - | |
|
| ReLU |
| - | |
| 2 |
|
|
| 73,776 |
|
| BatchNorm |
| 128 | |
|
| MaxPool2D |
| - | |
|
| ReLU |
| - | |
| 3 |
|
|
| 73,824 |
|
| BatchNorm |
| 64 | |
|
| MaxPool2D ( |
| - | |
|
| ReLU |
| - | |
|
| Dropout ( |
| - | |
| 4 |
| Flatten | 6144 | - |
| 6144 | Softmax |
| 12,290 | |
| Total |
|
The proposed recurrent convolutional neural network (RCNN) architecture
| Layer Type | Size | Output Shape |
|---|---|---|
| Convolutional | (144,1,1280,256) | |
| Max pooling | Pool size 4, stride 4 | (144,1,1280,256) |
| RCL | 256 filters ( | (144,1,1280,256) |
| Max pooling | Pool size 4, stride 4 | (144,1,320,256) |
| RCL | 256 filters ( | (144,1,320,256) |
| Max pooling | Pool size 4, stride 4 | (144,1,80,256) |
| RCL | 256 filters ( | (144,1,20,256) |
| Max pooling | Pool size 4, stride 4 | (144,1,5,256) |
| RCL | 256 filters ( | (144,1,20,256) |
| Max pooling | Pool size 4, stride 4 | (144,1,5,256) |
| Fully connected |
| (144,2) |
Deep CNN (dCNN) architecture as proposed by Schirrmeister et al. [18] (re-implemented in this work), where E is the number of channels, T is the number of timesteps and K is the number of classes. Input and Output sizes are shown for cropped training with (electrodes C3, C4, and Cz) and for window size of 4 s; binary classification with two classes for .
| Layer | Input | Operation | Output | Parameters |
|---|---|---|---|---|
| 1 |
|
| 275 | |
| 2 |
| Reshape |
| - |
|
|
| 1900 | ||
|
| BatchNorm |
| 100 | |
|
| ELU |
| - | |
|
| MaxPool2D ( |
| - | |
|
| Dropout (0.5) |
| - | |
| 3 |
|
| 12,550 | |
|
| BatchNorm |
| 200 | |
|
| ELU |
| - | |
|
| MaxPool2D ( |
| - | |
|
| Dropout (0.5) |
| - | |
| 4 |
| Reshape |
| - |
|
|
| 50,100 | ||
|
| BatchNorm |
| 400 | |
|
| ELU |
| - | |
|
| MaxPool2D |
| - | |
|
| Dropout (0.5) |
| - | |
| 5 |
| Reshape |
| - |
|
|
| 200,200 | ||
|
| BatchNorm |
| 800 | |
|
| ELU |
| - | |
|
| MaxPool2D ( |
| - | |
| 6 |
| Flatten | 1600 | - |
| 1600 | Softmax |
| 3202 | |
| Total |
|
Shallow CNN (sCNN) architecture by Schirrmeister et al. [18] (re-implemented in this work), where E is the number of channels, T is the number of timesteps and K is the number of classes. Input and Output sizes are shown for cropped training with (electrodes C3, C4, and Cz) and for window size of 4 seconds; binary classification with two classes for .
| Layer | Input | Operation | Output | Parameters |
|---|---|---|---|---|
| 1 |
|
| 1040 | |
|
| Dropout (0.5) |
| - | |
| 2 |
| Reshape |
| - |
|
|
| 4840 | ||
|
| BatchNorm |
| 160 | |
|
|
| - | ||
|
| Dropout (0.5) |
| - | |
| 3 |
| Reshape |
| - |
|
| AvgPool2d ( |
| - | |
|
|
| - | ||
| 4 |
| Flatten | 2480 | - |
| 6144 | Softmax |
| 4962 | |
| Total |
|
Figure 5MI classification accuracies from 20 subjects using (a) traditional machine learning approaches and (b) different neural classifiers. The polar bar plot shows the accuracy range (mean ± standard deviation) achieved by the 5 models for each of the 20 subjects. The lower panel subsumes for each algorithm the 20 mean accuracies achieved, black bars indicate the median result.
Figure 6MI classification accuracies from nine subjects using five different classifiers. The polar bar plot shows the accuracy range (mean ± standard deviation) achieved by the five models for each of the nine subjects. The lower panel subsumes for each algorithm the nine mean accuracies achieved, black bars indicate the median result.
Figure 7A frame of a live stream. Top: Filtered signal during a trial. Blue and red traces illustrate channel 1 and channel 2, respectively. Vertical lines indicate visual (orange) and acoustic cues (red). Bottom: Generated spectrograms from data within the grey rectangle shown above.
Figure 8Live setup for real-time EEG signal decoding and Katana robot arm control. P(L) and P(R) represent the probability of left and right hand movements, respectively.