| Literature DB >> 35571721 |
Abstract
With the rapid development of deep learning, researchers have gradually applied it to motor imagery brain computer interface (MI-BCI) and initially demonstrated its advantages over traditional machine learning. However, its application still faces many challenges, and the recognition rate of electroencephalogram (EEG) is still the bottleneck restricting the development of MI-BCI. In order to improve the accuracy of EEG classification, a DSC-ConvLSTM model based on the attention mechanism is proposed for the multi-classification of motor imagery EEG signals. To address the problem of the small sample size of well-labeled and accurate EEG data, the preprocessing uses sliding windows for data augmentation, and the average prediction loss of each sliding window is used as the final prediction loss for that trial. This not only increases the training sample size and is beneficial to train complex neural network models, but also the network no longer extracts the global features of the whole trial so as to avoid learning the difference features among trials, which can effectively eliminate the influence of individual specificity. In the aspect of feature extraction and classification, the overall network structure is designed according to the characteristics of the EEG signals in this paper. Firstly, depth separable convolution (DSC) is used to extract spatial features of EEG signals. On the one hand, this reduces the number of parameters and improves the response speed of the system. On the other hand, the network structure we designed is more conducive to extract directly the direct extraction of spatial features of EEG signals. Secondly, the internal structure of the Long Short-Term Memory (LSTM) unit is improved by using convolution and attention mechanism, and a novel bidirectional convolution LSTM (ConvLSTM) structure is proposed by comparing the effects of embedding convolution and attention mechanism in the input and different gates, respectively. In the ConvLSTM module, the convolutional structure is only introduced into the input-to-state transition, while the gates still remain the original fully connected mechanism, and the attention mechanism is introduced into the input to further improve the overall decoding performance of the model. This bidirectional ConvLSTM extracts the time-domain features of EEG signals and integrates the feature extraction capability of the CNN and the sequence processing capability of LSTM. The experimental results show that the average classification accuracy of the model reaches 73.7% and 92.6% on two datasets, BCI Competition IV Dataset 2a and High Gamma Dataset, respectively, which proves the robustness and effectiveness of the model we proposed. It can be seen that the model in this paper can deeply excavate significant EEG features from the original EEG signals, show good performance in different subjects and different datasets, and improve the influence of individual variability on the classification performance, which is of practical significance for promoting the development of brain-computer interface technology towards a practical and marketable direction.Entities:
Mesh:
Year: 2022 PMID: 35571721 PMCID: PMC9098272 DOI: 10.1155/2022/8187009
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Sliding windows for data augmentation. T denotes the time step of the whole trial, T' denotes the time length of the sliding window, y1, y2,…, y is the prediction value obtained for each example of the sliding window, and the final loss is the difference between the mean of all predictions y and the target y.
Figure 2EEG decoding overall network structure. The input signal form is X=RCT, where C denotes the number of channels and T denotes the time step. To simplify the network structure diagram, the pooling layer of each block is not represented in the diagram.
Network structure.
| Blocks | Layer | Filters | Output | Options |
|---|---|---|---|---|
| 1 | Input | [C, T] | ||
| Reshape | [1, C, T] | |||
| Conv_time | F1 | [F1, C, T] | kernel_size = (1, 64), stride = (1, 1), padding = (0, F1 | |
| BatchNorm | [F1, C, T] | num_features = F1, eps = 0.001, momentum = 0.01 | ||
| Conv_spat | D | [D | kernel_size = (C, 1), stride=(1, 1),groups = F1 | |
| BatchNorm | [D | num_features = D | ||
| Activation | [D | ELU | ||
| MaxPool2D | [D | kernel_size = (1, 3),stride = (1, 3) | ||
| Dropout | [D |
| ||
|
| ||||
| 2 | DepthwiseConv2D | F2 | [F2, 1, T//3] | kernel_size = (1, 16), stride = (1, 1), groups = F2 |
| PointwiseConv2D | D | [D | kernel_size = (1, 1), stride = (1, 1) | |
| BatchNorm | [D | num_features = D | ||
| Activation | [D | ELU | ||
| MaxPool2D | [D | kernel_size = (1, 3), stride = (1, 3) | ||
| Dropout |
| |||
|
| ||||
| 3 | DepthwiseConv2D | F3 | [F3, 1, T//9] | kernel_size = (1, 16), stride = (1, 1), groups = F3 |
| PointwiseConv2D | D | [D | kernel_size = (1, 1), stride = (1, 1) | |
| BatchNorm | [D | num_features = D | ||
| Activate | [D | ELU | ||
| MaxPool2D | [D | kernel_size = (1, 3), stride=(1, 3) | ||
| Dropout | [D |
| ||
|
| ||||
| 4 | SeparableConv2D | F4 | [F4, 1, T//27] | kernel_size = (1, 16), stride = (1, 1), groups = F4 |
| PointwiseConv2D | D | [D | kernel_size = (1, 1), stride = (1, 1) | |
| BatchNorm | [D | num_features = D | ||
| Activation | [D | ELU | ||
| MeanPool2D | [D | kernel_size = (1, 3), stride = (1, 3) | ||
| Dropout | [D |
| ||
Figure 3Structural differences between LSTM and ConvLSTM.
Figure 4ConvLSTM network structure.
Figure 5Structure diagram of attention mechanism.
Figure 6Iteration results_BCIC IV 2a. The red solid line and dashed line indicate the change in accuracy of the training and validation sets during the iterative process, respectively, while the blue solid line and dashed line indicate the change in LOSS of the training and valid sets during the iterative process.
Figure 7Iteration results_High gamma dataset. The red solid line and dashed line indicate the change in accuracy of the training and validation sets during the iterative process, respectively, while the blue solid line and dashed line indicate the change in LOSS of the training and valid sets during the iterative process.
Number of trainable parameters per model.
| DeepConvNet | ShallowConvNet | EEGNet-4,2 | EEGNet-8,2 | Ours | |
|---|---|---|---|---|---|
| Number of parameters | 152219 | 40644 | 796 | 1716 | 17972 |
Within-subject classification accuracy_bcic iv 2a.
| Author | Algorithm | Accuracy (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| A01 | A02 | A03 | A04 | A05 | A06 | A07 | A08 | A09 | Mean | ||
| Sakhavi S | FBCSP | 76.11 | 44.33 | 81.52 | 66.30 | 58.96 | 50.75 | 85.69 | 77.35 | 75.63 | 68.0 |
| R. T. Schirr | Deep ConvNet | 69.75 | 45.44 | 87.46 | 63.85 | 52.04 | 53.39 | 87.46 | 83.46 | 83.99 | 69.6 |
| Shallow ConvNet | 81.60 | 45.49 | 88.54 | 68.06 | 60.42 | 51.74 | 88.54 | 81.25 | 79.51 | 71.7 | |
|
| |||||||||||
| V. J. Lawhern | EEGNet-4, 2 | 80.56 | 48.96 | 87.50 | 64.93 | 62.50 | 58.68 | 87.85 | 77.78 | 81.25 | 72.2 |
| EEGNet-8, 2 | 85.76 | 43.06 | 92.36 | 62.50 | 63.19 | 62.85 | 83.33 | 73.96 | 76.39 | 72.5 | |
|
| |||||||||||
| Ours | Ours | 91.32 | 44.10 | 90.97 | 67.71 | 60.07 | 56.25 | 89.58 | 82.99 | 79.51 | 73.7 |
Within-subject classification accuracy_HGD.
| Algorithm | Accuracy (%) | Mean | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A01 | A02 | A03 | A04 | A05 | A06 | A07 | A08 | A09 | A10 | A11 | A12 | A13 | A14 | ||
| FBCSP | 76.12 | 85.92 | 95.79 | 94.22 | 97.73 | 89.89 | 88.87 | 94.90 | 78.90 | 89.87 | 85.87 | 95.78 | 95.12 | 69.70 | 88.5 |
| Deep ConvNet | 75.62 | 87.50 | 97.50 | 98.12 | 94.37 | 93.75 | 86.88 | 86.25 | 74.38 | 94.37 | 78.56 | 94.37 | 98.75 | 67.50 | 88.1 |
| Shallow ConvNet | 79.37 | 81.25 | 96.88 | 97.50 | 91.87 | 92.50 | 82.50 | 93.75 | 76.88 | 88.75 | 80.00 | 93.13 | 95.63 | 67.50 | 87.0 |
| EEGNet-4, 2 | 82.32 | 90.63 | 94.36 | 95.31 | 90.34 | 91.10 | 90.13 | 90.99 | 85.36 | 92.34 | 88.33 | 94.18 | 95.21 | 73.10 | 89.6 |
| EEGNet-8, 2 | 81.81 | 90.86 | 97.63 | 97.67 | 93.48 | 94.46 | 92.66 | 91.29 | 84.66 | 91.64 | 87.63 | 93.48 | 93.51 | 70.40 | 90.1 |
| Ours | 85.25 | 93.55 | 97.46 | 98.68 | 97.47 | 95.27 | 93.28 | 92.44 | 88.20 | 93.66 | 90.65 | 95.67 | 97.38 | 77.79 | 92.6 |
Figure 8Comparison results of the classification accuracy on BCIC IV 2a dataset.
Figure 9Comparison results of the classification accuracy on the high gamma dataset.
Evaluated deep learning structure.
| Design structure | Our choice | Aim | Average accuracy | |
|---|---|---|---|---|
| bciv iv 2a (%) | HGD (%) | |||
| Convolution in first layer | Splitted convolution | The first layer of convolution is divided into time-domain convolution and spatial filtering, which can better process the input of EEG signal and improve the classification accuracy. | 60.2 | 82.3 |
| ConvNet | Separable convolution | One is to reduce the number of network parameters and improve the training speed; the other is to show the relationships within and across decoupled feature maps | 60.8 | 82.9 |
| LSTM | BiConvLSTM | Improves LSTM's disadvantage of extracting only temporal features of EEG signals and enables it to extract spatial features of EEG signals | 65.3 | 84.6 |
| Data processing | Sliding window | It not only increases the number of training samples but also fully extracts the differential features and global features of all EEG data | 73.7 | 92.6 |