Literature DB >> 34975434

Data Augmentation for Deep Neural Networks Model in EEG Classification Task: A Review.

Chao He¹, Jialu Liu¹, Yuesheng Zhu², Wencai Du³.

Abstract

Classification of electroencephalogram (EEG) is a key approach to measure the rhythmic oscillations of neural activity, which is one of the core technologies of brain-computer interface systems (BCIs). However, extraction of the features from non-linear and non-stationary EEG signals is still a challenging task in current algorithms. With the development of artificial intelligence, various advanced algorithms have been proposed for signal classification in recent years. Among them, deep neural networks (DNNs) have become the most attractive type of method due to their end-to-end structure and powerful ability of automatic feature extraction. However, it is difficult to collect large-scale datasets in practical applications of BCIs, which may lead to overfitting or weak generalizability of the classifier. To address these issues, a promising technique has been proposed to improve the performance of the decoding model based on data augmentation (DA). In this article, we investigate recent studies and development of various DA strategies for EEG classification based on DNNs. The review consists of three parts: what kind of paradigms of EEG-based on BCIs are used, what types of DA methods are adopted to improve the DNN models, and what kind of accuracy can be obtained. Our survey summarizes the current practices and performance outcomes that aim to promote or guide the deployment of DA to EEG classification in future research and development.

Entities: Chemical

Keywords: EEG; brain-computer interface; classification; data augmentation; deep neural networks

Year: 2021 PMID： 34975434 PMCID： PMC8718399 DOI： 10.3389/fnhum.2021.765525

Source DB: PubMed Journal: Front Hum Neurosci ISSN： 1662-5161 Impact factor: 3.169

Introduction

As a key tool to capture the intention of brain activity, electroencephalography (EEG) can be used to measure rhythmic oscillations of the brain and reflect the synchronized activity of substantial populations of neurons (Atagün, 2016). The rhythmic oscillation is closely related to the state change of the nerve center that directly reflects the mental activity of the brain (Pfurtscheller, 2000; Villena-González et al., 2018). The brain-computer interface (BCI) is one of the typical applications used as a communication protocol between users and computers that does not rely on the normal neural pathways of the brain and muscles (Nicolas-Alonso and Gomez-Gil, 2012). Based on the generation types of EEG, BCIs can be divided into three types: non-invasive BCIs, invasive BCIs, and partially invasive (Rao, 2013; Levitskaya and Lebedev, 2016). Due to the low risk, low cost, and convenience, the EEG-based non-invasive BCIs are the most popular type of BCIs and are the main type discussed in this article. During the execution of the interaction, the automatic classification of EEG is an important step toward making the use of BCI more practical in applications (Lotte et al., 2007). However, some limitations present challenges for classification algorithms (Boernama et al., 2021). Firstly, EEG signals have weak amplitudes and are always accompanied by irrelated components which suffer from a low signal-to-noise ratio. Secondly, the essence of EEG is the potential change of cluster activity of neurons which is a non-stationary signal. The technologies of machine learning and non-linear theory are widely used for EEG classification in current research (Lotte et al., 2018). However, a long calibration-time and weak generalization ability limits their application in practice. In the past few years, deep neural networks (DNNs) have achieved excellent results in the field of image, speech, and natural text processing (Hinton et al., 2012; Bengio et al., 2013). The features can be automatically extracted from the input data by successive non-linear transformations based on hierarchical representations and mapping. Due to their ability to minimize the interference of redundant information and non-linear feature extraction, EEG decoding based on DNNs has attracted more and more attention. However, one of the prior conditions to obtain expected results is the support of large-scale datasets that could ensure the robustness and generalization ability of DNNs (Nguyen et al., 2015). There are still some challenges for EEG collection. First, it is difficult to collect large-scale data due to strict requirements for the experimental environment and subjects that may cause overfitting and increase the structural risk of the model (Zhang D. et al., 2019). More than that, EEG signals are highly susceptible to change in psychological and physiological conditions that cause high variability of feature distribution across subject/sessions (Zhang D. et al., 2018). It not only reduces the accuracy of the decoding model, but also limits the generalization of the model in the independent test set. One promising approach is regularization (Yu et al., 2008; Xie et al., 2015), which could effectively improve the generalization ability and robustness for DNNs. There are three ways to achieve regularization, including adding term into loss function (e.g., L2 regularization), directly in the model (e.g., dropout, batch normalization, kernel max norm constraint), and data augmentation (DA). Compared to the first two approaches, DA solves the problem of overfitting by using a more comprehensive set of data to minimize the distance between the training and test dataset. This is especially useful for EEG signals where the limitation of small-scale datasets greatly affects the performance of classifiers. Therefore, researchers are increasingly concerned with optimization for deep learning (DL) models using DA in the task of EEG classification. The framework of the methodology is shown in Figure 1.

FIGURE 1

The framework of EEG classification using DA strategy. Different color represents different classs.

The framework of EEG classification using DA strategy. Different color represents different classs. The rest of the article is organized as follows. The search methods for identifying relevant studies is described in detail in section “Method.” In section “Results,” the basic concept and specific methods of DA in EEG classification based on DNNs are presented. Section “Discussion” discusses the current research status and challenges. Finally, conclusions are drawn in section “Conclusion.”

Method

A wide literature search from 2016 to 2021 was conducted through Web of Science, PubMed, and IEEE Xplore. The keywords used for the search contain DA, EEG, deep learning, DNNs. Table 1 lists the collection criteria for inclusion or exclusion.

TABLE 1

Collection criteria for inclusion or exclusion.

Inclusion criteria	Exclusion criteria
• Published within the last 5 years	• Research for invasive EEG, electrocorticography (ECoG), magnetoencephalography (MEG), source imaging, fMRI, and so on, or joint studies with EEG
• A focus on non-invasive EEG signals	• No specific description of DA
• A specific explanation of how to apply DA to EEG signal
• At least one DNN is included for the classifier • EEG used in the BCI task or the detection of sleep states or epileptic seizures task

Collection criteria for inclusion or exclusion. This review was conducted following PRISMA guidelines (Liberati et al., 2009). Results are summarized in a flowchart in Figure 2. The flowchart identifies and narrows down the collection of related studies. Duplicates between all datasets and studies that meet the exclusion criteria are excluded. Finally, 56 papers that meet the inclusion criteria are included.

FIGURE 2

The search method for identifying relevant studies.

Results

Concepts and Methods for Data Augmentation

Data augmentation aims to prevent the overfitting of the DNN model by artificially generating new data based on existing training data (Shorten and Khoshgoftaar, 2019). There are three main strategies of this technology: basic image manipulations, deep learning, and feature transformation. The first approach performs augmentation directly in the input space while the last two methods realize DA based on the feature space of datasets. Here, we briefly describe these methods in the following parts.

Data Augmentation Based on Image Manipulations

Data augmentation based on image manipulations perform simple transformations using geometric features in an intuitive and low-cost way. Typical methods could be divided into the following categories.

Geometry Transformations

The geometric features of images are generally a visual representation of the physical information that contains both direction and contour elements (Cui et al., 2015; Paschali et al., 2019). Common operations include:

Flipping

This method is realized by rotating the image along the horizontal or vertical axis under the premise that the size of the matrix was consistent.

Cropping

The operation of cropping can be realized by cropping the central patch of images randomly and then mixing the remaining parts.

Rotation

Data augmentation rotation is realized by rotating images along some coordinate axis. How to select rotation parameters is an important factor that affects the enhancement effect.

Photometric and Color Transformations

Performing augmentations in the color channels’ space is another method to implement practically (Heyne et al., 2009). During the operation, the raw data are converted to a form of the power spectrum, stress diagram, and so on. They represent the distribution of spatial features.

Color Transformations

Color transformation realizes the generation of new data by adjusting the RGB matrix.

Noise Injection

Another approach to increase the diversity of data is injecting random matrices into the raw data, which are usually derived from Gaussian distributions (Okafor et al., 2017).

Data Augmentation Based on Deep Learning

Augmentation methods by image manipulations perform the transformation in input space of data. However, these approaches cannot take advantage of underlying features of data to perform augmentation (Arslan et al., 2019). Recently, a novel DA method has attracted the attention of researchers. It applies DNNs to map data space from high-dimensional to low-dimensional and realize feature extraction to reconstruct the artificial data (Cui et al., 2014). There are two typical deep learning strategies for DA: autoencoder (AE) and generative adversarial networks (GAN).

Autoencoder and Its Improved Version

As shown in Figure 3, an AE is a feed-forward neural network used to encode the raw data into low-dimensional vector representations by one-half of the network and to reconstruct these vectors back into the artificial data using another half of the network (Yun et al., 2019).

FIGURE 3

The structure of the autoencoder. Blue represents input layers and green represents output layers.

The structure of the autoencoder. Blue represents input layers and green represents output layers. To obtain the expected generated data, a variational autoencoder (VAE) is proposed to improve the performance of the autoencoder. Compared with AE, VAE ensure that generated data is subject to specific probability distribution by adding constraints into the structure (Figure 4).

FIGURE 4

The structure of variational autoencoder. Blue represents input layers and green represents output layers.

The structure of variational autoencoder. Blue represents input layers and green represents output layers. Where μ is the mean value of probability distribution, σ2 represents variance, and ∈ is deviation.

Generative Adversarial Networks and Their Improved Version

Generative adversarial networks refer to artificially generating data based on the principle of adversarial learning. As shown in Figure 5, it performs a competition between bilateral networks to achieve a dynamic balance that learns the statistical distribution of the target data (Deng et al., 2014). The optimization problem of GAN can be defined as follows:

FIGURE 5

The structure of generative adversarial networks.

The structure of generative adversarial networks. Where is the distribution of training data and D(x;θ) is the discriminative model used to estimate the probability distribution (•) between generated data of real data . represents the value function and is the expected value. In the process of training stage, the goal of GAN is to find the Nash equilibrium of a non-convex game with high-dimensional parameters. However, the optimization process of the model does not constraint for loss function that is easy to generate a meaningless output during the training stage. To address the issue and expand its application scope, the researchers proposed improved structures such as deep convolution GAN, conditional GANs, cycle GANs, and so on (Goodfellow et al., 2014). Amongst these new architectures for DA, DCGAN employed the CNNs to build the generator and discriminator networks rather than multilayer perceptron that expands more on the internal complexity than GAN (Radford et al., 2015). To improve the stability of the training process, an additional cycle-consistency loss function was proposed to optimize the structure of GAN, which was defined as cycle GANs (Kaneko et al., 2019). Conditional GANs effectively alleviate the limitations with mode collapse by adding a conditional vector to both the generator and the discriminator (Regmi and Borji, 2018). Another architecture of interest is known as Wasserstein GAN (WGAN). This architecture employed Wasserstein distance to measure the distance between generated data with real data rather than Jensen–Shannon or Kullback–Leibler divergence to improve the training performance (Yang et al., 2018).

Data Augmentation Based on Feature Transformation

Compared with the method of image manipulations and deep learning, feature transformation performs DA using spatial transformation of features in low dimensions that generate artificial data with a diverse distribution. However, a few studies have reported related methods. A novel spatial filtering method has been proposed to generate data using a time-delay strategy by combining it with a common spectral-spatial pattern (CSSP; Blankertz and BCI Competition, 2005). Another study applied empirical mode decomposition to divide EEG into multiple modes for DA (Freer and Yang, 2019). To clearly show the taxonomy of DA, Figure 6 briefly integrated all the DA methodologies collected in this review.

FIGURE 6

A taxonomy of data augmentation in EEG decoding.

Typical EEG Paradigms

Based on the form of interaction, the BCIs can be divided into two types: active type and passive type. Among them, active BCI is defined as a neural activity to a specific external stimulus that contains three typical paradigms: Motor imagery (MI), visual evoked potentials (VEP), and event-related potentials (ERP). MI is a mental process that imitates motor intention without real output. Different imagery tasks can activate the corresponding region of the brain, while this activation can be reflected by various feature representations of EEG (Bonassi et al., 2017). Visual evoked potentials are continuous responses from the visual region when humans receive flashing visual stimuli (Tobimatsu and Celesia, 2006). When external stimuli are presented in a fixed frequency form, the visual region is modulated to produce a continuous response related to this frequency, i.e., Steady-State Visually Evoked Potentials (SSVEP; Wu et al., 2008). Event-related potentials refers to a potential response when receiving specific stimulus such as visual, audio, or tactile stimulus (Luck, 2005). Compared with active BCI, passive BCI aims to output the EEG signals from subjects’ arbitrary brain activity, which is a form of BCI that does not rely on voluntary task (Roy et al., 2013; Arico et al., 2017). In this section, we review recent reports for DA in EEG classification based on DNNs.

Data Augmentation Strategy for EEG Classification

In recent years, scientific interest in the field of the application of DA for EEG classification has grown considerably. Abdelfattah et al. (2018) employed recurrent GANs (RGAN) to improve the performance of classification models in the MI-BCI tasks. Different from the structure of GAN, they applied recurrent neural networks to replace generator components. Due to its ability to capture the time dependencies of signals, RGAN show great advantages in time-series data generating. The classification accuracy was significantly improved after DA through the verification of three models. Zhang et al. (2020) carried out the research of image augmentation using deep convolution GAN (DCGAN) that replace pooling layers with Fractional-Strided Convolutions in the generator and strided convolutions in the discriminator. Considered the rule of feature distribution, they transformed time-series signal to spectrogram form and applied adversarial training with convolution operation to generate data. Meanwhile, they discussed the performance of different DA models and then verified that the generated data by DCGAN show the best similarity and diversity. Freer and Yang (2019) proposed a convolutional long-short term memory network (CLSTM) to execute binary classification for MI EEG. To enhance the robustness of the classifier, they applied noise injection, multiplication, flip, and frequency shift to augment data, respectively. Results show average classification accuracy could obtain 14.0% improvement after DA. Zhang Z. et al. (2019) created a novel DA method in the MI-BCI task in which they applied empirical mode decomposition (EMD) to divide the raw EEG frame into multiple modes. The process of decomposition was defined as: Where x(t) is recovered signal by EMD, IMF represents intrinsic mode functions, s represents the number of IMFs, and r(t) is the final residual value. In the training stage, they mixed IMFs into the intrinsic mode functions to generate new data and then transformed it into tensors using complex Morlet wavelets which were finally input into a convolutional neural network (CNN). Experimental results verified that the artificial EEG frame could enhance the performance of the classifier and obtain higher accuracy. Panwar et al. (2020) proposed a WGAN (Eq. 3) with gradient penalty to synthesize EEG data for rapid serial visual presentation (RSVP) task. It is worth noting that WGAN applied Wasserstein distance to measure the distance between real and generated data. Where P and P are the distribution of real data x and generated data x. W represents the distance of two distributions and E is mean value. To improve the training stability and convergence, they utilized a gradient penalty to optimize the training process. Meanwhile, the proposed method addressed the problems of frequency artifacts and instability in the training process of DA. To evaluate the effectiveness of DA, they proposed two evaluation indices (visual inspection and log-likelihood score from Gaussian mixture models) to assess the quality of generated data. Experiments show that presentation-associated patterns of EEG could be seen clearly in generated data and they obtained significant improvement based on the EEGNet model after DA in the RSVP task (Lawhern et al., 2018). A similar method was also performed in Aznan et al. (2019). Aznan et al. (2019) applied WGAN to generate synthetic EEG data that optimizes the efficiency of interaction in the SSVEP task. After that, they performed generated EEG to the pre-trained classifier in the offline stage and finetune classifier by real-collection EEG. This approach was used to control the robot and achieve real-time navigation. Results show that the DA method significantly improves the accuracy in real-time navigation tasks across multiple subjects. Yang et al. (2020) deemed that typical DA methods of GT and NI ignored the effect of signal-to-noise ratio (SNR) across trials. Therefore, they proposed a novel DA method by randomly averaging EEG data, which artificially generates EEG data with different SNR patterns. The DA was achieved by randomly taking n (1 < n < N) examples from the same category to calculate the average potential at each iteration, where N represents the number of all trials. RNN and CNN were used to classify different specific frequencies in the visual evoked potential (VEP) task and obtained significant improvement after DA. Li et al. (2019) discussed the effect of noise addition for time series form and spectrums signals in the MI-BCI task, respectively. They applied CNN combined with channel-projection and mixed-scale to classify 4-class MI signals and concluded that noise may destroy the amplitude and phase information of time-series signals, but cannot change the feature distribution of spectrum. Therefore, they performed STFT to transform time series EEG signal into spectral images, which was defined as amplitude-perturbation DA. Results show that the performance has been improved using DA almost for all subjects in two public datasets. Lee et al. (2020) investigated a novel DA method called borderline-synthetic minority over-sampling technique (Borderline -SMOTE). It generates synthetic data from minority class by using the m nearest neighbors from the instance of the minority class and then adding these instances into real data by weighting calculation. The effectiveness of DA was evaluated by EEG data collected from the P300 task. Results show that the proposed methods could enhance the robustness of decision boundaries to improve the classification accuracy of P300 based on BCIs. Regarding EEG-based passive BCIs, they have gradually become more prominent in research (Zander et al., 2009; Cotrina et al., 2014; Aricò et al., 2018), and are used to detect and monitor the affective states of humans. In this part, we introduce some cases of DA application to passive BCIs. Kalaganis et al. (2020) proposed a DA method based on graph-empirical mode decomposition (EMD) to generate EEG data, which combines the advantage of multiplex network model and graph variant of classical empirical mode decomposition. They designed a sustained attention driving task in a virtual reality environment, while realizing the automatic detection for the state of humans using graph CNN. The experimental results show that the exploration of the graph structure of EEG signal could reflect the spatial feature of signal and the methodology of integrating graph CNN with DA has obtained a more stable performance. Wang et al. (2018) discussed the limitations of DA for EEG in emotion recognition tasks and pointed out that the features of EEG in emotion detection tasks have a high correlation with time sequence. However, direct geometric transformations and noise injection may destroy the feature in the time domain, which may cause a negative DA effect. Based on these considerations, they added Gaussian noise to each feature matrix of the original data to obtain new training samples. The calculation could be defined as: Where μ and σ are mean value and standard deviation, respectively, P is probability density function, and z is Gaussian random variable. x is generated data after noise injection. There are three classification models, namely LeNet, ResNet, and SVM, that were used to evaluate the performance. Results show the generated data could significantly improve the performance for the classifier based on LeNet and ResNet. However, it obtains little effect on the SVM model. Luo et al. (2020) applied conditional Wasserstein GAN (cWGAN) and selective VAE (sVAE) to enhance the performance of the classifier in the emotion recognition tasks. The loss functions of sVAE is defined as follows: Where ELBO represent the evidence lower bound and x and x is real data and generated data, respectively. The goal of optimization was to maximize ELBO which was equal to minimizing the KL divergence between the real data and generated data. Based on the loss function of GAN, an extra penalty term is added to it: Where λ is weight coefficient for the trade-off between the original objective and gradient penalty, and represents the data points sampled from the straight line between a real distribution and generated distribution. ∥⋅∥2 is 2-norm value. In their work, the training samples of DA models were transformed into the forms of power spectral density or differential entropy and the performance of different classifiers are compared after DA. Experiments show that two representations of EEG signals were suitable for the requirement of the artificial datasets that enhances the performance of the classifier. Bashivan et al. (2015) emphasized the challenge in modeling cognition signals from EEG was extracting the representation of signals across subjects/sessions and suggested that DNN had an excellent ability for feature extraction. Therefore, they transformed the raw EEG signal to topology-preserving multi-spectral images as training sets in a mental load classification task. To address the overfitting and weak generalization ability, they randomly added noise to spectral images to generate training sets. However, this DA method did not significantly improve classification performance, just strengthened the stability of the model. To comprehensively show the implementation, we summarize the details of the application of DA in EEG decoding in Table 2.

TABLE 2

Summary of data augmentation for EEG decoding based on DNNs.

EEG paradigms	Channel of EEG	Subjects	DA methods	Input form of DA	Classifier	Improvement of accuracy after DA	Datasets	References
Driving detection	30	27	EMD	Time-series data	Graph CNN	72–95%	Private dataset (27 subjects)	Kalaganis et al., 2020
SD	23	23	WGAN	Time-series data	CNN	72.11–95.89%	CHB-MIT (PhysioNet, 2010)	Wei et al., 2019
ER	14	18	GAN	Time-series data	DNN	NA to 98.4%	Private dataset (18 subjects)	Chang and Jun, 2019
ER	15/32	NA	NI	Differential entropy	SVM/ResNet	40.8–45.4%	SEED (Chang and Jun, 2019)/MAHNOB-HCI dataset (Zheng and Lu, 2015)	Wang et al., 2018
ER	15/32	15	cWGAN/sVAE	Power spectral density/Differential entropy	SVM/DNN	44.9–90.8%	SEED/DEAP (Chang and Lin, 2011)	Luo et al., 2020
ER	32	15	NI	Time-series data	3D-CNN	79.11–88.49% 79.12–87.44%	DEAP (Koelstra et al., 2012)	Shawky et al., 2018
ER	NA	NA	CycleGAN	Time-series data	CNN	Average improvement of 3.7%∼8%	FER2013/SFEW/JAFFE datasets	Zhu et al., 2018
ERP	31	37	Paired trial	Time-series data	DNN	Average improvement of 20∼30%	ERP datasets from Williams et al. (2020)	Williams et al., 2020
P300	32	44	Borderline-SMOTE	Time-series data	SVM/CNN	Average improvement of 5∼15%	Private dataset (44 subjects)	Lee et al., 2020
MI	14	1	Conditional DCGAN	Time-frequency representation	CNN	78–83%	BCIC II-dataset III (Johnson et al., 2015)	Schlögl, 2003
MI	22/44	9/NA	NI	Spectral image	CP-MixedNet	Average improvement of 1.1∼4.5%	BCIC IV-dataset 2a (Zhang and Liu, 2018) /HGD (Ang et al., 2012)	Li et al., 2019
MI	14	1/5	EMD	Time-series data	CNN/WNN	77.9–82.9% 88.0–90.1%	BCIC II-dataset III/Five subject’s experiments	Zhang Z. et al., 2019
MI	14/3	1/9	GT	Time-frequency representation	CNN	NA	BCIC II-dataset III/BCIC IV-dataset 2b (Leeb et al., 2007; Schirrmeister et al., 2017)	Shovon et al., 2019
MI	3	9	Feature transform	Time-series data	CNN	Average improvement of 5∼10%	BCIC IV-dataset 2b	Huang et al., 2020
MI	3/3	4/9	DCGAN	Spectral image	CNN	74.5–83.2% 80.6–93.2%	BCIC IV-dataset 1 (Huang et al., 2020) /BCIC IV-dataset 2b (BCI Competition, 2008)	Zhang et al., 2020
MI	22	9	NI/GT	Time-series data	LSTM	NA	BCIC IV-dataset 2a	Freer and Yang, 2019
MI	22/3	9/9	SW	Time-series data	RM classifier	NA to 80.4%/82.39%	BCIC IV-dataset 2a/BCIC IV-dataset 2b	Majidov and Whangbo, 2019
MI	62	14	Conditional DCGAN	Spectral image	CNN	Average improvement of 3.22∼5.45%	Private dataset (14 subjects)	Fahimi et al., 2020
MI	22/3	9/9	Feature transform	Time-series data	HS-CNN	85.6–87.6%	BCIC IV-dataset 2a/BCIC IV-dataset 2b	Dai et al., 2019
MI	22/60	9/3	Extended CSSP	Feature matrix	FLDA	Average improvement of 0.3∼31.2%	BCIC IV-dataset 2a/BCIC III -dataset IIIa (Blankertz and BCI Competition, 2005)	Gubert et al., 2020
RSVP	256	10	WGAN	Time-series data	EEGNet	NA	BCIT X2 dataset (Dong et al., 2016)	Lawhern et al., 2018
Sleep stage classification	2/2	20/25	Oversampling model	Time-series data	BLSTM	Average improvement of 0.1∼2.0%	SA dataset/Sleep-EDF database (Goldberger et al., 2000)	Sun et al., 2019
SSVEP	32	8	Randomly average	Time-series data	RNN	Average improvement of 3∼13%	Private dataset (8 subjects)	Yang et al., 2020
MW detection	4	8	NI	Time-series data	DBN	NA	Private dataset (8 subjects)	Yin and Zhang, 2017b
MW detection	64	15	NI	Spectral image	Multi-frame classifier	NA	Private dataset (8 subjects)	Bashivan et al., 2014, 2015
MW detection	11	7	NI	Spectral image	SAE	34.2–75%	Private dataset (7 subjects)	Yin and Zhang, 2017a
MW detection	64	22	NI	Spectral image	RNN + CNN	NA to 93%	Private dataset (22 subjects)	Kuanar et al., 2018
MI	3	5	NI	Time-series data	CNN	NA	BCIC IV-dataset 2b	Parvan et al., 2019
MW detection	1	30	GAN	Spectral image	Boast	90–95%	NA	Piplani et al., 2018
MI	1	1	GAN	Spectral image	Distance measurement	NA	NA	Hartmann et al., 2018
ER	32	32	GAN	Spectral image	CWGAN	Average improvement of 3∼20%	SEED	Luo and Lu, 2018
MI	3	1	GAN	Spectral image	CNN	77–79%	BCIC IV-dataset 2b	Zhang X. et al., 2018
ER	14	18	GAN	Time-series data	GANA	97–98%	CHB-MIT	Chang and Jun, 2019
MI	3	9	GAN	Time-series data	CNN + LSTM	NA to 76%	BCIC IV-dataset 2b	Yang et al., 2019
RSVP	256	10	GAN	Time-series data	CNN	Average improvement of 0.7∼2%	BCIT X2 RSVEP Dataset (Touryan et al., 2014)	Panwar et al., 2019
SD	100	18	SW	Spatio-temporal signal	CNN	NA to 97%	Clinic dataset (100 subjects)	O’Shea et al., 2017
SD	100	10	SW	Time-series data	CNN	NA	University of Bonn Dataset (Andrzejak et al., 2001)	Ullah et al., 2018
SD	16	2	SW	Spectral image	CNN	NA	Clinic dataset	Truong et al., 2018
SSC	4	NA	SW	Time-series data	CNN	82.9–85.7%	Sleep-EDF database	Mousavi et al., 2019
SD	18	29	SW	Time-series data	CNN	NA to 93%	Clinic dataset	Avcu et al., 2019
MI	3	9	SW	Spectral image	CNN	NA to 84%	BCIC IV-dataset 2b	Tayeb et al., 2019
SD	18	24	SW	Spatio-temporal signal	LSTM	70–78%	CHB-MIT	Tsiouris et al., 2018
SD	NA	2	SW	Time-series data	CNN	NA	Clinic dataset	Tang et al., 2017
RSVP	64	12	Oversampling model	Time-series data	CNN	83.99–86.96%	Private dataset (12 subjects)	Manor and Geva, 2015
Movement of eye	2	NA	Oversampling model	Feature matrix	MLP	NA to 82%	MAHNOB HCI-Tagging database (Soleymani et al., 2012)	Drouin-Picaro and Falk, 2016
SSC	20	62	Oversampling model	Time-series data	CNN	Average improvement of 1.7∼20%	Sleep-EDF	Supratak et al., 2017
SSC	20	62	Oversampling model	Time-frequency representation	Mixer networks	89.3–90.1%	NA	Dong et al., 2017
SSC	14	121	Oversampling model	Spectral image	CNN	NA	Private dataset (121 subjects)	Ruffini et al., 2018
SD	23	23	Oversampling model	Spatio-temporal signal	CNN + LSTM	NA	CHB-MIT	Thodoroff et al., 2016
SSC	NA	NA	Oversampling model	Spectral image	CNN	NA to 99%	Private dataset	Sengur et al., 2019
SSC	16	NA	Feature transform	Time-series data	CNN	72–75%	Private dataset	Schwabedal et al., 2018
ER	32	32	GT	Wavelet feature	SAE	NA to 68.75%	DEAP	Surrogates Said et al., 2017
ER	32	22	GT	Feature matrix	NN	40.8–45.4%	DEAP	Frydenlund and Rudzicz, 2015
SSC	19	155	Repeat sampling	Entropy feature	CNN	NA to 81.4%	Clinic dataset	Deiss et al., 2018
ER	32	32	GT	Wavelet feature	CNN	Average improvement of 2.2∼5%	DEAP	Mokatren et al., 2019
MI	3/22	9/9	NI	Time-series data	Inception NN	Average improvement of 3%	BCIC IV-dataset 2b BCIC IV-dataset 2a	Zhang et al., 2021

Abbreviations: BCIC, BCI Competition; BLSTM, Bidirectional Long Short-Term Memory network; CSSP, Common Spectral Spatial Patterns; DBN, Deep Belief Networks; EMD, Empirical Model Decomposition; ER, Emotion recognition; FLDA, Fisher Linear Discriminant Analysis; HS-CNN, A CNN with Hybrid Convolution Scale; MW, Mental Workload; MLP, Multi-Layer Perception; LeNet, Deep neural network proposed by

Summary of data augmentation for EEG decoding based on DNNs. Abbreviations: BCIC, BCI Competition; BLSTM, Bidirectional Long Short-Term Memory network; CSSP, Common Spectral Spatial Patterns; DBN, Deep Belief Networks; EMD, Empirical Model Decomposition; ER, Emotion recognition; FLDA, Fisher Linear Discriminant Analysis; HS-CNN, A CNN with Hybrid Convolution Scale; MW, Mental Workload; MLP, Multi-Layer Perception; LeNet, Deep neural network proposed by

Discussion

The limitation of small-scale datasets hinders the application of DL for EEG classification. Recently, the strategy of DA has received widespread attention and is employed to improve the performance of DNNs. However, there remain several issues worth discussing. Taking the above discussion into consideration, we found that the input forms of DA models could be divided into three categories: time-series data, spectral image, and feature matrix. We also found that researchers preferred to convert EEG signals into image signals for subsequent processing in MI tasks. One possible reason might be that the features of MI are often accompanied by changes in frequency band energy, i.e., event-related desynchronization (ERD)/event-related synchronization (ERS; Phothisonothai and Nakagawa, 2008; Balconi and Mazza, 2009). This phenomenon indicated that more significant feature representations of MI-EEG were displayed in the time-frequency space rather than the time domain. While the EEG based on VEP paradigms prefer to employ the time-series signal as the input, which has the strict requirement of being time-locked and contains more obvious features in time sequence (Basar et al., 1995; Kolev and Schurmann, 2009; Meng et al., 2014). Another form of input is the feature matrix that could be extracted by wavelet, entropy, STFT, power spectral density, and so on (Subasi, 2007; Filippo et al., 2009; Seitsonen et al., 2010; Lu et al., 2017; Lashgari et al., 2020). From a difference of implementation point of view, DA can be divided into input space augmentation and feature space augmentation. Indeed, the former aspect has the advantage of interpretability and takes lower computational costs. However, we found that the operation in feature space could obtain more significant improvements than in input space based on the results of classification performance presented in Table 2. One explanation is that this type of DA model could extract an intrinsic representation of data due to the incredible ability of non-linear mapping and automatic feature extraction. Generative adversarial networks have become popular for generating EEG signals in recent years (Hung and Gan, 2021), although it has still not been clearly demonstrated to be the most effective strategy across different EEG tasks. Due to the limited number of studies, it is still unclear which method is the more popular technique. Consequently, researchers should select the appropriate DA method according to the paradigm type and feature representation of EEG. Previous studies show that DA could improve the decoding accuracy of EEG to varying degrees in different EEG tasks. However, this improvement varies greatly in different data sets and preprocessing modes. There are several possible explanations to be discussed. First, most studies have not discussed whether DA produces negative effects in the training stage of the classifier. As mentioned in the above discussion, EEG signals are accompanied by strong noise and multi-scale artifact. But existing DA methods are global operations, which cannot effectively distinguish these irrelevant components. Meanwhile, EEG signals collected from specific BCI tasks (SSVEP, P300) perform features that are time-locked and phase-locked, which may cause wrong feature representation using GT to produce artificial data. While GT performs effectively in MI and ER tasks due to this kind of signal having no strict requirement for feature-locking. Therefore, feature representation of EEG should be analyzed before the application of GT. Second, there are a few studies that discuss the boundary conditions of the feature distribution for generated data, even though it is one of the important guarantees of data validity. Another important issue worthy to discuss is how much generated data could most effectively enhance the performance of a classifier. Researchers have explored the influence of different ratios of real data (RD) and generated data (GD) for classification performance and demonstrated that the enhancement effect does not increase with the size of GD (Zhang et al., 2020). Research on the effect of different amounts of training data to the classification performance using artificial data has indicated that the improvement of performance requires at least a doubling size of GD (Zhang and Liu, 2018). Consequently, the size of the GD should be determined by multi-group trials with different mix proportion. Based on the above analysis we believe that the following studies are worthy of exploring in further research. First, different DA methods can be combined to extend datasets and augmentation would be executed both in input space and feature space. For example, generated data based on GT can be put into GANs to realize secondary augmentation, which may improve the diversity of generated data. Second, combining meta-learning with data enhancement might reveal why DA affects classification tasks, which may improve the interpretability of generated data. Meanwhile, DA based on GAN is a mainstream method at present, but how to improve the quality of generated data is still a valuable point.

Conclusion

Collecting large-scale EEG datasets is a difficult task due to the limitations of available subjects, experiment time, and operation complexity. Data augmentation has proven to be a promising approach to avoid overfitting and improve the performance of DNNs. Consequently, the research state of DA for EEG decoding based on DNNs is discussed in the study. The latest studies in the past 5 years have been discussed and analyzed in this work. Based on the analysis of their results, we could conclude that DA is able to effectively improve the performance in EEG decoding tasks. This review presents the current practical suggestions and performance outcomes. It may provide guidance and help for EEG research and assist the field to produce high-quality, reproducible results.

Author Contributions

CH is responsible for manuscript writing. JL draw related figure. YZ and WD guide the literature collection and structure of the manuscript. All authors contributed to the article and approved the submitted version.

Conflict of Interest

CH and JL were employed by the company Shenzhen EEGSmart Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

52 in total

Review 1. A review of classification algorithms for EEG-based brain-computer interfaces.

Authors: F Lotte; M Congedo; A Lécuyer; F Lamarche; B Arnaldi
Journal: J Neural Eng Date: 2007-01-31 Impact factor: 5.379

2. Classification of imperfectly time-locked image RSVP events with EEG device.

Authors: Jia Meng; Lenis Mauricio Meriño; Kay Robbins; Yufei Huang
Journal: Neuroinformatics Date: 2014-04

3. Brain oscillations and BIS/BAS (behavioral inhibition/activation system) effects on processing masked emotional cues. ERS/ERD and coherence measures of alpha band.

Authors: Michela Balconi; Guido Mazza
Journal: Int J Psychophysiol Date: 2009-08-24 Impact factor: 2.997

4. EEG Data Augmentation for Emotion Recognition Using a Conditional Wasserstein GAN.

Authors: Yun Luo; Bao-Liang Lu
Journal: Annu Int Conf IEEE Eng Med Biol Soc Date: 2018-07

Review 5. Passive BCI beyond the lab: current trends and future directions.

Authors: P Aricò; G Borghini; G Di Flumeri; N Sciaraffa; F Babiloni
Journal: Physiol Meas Date: 2018-08-29 Impact factor: 2.833

6. Mixed Neural Network Approach for Temporal Sleep Stage Classification.

Authors: Hao Dong; Akara Supratak; Wei Pan; Chao Wu; Paul M Matthews; Yike Guo
Journal: IEEE Trans Neural Syst Rehabil Eng Date: 2017-07-28 Impact factor: 3.802

7. Data augmentation for self-paced motor imagery classification with C-LSTM.

Authors: Daniel Freer; Guang-Zhong Yang
Journal: J Neural Eng Date: 2020-01-31 Impact factor: 5.379

8. HS-CNN: a CNN with hybrid convolution scale for EEG motor imagery classification.

Authors: Guanghai Dai; Jun Zhou; Jiahui Huang; Ning Wang
Journal: J Neural Eng Date: 2020-01-06 Impact factor: 5.379

9. A Long Short-Term Memory deep learning network for the prediction of epileptic seizures using EEG signals.

Authors: Κostas Μ Tsiouris; Vasileios C Pezoulas; Michalis Zervakis; Spiros Konitsiotis; Dimitrios D Koutsouris; Dimitrios I Fotiadis
Journal: Comput Biol Med Date: 2018-05-17 Impact factor: 4.589

10. Changes in EEG power spectral density and cortical connectivity in healthy and tetraplegic patients during a motor imagery task.

Authors: Filippo Cona; Melissa Zavaglia; Laura Astolfi; Fabio Babiloni; Mauro Ursino
Journal: Comput Intell Neurosci Date: 2009-06-24

2 in total

1. A Deep Learning-Based Quantitative Structure-Activity Relationship System Construct Prediction Model of Agonist and Antagonist with High Performance.

Authors: Yasunari Matsuzaka; Yoshihiro Uesawa
Journal: Int J Mol Sci Date: 2022-02-15 Impact factor: 5.923

2. Deep Convolutional Neural Network-Based Visual Stimuli Classification Using Electroencephalography Signals of Healthy and Alzheimer's Disease Subjects.

Authors: Dovilė Komolovaitė; Rytis Maskeliūnas; Robertas Damaševičius
Journal: Life (Basel) Date: 2022-03-04

2 in total