| Literature DB >> 33167561 |
Kai Zhang1,2, Guanghua Xu1,2, Xiaowei Zheng1,2, Huanzhong Li1, Sicong Zhang1, Yunhui Yu1, Renghao Liang1.
Abstract
The algorithms of electroencephalography (EEG) decoding are mainly based on machine learning in current research. One of the main assumptions of machine learning is that training and test data belong to the same feature space and are subject to the same probability distribution. However, this may be violated in EEG processing. Variations across sessions/subjects result in a deviation of the feature distribution of EEG signals in the same task, which reduces the accuracy of the decoding model for mental tasks. Recently, transfer learning (TL) has shown great potential in processing EEG signals across sessions/subjects. In this work, we reviewed 80 related published studies from 2010 to 2020 about TL application for EEG decoding. Herein, we report what kind of TL methods have been used (e.g., instance knowledge, feature representation knowledge, and model parameter knowledge), describe which types of EEG paradigms have been analyzed, and summarize the datasets that have been used to evaluate performance. Moreover, we discuss the state-of-the-art and future development of TL for EEG decoding. The results show that TL can significantly improve the performance of decoding models across subjects/sessions and can reduce the calibration time of brain-computer interface (BCI) systems. This review summarizes the current practical suggestions and performance outcomes in the hope that it will provide guidance and help for EEG research in the future.Entities:
Keywords: EEG; classification; decoding; review; transfer learning
Mesh:
Year: 2020 PMID: 33167561 PMCID: PMC7664219 DOI: 10.3390/s20216321
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Framework of an electroencephalography (EEG)-based brain–computer interface (BCI) system.
Inclusion and exclusion criteria.
| Inclusion Criteria | Exclusion Criteria |
|---|---|
|
Published within the last 10 years (as transfer learning (TL) for EEG has been proposed and developed in recent years). |
A focus on the processing of invasive EEG, electrocorticography (ECoG), magnetoencephalography (MEG), source imaging, fMRI, and so on, or joint studies with EEG. |
|
A focus on non-invasive EEG signals (as the object for discussion in this review). |
No specific description of TL for EEG processing. |
|
A specific explanation of how to apply TL to EEG signal processing. | \ |
Figure 2The search method for identifying relevant studies.
Figure 3The percentage of different EEG pattern strategies across collected studies.
Dataset.
| Datasets | Task | Subject | Channel | Amount of Data (Per Subject) | Sampling Rate | Reference |
|---|---|---|---|---|---|---|
| BCIC-II-IV | 2 MI classes | 1 | 28 | 3 sessions/416 trials | 1000 Hz | [ |
| BCIC-III-II | P300 | 2 | 64 | 5 sessions | 240 Hz | [ |
| BCIC-III-IVa | 2 MI classes | 5 | 118 | 4 sessions/280 trials | 1000 Hz | [ |
| BCIC-IV-2a | MI | 9 | 25 | 2 sessions/288 trials | 250 Hz | [ |
| BCIC-IV-2b | 2 MI classes | 9 | 6 | 720 trials | 250 Hz | [ |
| P300 speller | P300 | 8 | 8 | 5 sessions/20 trials | 256 Hz | [ |
| DEAP | ER | 32 | 40 | 125 trials | 128 Hz | [ |
| BCIC -III- IVC | MI | 1 | 118 | 630 trials | 200 Hz | [ |
| SEED | ER | 15 | 64 | 3 sessions/15 trials | 200 Hz | [ |
| OpenMIIR | Music Imagery | 10 | 64 | 5 sessions/12 trials in four tasks | 512 Hz | [ |
| CHB-MIT | ED | 22 | 23 | 844 hours’ collection | 256 Hz | [ |
Figure 4Different approaches to transfer learning.
Summary of transfer learning for EEG decoding.
| Pattern | Reference | Type | Transfer Method | Feature Extraction | Datasets | Results |
|---|---|---|---|---|---|---|
| MI | [ | ITL | Similarity measurement with KL divergence | LR+CSP | 19 subjects’ data, BCIC-IV-2a, and BCIC-III-4a | 70.3% and 75% and 75% |
| MI | [ | ITL | Informative subspace transferring and selective ITL with active learning | LDA | BCIC-IV-2b | \ |
| MI | [ | FTL | Ensemble learning and adaptive learning | LDA | NIPS | 68.1% |
| MI | [ | ITL | Similarity measurement with Jensen Shannon ratio and rule adaptation TL | CSP+LDA | BCIC-IV-2a | 77% |
| MRP | [ | ITL | MMD and regularized discriminative spatial pattern | Linear RR | BCIC-I-1 and BCIC-II-4 | \ |
| P300 | [ | ITL | Ensemble learning generic information | Bayesian LDA | 8 participants’ data | 62.5% |
| SSVEP | [ | ITL | Variability assessment Fisher’s discriminant ratios | Cluster | 8 subjects’ data | \ |
| P300 | [ | ITL | Dynamically adjusts weights of instances | Liner SVM | BCIC-III-2 and dataset of P300 speller | 74.9% |
| MI | [ | ITL | Selective informative with normalized entropy | LDA | BCIC-IV-2b | 75.8% |
| MI | [ | ITL | Selective informative expected decision boundary | LDA | BCIC-IV-2a | 75.6% |
| MI | [ | MTL | Domain adaptation parallel transport on the cone manifold of SPD | Linear SVM | BCIC-IV-2a | \ |
| DD | [ | ITL | Selective transfer learning | Linear regression | 15 subjects’ data | 66% |
| MI | [ | ITL | Composite local temporal correlation CSP Frobenius distance | Liner quadratic Mahalanobis | BCIC-III-IVa | 89.21% |
| SSVEP | [ | ITL | Ensemble learning and similarity measurement with mutual information | LDA | 10 healthy subjects’ data | \ |
| Cognitive detection | [ | ITL | Similarity measurement by Pearson’s correlation coefficient | SVM | 28 subjects’ data | 87.6% |
| ERP | [ | FTL | Probabilistic zero framework | Unsupervised adaptation | Akimpech dataset | \ |
| MI | [ | FTL | DA with power spectral density | CNN | BCIC-IV-2a | \ |
| MI | [ | FTL | Many-objective optimization | Linear SVM | BCIC-III-IVa | 75.8% |
| VEP | [ | FTL | Active semi-supervised TL | SVM | 14 subjects’ experiments | \ |
| MI | [ | FTL | Adaptive Selective CSP | Discriminant analysis | 6 participant experiments | 61% |
| MI | [ | FTL | CSA | Importance-weighted LDA | BCIC-III dataset | 79.1% |
| MI | [ | ITL | Instance TL based p-hash | CNN | BCIC-IV-2b | |
| MI | [ | FTL | Informative TL with AL | LDA | BCIC-IV-2a | 67.7% |
| MI | [ | MTL | Modifications of CSP | SVM | BCIC-III-IVa | \ |
| MI | [ | MTL | SVM+LDA | BCIC-IV-b | \ | |
| ER | [ | FTL | Transfer recursive feature elimination | Least-squares SVM | DEAP dataset | 78% |
| SSVEP | [ | MTL | Least-squares transformation | \ | 8 participant experiments | 82.1% |
| MI | [ | FTL | Domain transfer multiple kernel boosting | SVM |
BCIC-III-Iva |
81.6% |
| SSVEP | [ | FTL | Spatial filtering transfer | TRCA | 10 subjects’ data | \ |
| SSVEP | [ | FTL | Reference template transfer | MestCCA | 10 subjects’ data | \ |
| SSVEP | [ | FTL | Reference template transfer | Transfer template-CCA | 12 subjects’ data | 85% |
| SSVEP | [ | FTL | Reference template transfer | Adaptive combined-CCA | 10 subjects’ data | 83% |
| MI | [ | FTL | Fuzzy TL based on generalized hidden-mapping RR | SVM | BCIC-IV-2a | 89.3% |
| MI | [ | MTL | Adaptive extreme learning machine | SVM+ELM | 12 subjects’ data | 71.8% |
| MI | [ | MTL | Classifier ensemble | LDA | BCIC-IV-2a | \ |
| MI | [ | FTL | Regularized CSP with TL | LDA | BCIC-III-IVa | 78.9% |
| Multitask | [ | FTL | Geometrical transformations on Riemannian Procrustes analysis | / | 8 publicly available BCI datasets | \ |
| ERP | [ | FTL | Spectral transfer using information geometry | MDRM | 15 subjects’ data | 62% |
| MI | [ | FTL | Space adaptation | LDA | BCIC-IV-2a | 77.5% |
| MI | [ | FTL | Feature space transformation | LDA | PhysioNet datasets | 72% |
| MI | [ | FTL | Tangent space-based TL | LDA | BCIC-IV-2a | \ |
| ER | [ | FTL | Transfer component analysis and kernel principle component analysis | SVM | SEED | 77.96% |
| MI | [ | FTL | Transfer kernel CSP | SVM | BCIC-III-IVa | 81.14% |
|
MI/
| [ | FTL | Affine transform | Minimum distance mean and Bayesian classifier | BCIC-IV-2a and Brain Invaders experiment | \ |
| SSVEP | [ | ITL | Riemannian similarities | Bootstrapping | 12 subjects’ data | 80.9% & 81.3% |
| MI | [ | FTL | Multitask learning | RR+SVM | 10 healthy subjects’ data and an ALS subject’s data | 85% |
| Imagined speech | [ | ITL | Inductive transfer learning | Naïve Bayesian classifier | 27 subjects’ data | 68.9% |
| MI | [ | FTL |
Transferable discriminative | KNN+SVM | 5 subjects’ data | 74.4% |
| MI | [ | FTL | Nonstationary information transfers | LDA | 5 subjects’ data and BCIC-III-IVa | 80.4% |
| MI | [ | MTL | Fine-tuned based on VGG16 | CNN | BCIC-IV-2b | 74.2% |
| MI | [ | MTL | Fine-tuned based on pre-trained network | CNN | BCIC-IV-2a | 69.71% |
| ErrPs | [ | MTL | Fine-tuned based on pre-trained network | CNN | 15 epilepsy patients’ data | 81.50% |
| Music Imagination | [ | MTL | Fine-tuned based on AlexNet | LSTM | OpenMIIR dataset | 30.83% |
| ErrPs | [ | MTL | Fine-tuned based on pre-trained network | CNN | 31 subjects’ data | 84.1% |
| DD | [ | ITL | Source domain selection | Weighted adaptation regularization | 16 subjects ‘data | \ |
|
Attention | [ | ITL | Subject adaptation | CNN | 8 subjects ‘data | 84.17% |
| MI | [ | ITL | Subject transfer | CNN |
BCIC-IV2a and |
0.56 and 0.65 |
| MI | [ | MTL | Fine-tuned based on pre-trained network | RBM |
BCIC-IV2a and | 88.9% |
| P300 | [ | MTL | Fine-tuned based on pre-trained network | CNN | BCIC-III-2 | 90.5% |
| MI | [ | MTL | Fine-tuned based on pre-trained network | CNN | BCIC-IV-2b | 0.57 (MK) |
| MI | [ | MTL | Fine-tuned based on pre-trained network | Conditional variational autoencoder | PhysioNet datasets | 73% |
| Music Imagination | [ | FTL | Cross-domain encoder | Attention decoder-RNN | OpenMIIR datasets | 37.9% |
| MI | [ | ITL | Covariate shift detection and adaptation | Linear SVM |
BCIC-IV-2a |
73.8%
and
|
| MWA | [ | FTL | Cross-subject statistical shift | Random forest | 9 subjects‘ data | \ |
| MI | [ | MTL | Fine-tuned based on multiple network | CNN |
BCIC-IV-2a and | \ |
| MI | [ | MTL | Four-strategy model transfer learning | Deep neural network | BCIC-IV-2a | \ |
| MI | [ | ITL | Subject–subject transfer | CNN | 3 subjects’ data | \ |
| P300 | [ | ITL | Subject–subject transfer | Linear SVM | 22 subjects’ data | 68.7% |
| ER | [ | ITL | Measurement on Riemannian geometry instance transfer | SVM | MDME and SDMN datasets | \ |
| DD | [ | FTL | Adaptation regularization based TL | Multiple classifier |
23 subjects’ data and | 89.59% |
| ED | [ | MTL | Fine-tuned based on pre-trained network | GoogLeNet and Inception v3 | TUH open-source database |
82.85%
and
|
| ED | [ | MTL | Fine-tuned based on pre-trained network | CNN and bidirectional LSTM | CHB-MIT EEG dataset | 99.6% |
| ED | [ | MTL | Fine-tuned based on pre-trained network | CNN | CHB-MIT EEG dataset | 92.7% |
| MI | [ | FTL | Spatial filtering transfer and matrix decomposition | ELM |
BCIC-III-IVa and |
89% |
| SSVEP | [ | FTL | Spatial filtering transfer | Group TRCA | Benchmark dataset | \ |
| MI | [ | MTL | Adversarial inference | CNN | 52 subjects’ data | \ |
| ER | [ | FTL | Power spectral density feature | Polynomial/Gaussian kernels/ naïve Bayesian SVM | DEAP, MAHNOB-HCI, and DREAMER | \ |
| MWA | [ | FTL | Ensemble learning | Stacked denoising autoencoder | 8 subjects’ data | 92% |
| MI | [ | FTL | Data mapping and ensemble learning | LDA | BCIC-IV-2a | 0.58 (MK) |
| MI | [ | FTL | Center-based discriminative feature learning | CNN | BCIC-III-IVa | 76% |
“\” represents that there is no specific description or else multiple descriptions for the results.