| Literature DB >> 31294058 |
Luca Pion-Tonachini1,2, Ken Kreutz-Delgado2,3, Scott Makeig1.
Abstract
The ICLabel dataset is comprised of training and test sets of a set of spatiotemporal features of electroencephalographic (EEG) independent components (IC). The ICLabel training set feature sets were computed for over 200,000 EEG ICs from more than 6,000 existing EEG recordings. More than 8,000 of these ICs have accompanying crowdsourced IC labels across seven IC categories: Brain, Muscle, Eye, Heart, Line Nosie, Channel Noise, and Other. The feature-sets included in the ICLabel dataset are scalp topography images, channel-based scalp topography measures, power spectral densities (PSD) measures (median, variance and kurtosis) and autocorrelation functions, equivalent current dipole (ECD) model fits for single and bilaterally symmetric dipole models, plus features used in several published IC classifier approaches. The ICLabel test set is comprised of 130 ICs from 10 datasets not included in the training set. Each of the test set ICs has an associated IC label estimated based on labels provided by six ICA-EEG experts. Files necessary for adding to and amending the dataset are also included, plus a python class containing useful methods for interacting with the dataset, and IC classifications produced by several existing IC classifiers. These data are linked to the article, "ICLabel: An automated electroencephalographic independent component classifier, dataset, and website" [1]. An active tutorial and crowdsourcing website is available: iclabel.ucsd.edu/tutorial/overview.Entities:
Keywords: Classification; Crowdsourcing; EEG; ICA
Year: 2019 PMID: 31294058 PMCID: PMC6595408 DOI: 10.1016/j.dib.2019.104101
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1Graphical summary of an EEG independent component (IC). This is representative of what was shown to volunteer IC labelers who visited iclabel.ucsd.edu. The circle to the top-left is a scalp topography. The time series to the top-right shows IC activity, as does the plot to the bottom-left. The bottom-center illustration shows the single-dipole and bilaterally-symmetric-dipole model fits. The bottom-right illustrates the IC power spectral density (PSD) with two different frequency scales. RV stands for “residual variance”, or how well the dipole fit models the data. DMR stands for “dipole moment ratio” which is the ratio of the bilaterally-symmetric stronger to weaker dipole moment norms.
"Handcrafted" IC features available in the ICLabel dataset.
| Feature | Origin | Description |
|---|---|---|
| Autocorrelation | SASICA | Autocorrelation with a lag of 20 ms |
| Focal scalp topography | SASICA | Interpolated scalp map showing IC projection polarity and relative strength across the scalp using EEGLAB |
| Signal to noise ratio | SASICA | Trial-based measure of evoked potentials (present in file |
| Signal variance | SASICA | Sample variance of the IC process activity |
| Temporal kurtosis | ADJUST | Sample kurtosis of the IC process activity |
| Spatial eye difference (SED) | ADJUST | Measure of anterior horizontal scalp projection distribution |
| Spatial average difference (SAD) | ADJUST | Difference between absolute projections to anterior and posterior scalp regions |
| Differential variance | ADJUST | Difference between squared projections to anterior and posterior scalp regions |
| Maximum epoch variance (MEV) | ADJUST | Ratio of maximum and mean trial variance |
| Median gradient value | FASTER | Median of first derivative of IC activity |
| Kurtosis of spatial map | FASTER | Spatial kurtosis of IC scalp projections |
| Hurst exponent | FASTER | Measure of time series “memory” |
| Channel count | – | Number of EEG electrode channels |
| IC count | – | Number of ICs in the decomposition |
| Scalp topography radius | – | Radius of the scalp topography image (using EEGLAB |
| Epoched dataset | – | Whether the IC activity is continuous or a series of trials |
| Sample rate | – | Sampling rate of the IC time series |
| Data points | Total number of sample points in the recording |
Specifications Table
| Subject area | |
| More specific subject area | |
| Type of data | |
| How data was acquired | |
| Data format | |
| Experimental factors | |
| Experimental features | |
| Data source location | |
| Data accessibility | |
| Related research article |
This dataset contains extensive summary statistics for over 200,000 independent components (ICs) of high-density EEG datasets, a subset of which are labeled. The data can be used to develop and evaluate EEG independent component classifiers. The EEG recordings included in this dataset encompass many experimental paradigms, recording environments, preprocessing recipes, and blind source separation algorithms. The data could be used in combination with other similar datasets. Meta-analysis can be performed on this dataset to learn common properties of EEG independent components including EEG effective brain sources. |