| Literature DB >> 25610393 |
Martin N Hebart1, Kai Görgen2, John-Dylan Haynes3.
Abstract
The multivariate analysis of brain signals has recently sparked a great amount of interest, yet accessible and versatile tools to carry out decoding analyses are scarce. Here we introduce The Decoding Toolbox (TDT) which represents a user-friendly, powerful and flexible package for multivariate analysis of functional brain imaging data. TDT is written in Matlab and equipped with an interface to the widely used brain data analysis package SPM. The toolbox allows running fast whole-brain analyses, region-of-interest analyses and searchlight analyses, using machine learning classifiers, pattern correlation analysis, or representational similarity analysis. It offers automatic creation and visualization of diverse cross-validation schemes, feature scaling, nested parameter selection, a variety of feature selection methods, multiclass capabilities, and pattern reconstruction from classifier weights. While basic users can implement a generic analysis in one line of code, advanced users can extend the toolbox to their needs or exploit the structure to combine it with external high-performance classification toolboxes. The toolbox comes with an example data set which can be used to try out the various analysis methods. Taken together, TDT offers a promising option for researchers who want to employ multivariate analyses of brain activity patterns.Entities:
Keywords: decoding; fMRI; multivariate pattern analysis; pattern classification; representational similarity analysis; searchlight
Year: 2015 PMID: 25610393 PMCID: PMC4285115 DOI: 10.3389/fninf.2014.00088
Source DB: PubMed Journal: Front Neuroinform ISSN: 1662-5196 Impact factor: 4.081
Figure 1General structure of The Decoding Toolbox (TDT). (A) TDT in the view of basic users. All that is required are brain images (ideally preprocessed with SPM) and a configuration variable cfg that contains all decoding-relevant information. TDT will then generate results, including. mat-files with the results or if required brain maps displaying the decoded information in space. (B) TDT view for intermediate users. Decoding design creation, type of analysis, type of classifier and type of output can be modified. All of these settings are necessary for any decoding analysis, which will be set to default settings if not specified by the user. This level of description already covers most scenarios that the typical user would encounter. (C) All TDT options. For the optional functions including feature selection, feature transformation, scaling, and parameter selection, TDT offers a number of preconfigured settings which can be customized. Expert users can extend the toolbox to include new methods (e.g., classifiers, feature selection methods) or can even create an interface to external machine learning packages.
Important terminology for multivariate decoding with The Decoding Toolbox.
| Beta maps | After estimating the |
| Chunk | A unit that determines which data should remain together within a |
| Cross-classification | Not to be confused with |
| Cross-validation | Method to estimate of how well a classifier generalizes to novel data. In |
| Curse of dimensionality | Within machine learning the fact that classifier performance, i.e., the predictive power of a classifier drops when the number of features (e.g., voxels) becomes much larger than the number of samples (e.g., brain images) |
| Decoding step | An iteration of a decoding analysis which is part of the cycle of evaluating the classifier. When |
| Feature selection | Methods that reduce the number of features (e.g., voxels). In our terminology, selecting |
| General linear model (GLM) | A statistical model that incorporates analysis of variance, linear regression, and related parametric tests into a common framework. In brain imaging, the GLM is commonly used to explain each voxel's time series separately with multiple linear regressors each representing conditions of interest or nuisance variables (Friston et al., |
| Hyperplane | A plane in more than 3D. Typically, the term |
| A regularization method which influences the complexity of a classification model. In most cases, an | |
| Margin | For |
| Searchlight analysis | One of the three most common types of decoding analysis conducted, the two other being |
| Support vector machine | A type of classifier that maximizes the |
| Weight vector | Determines the contribution of each feature to the final classifier function. For most classifiers, the weight vector cannot be directly interpreted as reflecting the classified variable, because it is a filter that extracts a class signal while at the same time it suppresses correlated noise. Using the covariance of the data, the weight vector can be converted into an interpretable |
Figure 2General analysis stream of a typical searchlight decoding analysis in The Decoding Toolbox (TDT). Top: Typical preprocessing of data is done prior to running TDT, for example with the common software SPM. Rather than submitting individual images to decoding analyses, it has become quite common to use temporal compression or statistical estimates of trials (trial-wise decoding) or of multiple trials within one run (run-wise decoding) as data for classification. Middle: The decoding stream of cross-validated searchlight decoding. After selecting voxels from a searchlight and extracting data from the preprocessed images, a leave-one-run-out cross-validation is performed. In this, data is partitioned, where in successive folds data from one run is used for testing and data from all other runs for training. In each fold, a classifier is trained and its performance is validated on the left-out test set. Finally, the performance of the whole cross-validation is evaluated, typically by calculating the mean accuracy across all cross-validation iterations. The accuracy is then stored at the center of the current searchlight. The whole procedure is repeated for all voxels in the brain, yielding a complete map of cross-validation accuracies. Bottom: Usually, these searchlight maps are post-processed using standard random effects analyses, for example using SPM's second-level routine.
Figure 3Decoding design matrices. (A) General structure of a decoding design matrix. The vertical dimension represents different samples that are used for decoding, typically brain images or data from regions of interest. If multiple images are required to stay together within one cross-validation fold (e.g., runs), this is indicated as a chunk. The horizontal axis depicts different cross-validation steps or iterations. If groups of iterations should be treated separately, these can be denoted as different sets. The color of a square indicates whether in a particular cross-validation step a particular brain image is used. (B) Example design for the typical leave-one-run-out cross-validation (function make_design_cv). (C) Example design for a leave-one-run-out cross-validation design where there is an imbalance of data in each run. To preserve balance, bootstrap samples from each run are drawn (without replacement) to maintain balanced training data (function make_design_boot_cv). (D) Example design for a cross-classification design which does not maintain the leave-one-run-out structure (function make_design_xclass). (E) Example design for a cross-classification design, maintaining the leave-one-run-out structure (function make_design_xclass_cv). (F) Example design for two leave-one-run-out designs with two different sets, in the same decoding analysis. The results are reported combined or separately for each set which can speed-up decoding.
Figure 4Results of simulations. (A) On the left, the difference of the mean of images belonging to both classes is shown. On the right, the results from Simulation 1 (searchlight analysis) are plotted. (B) Results from Simulations 2A and 2B. The left panel shows the weights of the SVM trained on data in (A). On the right, the results from a recursive feature elimination are plotted. (C) Results from Simulation 3. The left panel shows the ROIs that were selected. The right panel shows decoding accuracies in the two ROIs depending on the SNR.
Figure 5Results of analyses on Haxby 2001 data set. (A) Confusion matrix reflecting the confusion of all eight classes in ventral temporal cortex, averaged across all 6 subjects. (B) Searchlight multiclass classification results of subject 1 (permutation p < 0.001, cluster-corrected).