Literature DB >> 34806255

Early soft and flexible fusion of electroencephalography and functional magnetic resonance imaging via double coupled matrix tensor factorization for multisubject group analysis.

Christos Chatzichristos¹, Eleftherios Kofidis^2,3, Wim Van Paesschen⁴, Lieven De Lathauwer^1,5, Sergios Theodoridis^6,7, Sabine Van Huffel¹.

Abstract

Data fusion refers to the joint analysis of multiple datasets that provide different (e.g., complementary) views of the same task. In general, it can extract more information than separate analyses can. Jointly analyzing electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) measurements has been proved to be highly beneficial to the study of the brain function, mainly because these neuroimaging modalities have complementary spatiotemporal resolution: EEG offers good temporal resolution while fMRI is better in its spatial resolution. The EEG-fMRI fusion methods that have been reported so far ignore the underlying multiway nature of the data in at least one of the modalities and/or rely on very strong assumptions concerning the relation of the respective datasets. For example, in multisubject analysis, it is commonly assumed that the hemodynamic response function is a priori known for all subjects and/or the coupling across corresponding modes is assumed to be exact (hard). In this article, these two limitations are overcome by adopting tensor models for both modalities and by following soft and flexible coupling approaches to implement the multimodal fusion. The obtained results are compared against those of parallel independent component analysis and hard coupling alternatives, with both synthetic and real data (epilepsy and visual oddball paradigm). Our results demonstrate the clear advantage of using soft and flexible coupled tensor decompositions in scenarios that do not conform with the hard coupling assumption.

Entities: Chemical

Keywords: EEG; double CMTF; fMRI tensor decompositions; flexible; fusion; seizure onset; soft

Mesh：

Year: 2021 PMID： 34806255 PMCID： PMC8837580 DOI： 10.1002/hbm.25717

Source DB: PubMed Journal: Hum Brain Mapp ISSN： 1065-9471 Impact factor: 5.038

INTRODUCTION

To better understand a system as complex as the human brain, multimodal measurements can be beneficial since they are able to provide information on complementary aspects of the same system. Through jointly analyzing the data that result from different modalities, their individual advantages may be exploited and at the same time some of their disadvantages can be mitigated (Adalı, Levin‐Schwarz, & Calhoun, 2015a; Lahat, Adalı, & Jutten, 2015). In this way, a more accurate localization of the activated brain areas can be achieved. Two of the most commonly used neuroimaging modalities are the electroencephalography (EEG) and functional magnetic resonance imaging (fMRI). fMRI is a noninvasive brain imaging technique, which indirectly studies brain activity by measuring fluctuations of the blood‐oxygen‐level‐dependent (BOLD) signal (neurovascular coupling) (Lindquist, 2008). The first BOLD fluctuation occurs roughly 2–3 s after the onset of the neural activity, when the oxygen‐rich (oxygenated) blood starts displacing the oxygen‐depleted (deoxygenated) blood. This rises to a peak after 4–6 s, before falling back to the baseline (and typically undershooting slightly). The BOLD poststimulus neuronal response is consistent with brain processes that alter the coupling between metabolism (need of oxygen) and blood flow (Huster, Debener, Eichele, & Herrmann, 2012; Logothetis & Pfeuffer, 2004). The time course of the BOLD signal corresponding to a transient neural activity is called the hemodynamic response function (HRF). Although fMRI has a high spatial resolution, often at the millimeter scale, it is both an indirect and a “delayed” measure of the brain activity, with its temporal resolution being limited by the repetition time (TR) of the scanner, usually of the order of seconds (Lindquist, 2008). Other than the resolution limitation, fMRI has been reported to underestimate the cortical activity near the ischemic area in patients with stroke, or in cases where the neurovascular coupling has been affected due to pathological issues or due to some medication (pharmacological studies) (Vitali, Di Perri, Vaudano, Meletti, & Villani, 2015). EEG provides information about the neural electrical activity in the brain as a function of time. This is done with the aid of multiple electrodes that are placed at certain locations over the scalp or (in rarer cases, in intracranial EEG) over the cortex under the skull. The EEG signal results from the electrical measurement of the neuronal activation, realized through the movement of charged ions at the junction between the synapses of (the dendrites of) the neurons. This provides a more direct measure of the neuronal activity compared to fMRI and hence a better temporal resolution (sensitive to millisecond changes in neural processing). However, EEG has poor spatial resolution, limited by the number of electrodes employed and the resistive properties of the extracerebral tissues. Furthermore, due to the fact that electrodes are more sensitive to neural activations that occur closer to the scalp, determining the exact location of the activations that take place in deeper areas is more challenging (Sanei & Chambers, 2007; Vitali et al., 2015). For example, in epilepsy related applications, the EEG signal is being used in order to identify the exact time of occurrence of the seizure event while the higher spatial resolution of fMRI data helps to delineate the seizure onset zone. The complementary nature of their spatiotemporal resolution motivates the fusion of EEG and fMRI toward a better (in both time and space) localization of the brain activity (Karahan, Rojas‐Lopés, Bringas‐Vega, Valdés‐Hernandéz, & Valdes‐Sosa, 2015; Levin‐Schwartz, 2017). The factorization or decomposition of data is a versatile instrument for data fusion, which consists of factorizing all involved data matrices (or tensors) so that one or more factors are shared (fully or partially and in a hard or smooth sense) among them. We can couple (fuse) those different decompositions by adding extra constraints on the data. Fusing data is only meaningful in the presence of a common representation that allows to exploit a relationship among the different datasets. Decomposing the data can precisely offer such a common representation or latent space (Lahat et al., 2015). The goal of different types of decompositions is to estimate the different components (“causes” or sources) that underlie and/or generate the data, based only on the observations, that is, “blindly.” Hence, the localization of the activated brain areas in the fusion of EEG and fMRI is a challenging joint blind source separation (BSS) problem (Theodoridis, 2020), in which the sources consist of a combination of the activated areas (spatial maps in fMRI and topoplots in EEG) and time courses. BSS methods that use as input matrices are mostly based on independent component analysis (ICA) (Calhoun, Liu, & Adalı, 2009; Hunyadi, De Vos, Van Paesschen, & Van Huffel, 2015; Mijović et al., 2012; Swinnen, Hunyadi, Acar, Van Huffel, & De Vos, 2014) and rely on the concatenation of different modes. They have been, up to recently, the state of the art for jointly analyzing EEG and fMRI. ICA is a powerful tool for separating a multivariate signal into additive components based on the assumption that they are statistically independent. However, by definition such methods fall short in exploiting the inherently multiway nature of these data. fMRI and EEG datasets are inherently multidimensional, comprising information in time and along different voxels or channels, subjects, trials, etc. For EEG, in order to unveil more of the latent information, the signal can be expanded along additional modes. These extra modes can be used to represent features from the frequency domain (e.g., via a wavelet transform) or the repeated responses along a trial/event‐related potential (ERP) mode (ERP is the response immediately after a specific sensory, cognitive, or motor stimulus) (Cong et al., 2015). This multidimensional nature of the EEG and fMRI datasets points to the adoption of tensor models instead of the matrix ones. Several tensor decomposition models have been applied in fMRI and EEG BSS, including canonical polyadic decomposition (CPD) or PARAllel FACtor analysis (PARAFAC) (Acar, Aykut‐Bingo, Bingol, Bro, & Yener, 2007; Andersen & Rayens, 2004), and its generalizations known as PARAFAC2 (Chatzichristos, Kofidis, & Theodoridis, 2017; Spyrou, Parra, & Escudero, 2019) and block term decomposition (BTD) (Chatzichristos, Kofidis, Moreno, & Theodoridis, 2019; De Lathauwer, 2012). The tensor decomposition models can (a) improve the ability of extracting spatiotemporal patterns of interest (Andersen & Rayens, 2004; Helwig & Hong, 2013; Stegeman, 2007), (b) facilitate neurophysiologically meaningful interpretations (Andersen & Rayens, 2004), and (c) produce unique (modulo scaling and permutation ambiguities) representations under mild conditions (Sidiropoulos & Bro, 2000). Those mild conditions can, in general, be more relaxed in the case of coupled tensor decompositions compared to their single‐tensor counterparts. It has been demonstrated that coupling through one or more common modes, by sharing the corresponding factors among the tensor decompositions, can ensure uniqueness that is not possible when considering separate decompositions (Sørensen, Domanov, & De Lathauwer, 2015). Moreover, tensorial methods are able to make predictions more robustly in the presence of noise, compared to their two‐way counterparts (Andersen & Rayens, 2004; Chatzichristos, Kofidis, et al., 2019; Sidiropoulos et al., 2017). This is very important in view of the fact that the biomedical data are usually highly corrupted by noise (Andersen & Rayens, 2004). In the context of fusion, the multiway nature of EEG has been exploited in some earlier methods (Acar, Levin‐Schwartz, Calhoun, & Adalı, 2017a; Hunyadi, De Vos, Van Paesschen, & Van Huffel, 2016), but it has been so far mostly neglected for fMRI. Furthermore, most of the methods rely on preprocessing of the fMRI data in the general linear model (GLM) framework. A spatial map of interest (areas of activation) per subject is extracted from the fMRI data and all the spatial maps are vectorized and stacked in a matrix (space × subjects). That is the spatial correlations and the extra dimension of time are discarded and the analysis relies on features (instead of raw data) and coupled matrix tensor factorization (CMTF) to solve the joint BSS problem. In the GLM framework, a canonical HRF is assumed to be known (and be invariant in space and among subjects), the expected signal changes are defined as regressors of interest in a multiple linear regression task and the estimated coefficients are tested against a null hypothesis. The assumption of intrasubject and intersubject variability of HRF is not accurate. It is known that HRF varies among different areas, subjects, and even tasks (Aguirre, Zarahn, & D'Esposito, 1998; Handweker, Ollinger, & Esposito, 2004; Van Eyndhoven, Hunyadi, Dupont, Van Paesschen, & Van Huffel, 2019). Alternative methods have been proposed for the fMRI analysis, which accommodate possible discrepancies between the assumed HRF and the actual one, and cope with possible resulted distortions and inaccuracies, by the adoption either of flexible dictionaries (Morante, 2020) or semiblind ICA approaches (Calhoun, Stevens, Kiehl, & Pekar, 2005). The assumptions used by GLM may introduce bias on the size of activation and on the estimates of its variance, thus affecting test statistics, power, and false positive rates, and so forth (Monti, 2011; Olszowy, Aston, Rua, & Williams, 2019). Furthermore, as it has been reported in some studies (Vitali et al., 2015), the neurovascular coupling could be also altered in epileptic patients; hence, the commonly used assumptions for the HRF may not be valid anymore. The goal of this article is to demonstrate the gains from: Using raw data instead of features (early fusion), omitting any possible bias in the resulting estimates, as previously explained, in an effort to fully exploit the information that underlies the raw data (Lahat et al., 2015; Ramachandram & Taylor, 2017). Exploiting the multiway nature of both modalities. Using models that do not assume hard coupling among the different modes of the modalities, in order to avoid problems due to mismodeling. Two different approaches that do not assume hard coupling will be explored, namely flexible and soft coupling tensor models. In flexible coupling, the HRF is computed as part of the optimization, while in soft coupling the HRF is constrained to be similar (but not identical) to a reference model of it. Hence, we propose the use of soft coupled tensor–tensor decomposition in analyses, where all the subjects are expected (due to the experiment settings) to have similar time courses and soft (or flexible) double CMTF (DCMTF) otherwise. In both approaches, the existence of nonshared components is allowed in order to model the possible existence of neural sources or artifacts that are not reflected in both modalities. Furthermore, we propose a lighter alternative model for the HRF (with fewer parameters), which will be used as a reference in the flexible model, in an attempt to decrease the computational cost of optimizing the HRF. We compare the proposed methods with methods based on ICA, hard coupling and uncoupled CPD per modality. A shorter version of this article appears in Chatzichristos, Kofidis, De Lathauwer, Theodoridis, and Van Huffel (2020). The soft coupled tensor–tensor decomposition approach was first proposed in our earlier work (Chatzichristos, Davies, Escudero, Kofidis, & Theodoridis, 2018), where the adoption of tensor models for both modalities was proposed for the first time. This article is organized as follows. The adopted notation is introduced in the following subsection. In the next section, we will briefly discuss the categorization of different fusion approaches and we will provide a detailed overview of tensor‐based approaches to the fusion of EEG and fMRI. In Section 3, after introducing the reader to tensor decomposition and the statistical methods used in the fusion of EEG and fMRI (Sections 3.1 and 3.2), we will present the alternative model for the HRF (Section 3.3). In the remaining subsections, the two different approaches for the fusion of EEG and fMRI will be introduced, namely soft coupled tensor–tensor decomposition and soft (or flexible) DCMTF. In Section 4, we will present comparative results with both simulated and real data (epilepsy and visual oddball paradigm) analyzed. In the last part of this article, we will discuss our findings and draw our conclusions.

Notation

Vectors, matrices, and higher‐order tensors are denoted by bold lower‐case, upper‐case, and calligraphic upper‐case letters, respectively. For a matrix , ⊤ and † denote its transpose and pseudo‐inverse, respectively. An entry of a vector , a matrix , or a (third‐order) tensor is denoted by a , a , or a , respectively. :,1: is used to denote the columns 1 to j of a matrix and denotes the jth column of . The column‐wise Khatri–Rao product of two matrices, and is denoted by , with , being the jth columns of , , respectively and ⊗ denoting the Kronecker product. The outer product of two tensors is denoted by ∘. For an Nth‐order tensor, , is its mode–n unfolded (matricized) version (whose rank is known as mode–n rank), which results from mapping the tensor element with indices (i 1, i 2, …, i ) to a matrix element (i , j), with , . The Frobenius norm of the tensor is defined as: .

DATA FUSION

Categorization of data fusion approaches

EEG–fMRI fusion is only one of a multitude of examples of data fusion. Data fusion generally refers to the analysis of several datasets where they interact with and inform each other. Different types of fusion can be realized (Karahan et al., 2015; Lahat et al., 2015; Ramachandram & Taylor, 2017) but generally, the definition may differ with regard to the degree of generality and the specific research areas (Cocchi, 2019). Various types of applications, involving diverse sets of inter‐related data, have emerged. These include metabolomics (Acar, Bro, & Smilde, 2015), array processing (Sørensen & De Lathauwer, 2013; Sørensen & De Lathauwer, 2017), sentiment analysis (Zadeh, Chen, Poria, Cambria, & Morency, 2017), link prediction (Ermiş, Acar, & Cemgil, 2013) and, of course, biomedical applications (Adalı et al., 2015a; Adalı, Levin‐Schwarz, & Calhoun, 2015b; Calhoun & Adalı, 2008; Karahan et al., 2015; Lahat et al., 2015; Ramachandram & Taylor, 2017), among many others. Data fusion approaches can be categorized in various ways. The main categorizations are based on the (a) the level and (b) way the fusion is performed (Figure 1). Two main levels and two sublevels have been defined and become a reference classification (Llinas & Hall, 1998; Ramachandram & Taylor, 2017), namely, “early”/low‐level fusion and “late” fusion, which is subdivided into mid‐level fusion and high‐level fusion. In “early”/low‐level fusion (or observational level), raw datasets are used. Late mid‐level (feature level or state‐vector level) fusion is considered when the data fusion methods operate on features extracted from each dataset separately, so, instead of using raw data for the problem at hand (e.g., classification), features of the data are utilized. The late high‐level (decision/information level) fusion methods model each dataset separately and only decisions (model outcome) from processing of each data block are fused.

FIGURE 1

Different types of data fusion approaches

Different types of data fusion approaches The categorization based on the way the fusion is performed is twofold. The earliest approaches to EEG–fMRI fusion (and a large number of recent ones, for example, (Ferdowsi, Abolghasemi, & Sanei, 2015)) are essentially “integrative” (or “asymmetric” (Huster et al., 2012)) in nature (Lahat et al., 2015). The rationale behind them is to employ objective functions for the decomposition of the fMRI signal with constraints based on information previously derived from the EEG analysis (or vice versa), the so‐called EEG‐aided fMRI analysis (or fMRI‐aided EEG analysis, respectively). Recently, the emphasis has been turned to “true” (or “symmetric”) fusion, for example, (Acar et al., 2017a; Acar, Levin‐Schwartz, Calhoun, & Adalı, 2017b; Hunyadi et al., 2016; Karahan et al., 2015; Martínez‐Montes, Valdés‐Sosa, Miwakeichi, Goldman, & Cohen, 2004; Van Eyndhoven, Hunyadi, De Lathauwer, & Van Huffel, 2017), where the decomposition of the data from each modality can influence the other using all the common available information that may exist, and the decomposition of the datasets is simultaneous, allowing full interactions with each other. In order to fully exploit all the latent information that lies in multimodal datasets, approaches that allow direct integration of information across modalities in the context of true fusion (Huster et al., 2012) are needed. The true fusion is realized through coupling of the data across shared modes/ways. Detailed reviews of such methods can be found in, for example, (Adalı et al., 2015a; De Lathauwer & Kofidis, 2017). Coupling is realized in the corresponding optimization through imposing constraints that relate the shared components/factors. A unifying algorithmic framework that implements both hard and soft CMTF approaches recently appeared in Schenker, Cohen, and Acar (2021).

Fusion of EEG and fMRI

EEG–fMRI fusion has been employed in different applications. Presurgical evaluation of epileptic patients is one of its first and main applications, in search of an improved localization of the neural sources of epileptogenic activity (Adalı et al., 2015b; Hunyadi, Dupont, Van Paesschen, & Van Huffel, 2017; Huster et al., 2012). Furthermore, EEG and fMRI fusion has been used in order to address research questions in cognitive neuroscience in the context of classical cognitive experiments, such as oddball paradigms (Walz et al., 2013) or auditory detection tasks (Sadaghiani et al., 2010). Also, fusion has been used for the accurate identification of the functional connectivity or the default modes of the brain by assessing associations between spontaneous EEG oscillations and fluctuations of the fMRI signal in resting state (Huang, Long, & Lei, 2019; Vitali et al., 2015). Although fMRI is not commonly used for brain–computer interface (BCI) applications, due to its immobility, recently fusion of EEG–fMRI via tensor decompositions has also been used in a training phase of a BCI (Deshpande, Rangaprakash, Oeding, Cichocki, & Hu, 2017). Representational similarity analysis (Cichy, Pantazis, & Oliva, 2014) is a type of late high‐level fusion approach, which is being studied lately for magnetoencephalography–fMRI and EEG–fMRI fusion. In representational similarity analysis, the data from different modalities are not necessarily obtained simultaneously. Representational dissimilarity matrices are extracted from the activity patterns of each modality and are quantitatively related to each other by comparing them in a late‐fusion context (Hebart, Bankson, Harel, Baker, & Cichy, 2018; Salmela, Salo, Salmi, & Alho, 2016). Various ways to couple the two datasets have been proposed, depending on the coupled mode: (a) coupling in the spatial domain with the use of the so‐called lead‐field matrix, which summarizes the volume conduction effects in the head (by transforming the 2D spatial information of the EEG to the three‐dimensional (3D) spatial information of the fMRI) (Karahan et al., 2015); (b) coupling in the time domain using the convolution of the EEG time course with an HRF (Martínez‐Montes et al., 2004); and (c) coupling in the subject domain, using the assumption that the same neural processes are reflected in both modalities with the same covariation (Calhoun, Adalı, Pearlson, & Kiehl, 2006; Hunyadi et al., 2016; Swinnen et al., 2014). For the single‐subject case, the EEG dataset is most commonly represented by a 3D or four‐dimensional (4D) tensor with its modes being space (channels) × time × frequency/ERP whereas the fMRI signal is commonly represented as a matrix with its dimensions corresponding to space (voxels) × time. In the multisubject case, usually features instead of the raw data are used and hence the time mode can be replaced by a subjects mode (Hunyadi et al., 2016; Acar, Kolda, & Dunlavy, 2011). Their fusion relies on the coupling of the EEG tensor and the fMRI matrix along their common mode (space, time, or subjects), employing CMTF. In EEG‐aided fMRI analyses using GLM, the EEG signal (or part of it) is used as the regressor of interest. Intrasubject and intersubject variability of HRF is known to exist (Handweker et al., 2004); hence, a possible misspecification of the HRF may lead to biased estimates of widespread, delayed or elongated activity in the brain (Lindquist, 2008; Handweker et al., 2004). Moreover, the mismatch of the temporal resolution of EEG and fMRI further limits the potential of the GLM analysis (Hunyadi et al., 2015). The use of the spatial maps from GLM categorizes such CMTF‐based methods as late mid‐level fusion (Ramachandram & Taylor, 2017). In all of the approaches that were previously described, the coupling across the corresponding modes is “hard,” meaning that the shared factors are constrained to be equal (after any transformation applied, e.g., convolution with an HRF). Such an assumption is very restrictive, since it implies that the used transformation is valid for every area of the brain and any subject. Moreover, there are cases where the strong coupling between the two modalities is also violated by the physiology of the activation. A stimulus, in rare cases, might cause a significantly lower coupling ratio between the two modalities by triggering neuronal activity with a detectable electrophysiological response but with low metabolic influence in the blood flow, hence, causing low activation detected in fMRI (Logothetis & Pfeuffer, 2004). In order to alleviate such problems caused in the modeling by this constraint of hard coupling, a “softer” assumption of similarity (or with similar properties), that is, not necessarily a strong equality, can be adopted instead (Seichepine, Essid, Fevotte, & Cappe, 2014; Farias, Cohen, & Comon, 2016). Furthermore, different methods can be used to account for a possible misspecification of the HRF. Constraining the HRF to a class of “plausible” waveforms and estimating the optimal one from the data itself has been proposed in Van Eyndhoven et al. (2017) for the single‐subject case. Such approaches will be called “flexible.” In this work, we investigate early (Ramachandram & Taylor, 2017) fusion of fMRI and EEG via soft or flexible coupling. As explained previously, soft and flexible coupling are different ways to accommodate for a possible mismodeling of the HRF. Their main difference is that with soft coupling all the HRFs of the different subjects are assumed to be similar (and not equal) with an a priori known HRF, while in the flexible approach only the model of the HRF is a priori known and the variables of the model, which determine the exact shape of the HRF, are estimated through optimization. In other words, the “soft” coupling assumes similarity and not a strong/hard equality, the “flexible” coupling allows the variation of the HRF through space by the use of a data‐driven HRF model (which is to be also optimized) and the early fusion uses the raw data instead of features. These are the basic ingredients of the proposed approach, in addition, of course, to the adoption of tensor models for both modalities.

MATERIALS AND METHODS

Canonical polyadic decomposition

CPD (or PARAFAC) (Sidiropoulos et al., 2017) approximates a third‐order tensor, (naturally extended to tensors of higher order), by a sum of R (number of sources here) rank‐1 tensors, namely Equivalently, for the kth frontal slice of , where , and are similarly defined matrices, and is the diagonal matrix having the elements of the kth row of on its diagonal. The main advantage of the CPD, besides its simplicity, is the fact that it is unique (up to permutation and scaling) under mild conditions (Sidiropoulos et al., 2017). Uniqueness of CPD is crucial to its application in BSS problems. Its performance is, however, largely dependent on the correct estimation of the tensor rank, R.

Coupled matrix tensor factorization

CMTF was first proposed in Acar et al. (2011), and later extended in Acar, Rasmussen, Savorani, Naes, and Bro (2013). For a third‐order tensor and a matrix , which are coupled across their common mode, it can result from the following optimization problem: The factor B, with columns b (r = 1, 2 … R), is shared between the decomposition of the third order tensor, , and the decomposition of the matrix, M. The matrices A and C correspond to the nonshared modes of . Furthermore, as in the previous subsection, is the diagonal matrix having the elements of the kth row of on its diagonal. The decomposition of into low‐rank factors “transfers” its uniqueness properties via the factor matrix B to the matrix decomposition, removing any ambiguities in (provided that B is of full column rank) (De Lathauwer & Kofidis, 2017). In a more generic setting, the datasets share some of the components in a common mode, but possess individual variability as well. In such a case, a model in which also unshared columns of factors are present is better suited. The so‐called advanced CMTF (ACMTF) (Acar et al., 2017a; Acar et al., 2017b) model has also been proposed in order to allow the presence of both shared and unshared components in the coupled factor(s) and provides a way to automatically determine them. One of the weaknesses of ACMTF is that it forces equality of the coupled components, which may influence the result of the decomposition in the presence of a model inaccuracy. In an attempt to smear such effects, Relaxed ACMTF (RACMTF) properly modifies the ACMTF model, with the adoption of soft coupling (Rivet, Duda, Guerin‐Dugue, Jutten, & Comon, 2015). Following the same rationale in order to remove the strict assumption of identical shared components of ACMTF, another alternative approach is to maximize the correlation between the shared components of the tensor and the matrix in the common mode. This method is called Correlated CMTF (Mosayebi & Hossein‐Zadeh, 2020). A wide variety of constraints, loss functions and different linear couplings (couplings with linear transformations) can be added in the CMTF rationale (Schenker et al., 2021).

Statistical methods

Classical approaches for jointly analyzing fMRI and EEG include joint ICA (JICA) (using one (Calhoun et al., 2009; Calhoun et al., 2006) or multiple (Swinnen et al., 2014) electrodes for EEG), linked ICA (LICA) (Groves, Beckmann, Smith, & Woolrich, 2011), and parallel ICA (pICA) (Calhoun et al., 2009; Hunyadi et al., 2015). LICA and pICA are late fusion approaches. pICA first identifies components separately for each modality, performing a temporal ICA in EEG and a spatial ICA in fMRI. In a second step, the corresponding extracted components are identified based on their correlation in the temporal domain. pICA can be performed either at a single‐subject level (Hunyadi et al., 2015) or at a multisubject level using group ICA (GICA; Lei, Qiu, Xu, & Yao, 2007). LICA uses a Bayesian framework which determines automatically the optimal weighting of each modality, and can also detect nonshared components, in case they exist. JICA is a late mid‐level fusion approach and jointly analyzes data from the same subjects from both modalities simultaneously. To achieve this, it uses the features derived from the first‐level analysis of fMRI (spatial maps) and the averaged ERP epochs of EEG, hence JICA is also classified as a late fusion method. JICA assumes that a stronger ERP yields a stronger BOLD fluctuation in the same area (and vice versa), which supports the common assumption of having the same linear mixing system in the two modalities (in the subject domain). Furthermore, each pair of coupled components is assumed to be dependent between the modalities and at the same time statistically independent of the rest of the components (Mijović et al., 2012). Other matrix‐based methods that are also used in data fusion include the family of partial least squares (Martínez‐Montes et al., 2004) methods that maximize the cross‐covariance among the datasets and canonical correlation analysis which aims to maximize the correlation among datasets (Correa, Adalı, Li, & Calhoun, 2010). Combinations are also common, either of the two types of methods or with ICA (Adalı, Akhonda, & Calhoun, 2019; Akhonda, Levin‐Schwartz, Bhinge, Calhoun, & Adalı, 2018). Detailed reviews of such matrix‐based methods using statistical models for data fusion problems can be found, for example, in Adalı et al. (2015a) and Sui, Adalı, Qijgbao, and Calhoun (2012).

Modeling of the HRF

As mentioned in the introduction, it is the GLM framework that is most commonly adopted in fMRI analysis. Analysis within the GLM is rooted in the simple assumption that the variance in the fMRI BOLD signal can be modeled by the convolution of an (assumed to be known) HRF with a sequence of events/stimuli. The hemodynamic response is a (strictly speaking nonlinear) function with complex shape, resulting from the neuronal and vascular changes, and is known to vary among different subjects as well as among different areas of the same brain (intersubject and intrasubject variability) (Handweker et al., 2004). GLM‐based methods explicitly need an estimate of the functional shape of the HRF to infer the expected activation pattern from the experimental task. Among the different available models for the HRF, the one that is more widely used is the model based on the two gamma distributions (Lindquist, 2008; Handweker et al., 2004), usually referred to as double gamma HRF model: where Γ(·) is the gamma function, Γ(·)− = 1/Γ(·) , and z (1,2,3,4,5) are the parameters that control the functional shape of the HRF. The values z (1) = 6, are used to generate the canonical HRF used in GLM (Lindquist, Loh, Atlas, & Wager, 2009). Several other models have been proposed, such as those based on the cosine function (Zarahn, 2002), radial bases (Riera et al., 2004), and spectral basis function (Liao et al., 2002). Furthermore, neurophysiologically informed nonlinear models of the HRF have been proposed. Those models are trying to describe the dynamic changes in deoxyhemoglobin content as a function of blood oxygenation and blood volume (Buxton, Wong, & Frank, 1998; Buxton, Uludağ, Dubowitz, & Liu, 2004), such as the so‐called “balloon” model. All proposed models differ both in capturing (describing) the evoked changes in the BOLD signal as well as in their number of parameters. In this work, a new, lighter (in terms of its parameters) model for the functional shape of the HRF will be tested, based on the Lennard‐Jones potential (Wikipedia, 2019a). The latter is used in physics to model the repulsive and attractive forces among neutral atoms or molecules. Due to its computational simplicity, the Lennard‐Jones potential is used extensively in computer simulations even though more accurate potentials exist. This light model will be adopted in view of the smaller number of parameters used and the smoother partial derivatives which will be used in the optimization, under the data‐driven HRF flexible model. The Lennard‐Jones HRF model (as it will be henceforth referred to) is defined over the nonnegative real numbers and can be expressed as: where z (1,2,3) are the parameters that control the functional shape of the HRF. It has only three such parameters, compared to the five parameters of the double gamma distribution model above. Its time derivative can be obtained as follows: where ψ0 is the polygamma function (Wikipedia, 2019b) of order zero, also called digamma function. Furthermore, the partial derivatives of the function with respect to each of the parameters are given by: Parameter z (1): Parameter z (2): Parameter z (3): Note that these derivatives, that will be used in optimizing the HRF function, are easier to be computed than those of the double gamma HRF model.

Soft‐coupled tensor–tensor decomposition

Coupling through equality (hard coupling), which has been used both in CMTF‐based methods (Acar et al., 2017a; Hunyadi et al., 2016; Acar et al., 2017b) and in JICA (Mijović et al., 2012; Swinnen et al., 2014; Calhoun et al., 2006) approaches, arises from the assumption that the neural sources are reflected, exactly with the same power, in both modalities. However, this is restrictive. Even if the exact equality and the independence assumptions, used by JICA, are valid, still the result of the first‐level GLM analysis of fMRI (used as an initial step to compute the spatial maps—features (Mijović et al., 2012; Swinnen et al., 2014; Acar et al., 2017a; Hunyadi et al., 2016; Acar et al., 2017b; Calhoun et al., 2006)) is not taking into account the complementary information from EEG. Furthermore, as reported in Mijović et al. (2012), the result obtained with JICA is mostly influenced by the quality (in terms of noise and artifacts) of the EEG signal and less by the fMRI data. This could indicate that the preprocessing of the fMRI with GLM and the use of features, instead of raw data (as in the case of EEG), may fail to retrieve all the information “hidden” in the raw fMRI data, due to the assumptions of GLM (Lindquist, 2008). We therefore propose a framework for early fusion of fMRI and EEG using coupled CPD with soft coupling (Seichepine et al., 2014; Farias et al., 2016), which implies similarity and not exact equality (Figure 2). Fusion based on raw data, though potentially more challenging, may allow more accurate and insightful inference (Ramachandram & Taylor, 2017). The coupling could be attempted in any of the modes, depending on the problem at hand.

FIGURE 2

Schematic representation of coupled canonical polyadic decompositions (CPDs) with “soft” or “flexible” coupling

Schematic representation of coupled canonical polyadic decompositions (CPDs) with “soft” or “flexible” coupling It is highly likely that the ranks of the two tensors (R and , the ranks of fMRI and EEG tensor, respectively) will not be the same (different number of artifacts in the two modalities, deep sources not captured by EEG, etc.). R is the number of common components in the coupled mode(s), so there are R − R and distinct components of fMRI and EEG, respectively. In this way, different model orders can be assigned to the decompositions of the modalities as long as the number of common components remains the same (without loss of generality, in (10), we assume that the common components are the first R ones). Consider the third‐order fMRI tensor, (space × time × subjects), and the fourth‐order EEG tensor (distinguished from the fMRI one with tilde), (frequency × space × trials amplitude × subjects). The rank‐R CPD of the fMRI tensor can be written as ≈ ⊤, k = 1, 2, …, K in terms of its frontal slices. is a matrix that contains the weights of the R spatial components (I voxels), , contain the associated time courses (I ) and subject activation levels of fMRI (K), respectively, and is the diagonal matrix formed from the kth row of . For the EEG case, with being the mode‐1 matricization of the kth subtensor, (Chatzichristos et al., 2017), we can similarly write a CPD , where the matrices contain the weights of the associated frequencies (I ), electrodes , trials amplitude and the subject activation levels of EEG (K), respectively, and is the diagonal matrix formed from the kth row of . The proposed cost function to be minimized is given by with being an a priori known lead‐field matrix used for the EEG forward problem and standing for the matrix representing the convolution with the (a priori known) HRF and the down‐sampling (due to the different sampling rate of the two modalities) (Karahan et al., 2015). The values of the λ's quantify the degree of coupling (and indirectly tune the importance of the accuracy of the used HRF function or lead‐field matrix). A stochastic model with priors in all of the modes can be potentially used, in which case the values of the λ's can be connected to the SD of the prior of the coupled mode (Farias et al., 2016). All coupled tensor decompositions are burdened by the task of the selection of such hyperparameters. A similar burden exists in the selection of the weights of regularizers that are being added either for selecting the coupled components or for the soft coupling (Seichepine et al., 2014; Acar, Schenker, Levin‐Schwartz, Calhoun, & Adalı, 2019) but also for quantifying the contribution of each modality. Note that in our approach, the weights of the different modalities are set equal to unity, due to the fact that they have been both normalized to unit norm prior to the analysis (a really critical preprocessing step).

Double CMTF

In task‐related fMRI, currently, two major classes of fMRI experimental designs exist: block event designs and event‐related designs (Lindquist, 2008; Tie et al., 2009). In a block event design, a condition is presented continuously for an extended time interval (block) to maintain cognitive engagement, with different task conditions usually alternating in time. The time courses of the stimuli (both the sequence of the stimuli and the time intervals) remain stable among subjects. In an event‐related design, discrete and short‐duration events are presented with randomized timing and order (both in scanning of a single subject and among different subjects). Both designs have advantages and disadvantages. For example, the block event design provides more robust activations, since relatively large BOLD signal changes with increased statistical power are detected. Moreover, it is more straightforward to analyze data from such designs, in the sense that the exact shape of the HRF does not much influence the result of the analysis and hence can be assumed to be the same per subject (equal to the canonical) with smaller impact on the widespread activation in time. On the other hand, the predictability of the block event design makes it inappropriate for some cognitive tasks, such as in the “oddball” paradigm, where a reaction to an unexpected stimulus is examined. Furthermore, it also increases the chance of low‐frequency artifacts. On the other hand, on event‐related design can detect transient variations in hemodynamic response and allows for the analysis of individual responses to trials. In event‐related designs, the different subjects cannot be stacked in the same tensor since the multilinearity assumption, required for tensor decomposition models like CPD, will certainly not be valid. One can note that the studies employing tensor decompositions in multisubject fusion applications are using only block‐event designs for this reason (Chatzichristos et al., 2018; Jonmohamadi et al., 2020). Despite the fact that in event‐related designs no connection among the time courses of the different subjects exist, similar areas are probably activated by similar stimuli. This is also the case for seizure events from patients characterized by the same focal seizure type (e.g., temporal or frontal). Hence, it is still beneficial to retain the neighborhood information exploited by the tensor formulation. As it can be understood, the different signals of the subjects cannot be stacked in the same tensor (fourth‐order for EEG and third‐order for fMRI). In order to retain the multiway structure (but still respect the differences in time courses), we will employ a double coupling rationale (Gong, Lin, Cong, & De Lathauwer, 2018) and the formulation of the problem will be transformed to a Double (in time between EEG and fMRI and in space among subjects in fMRI) CMTF (DCMTF), as shown in Figure 3, where the vectors with the same color (other than the light blue‐gray) are coupled.

FIGURE 3

Schematic representations of double coupled matrix tensor factorization (DCMTF) for K subjects. (a) The vectors with the same color (other than the light blue‐gray) are coupled (b) The coupling between the electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) can be either soft or flexible In Figure 3, CPD factors of the third‐order subject‐specific EEG tensor, , describe the variation over the spatial , the temporal and the spectral ( ) modes, for K different subjects. The decomposition of the corresponding fMRI matrix, , shows the variation over the temporal ( ) and spatial ( ) modes. The parameter sets {z } describe the subject‐specific HRF convolution matrix, , which will be optimized using either the double gamma model (Van Eyndhoven et al., 2017) or the Lennard‐Jones model or any other appropriate model selected. The proposed cost function in this approach is written as: where the matrix comprises the weights of the R spatial components of the kth subject, is a matrix containing a subset of the spatial components of the kth subject and a spatial map with which all the subject spatial maps (that belong to ) are similar (this similarity being imposed through a regularization term). The subset of the “common” spatial components is predefined from the user based on the task at hand. For example, in a visual oddball experiment, one will expect that the visual spatial component will be similar among the subjects while any spatial components representing artifacts will not. Similarly, in other task‐related fMRI experiments, the spatial components that are related to the stimuli are expected to be similar among the different subjects, whereas sources that are caused by artifacts, noise or any other source of activation not connected to the task of interest, are expected to have different spatial signatures. The selection of this subset of spatial components that will be shared allows a higher flexibility than a tensor representation in which all spatial components are assumed to be shared among the subjects. For the coupling in the time domain, instead of using the flexible approximation with the subject‐specific HRF, another soft coupling can be used instead, with an extra regularizer. The cost function will then become Of course, tuning two instead of one λ parameters is more difficult, but a decomposition of each subject separately can provide information about the similarity of the spatial maps. A large value of λ 1 means that the fMRI spatial maps of all subjects are very similar, hence imposing the same constraint in the spatial domain as in Equation (10) (the assumption of the same spatial maps is implicitly made by the tensor–tensor decomposition introduced in Section 3.5). Despite the fact that matrices, not higher‐order tensors, are considered here for fMRI, the coupling of the spatial components respects the multiway nature of the multisubject fMRI case (keep in mind that a three‐way CPD can also be seen as the joint diagonalization of a set of matrices hard‐coupled across both of their modes (Sorber et al., 2013)), so the multiway nature of the data is still exploited. It must be noted that λ 2 can be tuned similarly with λ of (10). To sum up, the model proposed in Section 3.5 (Equation (10)) is appropriate to be used whenever the subjects are expected to have similar time courses (i.e., block event design experiments), while the model proposed in Section 3.6 (Equations (11) and (12)) is to be used in all other cases. In cases where multiple sessions per subject exist, the different sessions (or different trials per subject) can be handled as different subjects, which are hard‐coupled in the spatial mode. Hence, assuming that a subject has two sessions, the second session can be handled as an extra subject, under the constraint that the associated matrices coincide.

RESULTS

Synthetic data

A simulated dataset similar to the one employed in Lei et al. (2007) and Dong et al. (2014) has been used in our analysis. The brain has been simulated as a single disc with 2,452 voxels (dipoles). In order to compute the scalp signals in EEG, a concentric three‐sphere model was set to wrap the disc (as a simulation of the skull) and 128 electrodes with their position and the corresponding lead‐field matrix computed in Dong et al. (2014) have been used (a detailed description and schematic representations of the three‐sphere model used can be viewed in Lei et al. (2007) and Dong et al. (2014)). The temporal sampling rate of EEG was 1 kHz while the number of frequency bins has been set equal to 40. The fMRI spatial maps were simulated as 2D images of 70 × 70 voxels, with the aid of the SimTB (Erhardt, Allen, Wei, Eichele, & Calhoun, 2012) toolbox. In comparison to Lei et al. (2007) and Dong et al. (2014), the overlap in time for EEG and in space for fMRI has been increased, so that we will make the problem more realistic and more difficult. In Figure 4, the neurophysiological sources can be viewed, from left to right: “vision area” S1, “default mode network (DMN)” S2, “auditory cortex” S3, “sensory networks” S4, “cognition areas” S5 and “dorsal attention network” S6. The activity level at each active voxel was randomly sampled from a Uniform [0.8,1.2] distribution for each replication of each simulation condition. These assumed active neural sources (rows a, b) along with the assumed spectral profiles (row c) yield scalp distributions and single‐trial images in EEG and spatial maps and time courses of fMRI. Scalp potential distribution maps (topoplots, row d) are computed by solving the forward problem for each spatial map of row b. The fMRI BOLD signals (time courses, row e) were computed as the convolution of the trial amplitude (row a) with the canonical HRF. In all of the scenarios, we assume coupling of fMRI and EEG in the time domain only, hence λ = λ = 0 (Equation (10)). Similar conclusions can be reached if the coupling is assumed in one of the other modes. Coupling across multiple modes increases the difficulty of tuning extra regularization parameters.

FIGURE 4

Simulated sources in electroencephalography (EEG) and functional magnetic resonance imaging (fMRI). (a) Trials in EEG. (b) Simulated neural activity. (c) Spectral profiles. (d) Topoplots of the neural activity. (e) Blood‐oxygen‐level‐dependent (BOLD) time courses of the fMRI. (f) Spatial maps of fMRI. From left to right the simulated sources are: “vision area” S1, “default mode network” S2, “auditory cortex” S3, “sensory networks” (left and right) S4, “cognition areas” S5, and “dorsal attention network” S6 This section will be split in three subsections: We will exhibit the difference between the soft coupling approximation and the flexible approximation proposed in Van Eyndhoven et al. (2017) through a comparative study (based on Pearson correlation). Furthermore, in this subsection, we will investigate the tuning of the λ value in the soft coupling method. Alternative methods will be tested in the case where all the subjects have the same time course: pICA, uncoupled CPDs (separately decomposing each tensor), hard and soft coupling in the time domain with different λ values. The same methods will be tested also in the last subsection, but different time courses per subject will be considered, in order to point out the need of an alternative formulation of the problem in such a case. The proposed soft coupled decomposition and the DCMTF were implemented with the aid of the structured data fusion framework (Sorber, Van Barel, & De Lathauwer, 2015) of Tensorlab (Vervliet et al., 2016) and nonlinear least squares (NLS) was adopted as the optimization scheme. pICA was implemented (using GICA) as in Hunyadi et al. (2015) and Lei et al. (2007), using Infomax (Bell & Sejnowski, 1995) for the ICA step. All the experiments ran 30 times (with different initialization and instance of noise) and the average performance is presented. In order to estimate R , (EEG) and (fMRI) are separately decomposed, and pairwise correlation of all the components are computed. Components with similarity exceeding a predefined threshold t are considered shared (Genicot, Absil, Lambiotte, & Sami, 2016), a value of t = 0.70 can be a suggested value. Based on the application at hand this can be increased or decreased in order to restrict the shared set to more similar components or include components with weaker similarities, respectively. Note that the shared components are determined in the ACMTF framework via appropriate regularization on the gains of the corresponding CPD components (Acar et al., 2017a; Acar et al., 2017b). In this way, we can also get an indication for the appropriate λ values to be used: higher correlation indicates higher values for λ; hence, the λ's of the modes which will not be coupled will be set to zero (Acar et al., 2017a). It must be pointed out that the coupling is not always beneficial, as we will demonstrate in the following. Only in cases where we have significant correlation between the modalities the coupling should be used. If “phantom” connections are imposed, the performance will deteriorate (Lahat et al., 2015). A validation process is therefore essential to decide whether all sources of data (modalities) are relevant and necessary or whether any of those are redundant or may even deteriorate the quality of the analysis (Acar et al., 2013). It is really important that the data of both modalities must be normalized (to unit norm) beforehand (so that the first two terms in (11) have the same weight in the cost function) and preprocessed for removal of artifacts (Walach, Filzmoser, & Hron, 2018). The best initialization of the decomposition algorithm for each modality separately is not guaranteed to be the best possible for their combination; furthermore, the permutation issues must be taken into consideration. Hence, an initialization method designed specifically for coupled decompositions must be used. For the initialization of the coupled tensor decomposition, the generalized eigenvalue decomposition‐based method proposed in Sørensen et al. (2015) is used. When prior information is available for any of the modes (usually for time courses), or part of them, the respective columns can be excluded from the optimization function and set equal (or almost equal) to their known values (Morante, Kopsinis, Theodoridis, & Protopapas, 2020). The correlation values presented in the following figures and tables represent the mean Pearson correlation of all the obtained sources with the ground truth. Since the same algebraic initialization is used for every run, the SDs of all methods are relatively small.

Soft versus flexible coupling

Figure 5 visualizes the importance of the choice of the λ value for soft coupling. We can distinguish two cases. In the first case (the solid lines), where the coupling assumption is exact (the simulated data were generated with the use of the canonical HRF, ), it can be readily seen that the hard coupling is the best to use. However, the soft coupling analysis can reach the same performance with the appropriate tuning of λ . In the second case (dotted lines), the assumption of hard coupling is violated, as the time courses of the fMRI sources were generated through convolution with different HRFs (the five different HRFs presented in Morante et al. (2020) have been used, which have a mean correlation of 0.8 with the canonical HRF), while in Equation (6), the HRF matrix, , is assumed to be built from the canonical HRF. The fact that the time courses of the two modalities (after the transformation with ) are similar but not equal deteriorates the performance of hard coupling. It still performs better than the uncoupled version but it is outperformed by the soft coupling for λ > 0.1. In cases where we move the HRF farther from the canonical HRF, the hard coupled case can become even worse than the uncoupled one.

FIGURE 5

Mean correlation of the obtained sources with the ground truth, using uncoupled (two separate canonical polyadic decompositions [CPDs] in each modality), Hard coupled (the separate CPDs have hard coupled time modes) and soft coupled CPDs (Equation (10)) models, with different λ values To compare soft (Equation (12)) with flexible coupling (Equation (11)), we simulated a similar single‐subject scenario in order to evaluate the performance and the computational cost of each method. We have simulated four different scenarios. In each scenario, we slightly modify the HRF from which the data are generated. The canonical HRF is adopted in the first scenario, while an HRF with 0.9, 0.8, and 0.7 correlation with the canonical one was used in the other scenarios, respectively. We can see (Figure 6) that, if we manage to tune appropriately the λ value, then the soft coupling method (green) outperforms the other methods but its deviation is large and its performance can be even worse than that of the flexible methods, in case of a possible suboptimal selection of λ . The selection of λ is made via grid search from the set {0.01; 0.1; 10; 100; 1,000}. Regarding the choice of the HRF model (Lennard‐Jones or double gamma), it seems that there is a compromise between accuracy and time complexity. The difference between the variances of the performance of the soft coupling and the flexible approaches is only statistically significant, in the first and third case of Figure 6. This difference is driven by the values of λ that are far away from the optimal one (e.g., selection of λ = 1,000 when the optimal is 0.01); hence, if the user selects a value in the “neighborhood” of the optimal one, the deviation is decreased, and the difference is below the significance level. In the cases where the HRF is closer to the canonical one, the Lennard‐Jones model has similar performance with the double gamma model albeit with significantly lower complexity. In the cases where the HRF differs more from the canonical one, the performance of the Lennard‐Jones model deteriorates and the time needed to converge can also become longer (while also a higher SD is observed). It should be noted that the time needed for the selection of the λ value is not included since it depends on the intervals of the grid used in the grid search approach followed. Furthermore, we noted that the performance of the flexible approaches was very good in the single‐subject case but when extra subjects were added the increase in the computation time was not linear and the complexity was significantly increased.

FIGURE 6

Comparison, based on Pearson correlation and time till convergence, of the soft coupling method with the flexible coupling method using the double gamma hemodynamic response function (HRF) and Lennard‐Jones HRF models

Soft‐coupled tensor–tensor decomposition

To compare the soft‐coupled tensor–tensor decompositions (Equation (10)) with DCMTF (Equations (11) and (12)), multisubject scenarios were simulated. The data from each subject contained all the six sources shown in Figure 4 with different activation levels. The activation patterns have strengths randomly sampled from a uniform (Adalı et al., 2015a; Logothetis & Pfeuffer, 2004) distribution. Five different subjects are simulated, and for the simulations presented in this subsection each subject is assumed to have the same time course for every source (differing only in the noise) while in the next subsection differences are incorporated in the time courses and HRFs of some of the subjects. The mean correlation between the obtained sources and the ground truth per method and per modality at different signal‐to‐noise ratios (SNR = squared Frobenius norm of the signal over the squared Frobenius norm of the noise) can be seen in Figure 7. We have simulated two different cases: (a) the same noise level as in Lei et al. (2007) is used and (b) the methods are tested with higher level of noise. pICA exhibits inferior performance compared to both the uncoupled and soft coupling methods in both scenarios, due to the overlapping in the sources, which violates the independence assumption. The resulting spatial maps obtained by spatial ICA in case (a) can be viewed in Figure 8. Note that, in the overlapping areas, there is crosstalk between the maps. S4, which overlaps with most of the rest of the sources, cannot be identified (for comparison with the ground truth, observe row d of Figure 4). It can be seen that, in case (a), the correlation for EEG with uncoupled analysis is higher than with soft coupling. The slight decrease in the performance in EEG can be explained by the gain in the performance in the fMRI sources. Overall, the correlation is increased with soft coupling (mean correlation among sources and between the two modalities—diamond). In case (b), where the SNR is of similar level for both modalities, soft coupling yields better results for both.

FIGURE 7

FIGURE 8

Resulting functional magnetic resonance imaging (fMRI) spatial maps with (a) parallel independent component analysis (pICA). (b) Soft coupled canonical polyadic decompositions (CPDs), at signal‐to‐noise ratio (SNR) = 0.1

Accuracy of different decomposition methods (uncoupled canonical polyadic decompositions [CPDs], coupled CPDs, and parallel independent component analysis [pICA]) based on the Pearson correlation, with different signal‐to‐noise ratio (SNR) values Resulting functional magnetic resonance imaging (fMRI) spatial maps with (a) parallel independent component analysis (pICA). (b) Soft coupled canonical polyadic decompositions (CPDs), at signal‐to‐noise ratio (SNR) = 0.1

Flexible DCMTF

In this subsection, we will test the case where intersubject variability exists, in time (different acquisition protocol) or in space (different spatial maps per source). Three different scenarios and two subscenarios (for each) will be tested: in the first scenario, all the subjects have the same time courses (similar to the previous subsection). In the second scenario, each subject has a different HRF and a time shift in each of the time courses of the sources. The five different HRFs presented in Morante et al. (2020) have been adopted and also the time courses are shifted (the first subject has no shift while subjects 2–5 have time shifts at increments of 1 s with respect to the first subject, hence a shift of 4 s in the fifth subject). In the last scenario, the time courses of the subjects are the same but the spatial maps of every subject are different: subject variability was introduced in the spatial domain of two of the sources (2 and 4) and rotation (in increments of 4° per subject) of one of the sources (2) and voxel shifts at increments of two voxels with respect to the first subject in the other source (4). In every scenario, two subscenarios are also simulated: (a) only sources 2, 3, and 4 (Figure 4) with low spatial overlap are used, and (b) all the sources are used. We considered these subscenarios in order to assess the impact of overlapping, time shift and intersubject spatial variability, separately. The mean Pearson correlation of the obtained sources with the ground truth is given in Table 1 for the different scenarios. It can be noted that the pICA method outperforms the other methods in the case where no severe spatial overlap and the same time courses per subject exist. However (as mentioned previously), this method is also affected more severely by the spatial overlap of the sources (since the assumption of independence of the sources is then violated) and additionally it is affected by per subject differences in the time courses since it is based on the assumptions imposed by GICA (Beckmann & Smith, 2005). In the case of high overlap and same time course per subject, the soft coupled tensor decomposition exhibits the best performance but on the other hand this method is most affected by the differences in the time courses, since the assumption of multilinearity is violated in the time domain. The Uncoupled decompositions have similar behavior (among the different cases) since the difference in the time course per subject remains even when the EEG and fMRI tensors are decomposed separately. For the DCMTF model, we used the flexible approach with Lennard‐Jones HRF. We can note that DCMTF allows the successful estimation of the underlying sources and hence performs much better than the other methods in the case of different time courses and HRFs per subject. It should be noted that the performance of DCMTF is similar to that of soft‐coupled tensor decomposition in the case of the same HRF and time course per subject. In the case where different spatial maps per subject exist but the same time course per source, we can note that the ICA‐based method is affected less since the assumption of the same time course used by GICA is then valid. Concerning the SD of the Pearson correlation, we can observe that GICA is the more stable method with slightly higher SD in the cases where it fails.

TABLE 1

Mean and SD (30 runs) of the correlation with the ground truth of the sources obtained via different fusion methods, under different scenarios

	Same time and space		Different time courses		Different spatial maps
Methods	Low overlap	High overlap	Low overlap	High overlap	Low overlap	High overlap
Parallel ICA	0.95 ± 0.02	0.80 ± 0.02	0.82 ± 0.02	0.68 ± 0.11	0.88 ± 0.02	0.75 ± 0.02
Uncoupled	0.85 ± 0.02	0.82 ± 0.02	0.84 ± 0.6	0.81 ± 0.08	0.85 ± 0.09	0.80 ± 0.10
Soft coupled tensors	0.95 ± 0.02	0.92 ± 0.02	0.75 ± 0.02	0.65 ± 0.03	0.78 ± 0.02	0.70 ± 0.03
DCMTF	0.91 ± 0.03	0.90 ± 0.03	0.90 ± 0.04	0.90 ± 0.03	0.88 ± 0.04	0.87 ± 0.04

Note: The best correlation among the different algorithms (per use case) is annotated with bold letters.

Abbreviations: DCMTF, double coupled matrix tensor factorization; ICA, independent component analysis.

Mean and SD (30 runs) of the correlation with the ground truth of the sources obtained via different fusion methods, under different scenarios Note: The best correlation among the different algorithms (per use case) is annotated with bold letters. Abbreviations: DCMTF, double coupled matrix tensor factorization; ICA, independent component analysis. The performance of DCMTF presented in Table 1 in the first two scenarios is with a high value of λ 1 (λ 1 = 106), since the spatial maps per subject are the same, while for the last scenario the λ 1 = 1 value was selected, based on grid search. The tuning of λ 1 affects less the result than that of λ 2 (as we saw in the previous scenario). We used in this context the flexible approach (Equation (11)) in order to be able to evaluate the effect of λ 1.

Real data

Epilepsy use case

In this section, we will illustrate that DCMTF could be also potentially used in a special case of multisubject epileptic analysis. To employ tensor methods in multisubject epileptic studies, a main assumption must be made, namely that all subjects must have their ictal onset zone (IOZ) in similar areas (i.e., same lobe), an information that might not be available in all cases. Hence, such studies with multiple epileptic subjects analyzed do not target the identification of the IOZ of each subject but rather the understanding of the common epileptic networks of patients with similar IOZs (Hunyadi et al., 2016). As an example of such a use case, we consider the dataset of 10 temporal lobe epilepsy patients, that have been previously used in Hunyadi et al., 2016 and Van Eyndhoven et al. (2019). The study was carried out in accordance with the recommendations of the International Conference on Harmonization guidelines on Good Clinical Practice with written informed consent from all subjects and was approved by the Medical Ethics Committee of the University Hospital Leuven. All subjects were adults who (a) underwent a full presurgical evaluation for refractory focal epilepsy using EEG–fMRI; (b) had single‐photon emission computed tomography coregistered to MRI (SISCOM) images available, as well as post‐surgery MRI scans when patients were seizure‐free; (c) had interictal epileptic discharges (IEDs) recorded during the EEG–fMRI recording session. The clinical characteristics of the patients can be viewed in Table 2.

TABLE 2

Clinical characteristics of the patients and the number of IEDs annotated by the neurologists

Subject number	Gender	IOZ	Number of fMRI scans	Number of IEDs	IED rate per hour
1	F	Left temporal	540	15	40
2	F	Right parieto‐temporal	1,080	105	156
3	M	Right temporal	1,620	825	733
4	F	Left anterio‐temporal	1,080	117	156
5	F	Left anterio‐temporal	1,080	126	187
6	M	Left parieto‐temporal	1,080	11	15
7	F	Left temporo‐occipital	1,620	1,815	1,613
8	F	Right temporal	540	226	602
9	M	Left anterio‐temporal	1,080	6	8
10	F	Right temporal	1,350	966	1,171

Abbreviations: fMRI, functional magnetic resonance imaging; IEDs, interictal epileptic discharges; IOZ, ictal onset zone.

Clinical characteristics of the patients and the number of IEDs annotated by the neurologists Abbreviations: fMRI, functional magnetic resonance imaging; IEDs, interictal epileptic discharges; IOZ, ictal onset zone. Scalp EEG data were recorded with the use of an MR‐compatible EEG cap placed according to the international 10/20 system, sampled at 5 Hz and referenced to electrode Cz. fMRI data were acquired on a 3 T MR scanner (either on Achieva TX with a 32‐channel head coil or Intera Achieva with an eight‐channel head coil in Philips Medical Systems, Best, Netherlands) with an echo time (TE) of 33 ms, a TR either of 2.2 or 2.5 s, and a voxel size of 2.6 × 3 × 2.6 mm3.

Preprocessing

Band‐pass filtering in the EEG signal between 0.1 and 30 Hz has been applied, as a preprocessing step. Furthermore, gradient and ballistocardiographic (BCG) artifacts were removed using the Bergen plug‐in of EEGLAB (Delorme & Makeig, 2004) and Brain Vision Analyzer software, respectively. The signal of every channel was divided by its SD. Two neurologists subsequently inspected and annotated the EEG signals for IEDs. Motion correction was performed using Motion Correction FMRI's Linear Image Registration Tool (Jenkinson, Bannister, Brady, & Smith, 2002), followed by a slice‐timing correction using Fourier space time‐series phase shifting. Then, brain volumes were extracted using Brain Extraction Tool (Smith, 2002) and brain images were smoothed with a 6 mm full width at half maximum Gaussian Kernel (approximately twice the voxel size). Eventually, grand‐mean voxelwise intensity was normalized over the entire 4D dataset by a single multiplicative factor followed by a high‐pass temporal filtering (Gaussian‐weighted least‐squares straight line fitting, with SD σ = 50.0 s). Registration of the fMRI data to high‐resolution T1 images and to standard (MNI atlas) images was carried out using the FMRIB's Linear Imaging Registration Tool (Jenkinson et al., 2002). We have formed the per‐subject EEG tensor as , where I represents different frequencies. We created a spectrogram which was computed from the nonoverlapping EEG segments, with each segment having length equal to the TR of the fMRI acquisition of the specific subject, in order to obtain a spectrogram “synchronized” to the fMRI signal. The squared Fourier magnitudes are averaged into 1 Hz bins, from 1 to 40 Hz. In order to reduce the size of the fMRI images, the initial 79 × 95 × 68 images were vectorized and masked with the region of interest (ROI) mask created in Hunyadi et al. (2016). Hence, the fMRI data of the kth patient were organized into , with I = 11,923 voxels and I being equal to the number of fMRI scans per patient (fourth column of Table 2).

Selection of hyperparameters

In order to predict the optimal model for our dataset (similarly with Van Eyndhoven et al. (2019)), we constructed a “BOLD predictor,” that is, the regressor of interest that would have been used if the analysis were performed in the context of GLM. In order to construct the BOLD predictor, the timing of the annotations of the IED, provided by the neurologists, was used. At each annotation of an IED, a Dirac delta function was placed. The time course of the successive delta functions was convolved with the canonical HRF. For the selection of the rank per patient, we fitted a CPD model to the EEG data of each patient for R ranging from 1 to 7 and for every R we computed the maximum correlation between the resulting time courses and the BOLD predictor (Table A1). We have noted that, for R = 2, the CPD of the EEG tensor was not always providing a source of interest, meaning that the temporal mode of the resulting components was not always highly correlated with the BOLD predictor. In order to estimate the rank for every subject, along with the “BOLD predictor” we also employed the CORe CONsistence DIAgnostic (CORCONDIA) (Bro & Kiers, 2003). CORCONDIA reflects how close (in percentage) the core tensor of the Tucker decomposition with the CPD factors is to superdiagonal. The CORCONDIA score will decrease dramatically after the true rank since any new components will be mainly descriptive of noise leading to high off‐superdiagonal core values. As estimated rank we selected the rank corresponding to maximal correlation with the BOLD predictor and CORCONDIA percentage higher than 70% (Kinney‐Lang, Spyrou, Ebied, Chin, & Escudero, 2018) (Table A1).

TABLE A1

CORCONDIA percentage and maximum correlation with BOLD predictor for every subject for R ∈{1, 2, …, 7}

	CORe CONsistency DIAgnostic %							Correlation with BOLD predictor
Rank	1	2	3	4	5	6	7	1	2	3	4	5	6	7
Subj.1	—	100	99.4	97	90	76.8	69	0.35	0.9	0.44	0.93	0.97	0.9	0.92
Subj.2	—	100	95.7	69.4	29.4	28.2	24.1	0.29	0.94	0.92	0.92	0.91	0.93	0.9
Subj.3	—	100	87	74.6	37.5	12.6	−12	0.11	0.2	0.61	0.71	0.78	0.82	0.68
Subj.4	—	100	98.6	94.1	93.7	89.6	79.4	0.25	0.8	0.22	0.62	0.9	0.68	0.12
Subj.5	—	100	97.7	95.6	−1.4	−12	−74.5	0.09	0.8	0.74	0.75	0.76	0.7	0.33
Subj.6	—	100	99.7	70.5	−22.4	−67	−174	0.61	0.93	0.94	0.54	0.95	0.94	0.94
Subj.7	—	100	98.8	95.2	90.4	31.4	23.2	0.22	0.66	0.68	0.53	0.48	0.47	0.39
Subj.8	—	100	98	95	91.2	82.3	79.5	0.39	0.35	0.95	0.94	0.95	0.94	0.94
Subj.9	—	100	97.2	77.6	−23	−2	42.3	0.58	0.91	0.68	0.72	0.74	0.69	0.68
Subj.10	—	100	97.6	88.8	58.6	69.4	53.3	0.67	0.78	0.74	0.69	0.45	0.5	0.42

Note: The selected rank is highlighted with bold letters.

Abbreviations: BOLD, blood‐oxygen‐level dependent; CORCONDIA, CORe CONsistence DIAgnostic.

The number of common components (rank R of DCMTF) selected per patient can be viewed in the second column of Table 3.

TABLE 3

Results of the analysis in each patient

Subject number	Selected	Mean number of	λ ₂	Spatial map
	Rank R _c	Detected IEDS		Overlap with IOZ
1	5	9 (15)	1	No
2	2	98 (105)	10	Yes
3	4	694 (825)	10	Yes
4	5	95 (117)	100	Yes
5	2	94(126)	1,000	Yes
6	3	8 (11)	1	Yes
7	3	1,569 (1,815)	100	Yes
8	3	190 (226)	10	Yes
9	2	5 (6)	10	Yes
10	2	804 (966)	100	Yes

Abbreviations: IEDs, Interictal Epileptic Discharges; IOZ, Ictal Onset Zone.

Results of the analysis in each patient Abbreviations: IEDs, Interictal Epileptic Discharges; IOZ, Ictal Onset Zone. From the initial analysis of the temporal components of the CPD of the EEG tensor, we tried to find what is the number of components in each patient that were highly correlated (above 0.8) with the BOLD predictor. This number ranged from 0 to 3 (median = 1, mean = 1.4), hence we selected . The selection of the values of the λ's is more tricky. λ 2 was selected such that the correlation with the BOLD predictor is maximized. The different values of λ 2 per subject can be viewed in the penultimate column of Table 3. It must be noted that the selection of λ 2 was performed without the coupling in the spatial domain among subjects, hence, in a single‐subject analysis setting. The value of λ 1 = 1 (for the analysis of both left and right temporal subjects) was selected as the value that maximizes the overlap of the spatial maps with the IOZ. We must keep in mind that in other applications such information might not be available. The selection of the λ values is always problem‐dependent. Moreover, it has been observed that higher values of λ 1 decrease the overlap with the IOZ. This is a finding that makes sense, since the IOZ is subject‐specific and larger values of λ 1 tend to minimize the differences among the spatial maps and increase the similarity with the common map .

Results of the analysis

As one can see from the third column of Table 3, in most of the subjects, more than 80% of the IEDs are detected (compare to the fifth column of Table 2). We can observe (Table 3) that the spatial maps obtained for all subjects (except for Subject 1) show significant activation in areas that are concordant with the IOZ. For the visualization of the spatial components, the transformation to Z scores has been performed with the use of the “display” GUI of the GIFT toolbox (Medical Image Analysis Lab (MIALAB), 2017). Similarly with the simulated experiments, all the algorithms ran 30 times. The Z‐map of the common spatial matrix, , reveals activations in the right and left temporal lobe for the group of the patients with right and left temporal epilepsy, accordingly. It must be noted that for the patients with the left temporal epilepsy the activation is more on the anterior side of the temporal lobe. Furthermore, for those with the right temporal epilepsy we also observe activations in the occipital lobe, as well (red areas of Figure 9). Moreover, for both groups, deactivations (negative part of the fMRI Z images) are observed in areas associated with the DMN (blue areas of Figure 9).

FIGURE 9

Common spatial map of activation for patients with right temporal epilepsy, . With the red color, we can view values of Z > 0 and with blue color Z < 0

Common spatial map of activation for patients with right temporal epilepsy, . With the red color, we can view values of Z > 0 and with blue color Z < 0 We should also note that in some subjects (e.g., Subject 2) we observed a second component with high correlation with the BOLD predictor. The spatial Z‐map of this component includes ROIs of activation that are not common among the subjects (e.g., activation in parietal lobe for Subject 2). It seems that in subjects where the difference between the subject‐specific spatial map, , and the common one, , is significant, the spatial map of interest is split in two components. This is a very interesting and promising finding since in group analysis in the framework of GLM such splitting is not allowed. Comparing with the results obtained with the use of GLM as preprocessing step (presented in Hunyadi et al. (2016), Figure 2a), we can see that our results provide a denser and more concentrated activation in the right pole, while also the DMN is better discriminated without the need of extra analysis (in Hunyadi et al. (2016), the DMN appeared only if the GLM component was further decomposed with CMTF; see Figure 2b).

Visual oddball EEG–fMRI

The dataset used in this subsection is obtained from the OpenfMRI database (Stanford University, 2017). Its OpenfMRI accession number is ds000116 and it has been previously used in Walz et al. (2013). The 17 subjects (11 males; mean 27.7 years; range 20–40 years) performed different sessions of visual oddball tasks while EEG and fMRI were simultaneously recorded. The demographic characteristics of the subjects can be viewed in Table 4.

TABLE 4

Results of the analysis in each subject

Subject number	Sex	Age	Selected rank, R _c	λ ₂	Excluded bipolar pairs
1	M	34	4	100	8, 24
2	F	25	3	100	5, 30
3	M	25	5	10	29, 30, 34, 36
4	M	30	4	1,000	23, 24, 30
5	M	21	4	1,000	8, 30
6	M	26	6	100	9, 24, 30
7	M	35	5	1,000	30
8	M	23	3	100	28, 29, 30, 33, 36
9	M	33	3	1,000	29, 30, 33
10	M	26	3	1,000	24, 29, 33
11	F	40	4	100	16, 24, 30
12	M	32	3	100	30
13	F	23	6	1,000	30
14	F	20	3	1,000	30
16	F	25	3	1,000	16, 24, 28, 30, 37
17	F	27	7	1,000	7, 33

Results of the analysis in each subject In each session, 125 stimuli per task were presented for 200 ms each, with a 2–3 s uniformly distributed variable intertrial interval (VITI) and a target probability of 20% (target—oddball stimuli where the subject should press the button and 80% standard stimuli which should be ignored) (Walz et al., 2013). The first two stimuli for each run were constrained to be standard. As target and standard stimuli, a large red circle and a small green circle on gray backgrounds were used, respectively (3.45 and 1.15° visual angles). The fMRI data were acquired with the use of 3 Tesla Philips Achieva MRI scanner and a resolution of 3 × 3 × 4 mm3 was used. The TR used was 2,000 ms (smaller than VITI) and the TE was equal to 25 ms. Thirty‐two slices of 64 × 64 voxels were obtained. A single‐volume high‐resolution (2 × 2 × 2 mm3) echo‐planar imaging image and a 1 × 1 × 1 mm3 spoiled gradient recalled (SPGR) image were acquired from each subject for registration purposes. We must note that Subject 10 has been excluded from our analysis since it is the only subject for which an SPGR MRI image was not available. The EEG electrodes were arranged in 43 bipolar pairs. All electrode impedances were kept below 20 kΩ, including 10 kΩ resistors built into each electrode for subject safety. A more detailed and comprehensive description of the hardware, along with the acquisition of the data can be found in Walz et al. (2013). Gradient artifacts were removed by subtracting the mean artifact across all functional volume acquisitions and a 10 ms median filter was also applied in order to remove any residual spike artifacts. Band‐pass filtering of the data between 1 and 100 Hz has been performed while also a notch filter has been employed at 60 Hz for removal of the electrical line noise interference. The authors in Walz et al. (2013) have used principal component analysis (PCA) to remove the BCG artifact, but we noted that still the data were corrupted by remnants of the BCG artifact. On top of the PCA that has been used we also applied ICA in order to improve the result of the artifact removal, following the rationale of the optimal basis set + ICA method, proposed in Vanderperren et al. (2010). The BCG‐free data from the 43 bipolar channels were referenced to the 34‐electrode space, in each subject. Noisy channels (subjectively determined by visual inspection from Walz et al. (2013)) were excluded prior to re‐referencing due to oversampled system design. The excluded electrodes per subject can be viewed in the last column of Table 4. We computed a spectrogram for each subject from the nonoverlapping EEG segments, with each segment having length equal to the TR of the fMRI acquisition of the specific subject, in order to obtain a spectrogram “synchronized” to the fMRI signal. The squared Fourier magnitudes are averaged into 1 Hz bins, from 1 to 40 Hz. We have formed an EEG tensor as , where I = 40 represents different frequencies, electrodes, and is the number of EEG samples (not the same over all subjects). The fMRI data of the kth subject were organized into , with I = 131,072 voxels and I = 170 being equal to the number of fMRI scans per subject. The same preprocessing steps as those described in “Preprocessing” section were followed, with the only difference being that no slice time correction was performed. Furthermore, bias field correction has been performed using the FMRIB's Automated Segmentation Tool (Zhang, Brady, & Smith, 2001), which is available in FSL (Jenkinson, Beckmann, Behrens, Woolrich, & Smith, 2017). To determine the rank per subject, we employed CORCONDIA (Bro & Kiers, 2003). Since we were expecting at least three different sources (normal stimuli, odd stimuli, and response to odd stimuli) we selected as rank, R , the rank that was higher than 2 and exhibited the highest drop in CORCONDIA percentage when rank was increased beyond that (the selected rank per subject can be seen in the respective column in Table 4). The λ 2 per subject was selected so that the resulting sources in the single‐subject case for the flexible (Equation (11)) and soft version (Equation (12)) of the proposed algorithm have the maximum correlation with each other. As it was previously observed, the flexible scheme had a very good performance in the single‐subject analysis but, when the number of subjects is increased, significant increase of the complexity is noted. Hence, we will use the result of the flexible approach in the single‐subject case for tuning λ in the soft approach (which is considered equivalent to λ 2 in the multisubject scenario). In our first attempt to blindly extract the different sources, the method failed to successfully discriminate the target and normal stimuli and separate them into distinct sources. Instead, a single source with a combination of both stimuli was obtained. Unlike the previous case, where a big portion of the IEDs were even blindly discriminated from the background brain activity, in this use case, the two sources are highly correlated. Hence, we resorted to a semiblind approach (Morante et al., 2020). Namely, for every subject, we assumed that two of the components of the EEG trial mode ( and ) are known and equal to the timings of the normal and oddball stimuli (with delta functions placed at the exact times where the stimuli occurred). Both conditions (target and normal stimuli) elicited activation in regions associated with a response to visual stimulation (occipital cortex). In accordance to earlier findings, the source which was correlated with the target stimuli revealed activation in inferior and middle/superior frontal gyrus, anterior (mainly) and posterior cingulate cortex, inferior and superior parietal lobe and dorsolateral prefrontal cortex (Stevens, Skudlarski, Gatenby, & Gore, 2000; Brázdil, Mikl, Mareček, Krupa, & Rektor, 2007; Ardekani et al., 2002). In some of the subjects, we have also noted activity in the left thalamus, in the amygdala and in the right cerebellum (probably connected to pressing the button with the right finger). The main activated areas with the respective size in voxels, the exact position in MNI space, the maximum Z‐score and the respective Brodmann area can be viewed in Table 5.

TABLE 5

Significant (Z > 2.5) activation clusters from the contrast of the target stimuli versus baseline

Cluster size	Z‐score	MNI x	MNI y	MNI z	BA	Hemisphere	Anatomical labels
986	5.3	12	−92	2	17	Bilateral	Primary Vis., I. occipital L.
373	4.9	33	−58	−30	—	Bilateral	Cerebellum
323	6.2	7	2	36	24, 32	Bilateral	Anterior cingulate G.
295	5.6	−8	−12	67	6	Left	Premotor, SMA
257	4.8	54	24	23	9	Bilateral	Dorsolateral prefrontal C.
232	4.5	37	0	38	8	Bilateral	Front‐eye fields
							M. and S. Frontal G.
198	3.4	−48	−60	36	39, 40	Left	Angular G., M. parietal L.
174	3.6	50	−53	38	39, 40	Right	Angular G., M. parietal L.
169	4.3	62	−40	8	22	Right	S. and M. temporal G.
124	4.3	30	−65	52	7	Right	S. lateral occipital C.
123	3.5	−57	12	13	4	Left	Precentral G.
112	4.5	−57	−38	28	40	Left	Supramarginal G., I. parietal L.
103	4.3	37	−39	33	40	Right	Supramarginal G., I. parietal L.
98	5.2	52	10	28	44	Right	Brocca, I. prefrontal C.
85	3.9	−51	−41	8	22	Left	S. temporal G.
34	2.95	−7	−56	8	23	Left	P. cingulate G.
18	2.85	−14	2	24	8	Left	Caudate nucleus

Abbreviations: C, Cortex; G, Gyrus; I, Inferior; L, Lobe; M, Middle; P, posterior; S, Superior; SMA, Supplementary Motor Area; Vis, Visual cortex.

Significant (Z > 2.5) activation clusters from the contrast of the target stimuli versus baseline Abbreviations: C, Cortex; G, Gyrus; I, Inferior; L, Lobe; M, Middle; P, posterior; S, Superior; SMA, Supplementary Motor Area; Vis, Visual cortex. As expected, additional activation was observed in the contrast created target > normal stimuli. Widespread activation mainly in the left hemisphere and more specifically in the parietal and frontal regions and a frontal pole activation has been noted (Figure 10). It appears that three main networks are involved. The DMN (posterior cingulate cortex, precuneus, medial prefrontal cortex, and angular gyrus), the central executive network (lateral posterior parietal and dorsolateral prefrontal cortex) and the salience network (anterior cingulate gyrus and frontal insular cortex). Those networks have been reported in the literature in visual oddball paradigms when the target stimuli are compared with the standard one (Kiehl, Laurens, Timothy, Forster, & Liddle, 2000; Sbaihat et al., 2021).

FIGURE 10

Spatial map of activation for the contrast target > normal stimuli Z > 2.5

Spatial map of activation for the contrast target > normal stimuli Z > 2.5 In the analysis of the temporal component of the EEG, we have noted an increased theta 4–7 Hz and (especially) delta activity 0.5–4 Hz in the frequency component of the source related to the target stimuli. In the literature, the delta band in oddball paradigms has been most directly related to target P300 ERP produced during the target stimuli (Kolev, Demiralp, Yordanova, Ademoglu, & Isoglu‐Alkac, 1997; Bernat, Malone, Williams, Patrick, & Iacono, 2007). It has been reported that an increased delta activity is common for both hemispheres during the target stimuli in oddball paradigms and for theta coherences over the left hemisphere (Güntekin & Başar, 2010). Furthermore, the modulation of spectral activity in the delta band in response to the target stimuli indicates a dissociation in the activity of the neural networks that oscillate in delta and theta ranges and contribute to the generation of the P300, and can be used in order to identify disorders of consciousness (Rivera‐Lillo et al., 2018).

DISCUSSION

Our results suggest that, if the correct coupling strategy (fusion model) is selected (based on the type of the problem at hand), the use of “nonhard” (soft or flexible) coupling methods with raw data can lead to better performance, compared to the uncoupled or hard‐coupled counterparts. In any case, the selection of the method to be employed is heavily based on any a priori knowledge of the type of problem. A preliminary analysis of the data of both modalities separately is recommended. This can provide relevant information to the user on the model to be selected as well as hints on the values of the (hyper)parameters. As mentioned previously, also the number of coupled components, R , can be obtained by setting a threshold based on the correlation among the components of the two modalities in the initial “separate” decomposition. Furthermore, it should be kept in mind that the design of the experiment plays a significant role in the model selection. An experiment with different time courses per subject will lead to the adoption of DCMTF; this does not mean that the multiway nature of the data will still not be exploited (as previously noted), since the coupling across the spatial modes of the coupled fMRI data will enrich the optimization problem with spatial neighborhood information among the different subjects. Concerning reproducibility of the results, we have noted that all the methods have a low SD provided that they succeed to correctly separate the sources (high mean Pearson correlation). We have also noted that although the SD of the correlation is low in the uncoupled tensors method in cases where it fails, the SD of the correlation of the estimated sources in every run is high (can go up to 0.40). This means that, although the method produces (almost) equally bad results in every run, the intrarun variability is high. On the other hand, the coupled tensors method, though it also fails when different time courses exist, has low intrarun variability. This difference could be possibly explained from the coupling constraints which enhance the uniqueness properties of the decomposition. We have demonstrated the use of DCMTF in two different use cases, namely IOZ localization and visual oddball paradigm. In the epilepsy use case, DCMTF has retained, in the multisubject study, the good performance reported in Van Eyndhoven et al. (2021) with separately analyzing each subject. DCMTF is capable of blindly obtaining (most of) the IEDs annotated by the neurologists, while other multisubject methods using CMTF (Acar et al., 2017a; Hunyadi et al., 2016) can be applied only with the use of predefined IEDs (as a regressor for GLM). Hence, the automatic detection of the IEDs is not possible. Furthermore, in the oddball paradigm, our model obtains activated areas congruent with the literature and provides also frequency information that is not possible with the usual analysis in the framework of GLM. We need also to stress out that the use of tensor–tensor approaches (Chatzichristos et al., 2018; Jonmohamadi et al., 2020) could not have been applied in none of the use cases, since the subjects do not have the same time courses. Hence, DCMTF is the only method available (to the best of the authors' knowledge) that can be used in a (semi‐) blind context using raw data and not predefined features in both of the analyzed use cases. The flexible approach proposed in Van Eyndhoven et al. (2017) and Van Eyndhoven et al. (2021) performs very well in single‐subject cases and without the need of any parameter tuning. In multisubject cases (with the use of DCMTF), however, it becomes computationally expensive, and as the number of subjects increases, problems in the model computation might occur (nonconvergence of the model and high memory consumption). Hence, we suggest the use of the flexible two‐gamma model for single‐subject analysis (similar to this performed in Van Eyndhoven et al. (2021)) but in multisubject studies the use of the Lennard‐Jones model or the soft coupling DCMTF are proposed. Still, the more accurate double gamma HRF model can be used as an indirect way to tune the λ's of the soft coupling approach, as exhibited in Section 4.2.2. We have demonstrated that relying on raw data in the problem of fusion of EEG and fMRI, and provided the heterogeneity of the data variables (Walach et al., 2018) is carefully handled, facilitates accurate source identification. As it has been previously pointed out (Lahat et al., 2015; Adalı et al., 2015b), the use of raw data can improve the result of the decomposition by exploiting latent correlations between the different datasets, which might have been hidden by the use of intermediate feature extraction methods (such as GLM). Moreover, we have confirmed that ICA‐based methods fail to correctly decompose overlapped sources (Chatzichristos, Kofidis, et al., 2019; Stegeman, 2007).

CONCLUSIONS

After briefly introducing the reader to the problem of EEG–fMRI fusion, this article reports our recent results in this subject, which are based on the adoption of two different coupled decomposition models for jointly analyzing fMRI and EEG data in a true and early fusion context. This is an attempt to benefit from the multiway nature of both modalities, and to perform an early fusion, bypassing the need to rely on features. Performance gains have been reported compared to pICA as well as to the separate tensor decomposition‐based analyses of the datasets. A comparison between flexible (computation of the optimal HRF from data) and soft (similarity and not equality to a predefined HRF) coupling approaches has been presented while also an alternative HRF model, intended to reduce the complexity of the double gamma HRF model, has been tested for the first time. We have demonstrated in two real‐world applications (interictal onset localization and visual oddball paradigm) that DCMTF is capable of producing good results both in a semisupervised and an unsupervised setting. The results of the analysis are in accordance with previous studies and our findings point out the superiority of the use of raw data over features, when carefully preprocessed and with the selection of the appropriate model. Future work will include alternative decomposition models (e.g., BTD (Dron, Chin, & Escudero, 2021)) in the fusion model and comparisons with additional matrix‐based methods, notably those based on independent vector analysis (Adalı et al., 2015a). Ways of semiautomatically selecting the values of the λ parameters should also be sought for.

65 in total

1. Concurrent EEG/fMRI analysis by multiway Partial Least Squares.

Authors: Eduardo Martínez-Montes; Pedro A Valdés-Sosa; Fumikazu Miwakeichi; Robin I Goldman; Mark S Cohen
Journal: Neuroimage Date: 2004-07 Impact factor: 6.556

2. Linked independent component analysis for multimodal data fusion.

Authors: Adrian R Groves; Christian F Beckmann; Steve M Smith; Mark W Woolrich
Journal: Neuroimage Date: 2010-10-14 Impact factor: 6.556

3. A new informed tensor factorization approach to EEG-fMRI fusion.

Authors: Saideh Ferdowsi; Vahid Abolghasemi; Saeid Sanei
Journal: J Neurosci Methods Date: 2015-07-29 Impact factor: 2.390

Review 4. On the nature of the BOLD fMRI contrast mechanism.

Authors: Nikos K Logothetis; Josef Pfeuffer
Journal: Magn Reson Imaging Date: 2004-12 Impact factor: 2.546

Review 5. Modeling the hemodynamic response to brain activation.

Authors: Richard B Buxton; Kâmil Uludağ; David J Dubowitz; Thomas T Liu
Journal: Neuroimage Date: 2004 Impact factor: 6.556

6. The variability of human, BOLD hemodynamic responses.

Authors: G K Aguirre; E Zarahn; M D'esposito
Journal: Neuroimage Date: 1998-11 Impact factor: 6.556

7. Tensor-driven extraction of developmental features from varying paediatric EEG datasets.

Authors: E Kinney-Lang; L Spyrou; A Ebied; R F M Chin; J Escudero
Journal: J Neural Eng Date: 2018-05-21 Impact factor: 5.379

8. Simultaneous EEG-fMRI: trial level spatio-temporal fusion for hierarchically reliable information discovery.

Authors: Li Dong; Diankun Gong; Pedro A Valdes-Sosa; Yang Xia; Cheng Luo; Peng Xu; Dezhong Yao
Journal: Neuroimage Date: 2014-05-20 Impact factor: 6.556

9. Multiway analysis of epilepsy tensors.

Authors: Evrim Acar; Canan Aykut-Bingol; Haluk Bingol; Rasmus Bro; Bülent Yener
Journal: Bioinformatics Date: 2007-07-01 Impact factor: 6.937

10. Dynamics of task-induced modulation of spontaneous brain activity and functional connectivity in the triple resting-state networks assessed using the visual oddball paradigm.

Authors: Hasan Sbaihat; Ravichandran Rajkumar; Shukti Ramkiran; Abed Al-Nasser Assi; N Jon Shah; Tanja Veselinović; Irene Neuner
Journal: PLoS One Date: 2021-11-04 Impact factor: 3.240

2 in total

1. Multi-Subject Analysis for Brain Developmental Patterns Discovery via Tensor Decomposition of MEG Data.

Authors: Irina Belyaeva; Ben Gabrielson; Yu-Ping Wang; Tony W Wilson; Vince D Calhoun; Julia M Stephen; Tülay Adali
Journal: Neuroinformatics Date: 2022-08-24

2. Early soft and flexible fusion of electroencephalography and functional magnetic resonance imaging via double coupled matrix tensor factorization for multisubject group analysis.

Authors: Christos Chatzichristos; Eleftherios Kofidis; Wim Van Paesschen; Lieven De Lathauwer; Sergios Theodoridis; Sabine Van Huffel
Journal: Hum Brain Mapp Date: 2021-11-22 Impact factor: 5.038

2 in total