Yalda Mohsenzadeh1, Caitlin Mullin1, Benjamin Lahner1,2, Radoslaw Martin Cichy3, Aude Oliva1. 1. Computer Science and AI Lab., Massachusetts Institute of Technology, Cambridge, MA 02139, USA. 2. Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA. 3. Department of Education and Psychology, Freie Universität Berlin, Berlin 14195, Germany.
Abstract
To build a representation of what we see, the human brain recruits regions throughout the visual cortex in cascading sequence. Recently, an approach was proposed to evaluate the dynamics of visual perception in high spatiotemporal resolution at the scale of the whole brain. This method combined functional magnetic resonance imaging (fMRI) data with magnetoencephalography (MEG) data using representational similarity analysis and revealed a hierarchical progression from primary visual cortex through the dorsal and ventral streams. To assess the replicability of this method, we here present the results of a visual recognition neuro-imaging fusion experiment and compare them within and across experimental settings. We evaluated the reliability of this method by assessing the consistency of the results under similar test conditions, showing high agreement within participants. We then generalized these results to a separate group of individuals and visual input by comparing them to the fMRI-MEG fusion data of Cichy et al (2016), revealing a highly similar temporal progression recruiting both the dorsal and ventral streams. Together these results are a testament to the reproducibility of the fMRI-MEG fusion approach and allows for the interpretation of these spatiotemporal dynamic in a broader context.
To build a representation of what we see, the human brain recruits regions throughout the visual cortex in cascading sequence. Recently, an approach was proposed to evaluate the dynamics of visual perception in high spatiotemporal resolution at the scale of the whole brain. This method combined functional magnetic resonance imaging (fMRI) data with magnetoencephalography (MEG) data using representational similarity analysis and revealed a hierarchical progression from primary visual cortex through the dorsal and ventral streams. To assess the replicability of this method, we here present the results of a visual recognition neuro-imaging fusion experiment and compare them within and across experimental settings. We evaluated the reliability of this method by assessing the consistency of the results under similar test conditions, showing high agreement within participants. We then generalized these results to a separate group of individuals and visual input by comparing them to the fMRI-MEG fusion data of Cichy et al (2016), revealing a highly similar temporal progression recruiting both the dorsal and ventral streams. Together these results are a testament to the reproducibility of the fMRI-MEG fusion approach and allows for the interpretation of these spatiotemporal dynamic in a broader context.
To solve visual object recognition, the human brain has developed a particular cortical topology within the ventral and dorsal streams, recruiting regions in cascading sequence, to quickly build a representation of what we see (i.e., [1,2,3,4,5,6,7,8,9]).In order to reveal the complex neural dynamics underlying visual object recognition, neural representations must be resolved in both space and time simultaneously [10,11,12,13,14,15]. Towards this aim, Cichy and collaborators proposed a novel approach to combine functional magnetic resonance imaging (fMRI) with magnetoencephalography (MEG) termed MEG-fMRI fusion [8,9,16,17,18]. The results revealed the dynamics of the visual processing cascade. Neural representations first emerge in the occipital pole (V1, V2, V3) at around 80 ms, and then progress in the anterior direction along the ventral (i.e., lateral-occipital cortex LO, ventral occipital cortex VO, temporal occipital cortex TO and parahippocampal cortex PHC) and dorsal (intraparietal sulcus regions) visual streams within 110–170 ms after image onset.The consistency of these results with established findings [19,20,21,22,23,24,25,26] suggests that fMRI-MEG fusion is an appropriate analytical tool to non-invasively evaluate the spatiotemporal mechanisms of perception. To assess the power of this technique to yield consistent results on visual recognition dynamics in the human ventral and dorsal streams [27,28,29], here we replicate the standard design and analysis protocol of fMRI-MEG fusion [9]. Specifically, we first evaluated the reliability of the fusion method at capturing the spatiotemporal dynamics of visual perception by assessing the neural agreement of visually similar experiences within individuals, asking: ‘Do similar visual experiences obey similar spatiotemporal stages in the brain?’ Given the known variation across brain regions for separate visual category input (e.g., [1,30,31,32,33,34,35]), our second objective was to determine the generalizability of these patterns: ‘For a given task, which spatiotemporal properties are reproducible across diverse visual input and independent observer groups?’The current results reveal that the established fusion method is reliable and generalizable within and across image sets and participant groups. In addition, novel approaches to region-of-interest based analyses validate the replicability of the spatio-temporal dynamics for highly similar visual content, demonstrating the robustness of the technique for tracing representational similarity in the brain over space and time.
2. Materials and Methods
This paper presents two independent experiments in which fMRI and MEG data were acquired when observers look at pictures of natural images. The fMRI and MEG data of Experiment 1 are original to this current work. Data of Experiment 2 has been published originally in [9] (Experiment 2).
2.1. Participants
Two separate groups of fifteen right-handed volunteers with normal or corrected to normal vision participated in Experiment 1 (9 female, 27.87 ± 5.17 years old) and Experiment 2 (5 female, 26.6 ± 5.18 years old, see [9]). The participants signed an informed consent form and were compensated for their participation. Both studies were conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Massachusetts Institute of Technology (Approval code: 1510287948).
2.2. Stimulus Set
In Experiment 1, the stimulus set consisted of twin sets of 78 real-world natural images each (156 images total) from the LaMem dataset [36]. Twin-set 1 and Twin-set 2 each contained an image identifiable by the same verbal semantic description of the main object shown (based on consensus among the authors; see Figure 1a for examples). The sets were not significantly different on a collection of low level image statistics [37,38,39] (See Appendix A for more information about stimulus set). The stimulus set of Experiment 2 consisted of 118 natural images of objects [9] from the ImageNet dataset [40]. In both experiments, participants performed the same orthogonal vigilance task. See Appendix A for details on the Experimental Design.
Figure 1
Examples of stimuli of Experiment 1 and analysis scheme. (a) Examples of experimental condition: The stimulus set consisted of two sets (termed twin-sets 1 and 2) of 78 object images each ordered in pairs, such that each set was of the same semantic content but different in appearance. (b) MEG multivariate analysis. MEG data were extracted from −200 ms to 1000 ms with respect to stimulus onset. The 306-dimensional MEG sensor data at each time point t were arranged in vectors. For each pair of conditions, the performance of an SVM classifier in discriminating the conditions based on vector patterns was used as a measure of dissimilarity between the pair of conditions. A representational dissimilarity matrix at each time point t was populated using these pairwise dissimilarities; (c) fMRI multivariate analysis. At each voxel v, the activation patterns in a sphere with radius of 4 voxels were extracted and pairwise dissimilarities of conditions were computed (1-Pearson’s R) yielding a representational dissimilarity matrix assigned to the voxel v at the center of the sphere; (d) fMRI-MEG similarity-based fusion. The MEG RDM representation at each time point t were compared with the fMRI RDM representations at each voxel v by computing Spearman’s R correlations. These computations resulted in correlation maps over the whole brain and across time.
2.3. fMRI and MEG Acquisition
MEG and fMRI data for Experiment 1 were acquired in separate sessions, similar to Experiment 2 in [9]. Images were presented 500 ms in all conditions. See Appendix B for data acquisition detail of both experiments.
2.4. Data Analyses
We performed several data analyses to test the robustness and generalizability of our results. First, we performed a full brain fMRI-MEG fusion [8,9,16,17], which uses representational similarity analysis [41,42] to map MEG and fMRI data into a common space (see Appendix C for details). To summarize, the MEG data were analyzed in a time-resolved manner with 1 ms resolution. MEG sensor data at each time point were arranged in pattern vectors for each stimulus condition and repetition. To overcome computational complexity and reduce noise, trials per image condition were randomly sub-averaged in groups of 3 in Experiment 1, and groups of 5 in Experiment 2. These vectors were then used to train support vector machines to classify each pair of conditions. The performance of the binary SVM classifiers computed with leave-one-out cross validation procedure were interpreted as a pairwise dissimilarity measure (higher decoding indicates larger dissimilarity) and used to construct a condition by condition representational dissimilarity matrix (RDM) per time point (Figure 1b). The fMRI data were analyzed with a searchlight approach to construct the representational dissimilarity matrices in a voxel-resolved fashion. At every voxel in the brain, the condition-specific voxel patterns in its vicinity were extracted and pairwise condition-specific dissimilarities (1-Pearson’s R) were computed to create a condition by condition fMRI RDM assigned to that voxel (Figure 1c). Then, in the similarity space of RDMs, MEG and fMRI data were directly compared (Spearman’s R) to integrate high spatial resolution of fMRI with high temporal resolution of MEG. In detail, for a given time point the MEG RDM was correlated with fMRI RDMs specified with the searchlight method resulting in a 3D correlation map. Repeating the procedure for all time points as depicted in Figure 1d resulted in a 4D spatiotemporal correlation map per individual. The correlation maps were averaged over individuals and significant correlations were determined with permutation tests and multiple comparison corrections over time and space with cluster correction method (N = 15 in each experiment; cluster-definition threshold of 0.001 and cluster size threshold of 0.01). The detail on statistical tests are presented in Appendix F.Second, to quantitatively compare the spatiotemporal neural dynamics within and across the experiments and show how reliable they are, we performed two types of anatomically-defined region-of-interest (ROI) based analysis along ventral and dorsal streams (see Appendix D for detail): (1) a spatially restricted searchlight voxel-wise analysis: in this analysis, the searchlight based MEG-fMRI fusion correlation time series are averaged over voxels within an ROI; (2) the conventional ROI-based analysis: in this analysis, the condition-specific voxel responses within an ROI are extracted, and the pairwise condition-specific response dissimilarities are computed to create an ROI RDM which is then compared with the time-resolved MEG RDMs resulting in correlation time series (as in [9]).
3. Results
3.1. Reliability of Similarity-Based fMRI-MEG Fusion Method
We first assessed the reliability of the fMRI-MEG similarity-based data fusion method. For this we applied fMRI-MEG fusion separately to the two sets of images making up the Twins-set (i.e., Twin-set 1 and Twin-set 2) and compared the results.Figure 2 and Figure 3 show the spatiotemporal dynamics of visual perception in the ventral and dorsal visual pathways respectively for the Twins-sets over the time course of 1000 ms. Qualitatively examining these results reveal that the neural representations start at the occipital pole around 70–90 ms after stimulus onset, followed by neural representations in the anterior direction along the ventral stream (Figure 2a,c), and along the dorsal stream up to the inferior parietal cortex (Figure 3a,c).
Figure 2
Experiment 1 Twins-set spatiotemporal neural dynamics of vision in ventral stream: (a) An axial slice encompassing early visual cortex (EVC) and ventral ROIs, ventral occipital (VO) area and parahippocampal cortex (PHC). The significant correlation maps for Twin-set 1 in this axial slice depicted over time (n = 15, cluster-definition threshold P < 0.001, cluster threshold P < 0.01); (b) The correlation time series are computed based on spatially restricted searchlight voxel-wise fusion analysis (see Methods). The depicted curves from left to right compare these time series in EVC and ventral ROIs VO and PHC for Twin-set 1 and Twin-set 2. Significant time points depicted with color coded lines below the graphs are determined with sign-permutation tests (n = 15; P < 0.01 cluster-definition threshold, P < 0.01 cluster threshold); (c) The significant correlation maps for Twin-set 2 in the same axial slice as (a) depicted over time (n = 15, cluster-definition threshold P < 0.001, cluster threshold P < 0.01).
Figure 3
Experiment 1 Twins-set spatiotemporal neural dynamics of vision in dorsal stream: (a) An axial slice encompassing early visual cortex (EVC) and dorsal ROIs, inferior parietal sulcus (IPS0 and IPS1). The significant correlation maps for Twin-set 1 in this axial slice depicted over time (n = 15, cluster-definition threshold P < 0.001, cluster threshold P < 0.01); (b) The correlation time series are computed based on spatially restricted searchlight voxel-wise fusion analysis (see Methods). The depicted curves from left to right compare these time series in EVC and dorsal ROIs, IPS0 and IPS1 for Twin-set1 and Twin-set2. Significant time points depicted with color coded lines below the graphs are determined with sign-permutation tests (n = 15; P < 0.01 cluster-definition threshold, P < 0.01 cluster threshold); (c) The significant correlation maps for Twin-set 2 in the same axial slice as (a) depicted over time (n = 15, cluster-definition threshold P < 0.001, cluster threshold P < 0.01).
To quantitatively compare the spatiotemporal fusion maps between the Twin-sets we performed spatially restricted voxel-wise fMRI-MEG fusion on five regions-of-interest (ROIs) (early visual cortex (EVC), ventral occipital area (VO), parahippocampal cortex (PHC), and inferior parietal sulci (IPS0 and IPS1). We averaged the correlation values over the voxels within each ROI resulting in one correlation time series per individual (see Appendix D, spatially restricted searchlight voxel-wise fMRI-MEG fusion method). Figure 2b and Figure 3b compare the ROI-specific time courses for Twin-set 1 and 2. We observe that the two sets result in similar temporal dynamics within the regions of interest across ventral and dorsal pathways. While the analyses reported in Figure 2b and Figure 3b tests for similarity of time courses, we further investigated whether there are any significant dissimilarities. This explicit test however revealed no significant differences (permutation tests, n = 15; cluster-definition threshold of P < 0.01, and cluster size threshold of P < 0.01). This demonstrates the reliability of the fusion method in reproducing similar patterns of neural activity across similar visual experiences.Next, we performed an ROI-based fMRI-MEG fusion following the method described in [8,9] (see Figure 4a). Studied ROIs (see details in Appendix D) include EVC, ventral regions (VO, TO, PHC) and dorsal regions (IPS0, IPS1, IPS2, and IPS3). As depicted in Figure 4b–e panels, the ROI-based fMRI-MEG fusion results in visually similar time series. Comparison of peak latency times show a peak of response around 120 ms in EVC and significantly later (two-sided hypothesis test, all P < 0.01, FDR corrected), around 140 ms in ventral and dorsal ROIs. The peak latency and onset times with their corresponding 95% confidence intervals are reported in Table 1.
Figure 4
Comparing dorsal and ventral stream neural representations through similarity-based ROI fMRI-MEG fusion. (a) The voxel patterns are extracted from each ROI to construct the fMRI ROI RDM. Then the ROI-specific fMRI RDM was compared with time-resolved MEG RDMs resulting in correlation time series for each region of interest; (b,c) The fMRI-MEG fusion time series are depicted in EVC and ventral ROIs, LO, VO, PHC, and TO for Set 1 and 2, respectively; (d,e) The fMRI-MEG fusion time series are depicted in EVC and dorsal ROIs, IPS0-3 for Twin-set 1 and Twin-set 2, respectively. The color-coded lines below the curves indicate significant time points (n = 15, cluster-definition threshold P < 0.01, cluster threshold P < 0.01) and the dashed vertical line in each plot indicates the peak latency in EVC time series. Peak latency times and their corresponding 95% confidence intervals for correlation time series in bcde are illustrated with barplots and error bars, respectively. Black lines above the bar indicate significant differences between conditions. 95% confidence intervals were found with bootstrap tests and barplots were evaluated with two-sided hypothesis tests; false discovery rate corrected at P < 0.05.
Table 1
Peak and onset latency for fMRI-MEG fusion time series in EVC, ventral, and dorsal regions for Twin-Set 1 and Twin-Set 2 (Experiment 1).
Region of Interest
Twin-Set 1
Twin-Set 2
Peak latency (ms)
Onset latency (ms)
Peak latency (ms)
Onset latency (ms)
EVC
120 (117–137)
92 (85–99)
123 (118–126)
90 (51–94)
Dorsal regions
IPS0
134 (131–139)
107 (97–116)
136 (131–143)
113 (107–118)
IPS1
134 (128–141)
113 (102–123)
135 (130–138)
111 (105–116)
IPS2
134 (131–142)
118 (95–156)
136 (130–138)
116 (107–126)
Ventral regions
LO
136 (131–141)
101 (95–107)
134 (129–138)
105 (100–108)
VO
134 (130–139)
100 (95–105)
135 (130–138)
104 (99–111)
TO
133 (130–141)
104 (92–111)
131 (122–138)
85 (73–101)
PHC
134 (130–138)
106 (99–113)
135 (132–139)
112 (107–116)
Values are averaged over n = 15 subjects with 95% confidence intervals reported in parentheses.
To assess reproducibility of the full brain 4D spatiotemporal maps (time x voxel x voxel x voxel) between Twin set 1 and 2, for each participant separately, we compared the voxel-wise fMRI-MEG correlation time series of Twin set 1 with Twin set 2. This resulted in one 3D reliability (correlation) map per participant (see Appendix E). We assessed significance by permutation tests and corrected for multiple comparisons using the cluster correction method (n = 15; cluster-definition threshold of P < 0.01, and cluster size threshold of P < 0.01). This revealed significant clusters in the reliability map across both the ventral and the dorsal stream (Figure 5).
Figure 5
Reliability map of MEG-fMRI fusion time courses for Twin set 1 and Twin set 2. The figure shows significant clusters projected onto a standard T1 MNI brain. Significant correlations are determined with permutation tests and multiple comparison corrections using cluster correction method (n = 15; cluster-definition threshold of P < 0.01, and cluster size threshold of P < 0.01).
3.2. Generalizability of Similarity-Based fMRI-MEG Fusion Method
We evaluated the generalizability of the fMRI-MEG similarity-based data fusion method by comparing results across participant groups presented with different images (Twins-Set and ImageNet-Set from Experiment 2 of [9]). We applied the fMRI-MEG similarity-based data fusion method to Twins-set and ImageNet-set, separately. Figure 6 and Figure 7 display the spatiotemporal dynamics of visual perception for the two datasets along the ventral and dorsal visual pathways, respectively, over the first 1000 ms from stimulus onset. In both cases, the significant signals emerge in EVC around 70–80 ms after stimulus onset and then in the anterior direction along the ventral stream (Figure 6a,c), and across the dorsal stream up to the inferior parietal cortex (Figure 7a,c). We determined significant spatiotemporal correlations with sign-permutation tests (n = 15; P < 0.01 cluster-definition threshold, P < 0.01 cluster threshold).
Figure 6
Spatiotemporal neural dynamics of vision in ventral stream: (a) An axial slice encompassing early visual cortex (EVC) and ventral ROIs, occipital (VO) area and parahippocampal cortex (PHC). The significant correlation maps for Twins-set (Experiment 1) in this axial slice depicted over time (n = 15, cluster-definition threshold P < 0.001, cluster threshold P < 0.01); (b) The correlation time series are computed based on spatially restricted searchlight voxel-wise fusion analysis (see Methods). The depicted curves from left to right compare these time series in EVC and ventral ROIs VO and PHC for Twins-set (Experiment 1) and ImageNet-set (Experiment 2). Significant time points depicted with color coded lines below the graphs are determined with sign-permutation tests (n = 15; P < 0.01 cluster-definition threshold, P < 0.01 cluster threshold); (c) The significant correlation maps for ImageNet-set (Experiment 2) in the same axial slice as (a) depicted over time (n = 15, cluster-definition threshold P < 0.001, cluster threshold P < 0.01).
Figure 7
Spatiotemporal neural dynamics of vision in dorsal stream: (a) An axial slice encompassing early visual cortex (EVC) and dorsal ROIs, inferior parietal sulcus (IPS0 and IPS1). The significant correlation maps for Twins-set (Experiment 1) in this axial slice depicted over time (n = 15, cluster-definition threshold P < 0.001, cluster threshold P < 0.01); (b) The correlation time series are computed based on spatially restricted searchlight voxel-wise fusion analysis (see Methods). The depicted curves from left to right compare these time series in EVC and dorsal ROIs, IPS0 and IPS1 for Twins-set (Experiment 1) and ImageNet-set (Experiment 2). Significant time points depicted with color coded lines below the graphs are determined with sign-permutation tests (n = 15; P < 0.01 cluster-definition threshold, P < 0.01 cluster threshold); (c) The significant correlation maps for ImageNet-set (Experiment 2) in the same axial slice as (a) depicted over time (n = 15, cluster-definition threshold P < 0.001, cluster threshold P < 0.01).
Qualitatively, Figure 6b and Figure 7b show similar MEG-fMRI time courses for the two datasets (Twins-Set and ImageNet-Set) in EVC, ventral regions (VO and PHC) and dorsal regions (IPS0 and IPS1). Further explicit testing revealed no significant difference between the two sets (permutation tests, n = 15; cluster-definition threshold of P < 0.01, and cluster size threshold of P < 0.01).To further investigate the similarities and dissimilarities between these two sets of data quantitatively, we performed ROI-based analyses. For each dataset, we correlated ROI-specific fMRI RDMs with time-resolved MEG RDMs resulting in correlation time courses shown in Figure 8a–d. We determined significant time points illustrated with color coded lines below the graphs with sign-permutation tests (n = 15; P < 0.01 cluster-definition threshold, P < 0.01 cluster threshold). We observed that in both datasets, correlation time series peak significantly earlier in EVC compared to high level regions in ventral (VO and PHC) and dorsal (IPS0, IPS1, IPS2) pathways (two-sided hypothesis test, all P < 0.01, FDR corrected) reflecting the established structure of the visual hierarchy. Peak and onset latencies of curves in Figure 8a–d and their corresponding 95% confidence intervals are reported in Table 2.
Figure 8
Comparing dorsal and ventral stream neural representations through similarity-based ROI fMRI-MEG fusion. The voxel patterns are extracted from each ROI to construct the fMRI ROI RDM. Then the ROI-specific fMRI RDM was compared with time-resolved MEG RDMs resulting in correlation time series for each region of interest; (a,b) The fMRI-MEG fusion time series are depicted in EVC and ventral ROIs, LO, VO, PHC, and TO for Twins-set and ImageNet-set, respectively; (c,d) The fMRI-MEG fusion time series are depicted in EVC and dorsal ROIs, IPS0-3 for Twins-set and ImageNet-set, respectively. The color-coded lines below the curves indicate significant time points (n = 15, cluster-definition threshold P < 0.01, cluster threshold P < 0.01) and the dashed vertical line in each plot indicates the peak latency in EVC time series. Peak latency times and their corresponding 95% confidence intervals for correlation time series in bcde are illustrated with barplots and error bars, respectively. Black lines above the bar indicate significant differences between conditions. 95% confidence intervals were found with bootstrap tests and barplots were evaluated with two-sided hypothesis tests; false discovery rate corrected at P < 0.05.
Table 2
Peak and onset latency for fMRI-MEG fusion time series in EVC, ventral, and dorsal regions for Twins-Set (Experiment 1) and ImageNet-Set (Experiment 2).
Region of Interest
Twins-Set
ImageNet-Set
Peak latency (ms)
Onset latency (ms)
Peak latency (ms)
Onset latency (ms)
EVC
122 (119–125)
89 (56–95)
127 (110–133)
84 (71–91)
Dorsal regions
IPS0
137 (131–141)
111 (102–117)
172 (137–204)
133 (99–156)
IPS1
136 (132–140)
111 (103–119)
172 (142–205)
137 (99–158)
IPS2
136 (132–138)
112 (100–119)
174 (108–263)
103 (90–186)
Ventral regions
LO
136 (131–141)
102 (97–107)
141 (115–169)
96 (87–118)
VO
136 (131–139)
102 (95–109)
166 (125–173)
103 (95–125)
TO
134 (128–140)
90 (77–100)
139 (135–154)
115 (95–126)
PHC
136 (132–139)
109 (102–115)
172 (156–210)
121 (99–159)
Values are averaged over n = 15 subjects with 95% confidence intervals reported in parentheses.
4. Discussion
One of the central tenants of science is reproducibility [43,44,45]. This is especially relevant in cognitive neuroscience, as the surge of new cutting-edge multivariate analysis tools has made it possible to address questions that were previously untestable [41,46]. Here, we have demonstrated the reproducibility of the fMRI-MEG fusion method through the application of two separate experiments, testing the reliability and generalizability of the technique.The first analysis compared neural data within the same subject groups across Twin image sets, with equalized low-level visual features, and sharing highly similar semantic concepts (e.g., a giraffe for a giraffe, a flower for a flower). We observed that the analysis yielded reliably consistent spatiotemporal dynamics, with each subset showing responses first in the occipital pole, and then signals in the anterior direction, with similar path in both ventral and dorsal streams. The strong agreement between the full brain spatiotemporal maps of the Twin-sets suggests that the signals detected by the fMRI-MEG data fusion method is likely to reliably reflect the representations of the stimulus. This sequence of representational signals followed the established spatial [47,48] and temporal [19,49] patterns associated with hierarchical visual processing.This high degree of reliability, within subjects, across these separate but matched image-sets has pragmatic consequences in reinforcing the power and confidence of this method. The agreement within results of unimodal neuro-imaging have been shown have low test-retest reliability [29]. This realization has brought about a crisis of confidence in the replicability and reliability of published research findings within neuroscience [50,51]. To mitigate these concerns the generation of new research discoveries requires that findings move beyond drawing conclusions solely on the basis of a single study. As such, we sought to extend this replication by comparing the newly collected Twins-Set data to previously published independent fMRI-MEG fusion data [9]. This analysis focused on the generalizability of this method across subject groups and to a wider range of stimuli.Across the two experiments, findings from the full brain correlation maps and spatially restricted ROI analysis revealed that the fMRI-MEG data fusion captures the spatiotemporal dynamic patterns common to visual processing in a manner that generalizes across subjects and natural images. The results of both experiments illustrate the spatiotemporal progression associated with hierarchical visual processing [6,52,53], with significant signals emerging first in early visual areas and then along the ventral and dorsal pathways. Additionally, qualitative comparison of correlation time series in Figure 2b and Figure 3b with Figure 6b and Figure 7b reveals that the response patterns between Twin set 1 and 2 are qualitatively more similar than between the Twins sets and the ImageNet set. This is most likely due to the fact that Twin set 1 and 2 had similar low- and high-level features by design, and both were different from the features of the ImageNet set. Nevertheless, the similarity in spatiotemporal dynamics in the two experiments suggests that the neural dynamics in processing natural images are similar.We found that overall this generalized response had similar expected spatiotemporal response patterns, but with greater variability in high-level visual areas. Previous work examining the replicability of neuroimaging data has shown that low-level brain functions (motor and sensory tasks), show generally less variance than high level cognitive tasks [54]. Moreover, between subject comparisons show significantly higher variability than within due to the increase sources of noise [55].Cognitive neuroscientists have many methodological options for studying how experimental variables are systematically related to the spatiotemporal dynamics underlying a cortical process of interest. To date, much of our understanding has been advanced by methods that yield high spatial or temporal resolution, but not both. Separately, the limitations of these modalities in their application to understanding cognition are well known. Here we show that combining these methods through representational similarity analysis provides a reliable and generalizable tool for studying visual perception. This confirmatory replication within and between image-sets and subject-groups demonstrates the replicability of the method in the study of visual processing. By establishing replicability our study increases the trustworthiness of the method, and we hope that this motivates further researchers to use the method in the study of human brain function.
Table A1
Comparison of the spatial frequency information between the Twin Sets (Experiment 1).
Spectral Energy Level (Spatial Frequency)
10%
30%
70%
90%
t value
−0.43
0.87
0.37
−0.37
p value
0.67
0.37
0.7
0.72
Table A2
Comparison of color distribution (RGB and Lab) between the Twin Sets (Experiment 1).
Authors: Katherine S Button; John P A Ioannidis; Claire Mokrysz; Brian A Nosek; Jonathan Flint; Emma S J Robinson; Marcus R Munafò Journal: Nat Rev Neurosci Date: 2013-04-10 Impact factor: 34.870
Authors: Hung-Yun Lu; Elizabeth S Lorenc; Hanlin Zhu; Justin Kilmarx; James Sulzer; Chong Xie; Philippe N Tobler; Andrew J Watrous; Amy L Orsborn; Jarrod Lewis-Peacock; Samantha R Santacruz Journal: J Neural Eng Date: 2021-08-16 Impact factor: 5.043