The rapid growth of the literature on neuroimaging in humans has led to major advances in our understanding of human brain function but has also made it increasingly difficult to aggregate and synthesize neuroimaging findings. Here we describe and validate an automated brain-mapping framework that uses text-mining, meta-analysis and machine-learning techniques to generate a large database of mappings between neural and cognitive states. We show that our approach can be used to automatically conduct large-scale, high-quality neuroimaging meta-analyses, address long-standing inferential problems in the neuroimaging literature and support accurate 'decoding' of broad cognitive states from brain activity in both entire studies and individual human subjects. Collectively, our results have validated a powerful and generative framework for synthesizing human neuroimaging data on an unprecedented scale.
The rapid growth of the literature on neuroimaging in humans has led to major advances in our understanding of human brain function but has also made it increasingly difficult to aggregate and synthesize neuroimaging findings. Here we describe and validate an automated brain-mapping framework that uses text-mining, meta-analysis and machine-learning techniques to generate a large database of mappings between neural and cognitive states. We show that our approach can be used to automatically conduct large-scale, high-quality neuroimaging meta-analyses, address long-standing inferential problems in the neuroimaging literature and support accurate 'decoding' of broad cognitive states from brain activity in both entire studies and individual human subjects. Collectively, our results have validated a powerful and generative framework for synthesizing human neuroimaging data on an unprecedented scale.
The development of non-invasive neuroimaging techniques such as functional magnetic resonance imaging (fMRI) has spurred explosive growth of the human brain imaging literature in recent years. In 2010 alone, over 1,000 fMRI articles were published[1]. This proliferation has led to substantial advances in understanding of human brain and cognitive function; however, it has also introduced important new challenges. In place of too little data, researchers are now besieged with too much. Because individual neuroimaging studies are often underpowered and exhibit relatively high false positive rate[2-4], multiple studies are required to achieve consensus regarding even broad relationships between brain and cognitive function. Distilling the extant literature thus necessitates development of new techniques for large-scale aggregation and synthesis of human neuroimaging data[4-6].Here we describe and validate a novel framework for brain mapping, NeuroSynth, that takes an instrumental step towards automated large-scale synthesis of the neuroimaging literature. NeuroSynth combines text mining, meta-analysis, and machine learning techniques to generate probabilistic mappings between cognitive and neural states that can be used for a broad range of neuroimaging applications. Whereas previous approaches have relied heavily on researchers’ manual efforts (e.g., [7,8])—a constraint that limits the scope and efficiency of resulting analyses[1]—the present framework is fully automated, enabling rapid and scalable synthesis of the neuroimaging literature. We demonstrate the capacity of this framework to generate large-scale meta-analyses for hundreds of broad psychological concepts; support quantitative inferences about the consistency and specificity with which different cognitive processes elicit regional changes in brain activity; and decode and classify broad cognitive states in new data based solely on observed brain activity.
RESULTS
Our methodological approach includes several steps (Fig. 1a). First, we used text-mining techniques to identify neuroimaging studies that used specific terms of interest (e.g., ‘pain’, ‘emotion’, ‘working memory’, etc.) at a high frequency (>1 in 1,000 words) within the article text. Second, we automatically extracted activation coordinates from all tables reported in these studies. This approach produced a large database of term-to-coordinate mappings; we report results based on 100,953 activation foci drawn from 3,489 neuroimaging studies published in more than 15 journals (Online methods). Third, we conducted automated meta-analyses of hundreds of psychological concepts, producing an extensive set of whole-brain images quantifying brain-cognition relationships (Fig. 1b). Finally, we used a machine learning technique (naïve Bayes classification) to estimate the likelihood that new activation maps were associated with specific psychological terms, enabling relatively open-ended decoding of psychological constructs from patterns of brain activity (Fig. 1c).
Figure 1
Schematic overview of NeuroSynth framework and applications. (a) Schematic of NeuroSynth approach. The full text of a large corpus of articles is retrieved and terms of scientific interest are stored in a database. Articles are retrieved from the database based on a user-entered search string (e.g., the word ‘pain’), and peak coordinates from the associated articles are extracted from tables. A meta-analysis of the peak coordinates is automatically performed, producing a whole-brain map of the posterior probability of the term given activation at each voxel (i.e P(Pain|Activation)). (b) Two types of inference in brain imaging. Given a known psychological manipulation, one can quantify the corresponding changes in brain activity and generate a forward inference; however, given an observed pattern of activity, drawing a reverse inference about associated cognitive states is more difficult, because multiple cognitive states could have similar neural signatures. (c) Given meta-analytic posterior probability maps for multiple terms (e.g., working memory, emotion, pain), one can classify a new activation map by identifying the class with the highest probability P given the new data (in this example, pain).
Automated coordinate extraction
Our approach differs from previous work in its heavy reliance on automatically extracted information, raising several potential data quality concerns. For example, the software might incorrectly classify non-coordinate information in a table as an activation focus (i.e., a false positive); different articles report foci in different stereotactic spaces, resulting in potential discrepancies between anatomical locations represented by the same set of coordinates and the software did not discriminate activations from deactivations.To assess the impact of these issues on data quality, we conducted an extensive series of supporting analyses (Supplementary Note). First, we compared automatically extracted coordinates with a reference set of manually-entered foci in the SumsDB database[7,9], revealing high rates of sensitivity (84%) and specificity (97%). Second, we quantified the proportion of activation increases versus decreases reported in the neuroimaging literature. We found that decreases constitute a small proportion of results and have minimal effect on the results we report. Third, we developed a preliminary algorithm for automatically detecting and correcting (based on ref. [10]) for between-study differences in stereotactic space (Supplementary Fig. 1). Collectively, these results indicate that while automated extraction misses a minority of valid coordinates, and work remains to be done to increase the specificity of the extracted information, the majority of coordinates are extracted accurately, and a number of factors of a priori concern have relatively small influences on the results.The database of studies and coordinates, software, and meta-analysis maps for several hundred terms used in the present study are made available via a web interface (http://neurosynth.org) that provides instant access to, and visualization of, thousands of whole-brain images.
Large-scale automated meta-analysis
We used the database of automatically extracted activation coordinates to conduct a comprehensive set of automated meta-analyses for several hundred terms of interest. For each term, we identified all studies that used the term at high frequency anywhere within the article text[11] and submitted all associated activation foci to a meta-analysis. This approach generated whole-brain maps displaying the strength of association between each term and every location in the brain, enabling multiple kinds of quantitative inference (e.g., if the term ‘language’ was used in a study, how likely was the study to report activation in Broca’s area? If activation was observed in the amygdala, what was the probability that the study frequently used the term ‘fear’?).To validate this automated approach—which rests on the assumption that simple word counts are a reasonable proxy for the substantive content of articles—we conducted a series of supporting analyses (Supplementary Note). First, we demonstrated that NeuroSynth accurately recaptured conventional boundaries between distinct anatomical regions by comparing lexically defined regions-of-interest (ROIs) to anatomically defined ROIs (Supplementary Fig. 2). Second, we used NeuroSynth to replicate previous findings of visual category-specific activation in regions such as the fusiform face area (FFA[12]) and visual word form area (VWFA[13]; Supplementary Fig. 3). Third, we demonstrated that more conservative meta-analyses restricting the lexical search space to only article titles yielded similar, though less sensitive, meta-analysis results (Supplementary Fig. 4).Finally, we compared our results with those produced by prior manual approaches. Comparison of automated meta-analyses of three broad psychological terms (‘working memory’, ‘emotion’, and ‘pain’) with previously published meta- or mega-analytic maps[14-16] revealed a marked convergence between approaches both qualitatively (Fig. 2) and quantitatively (Supplementary Fig. 5). To directly test the convergence of automated and manual approaches when applied to similar data, we manually validated a set of 265 automatically extracted pain studies and performed a standard multilevel kernel density analysis (MKDA[15]) contrasting experimental pain stimulation with baseline (n = 66 valid studies). Direct comparison between automated and manual results revealed a striking overlap (correlation across voxels = .84; Supplementary Fig. 6). Thus, these results demonstrated that, at least for broad domains, an automated meta-analysis approach generates results comparable in sensitivity and scope to those produced more effortfully in previous studies.
Figure 2
Comparison of previous meta-analysis results with forward and reverse inference maps produced automatically using the NeuroSynth framework. Meta-analyses were carried out for working memory (top row), emotion (middle row), and physical pain (bottom row), and mapped to the PALS-B12 atlas[45]. (a) Meta-analytic maps produced manually in previous studies[14-16]. (b) Automatically generated forward inference maps displaying the probability of observing activation given the presence of the term (i.e., P(Activation|Term)). (c) Automatically generated reverse inference maps display the probability of the term given observed activation (i.e., P(Term|Activation)). Thus, regions in (b) are consistently associated with the term, and regions in (c) are selectively associated with the term. To account for base differences in term frequencies, reverse inference maps assume uniform priors (i.e., equal 50% probabilities of Term and No Term). Activation in orange/red regions implies a high probability that a term is present, and activation in blue regions implies a high probability that a term is not present. Values for all images are displayed only for regions that are significant for a test of association between Term and Activation, with a whole-brain correction for multiple comparisons (FDR = .05). DLPFC = dorsolateral prefrontal cortex; dACC = dorsal anterior cingulate cortex; aI = anterior insula.
Quantitative reverse inference
The relatively comprehensive nature of the NeuroSynth database enabled us to address a long-standing inferential problem in the neuroimaging literature—namely, how to quantitatively identify cognitive states based on patterns of observed brain activity. This problem of ‘reverse inference’[17] arises because most neuroimaging studies are designed to identify neural changes that result from known psychological manipulations, and not to determine what cognitive state(s) a given pattern of activity implies (Fig. 1b, ref.[17]). For instance, fear consistently activates the human amygdala, but this does not imply that people who show amygdala activation must be experiencing fear, because other affective and non-affective states have also been reported to produce amygdala activation[4,18]. True reverse inference requires knowledge of which brain regions and networks are selectively, and not just consistently, associated with particular cognitive states[15,17].Because the NeuroSynth database contains a broad set of term-to-activation mappings, our framework is well suited for drawing quantitative inferences about mind-brain relationships in both the forward and reverse directions. We were able to quantify both the probability of observing activation in specific brain regions given the presence a particular term (P(Activation|Term), or ‘forward inference’), and the probability of a term occurring in an article given the presence of activation in a particular brain region (i.e., P(Term|Activation), or reverse inference). Comparison of these two analyses provided a way to assess the validity of many common inferences about the relationship between neural and cognitive states.For illustration purposes, we focused on the sample domains of working memory, emotion, and pain, which are of substantial basic and clinical interest and have been extensively studied using fMRI (for additional examples, see Supplementary Fig. 7). These domains are excellent candidates for quantitative reverse inference, as they are thought to have somewhat confusable neural correlates, with common activation of regions such as dorsal anterior cingulate cortex (ACC)[19] and anterior insula.Results revealed important differences between the forward and reverse inference maps in all three domains (Fig. 2). For working memory, the forward inference map revealed the most consistent associations in dorsolateral prefrontal cortex (DLPFC), anterior insula and dorsal medial frontal cortex (MFC), replicating previous findings[15,20]. However, the reverse inference map instead implicated anterior PFC and posterior parietal cortex as the regions most selectively activated by working memory tasks.We observed a similar pattern for pain and emotion. In both domains, frontal regions broadly implicated in goal-directed cognition[21-23] showed consistent activation in the forward analysis, but were relatively non-selective in the reverse analysis (Fig. 2). For emotion, the reverse inference map revealed much more selective activations in the amygdala and ventromedial PFC (Fig. 3). For pain, the regions of maximal pain-related activation in insula and ACC shifted from anterior foci in the forward analysis to posterior ones in the reverse analysis (Fig. 3). This is consistent with nonhuman primate studies implicating dorsal posterior insula as a primary integration center for nociceptive afferents[24] as well as human studies demonstrating that anterior aspects of the so-called ‘pain matrix’ respond non-selectively to multiple modalities[25].
Figure 3
Comparison of forward and reverse inference in selected regions of interest. (a) Labeled regions of interest displayed on lateral and medial brain surfaces. (b) Comparison of forward inference (i.e., probability of activation given term P(T|A)) and reverse inference (probability of term given activation P(A|T)) for the domains of working memory (top), emotion (middle), and pain (bottom). Bars with asterisks denote statistically significant effects (whole-brain FDR, q = .05). dACC = dorsal anterior cingulate cortex (coordinates: +2, +8, +50); aIns = anterior insula (+36, +16, +2); IFJ = inferior frontal junction (−50, +8, +36); pIns = posterior insula (+42, −24, +24); aPFC = anterior prefrontal cortex (−28, +56, +8); vmPFC = ventromedial prefrontal cortex (0, +32, −4). L and R refer to the left and right hemispheres, respectively.
Perhaps most strikingly, several frontal regions that showed consistent activation for emotion and pain in the forward analysis were actually associated with a decreased likelihood that a study involved emotion or pain in the reverse inference analysis (Fig. 3). This seeming paradox reflected the fact that even though lateral and medial frontal regions were consistently activated in studies of emotion and pain, they were activated even more frequently in studies that did not involve emotion or pain (Supplementary Fig. 8). Thus, the fact that these regions showed involvement in pain and emotion likely reflected their much more general role in cognition (e.g., sustained attention or goal-directed processing[22,23]) rather than pain- or emotion-specific process.These results demonstrate that without the ability to distinguish consistency from selectivity, neuroimaging data can produce misleading inferences. For instance, neglecting the high base rate of ACC activity might lead researchers in the areas of cognitive control, pain, and emotion each to conclude that the ACC plays a critical role in their particular domain. Instead, because the ACC is activated consistently in all of these states, its activation may not be diagnostic of any one of them—and conversely, might even predict their absence. The NeuroSynth framework can potentially address this problem by enabling researchers to conduct quantitative reverse inference on a large scale.
Open-ended classification of cognitive states
An emerging frontier in human neuroimaging is brain ‘decoding’: inferring a person’s cognitive state based solely on their observed brain activity. The problem of decoding is essentially a generalization of the univariate reverse inference problem addressed above: instead of predicting the likelihood of a particular cognitive state given activation at a single voxel, one can generate a corresponding prediction based on an entire pattern of brain activity. The NeuroSynth framework is well positioned for such an approach: whereas previous decoding approaches have focused on discriminating between narrow sets of cognitive states and have required extensive training on raw fMRI datasets (e.g., refs.[26-28]), the breadth of cognitive concepts represented in the NeuroSynth database affords relatively open-ended decoding, with little or no training on new datasets.To assess the ability of our approach to decode and classify cognitive states, we trained a naïve Bayes classifier[29] capable of discriminating between flexible sets of cognitive states given new images as input (Fig. 1c). First, we tested the classifier’s ability to correctly classify studies in the NeuroSynth database that were associated with different terms. In a 10-fold cross-validated analysis, the classifier discriminated between studies of working memory, emotion, and pain, with high sensitivity and specificity (Fig. 4a), demonstrating that each of these domains has a relatively distinct neural signature (Fig. 4b).
Figure 4
Three-way classification of working memory (WM), emotion, and pain. (a) Naive Bayes classifier performance when cross-validated on studies in the database (left) or applied to entirely new subjects (right). Sens. = sensitivity; Spec. = specificity. (b) Whole-brain maximum posterior probability map; each voxel is colored by the term with the highest associated probability. (c) Whole-brain maps displaying the proportion of individual subjects in the three pain studies (total n = 79) who showed activation at each voxel (P < .05, uncorrected), averaged separately for subjects who were correctly (n = 51; top row) or incorrectly (n = 28; bottom row) classified. Regions are color-coded according to the proportion of subjects in the sample who showed activation at each voxel.
To assess the classifier’s ability to decode cognitive states in individual human subjects, we applied the classifier to 281 single-subject activation maps derived from contrasts between: N-back working memory performance vs. rest (n = 94), negative vs. neutral emotional photographs (n = 108); and intense vs. mild thermal pain (n = 79). The classifier performed substantially above chance, identifying the originating study type with sensitivities of 94%, 70%, and 65%, respectively (chance = 33%), and specificities of 80%, 86%, and 98% (Fig. 4a). Moreover, there were systematic differences in activation patterns for correctly vs. incorrectly classified subjects. For example, incorrectly classified subjects in physical pain tasks (e.g., Fig. 4c) systematically activated lateral orbitofrontal cortex and dorsomedial prefrontal cortex, but not SII/posterior insula, suggesting that the discomfort due to noxious heat in these subjects may have been qualitatively different (e.g., emotionally-generated vs. physically-generated pain). Thus, these findings demonstrate the viability of decoding cognitive states in new subjects without training while suggesting novel hypotheses amenable to further exploration.Next, to generalize beyond working memory, emotion, and pain, we selected 25 broad psychological terms used at high frequency in the database (Fig. 5). Classification accuracy was estimated in ten-fold cross-validated two-alternative and multi-class analyses. The classifier performed substantially above chance in both two-alternative classification (mean pairwise accuracy of 72%; Fig. 4) and relatively open-ended multi-class classification on up to ten simultaneous terms (Supplementary Fig. 9). Moreover, the results provide insights into the similarity structure of neural representation for different processes. For instance, pain was highly discriminable from other psychological concepts (all pairwise accuracies > 74%), suggesting that pain perception may be a distinctive state that is neither grouped with other sensory modalities or with other affective concepts like arousal and emotion. Conversely, conceptually related terms like ‘executive’ and ‘working memory’ could not be distinguished at a rate different from chance, reflecting their closely overlapping usage in the literature.
Figure 5
Accuracy of the naive Bayes classifier when discriminating between all possible pairwise combinations of 25 key terms. Each cell represents a cross-validated binary classification between the intersecting row and column terms. Off-diagonal values reflect accuracy (in %) averaged across the two terms. Diagonal values reflect the mean classification accuracy for each term. Terms were ordered using the first two factors of a principal components analysis (PCA). All accuracy rates above 58% and 64% are statistically significant at P < .05 and P < .001, respectively.
DISCUSSION
The advent of modern neuroimaging techniques such as fMRI has spurred dramatic growth in the primary cognitive neuroscience literature, but has also made comprehensive synthesis of the literature increasingly difficult. The NeuroSynth framework introduced here addresses this problem in several ways. First, we validated a novel approach for conducting large-scale automated neuroimaging meta-analyses of broad psychological concepts that are lexically well represented in the literature. A key benefit is the ability to quantitatively distinguish forward inference from reverse inference, enabling researchers to assess the specificity of mappings between neural and cognitive function—a long-standing goal of cognitive neuroscience research. Although considerable work remains to be done to improve the specificity and accuracy of the tools developed here, we expect quantitative reverse inference to play an increasingly important role in future meta-analytic studies.Second, we demonstrated the viability of decoding broad psychological states in a relatively open-ended way in individual subjects—to our knowledge, the first application of a domain-general classifier that can distinguish a broad range of cognitive states based solely on prior literature. Particularly promising is the ability to decode brain activity without prior training data or knowledge of the “ground truth” for an individual. Our results raise the prospect that legitimate ‘mind reading’ of more nuanced cognitive and affective states might eventually become feasible given additional technical advances. However, the present NeuroSynth implementation provides no basis for such inferences, as it distinguishes only between relatively broad psychological categories.Third, the platform we introduce is designed to support immediate use in a broad range of neuroimaging applications. To name just a few potential applications, researchers could use these tools and results to define region-of-interest masks or Bayesian priors in hypothesis-driven analyses; conduct quantitative comparisons between meta-analysis maps of different terms of interest; use the automatically-extracted coordinate database as a starting point for more refined manual meta-analyses; draw more rigorous reverse inferences when interpreting results by referring to empirically established mappings between specific regions and cognitive functions; and extract the terms most frequently associated with an active region or distributed pattern of activity, thus contextualizing new research findings based on the literature.Of course, the NeuroSynth framework is not a panacea for the many challenges facing cognitive neuroscientists, and a number of limitations remain to be addressed. We focus on two in particular here. First, the present reliance on a purely lexical coding approach, while effective, is suboptimal in that it relies on traditional psychological terms that may fail to carve the underlying neural substrates at their natural joints, fails to capitalize on redundancy across terms (e.g., ‘pain’, ‘nociception’, and ‘noxious’ overlap closely but are modeled separately), and does not allow closely related constructs to be easily distinguished (e.g., physical vs. emotional pain). Future efforts could overcome these limitations by using controlled vocabularies or ontologies for query expansion, developing extensions for conducting multi-term analyses, and extracting topic-based representations of article text (Supplementary Note).Second, while our automated tools accurately extract coordinates from articles, they are unable to extract information about fine-grained cognitive states (e.g., different negative emotions). Thus, the Neurosynth framework is currently useful primarily for large-scale analyses involving broad domains, and should be viewed as a complement to, and not a substitute for, manual meta-analysis approaches. We are currently working to develop improved algorithms for automatic coding of experimental contrasts, which should substantially improve the specificity of the resulting analyses. In parallel, we envision a ‘crowdsourced’ collaborative model in which multiple groups participate in validation of automatically extracted data, thereby combining the best elements of both automated and manual approaches. Such efforts should further increase the specificity and predictive accuracy of the decoding model, and will hopefully lead to the development of many other applications that we have not anticipated here.To encourage further application and development of a synthesis-oriented approach, we have publicly released most of the tools and data used in the present study via a web interface (http://neurosynth.org). We hope that cognitive neuroscientists will use, and contribute to, this new resource, with the goal of developing next-generation techniques for interpreting and synthesizing the wealth of data generated by modern neuroimaging techniques.
Authors: Kevin N Ochsner; Rebecca D Ray; Jeffrey C Cooper; Elaine R Robertson; Sita Chopra; John D E Gabrieli; James J Gross Journal: Neuroimage Date: 2004-10 Impact factor: 6.556
Authors: Rosa Li; David V Smith; John A Clithero; Vinod Venkatraman; R McKell Carter; Scott A Huettel Journal: J Neurosci Date: 2017-03-06 Impact factor: 6.167
Authors: Zarrar Shehzad; Clare Kelly; Philip T Reiss; R Cameron Craddock; John W Emerson; Katie McMahon; David A Copland; F Xavier Castellanos; Michael P Milham Journal: Neuroimage Date: 2014-02-28 Impact factor: 6.556