Peter Zeidman1, Amirhossein Jafarian2, Mohamed L Seghier3, Vladimir Litvak2, Hayriye Cagnan4, Cathy J Price2, Karl J Friston2. 1. Wellcome Centre for Human Neuroimaging, 12 Queen Square, London, WC1N 3AR, UK. Electronic address: peter.zeidman@ucl.ac.uk. 2. Wellcome Centre for Human Neuroimaging, 12 Queen Square, London, WC1N 3AR, UK. 3. Cognitive Neuroimaging Unit, ECAE, Abu Dhabi, United Arab Emirates. 4. Nuffield Department of Clinical Neurosciences, Level 6, West Wing, John Radcliffe Hospital, Oxford, OX3 9DU, UK.
Abstract
This paper provides a worked example of using Dynamic Causal Modelling (DCM) and Parametric Empirical Bayes (PEB) to characterise inter-subject variability in neural circuitry (effective connectivity). It steps through an analysis in detail and provides a tutorial style explanation of the underlying theory and assumptions (i.e, priors). The analysis procedure involves specifying a hierarchical model with two or more levels. At the first level, state space models (DCMs) are used to infer the effective connectivity that best explains a subject's neuroimaging timeseries (e.g. fMRI, MEG, EEG). Subject-specific connectivity parameters are then taken to the group level, where they are modelled using a General Linear Model (GLM) that partitions between-subject variability into designed effects and additive random effects. The ensuing (Bayesian) hierarchical model conveys both the estimated connection strengths and their uncertainty (i.e., posterior covariance) from the subject to the group level; enabling hypotheses to be tested about the commonalities and differences across subjects. This approach can also finesse parameter estimation at the subject level, by using the group-level parameters as empirical priors. The preliminary first level (subject specific) DCM for fMRI analysis is covered in a companion paper. Here, we detail group-level analysis procedures that are suitable for use with data from any neuroimaging modality. This paper is accompanied by an example dataset, together with step-by-step instructions demonstrating how to reproduce the analyses.
This paper provides a worked example of using Dynamic Causal Modelling (DCM) and Parametric Empirical Bayes (PEB) to characterise inter-subject variability in neural circuitry (effective connectivity). It steps through an analysis in detail and provides a tutorial style explanation of the underlying theory and assumptions (i.e, priors). The analysis procedure involves specifying a hierarchical model with two or more levels. At the first level, state space models (DCMs) are used to infer the effective connectivity that best explains a subject's neuroimaging timeseries (e.g. fMRI, MEG, EEG). Subject-specific connectivity parameters are then taken to the group level, where they are modelled using a General Linear Model (GLM) that partitions between-subject variability into designed effects and additive random effects. The ensuing (Bayesian) hierarchical model conveys both the estimated connection strengths and their uncertainty (i.e., posterior covariance) from the subject to the group level; enabling hypotheses to be tested about the commonalities and differences across subjects. This approach can also finesse parameter estimation at the subject level, by using the group-level parameters as empirical priors. The preliminary first level (subject specific) DCM for fMRI analysis is covered in a companion paper. Here, we detail group-level analysis procedures that are suitable for use with data from any neuroimaging modality. This paper is accompanied by an example dataset, together with step-by-step instructions demonstrating how to reproduce the analyses.
Neuroimaging studies typically have a hierarchical form. At the first (within subject) level, the neural responses of individual subjects are inferred from measurements (e.g. fMRI, EEG, MEG) by specifying and fitting suitable models. The ensuing model parameters are then taken to the second (between subject) level, where the commonalities and differences across subjects are assessed. There may be further levels of the hierarchy; for example, each group of subjects may have been sampled from different populations. This naturally suggests the use of a hierarchical model that links individual subjects to the population(s) from which they were sampled. In this paper, we address the practicalities of hierarchical modelling in brain connectivity studies.Dynamic Causal Modelling (DCM) is the predominant analysis framework for inferring effective connectivity; namely, the directed causal influences among neural populations that mediate an effect of one population on another (Friston et al., 2003). In this context, a model is a set of differential equations that describe the transformations from experimental stimulation through neural activity to observed data. The model has parameters, such as the strength of particular neural connections, which are estimated from the data using a Bayesian (probabilistic) method called Variational Laplace (Friston et al., 2007). This provides a probability density over the possible values of the parameters (e.g., connection strengths), as well as a score for the quality of the model, called the log-evidence or (negative) variational free energy.Having inferred every subject's connectivity strengths, the next challenge is how to quantify the commonalities and differences across subjects. For example, one may wish to test for differences between a patient group and a control group, or investigate whether the dose of a drug alters certain connections, or whether there is a relationship between connection strengths and behavioural measures. To enable these kinds of hypotheses to be tested efficiently, DCM was recently supplemented with a hierarchical model over parameters, called the Parametric Empirical Bayes (PEB) framework (Friston et al., 2016). Readers familiar with mass-univariate analysis in neuroimaging (Statistical Parametric Mapping, SPM) will be accustomed to the ‘summary statistic’ approach, which begins with quantifying effects at the first or within-subject level, followed by a second level analysis to test whether these effects are conserved over subjects. The PEB framework enables a similar workflow for DCM, illustrated schematically in Fig. 1 in the context of the example fMRI experiment presented here. For each subject, a DCM is specified and a probability density over their parameters (e.g. connection strengths) is estimated from the data (Fig. 1, bottom). The parameters of interest are then collated and modelled at the second level using a General Linear Model (GLM), with any unexplained between-subject variability captured by a covariance component model. Therefore, individual differences in connection strengths are decomposed into hypothesised group-level effects, plus any unexplained random effects (RFX), illustrated in Fig. 1 (top). Having estimated the group-level parameters (e.g., group-average connection strengths), hypotheses can be tested by comparing the evidence for different mixtures of these parameters – a process called Bayesian model comparison. With the PEB approach, Bayesian model comparison is only introduced at the point at which a hypothesis is tested. In group studies, the hypothesis is, by definition, about between subject effects. Our focus is therefore on comparing different models of between-subject effects on (condition specific changes in) within-subject connectivity.
Fig. 1
The two-level Parameter Empirical Bayes (PEB) model used in our analysis. At the first level (bottom of the figure), a forward model (DCM) describes how neural activity causes the fMRI timeseries of each subject . The parameters from the neural part of the model are taken to the group level (upper part of the figure), where they are modelled as a linear combination of a group mean, a group difference due to laterality index, covariates of no interest (not shown) and random effects (RFX) variability. Image credits: “Brain image” by parkjisun and “CT Scan” by Vectors Market from the Noun Project.
The two-level Parameter Empirical Bayes (PEB) model used in our analysis. At the first level (bottom of the figure), a forward model (DCM) describes how neural activity causes the fMRI timeseries of each subject . The parameters from the neural part of the model are taken to the group level (upper part of the figure), where they are modelled as a linear combination of a group mean, a group difference due to laterality index, covariates of no interest (not shown) and random effects (RFX) variability. Image credits: “Brain image” by parkjisun and “CT Scan” by Vectors Market from the Noun Project.A simple alternative to using parametric empirical Bayes is to take the expected values of the estimated connectivity parameters from all subjects and run classical statistics such as ANCOVA or MANCOVA. The use of a Bayesian framework is distinct in several respects. As illustrated in Fig. 1, the estimated parameters in DCM are (multivariate normal) probability densities. Taking just the expected values – i.e. the peaks - from each subject to the group level would ignore the estimated covariance (uncertainty) of the parameters. By contrast, the PEB framework takes both the expected values and the covariance of the parameters to the group level (through conditional dependencies among random variables, detailed in Equation (4)). A useful implication is that more precise parameter estimates will have greater influence on the group-level result, so subjects with noisy data and uncertain parameter estimates will be down-weighted. Additionally, each level of the hierarchy serves as a prior on the parameters of the level below. This can, for example, enable better estimates at the single subject level, by incorporating knowledge of the experimental effects or connection strengths that are typical for the group. This could be important in settings where precise inferences about individual subjects are needed (e.g., when identifying potential biomarkers).The PEB framework may be contrasted with an established approach for group-level modelling of effective connectivity; namely, Random Effects Bayesian Model Selection (RFX BMS), introduced in Stephan et al. (2009). With RFX BMS, multiple DCMs are specified per subject, where each DCM corresponds to one hypothesis about the network architecture. These DCMs are fitted to each subject's data. A hierarchical model is then defined, which estimates the relative probability that any randomly selected person from the population would have had their data generated by each DCM. The ‘random effect’ in this context is which model generated each subject's data; i.e., some subjects may have had their data generated by model 1, others by model 2, etc. This sort of random effect on models is typically used when subjects vary in their neural architecture in terms of the presence or absence of particular connections, or to compare different model architectures (e.g. different DCM forward models). The RFX BMS method has also benefited from extensive validation (Rigoux et al., 2014).The PEB approach differs fundamentally, in that it is a hierarchical model with random effects on parameters rather than models. In other words, it assumes that all subjects have the same basic architecture, i.e. they can be explained by the same DCM forward model, but they differ in terms of the strength of connections within that model. For example, the PEB approach could be used to test for a decrease in effective connectivity due to the administration of a drug (i.e. a continuous difference in connectivity), or to test for a complete functional disconnection due to a lesion, in which case the relevant connectivity parameters would be expected to shrink towards zero. Practically, only one DCM is specified per subject and the estimated parameters are taken to the group level, where hypotheses about between subject effects are tested. With this approach, parameters may be treated as ‘fixed effects’ or ‘random effects’ (i.e., sampled from a wider population) by setting appropriate priors on between-subject variability in the hierarchical model. Unlike the RFX BMS approach, hypothesised between-subject effects can be continuous behavioural or clinical measures (e.g., a brain lateralisation index – LI – in this study), rather than the presence or absence of connections. Also, by comparing the free energy of group-level PEB models, hypotheses can be tested in terms of different mixtures of first level effects (connections) and second level effects (covariates), affording a flexibility in the questions that can be addressed. Crucially, by only inverting one ‘full’ DCM per subject in the PEB approach – rather than multiple DCMs per subject – one does not run into the problem of different DCMs falling into different local optima – a particular problem for neural mass models used with electrophysiological data.To illustrate PEB in this setting, this paper walks through the analysis of a previously published fMRI study on the lateralisation of semantic processing, using DCM and PEB. The first level DCM analysis is described in the first part of the tutorial (in the companion paper), which focused on the specifics of fMRI data analysis. Here, we address the more generic issue of how to model commonalities and differences among subjects in effective connectivity at the group level, regardless of the imaging modality. After introducing the example dataset, we describe the specification of a PEB model and demonstrate testing hypotheses using Bayesian Model Reduction (BMR) – a particularly efficient form of Bayesian model selection. We conclude by illustrating how predictive validity can be assessed using cross-validation. To help associate methods with their implementation in the SPM software (http://www.fil.ion.ucl.ac.uk/spm/software/), MATLAB function names are provided in bold text, such as (spm_dcm_fit.m), and are listed in Table 1. The example dataset and a step-by-step guide to running these analyses are available from https://github.com/pzeidman/dcm-peb-example.
Table 1
SPM software functions in the PEB framework.
Function name
Analysis level
Description
spm_dcm_fit
1st
Fits first level DCMs, arranged in a GCM array*, to the data using variational Laplace.
spm_dcm_peb_fit
1st
Iteratively re-estimates each subject's first level DCM using the group average parameter estimates as empirical priors.
spm_dcm_bmc
2nd
Performs fixed effects and random effects Bayesian model selection (RFX BMS) on DCMs (this is not required for PEB analysis).
spm_dcm_peb
≥2nd
Specifies and estimates a PEB model.
spm_dcm_peb_bmc
≥2nd
Compares an estimated PEB model against reduced models where particular combinations of parameters (relating to particular connections) are switched off. In the absence of pre-specified alternative models, an automatic search over reduced models is performed. (To compare different mixtures of covariates and pre-specified first level models, instead use spm_dcm_bmc_peb.)
spm_dcm_peb_bmc_fam
≥2nd
Performs family-wise model comparison on the output of spm_dcm_peb_bmc, to enable groups of pre-defined models to be compared.
spm_dcm_peb_review
≥2nd
Graphical user interface for reviewing a PEB model or model comparison result.
spm_dcm_loo
1st and 2nd
Performs leave-one-out cross validation to assess the predictive validity of DCM parameters.
* A GCM (Group DCM) array is a Matlab cell array of DCM structures or filenames, with one row per subject and one column per model. For most group analyses, the first column of the GCM is expected to contain each subject's ‘full’ model, which includes all parameters of interest, and subsequent columns contain reduced models with certain parameters fixed at their prior expectation (typically zero).
SPM software functions in the PEB framework.* A GCM (Group DCM) array is a Matlab cell array of DCM structures or filenames, with one row per subject and one column per model. For most group analyses, the first column of the GCM is expected to contain each subject's ‘full’ model, which includes all parameters of interest, and subsequent columns contain reduced models with certain parameters fixed at their prior expectation (typically zero).
Experimental design
We used data from a previously published fMRI experiment on language lateralisation with 60 subjects (Seghier et al., 2011). To recap, this experiment investigated how the left and right frontal lobes interact during semantic (relative to perceptual) processing of words. While language is typically thought to be left lateralised, the right hemisphere also responds in language tasks, and this experiment focussed on individual differences in the degree of lateralisation. At the within-subject level, it was a balanced 2x2 factorial design with factors: stimulus type (Words or Pictures) and task (Semantic or Perceptual matching). There was also a single factor at the between-subject level: Laterality Index (LI). This is a measure of functional brain asymmetry, derived from the number of activated voxels in each hemisphere in an initial SPM analysis. A positive LI (towards +1) indicated left hemisphere dominance, whereas a negative LI (towards −1) indicated right hemisphere dominance. The main question addressed by this experiment was: what neural circuitry underlies individual differences in LI?Analysing a group connectivity study begins by identifying which effects would be expected to occur at the within-subject level, and which would be expected at the between-subject level. Typically, within-subject effects consist of one measurement per trial, whereas between-subject effects consist of one measurement per subject. Here, the effects of words and pictures were at the within-subject level, so we included these in each subject's first level General Linear Model (GLM) and subsequent DCM, whereas LI was a between-subject measure, so we only introduced it in the second level PEB analysis. Other factors of no interest (e.g., age, gender, and handedness) were also included at the between-subject level.Each subject's DCM consisted of a network with four brain regions: left ventral frontal (lvF), left dorsal frontal (ldF), right ventral frontal (rvF) and right dorsal frontal (rdF) cortex. Each of these regions could be modulated by two experimental conditions: pictures and words. Therefore, as detailed in the next section, fitting DCMs to the fMRI data provided us with estimates of eight interesting parameters per subject: 1) modulation of region lvF by pictures, 2) ldF by pictures, 3) rvF by pictures, 4) rdF by pictures, 5) lvF by words, 6) ldF by words, 7) rvF by words and 8) rdF by words. Here, we asked which mixtures of these eight parameters best explained the commonalities across subjects, and which best explained individual differences due to brain laterality (LI). We hypothesised that the commonalities across subjects, and the LI differences, could be expressed in:the network's response to pictures and/or words,the network's response in dorsal and/or ventral regions,the network's response in left and/or right hemisphere regions.These hypotheses formed independent factors in our analysis. We will detail the specification and comparison of group level models that varied across these factors (Fig. 2), as well as illustrating the comparison of group level models in a less constrained manner, using an automatic search of the hypothesis or model space. First, we will briefly reprise the first level DCM analysis, which quantified the effects of pictures and words on each connection for each subject.
Fig. 2
Model space. Each picture is a network of four frontal brain regions. The grey circles in the top row of each picture are left dorsal (ldF) and right dorsal (rdF) frontal cortex, and the circles in the bottom row of each picture are left ventral (lvF) and right ventral (rvF) frontal cortex. The red curved lines indicate inhibitory self-connections that were allowed to be modulated by the task. Candidate PEB models varied according to the three factors shown. Additionally, a ‘null’ model (not shown) was specified with no modulation to serve as a baseline. This gave rise to models.
Model space. Each picture is a network of four frontal brain regions. The grey circles in the top row of each picture are left dorsal (ldF) and right dorsal (rdF) frontal cortex, and the circles in the bottom row of each picture are left ventral (lvF) and right ventral (rvF) frontal cortex. The red curved lines indicate inhibitory self-connections that were allowed to be modulated by the task. Candidate PEB models varied according to the three factors shown. Additionally, a ‘null’ model (not shown) was specified with no modulation to serve as a baseline. This gave rise to models.
First level analysis: DCM
Our connectivity analysis began by specifying one DCM for fMRI model for each subject (Fig. 1, bottom). Here, we focused on the subset of parameters from these models that quantified the change in each brain region's inhibitory self-connection due to words and pictures - these are the leading diagonal of matrix in the DCM for fMRI neural model (see Discussion for consideration of self-connections versus between-region connections). These parameters had values greater than zero if there was an increase in self-inhibition due to the stimuli, and so a reduction in sensitivity to inputs from the rest of the network. They had values less than zero if there was a reduction in self-inhibition (i.e., disinhibition) due to the stimuli, thus increasing sensitivity to inputs from the network.We shall write these parameters of interest as for subject , where the superscript number 1 indicates the first level of analysis. Notation details are provided in the footnote.1We then inverted (estimated) each subject's DCM. This involves finding the posterior density over parameters that maximises a score for the model, called the negative variational free energy . This approximates (a lower bound on) the log model evidence , which is the log of the probability of having observed the data given the model . In short, Bayesian model inversion provides an estimate of log evidence and a multivariate normal probability density over the parameters:where is a vector of the expected values of the parameters (e.g., the connection strengths) and is their posterior covariance matrix. Elements on the leading diagonal of are the posterior variance (uncertainty) of each parameter and the off-diagonal elements are their corresponding covariance.The estimation scheme, called variational Laplace (Friston et al., 2007), tunes the parameters (e.g. connection strengths) with the objective of making the predicted neuroimaging timeseries match the observed timeseries as closely as possible (maximising accuracy), while penalizing movement of the parameters away from their prior values (minimizing complexity). The ideal compromise between accuracy and complexity maximises the negative variational free energy . However, this process inherently involves uncertainty: multiple settings of the parameters could provide similarly good explanations for the data. Furthermore, there will typically be some collinearity between the parameters, meaning their effect on the predicted timeseries cannot be disambiguated. DCM quantifies this uncertainty by returning a full probability density over the parameters – both their expected values and covariance - as defined in Equation (1).
Second level analysis: Parametric Empirical Bayes (PEB)
Having inverted each subject's DCM, we ran a PEB analysis (spm_dcm_peb.m). The procedure begins by collating the estimated parameters of interest from all subjects . This consists of two quantities from each subject: the expected values of the parameters and their covariance matrices . Typically, only parameters relevant to the hypotheses being tested are taken to the group level, to improve statistical efficiency. For example, a priori, we expected laterality to effect differences in neuronal coupling, but we did not expect it to effect haemodynamics. This means including haemodynamic parameters in the PEB analysis could confound evidence for neural effects, especially in the presence of collinearity or conditional dependence. Effectively, parameters not included in the group level analysis (e.g. related to haemodynamics) are treated as fixed effects – meaning they are not treated as if they are sampled from a wider population – and are therefore released from the constraint of being concentrated around some group mean. Before detailing how we configured the PEB model for this experiment, we first offer a brief overview of the underlying theory.
Theory
The PEB framework specifies a hierarchical statistical model of connectivity parameters:Starting with the last line of Equation (2), the observed neuroimaging data for subject are modelled as having been generated by a DCM (or by any nonlinear model) – denoted by – with parameters . Any known uninteresting effects, such as the mean of the signal, are modelled by a GLM with design matrix and parameters . The observation noise is modelled as residuals . The second line of Equation (2) is the second level of the PEB model. It says that the vector of all subjects’ neural parameters , ordered by subject and then by parameter, can be described by a GLM with design matrix and group-level parameters .The group-level design matrix encodes the hypothesised sources of variation across subjects, with one column for each experimental effect (called regressors or covariates). For every column in the design matrix there is a corresponding entry in parameter vector which is estimated from the data. Each of these parameters is the group-level effect of one covariate (e.g. LI, handedness or gender) on one connection. Any differences between subjects not captured by this model are defined as zero-mean (I.I.D.) additive noise . These are the random effects (RFX). In other words, from the point of view of a generative model, to generate a single subject's data, one would first sample a parameter vector from the prior distribution (level 3), and add a random effect for this subject (level 2). Finally, one would generate data using the DCM and add some observation noise (level 1). More levels could be added to the hierarchy; for example, to model nested groups of subjects. However, Equation (2) stops at two levels and specifies that parameters would have fixed prior expected value and residuals , expressed in the first line of Equation (2). In summary, this hierarchical model partitions the variability in connectivity parameters across subjects into hypothesized group-level effects and uninteresting between-subject variability .To turn this into a statistical (i.e. probabilistic) model, we first define the probability density over the error terms:Where the covariance matrices , and are I.I.D. and define the zero-mean additive noise at each level of the hierarchy. The model can then be written in terms of probability densities:Equation (4) specifies the joint probability of all three quantities of interest; namely, data from all subjects , the DCM parameters of all subjects and the group-level parameters . The last equality states that the timeseries for subject is generated by their DCM, with covariance (uncertainty) determined by their observation noise. The penultimate equality defines the transition from single subjects to the group level. The second equality defines the priors on the group-level parameters. Each level acts as a constraint on the level below – meaning that group level parameters constrain the estimates of the individual subjects.Configuring the PEB model for an experiment only requires specifying the second level design matrix , detailed in the next section. The software provides default prior probability densities for the between-subject variability and the second level neural parameters In brief, the prior variances used for inverting subject-specific DCMs are also used as the prior variance around the group mean, while random effects at the between subject level are set to 1/16 of this variance. This assumes that people only acquire data from multiple subjects when the random effects from subject to subject are less than (i.e., have a quarter of the standard deviation of) the range of values that one expects a priori. For details, please see Appendix 1: PEB connectivity priors specification and Appendix 2: PEB random effects specification.
PEB: design matrix specification
The group-level design matrix defines the hypotheses about between-subject variability. It is specified in two parts: between-subject effects and within-subject effects . The between-subject design matrix (Fig. 3, left) encodes covariates or regressors in each column, with one row per subject. The software implementation in SPM expects the first column to be ones (to model the commonalities across subjects; i.e., constant or group mean) and the second column is usually the effect of interest. Other covariates are placed in the subsequent columns. For this experiment, we included covariates encoding the group mean, LI score, handedness, gender and age. (We computed the LI score by taking the first principal component of the four different measures used in Seghier et al. (2011).)
Fig. 3
The group-level design matrix. Left: the between-subjects design matrix, with regressors (covariates) modelling the group mean, laterality index, handedness, gender and age. Middle: the within-subjects design matrix, where the diagonal encodes which DCM parameters receive group-level effects. This was set to the identity matrix to include all between-subject effects on all DCM parameters. Right: The resulting design matrix, computed according to Equation (5). Each row corresponds to one DCM parameter from one subject and each column corresponds to one group-level effect (e.g. gender) on one DCM parameter. For display purposes only, the regressors were z-scored and thresholded to produce colours in a consistent range.
The group-level design matrix. Left: the between-subjects design matrix, with regressors (covariates) modelling the group mean, laterality index, handedness, gender and age. Middle: the within-subjects design matrix, where the diagonal encodes which DCM parameters receive group-level effects. This was set to the identity matrix to include all between-subject effects on all DCM parameters. Right: The resulting design matrix, computed according to Equation (5). Each row corresponds to one DCM parameter from one subject and each column corresponds to one group-level effect (e.g. gender) on one DCM parameter. For display purposes only, the regressors were z-scored and thresholded to produce colours in a consistent range.An important decision – when preparing the between-subjects design matrix – is whether to mean-centre regressors which follow the first constant term. If they are mean-centred, then the first regressor (the column of ones) would correspond to the mean experiment-related changes in connectivity over subjects, and the between-subject effects would add to or subtract from this. If they were not mean-centred, then the first regressor would correspond to the baseline or intercept of the model. Here, we did mean-centre, endowing the first regressor with the interpretation of the group mean effective connectivity.The within-subjects design matrix (Fig. 3, middle) defines which DCM connectivity parameters can receive between-subject effects (via setting each element on the leading diagonal to 1 or 0). Here, these were ordered into the effects of picture stimuli on each of the four regions, followed by the effects of words on each of the four regions. We set this to be the identity matrix (ones on the leading diagonal, zeros elsewhere), enabling the between-subject effects to influence all connections. The two parts of the design matrix are combined by the software (spm_dcm_peb.m) according to:Where is the Kronecker tensor product, which replicates matrix for each element of . This is a subtle aspect of PEB that makes things slightly more complicated than simply modelling experimental effects on some response variable. Here, we have to specify what the experimental (between-subject) effects actually act upon; namely, which (within-subject) parameters. For example, the effect of laterality may be expressed in all connectivity parameters – or just one parameter. By default, the software assumes that every experimental (between-subject) effect can be expressed on every (within-subject) parameter.A part of the resulting design matrix is shown in Fig. 3 (right). The first eight columns encode the group mean of each connectivity parameter. Columns 9–16 encode the effect of LI on each connectivity parameter, and the remaining columns encode the effects of handedness, gender and age. The rows are ordered subject-wise, corresponding to DCM parameters of interest from subject one, then from subject two, etc.
PEB: model estimation
Having specified the design matrix we inverted the PEB model (spm_dcm_peb.m), which furnished two useful quantities: the estimated group-level parameters and the group-level free energy. The parameter estimates correspond to a multivariate Gaussian density:Where vector are the estimated betas or weights on the covariates in the design matrix and the covariance matrix specifies uncertainty over these weights. (This should not be confused with in Equation (3), which specifies the between-subject variability.)Fig. 4 (left) shows the estimated second level parameters pertaining to the effects of interest; namely, the group average and the effect of LI (spm_dcm_peb_review.m). The heights of the bars are in Equation (6) and the error bars were computed from the leading diagonal of covariance matrix . Parameters 1–8 are the commonalities (group average connectivity parameters) across all subjects and parameters 9–16 are the differences in connectivity due to LI score. Each parameter quantifies the change in an inhibitory self-connection due to words or pictures.
Fig. 4
Parameters of the group-level Bayesian General Linear Model (GLM). Left: the posterior parameter estimates, corresponding to the first 16 regressors in design matrix . These represent the commonalities (average) across subjects (GLM parameters 1–8) and the difference in connectivity due to Laterality Index (LI, GLM parameters 9–16). The units are the same as the parameters in the underlying DCMs; i.e., unitless log scaling parameters that scale the default self-connection strength of −0.5Hz. Dotted lines indicate parameters relating to the modulation of region rdF by words, discussed in the text. Right: the estimated between-subject variance of each connectivity parameter, after accounting for known effects. Parameters have the same order in both plots: 1 = Modulation of lvF by pictures, 2 = ldF by pictures 3 = rvF by pictures, 4 = rdF by pictures, 5 = lvF by words, 6 = ldF by words, 7 = rvF by words, 8 = rdF by words.
Parameters of the group-level Bayesian General Linear Model (GLM). Left: the posterior parameter estimates, corresponding to the first 16 regressors in design matrix . These represent the commonalities (average) across subjects (GLM parameters 1–8) and the difference in connectivity due to Laterality Index (LI, GLM parameters 9–16). The units are the same as the parameters in the underlying DCMs; i.e., unitless log scaling parameters that scale the default self-connection strength of −0.5Hz. Dotted lines indicate parameters relating to the modulation of region rdF by words, discussed in the text. Right: the estimated between-subject variance of each connectivity parameter, after accounting for known effects. Parameters have the same order in both plots: 1 = Modulation of lvF by pictures, 2 = ldF by pictures 3 = rvF by pictures, 4 = rdF by pictures, 5 = lvF by words, 6 = ldF by words, 7 = rvF by words, 8 = rdF by words.To focus on a specific example – which will be relevant to the analyses which follow – consider the DCM parameter quantifying the effect of words on the inhibitory self-connection of region rdF. This was represented by two PEB GLM parameters, indicated in Fig. 4 (left) with dashed lines. Parameter number eight is the average effect of words on rdF across subjects and parameter number 16 is the difference in this effect between subjects due to LI. From this plot, we can see that words increased region rdF's inhibitory self-connection. The effect size was 0.41 in the units of the underlying DCM. Parameter number 16 shows that the effect of words on rdF also correlated positively with LI score. The effect size was 1.79, meaning that the contribution of LI was to increase the rdF self-connection by 1.79 times the LI score. Note that these are just unstandardized effects sizes; to formally test hypotheses about where laterality effects are manifest, model comparisons are required, which we will return to shortly. Fig. 4 (right) shows the estimated between-subject variability (i.e., random effects) of the eight DCM connectivity parameters, corresponding to the leading diagonal of matrix in Equation (3).Inversion of the full PEB model also returned a score for its quality – the (negative) variational free energy – which approximates the log model evidence:This is the log of the probability of observing the neuroimaging data (from all subjects) given the entire hierarchical model . Note there is a distinction between the free energy at the first level (relating to individual subjects' DCMs) and the second level (relating to the entire group and our model of between subject effects). At the first level, the free energy of a DCM is its accuracy minus its complexity; where complexity is the difference (KL-divergence) between the estimated parameters and the priors. At the second level, the free energy is the sum of all subjects’ DCMs accuracies, minus the complexity induced by fitting the DCMs and the second-level GLM (see Equation (10) of Friston et al. (2016)). By comparing the free energy of PEB models – with different sets of parameters switched on and off2 – and selecting the PEB models with the greatest free energy, one can find the optimal explanation for the dataset as a whole.
Inference: comparing reduced PEB models
Having estimated the parameters of the group-level GLM, we next tested hypotheses, specifically to find out whether there was an effect of laterality and, if so, where it was expressed. In the PEB framework, this is done by comparing the evidence for reduced GLMs that have certain combinations of parameters switched off (fixed at their prior expectation of zero). Comparing full and reduced models in this way is conceptually similar to performing an F-test in classical statistics. In practice, the evidence and parameters of the reduced models can be derived analytically from the full model in a matter of milliseconds, using a procedure referred to as Bayesian Model Reduction (BMR, see Appendix 3: Bayesian Model Comparison and Reduction).We defined a set of candidate models to identify the best explanation for the commonalities across subjects and the best explanation for the LI differences, in terms of the three experimental factors defined above (Fig. 2). Each factor had three levels, plus a null hypothesis, necessitating candidate models to evaluate all combinations of the factors, shown schematically in Fig. 5A. This is referred to as a factorial model space, because each of the three factors can be imagined as an axis on a graph, with each of the models situated somewhere in the space formed by those axes.
Fig. 5
Comparison of PEB models in a pre-defined model space. A. Connectivity parameters switched on (white) and off (black) in each model. See the legend of Fig. 4 for the identity of each parameter. B. Joint probability of all candidate models. The axes list the 28 candidate models (combinations of connections) in terms of commonalities across subjects and differences between subjects due to Laterality Index (LI). In other words, the PEB model in row column had parameters relating to commonalities across subjects set according to model , and parameters relating to LI set according to model . The best model was number 4 for the commonalities and 15 for laterality differences, with 56% posterior probability. C. The same result shown in part B, summed over the columns and re-normalized, to give the posterior probability for the commonalities across subjects. D. The same result shown in panel B, summed over the rows and re-normalized, to give the posterior probability for the models of laterality index (LI). E-F. Schematic diagrams of models 4 and 15. Curved lines are self-connections modulated by words and/or pictures. G. Bayesian Model Average (BMA) of the parameters over all models. H. The BMA thresholded at posterior probability >95% for clarity. The schematic illustrates the parameters which survived thresholding. P=Pictures, W=Words, LI = Laterality Index.
Comparison of PEB models in a pre-defined model space. A. Connectivity parameters switched on (white) and off (black) in each model. See the legend of Fig. 4 for the identity of each parameter. B. Joint probability of all candidate models. The axes list the 28 candidate models (combinations of connections) in terms of commonalities across subjects and differences between subjects due to Laterality Index (LI). In other words, the PEB model in row column had parameters relating to commonalities across subjects set according to model , and parameters relating to LI set according to model . The best model was number 4 for the commonalities and 15 for laterality differences, with 56% posterior probability. C. The same result shown in part B, summed over the columns and re-normalized, to give the posterior probability for the commonalities across subjects. D. The same result shown in panel B, summed over the rows and re-normalized, to give the posterior probability for the models of laterality index (LI). E-F. Schematic diagrams of models 4 and 15. Curved lines are self-connections modulated by words and/or pictures. G. Bayesian Model Average (BMA) of the parameters over all models. H. The BMA thresholded at posterior probability >95% for clarity. The schematic illustrates the parameters which survived thresholding. P=Pictures, W=Words, LI = Laterality Index.We asked which of the 28 models (patterns of switched on/switched off parameters) was the best explanation for the commonalities across subjects and which was the best explanation for the LI differences. This meant comparing the evidence for a total of reduced GLMs (spm_dcm_peb_bmc.m). Fig. 5B shows the resulting matrix of posterior probabilities, where element is the probability of the GLM, which had parameters relating to commonalities across subjects configured according to model , and parameters relating to laterality (LI) configured according to model . It is clear from Fig. 5B that one GLM stood out (56% posterior probability). Its commonalities parameters were deployed according to model 4 and its LI parameters were deployed according to model 15. This result is shown more clearly by marginalising (summing) over the columns and rows and re-normalising, to give the probability for each model of the commonalities (Fig. 5C) and LI differences (Fig. 5D). Model 4, which was the best explanation for the commonalities across subjects, had modulation by words and pictures in dorsal regions (Fig. 5E). Model 15, which was the best explanation for the differences across subjects due to LI, had words specifically modulating the self-connection of rdF (Fig. 5F).Despite the relatively strong probability for these two models, no single model could be described as an overall winner (>95% probability), which is unsurprising given the large number of models. To summarise the parameters across all models, we computed the Bayesian Model Average (BMA). This is the average of the parameters from different models, weighted by the models’ posterior probabilities (Hoeting et al., 1999; Penny et al., 2006). The averaged parameters are shown in Fig. 5G and the values are listed in Table 3. To focus our results on just those parameters that had the greatest evidence of being non-zero, we thresholded the BMA to only include parameters that had a 95% posterior probability of being present vs absent (thresholding based on the free energy is detailed in Appendix 3). As illustrated in Fig. 5H, in common across subjects, pictures increased self-inhibition in region ldF and dis-inhibited rdF; shifting the balance of activation towards the right hemisphere with no differences across subjects due to LI score. By contrast, words increased self-inhibition in rdF, which was further increased in subjects with more positive LI scores; i.e. with greater left-lateralisation of brain responses. We may conclude, therefore, that individual differences in LI score could be explained by differences in the gain or excitability of region rdF, when presented with words.
Table 3
BMA of PEB parameters: Expected values of estimated commonalities and LI differences.
Commonalities
Laterality Index
Region
Pictures
Words
Pictures
Words
lvF
−0.02
0.02
0
0
ldF
0.40
0.06
−0.001
−0.02
rvF
−0.07
0
0
0.01
rdF
−0.23
0.38
−0.02
1.74
*‘Pictures’ and ‘Words’ are the log of scaling parameters that multiply up or down the default self-connection (−0.5Hz). Positive values indicate greater self-inhibition than the default and negative values indicate less self-inhibition than the default.
Symbols used in this paper.BMA of PEB parameters: Expected values of estimated commonalities and LI differences.*‘Pictures’ and ‘Words’ are the log of scaling parameters that multiply up or down the default self-connection (−0.5Hz). Positive values indicate greater self-inhibition than the default and negative values indicate less self-inhibition than the default.
Inference: family analysis
In the model comparison above, there were three factors with three levels each, plus a ‘null’ model, giving models of the commonalities across subjects and 28 models of the LI differences, thus candidate models in total. Rather than conclude the analysis by plotting 784 values, it is clearer to report a single statistic for each of the three factors. To do this, we grouped the 784 models into ‘families’ according to each of the experimental factors and compared the pooled evidence of each family (Penny et al., 2010). The results (computed using spm_dcm_peb_bmc_fam.m) are shown in Fig. 6 and are described in the sections which follow.
Fig. 6
Family-wise analyses for each of three experimental factors. Each plot is a separate analysis with the same 784 PEB models grouped into different families. For each plot, element represents the pooled probability for models in which the commonalities parameters were set according to family and the Laterality Index (LI) parameters were set according to family . For example, for Factor 1, element is the pooled probability for all models with commonalities parameters set according to family 1 and Laterality Index parameters set according to family 2. The families numbered 1–4 for each analysis were as follows. Factor 1 (task): 1 = pictures and words, 2 = words, 3 = pictures, 4 = none. Factor 2 (dorsal/ventral): 1 = dorsal and ventral, 2 = dorsal, 3 = ventral, 4 = none. Factor 3 (left/right): 1 = left and right, 2 = left, 3 = right, 4 = none. See Fig. 2 for an illustration of each factor.
Family-wise analyses for each of three experimental factors. Each plot is a separate analysis with the same 784 PEB models grouped into different families. For each plot, element represents the pooled probability for models in which the commonalities parameters were set according to family and the Laterality Index (LI) parameters were set according to family . For example, for Factor 1, element is the pooled probability for all models with commonalities parameters set according to family 1 and Laterality Index parameters set according to family 2. The families numbered 1–4 for each analysis were as follows. Factor 1 (task): 1 = pictures and words, 2 = words, 3 = pictures, 4 = none. Factor 2 (dorsal/ventral): 1 = dorsal and ventral, 2 = dorsal, 3 = ventral, 4 = none. Factor 3 (left/right): 1 = left and right, 2 = left, 3 = right, 4 = none. See Fig. 2 for an illustration of each factor.
Factor 1: whether the network was modulated by words and/or pictures
We grouped the 784 GLMs into four families according to the Task factor (see Fig. 2), plus the null model with no modulation. To recap, the families differed according to which experimental conditions modulated the self-connections: 1) pictures and words, 2) words, 3) pictures and 4) none. The evidence for the models within each family was pooled, and the families were compared under the prior that each family was equally likely. There was 86% probability that family 1 (pictures and words both modulating) was the best explanation for the commonalities and family 2 (only words) was the best explanation for LI differences. Therefore, we found that the frontal network was modulated by words and pictures, but LI differences were specifically explained by the response to words.
Factor 2: whether the stimuli modulated dorsal and/or ventral regions
We next grouped the same 784 models into families defined by factor 2 (dorsal vs ventral). The families were 1) dorsal and ventral, 2) dorsal, 3) ventral and 4) none. The probability that family 2 (dorsal) best explained both the commonalties and the laterality differences was 69%. The second best family, with 20% probability, was family 1 (both dorsal and ventral regions) for the commonalities. We can conclude that task effects were mainly expressed in the dorsal regions, and the effect of LI was specific to dorsal regions.
Factor 3: whether the stimuli modulated left and/or right hemisphere
Finally, we organised the same models according to factor 3 (left vs right). The families were 1) left and right, 2) left, 3) right and 4) none. There was 84% probability for family 1 (both left and right regions), whereas family 3 (right hemisphere) was the best explanation for differences due to LI. Therefore, differences in hemispheric asymmetry across subjects could be explained specifically by differences in the right hemisphere.
Interim summary: hypothesis-based analyses
To summarise the analyses so far, we first estimated a single DCM for each subject with all DCM parameters of interest switched on. We then formed a ‘full’ second level GLM with parameters representing the group average connection strengths (commonalities), the differences between subjects due to LI, and covariates of no interest (handedness, age and gender). We then compared 784 reduced versions of this GLM with different combinations of parameters – representing the commonalities and laterality differences across subjects – switched off. This provided free energy scores quantifying the evidence for each candidate model. By grouping the reduced GLMs into families – and pooling their evidence – we were able to quantify the effect of between-subject factors on within-subject parameters (i.e., condition specific changes in connectivity).
Inference: search over reduced PEB models
In the analyses above, large numbers of pre-defined models were compared (in seconds) using a technology called Bayesian Model Reduction (BMR). If there were no strong hypotheses about between-subject effects on connectivity, and the objective was simply to ‘prune’ any GLM parameters that did not contribute to the model evidence, BMR could be used to automatically search over reduced models. This more exploratory approach is conducted under the assumption that all reduced models are equally probable a priori, and thus the ‘full’ model only contains parameters that are biologically plausible.For this experiment, the GLM contained 40 parameters in total (five between subject effects on eight DCM parameters – Fig. 3, right) and the objective was to find the best reduced GLM with certain parameters switched off. Evaluating all possible reduced GLMs would have required evaluating models, which is not possible in a reasonable amount of time. Therefore, to reduce the computational demand, an automatic search procedure is available (spm_dcm_peb_bmc.m), originally referred to as ‘post-hoc’ DCM analysis (Friston and Penny, 2011; Rosa et al., 2012). This procedure compares the evidence for reduced models using Bayesian Model Reduction, iteratively discarding parameters that do not contribute to model evidence. The iterative procedure stops when discarding any parameter starts to decrease model evidence. Technically, this is known as a greedy search and allows thousands of models to be compared quickly and efficiently.A Bayesian Model Average (BMA) is then calculated over the 256 models from the final iteration of the greedy search, and the result for our data is shown in Fig. 7. The middle and right plots show the parameters with a posterior probability of being non-zero greater than 0.95. It is clear that the estimated response to words and the difference in the response to words due to LI were remarkably similar to the analysis above; however, in this illustrative analysis we did not have to declare any specific a priori models. The estimated response to pictures was slightly different, modulating region rvF rather than rdF. If this difference was of experimental interest, these two explanations could be compared in terms of their free energy and the relative probability of each explanation quantified.
Fig. 7
BMA performed on the final 256 models of an automatic parameter search. Left: Posterior parameter estimates based on the BMA. Only parameters relating to commonalities and Laterality Index are shown. Middle: Thresholded BMA, where parameters with probability greater than 95% of being present vs. absent were retained. Each parameter's individual probability was assessed by comparing the evidence for all models (from the final iteration of a greedy search) which had the parameter switched on, versus all models which had the parameter switched off. Right: The schematic shows effects of Pictures (P), Words (W) and Laterality Index (LI).
BMA performed on the final 256 models of an automatic parameter search. Left: Posterior parameter estimates based on the BMA. Only parameters relating to commonalities and Laterality Index are shown. Middle: Thresholded BMA, where parameters with probability greater than 95% of being present vs. absent were retained. Each parameter's individual probability was assessed by comparing the evidence for all models (from the final iteration of a greedy search) which had the parameter switched on, versus all models which had the parameter switched off. Right: The schematic shows effects of Pictures (P), Words (W) and Laterality Index (LI).
Inference: network structure
Although our hypotheses were about changes in effective connectivity due to words and pictures (parameter matrix from the DCM neural model), it is useful to present these results in the context of the average effective connectivity across experimental conditions (parameter matrix ). This is because the parameters encode the deviation from the average connection strengths for each experimental condition. To bring these parameters to the group level, we specified and estimated a separate PEB model for the average connectivity , and performed an automatic search over reduced models. While these analyses could have been done in the same PEB model, including a large number of parameters can cause a dilution of evidence effect, as well as inducing a much larger search space. Examining each set of parameters separately can therefore help to keep the analysis focused and tractable. The parameter estimates are listed in Table 4. There were two non-trivial effects of Laterality Index: lvF- > ldF (−0.35Hz) and rdF- > ldF (0.45Hz). Thus, in subjects with greater left hemisphere dominance as scored by a more positive LI, region ldF was more inhibited by lvF and more excited by rdF, independent of whether the stimuli were pictures or words.
Table 4
BMA of PEB parameters: Average connectivity across experimental conditions (A-matrix).
Source
Target
Commonalities
Laterality index
lvF
lvF
−0.20
0
lvF
ldF
0.16
−0.35
lvF
rvF
0.08
0
ldF
lvF
0.08
0
ldF
ldF
−0.18
0
ldF
rdF
0.06
0
rvF
lvF
0.10
0
rvF
rvF
−0.30
0
rvF
rdF
0.08
0
rdF
ldF
0.18
0.45
rdF
rvF
0
0
rdF
rdF
−0.32
0
* Between-region connections are in units of Hz. Self-connections, where the source and target are the same, are the log of scaling parameters that multiply up or down the default value −0.5Hz.
BMA of PEB parameters: Average connectivity across experimental conditions (A-matrix).* Between-region connections are in units of Hz. Self-connections, where the source and target are the same, are the log of scaling parameters that multiply up or down the default value −0.5Hz.
Prediction: cross-validation
In the previous steps, we identified a laterality effect on the response of region rdF to words. Our final question was whether the size of this effect was large enough to be interesting; i.e., whether we could predict a subject's LI from their neural response. Questions of this sort – assessing predictive validity – are particularly important for studies determining the clinical significance of model parameters. To address this we used leave-one-out cross validation (spm_dcm_loo.m). A PEB model was fitted to all but one subject, and covariates for the left-out subject were predicted. This was repeated with each subject left out and the accuracy of the prediction was recorded. For the technical details of this procedure see Appendix 4: Leave-one out cross-validation with PEB.We assessed whether we could predict subjects' LI based on their modulation of rdF by words, as well as their known covariates - handedness, age and gender. The red line in Fig. 8 (left) shows the predicted (mean-centred) LI score for each left out subject. The shaded area is the 90% credible interval of the prediction and the dashed black line is the actual mean-centred LI. In this example, 44 out of 60 of subjects had their true LI within the estimated 90% credible interval (shaded area). Fig. 8 (right) plots the out-of-samples correlation of the actual LI against the (expected value of) the predicted LI for each left-out subject. The Pearson's correlation coefficient was 0.34, p = 0.004. Therefore, we can conclude that the effect size estimated by DCM was sufficiently large to predict the left-out subjects' LI with performance above chance, although there was still a lot of variability to be explained.
Fig. 8
Leave-one-out cross validation. Left: the out-of-samples estimate of (mean-centred) Laterality Index for each subject (red line) with 90% confidence interval (shaded area). The black dashed line is the actual group effect. Right: The actual subject effect plotted against the expected value of the estimated subject effect.
Leave-one-out cross validation. Left: the out-of-samples estimate of (mean-centred) Laterality Index for each subject (red line) with 90% confidence interval (shaded area). The black dashed line is the actual group effect. Right: The actual subject effect plotted against the expected value of the estimated subject effect.
Analysis summary
The above analyses characterised the effective connectivity underlying individual differences in lateralisation of neural responses, in the context of a semantic processing task. We asked whether the commonalities and LI differences across subjects were expressed in condition specific changes in connectivity due to picture and/or word stimuli, in dorsal and/or ventral regions, and in left and/or right hemispheres (Fig. 2). Comparisons of PEB models showed that in common across subjects, the frontal network was modulated by pictures and words, in dorsal regions of both hemispheres (Fig. 6). The parameters revealed a double dissociation between picture and word stimuli: pictures inhibited responses in region ldF, whereas words inhibited responses in rdF (Fig. 5H). Differences between subjects – due to LI score – were specifically expressed in the response to words in the dorsal region of the right hemisphere (Figs. 5H and 6). This effect was positive, meaning that having a higher LI score (being more left lateralized) was associated with greater inhibition of region rdF. Cross-validation showed that this effect was sufficiently large to predict left-out subjects’ LI.
Discussion
We have illustrated an empirical (i.e. hierarchical) Bayesian procedure for conducting group connectivity analyses, using DCM and PEB. This begins with a first level analysis, modelling within-subject effects using neural models (DCMs). The connectivity parameters of interest are then taken to a second level analysis, where they are modelled using a Bayesian GLM. Hypotheses are tested by comparing the evidence for the full GLM against reduced GLMs, where certain parameters are switched off (i.e., fixed at zero). Finally, the predictive validity of the model parameters is assessed using cross-validation.The first level analysis (detailed in the companion paper) involves designing an architecture for each subject's model, i.e. deciding which brain regions to include, selecting which connections should be informed by the data, and selecting which connections can be modulated by the experimental manipulations. The second level analysis involves deciding which parameters to treat as random effects, i.e. sampled from the population, and including these in the PEB model. Typically, only the parameters that are relevant to the hypotheses will be included in the group analysis, and for task-based DCM studies these will primarily be matrix of the DCM neural model, quantifying the modulations of connections by each experimental condition. In this example study, we only allowed modulations of the self-connections on each region, and it was these parameters that we took to the group level. These self-connection parameters provide gain control – i.e. they determine the sensitivity of each region to input coming from the rest of the network. Limiting task effects to these parameters can facilitate a more identifiable model (i.e. eliciting parameters that can be distinguished from one another with higher confidence), thereby maximising the chances of identifying task-related effects. Furthermore, the self-connections lend themselves to a straightforward biological interpretation, namely modifying the inhibitory-excitatory balance within each region (Bastos et al., 2012). Nevertheless, the same analysis pipeline could be applied if the interesting hypotheses were about between-region connections, or if they were about the average or baseline connectivity rather than task effects (parameter matrix of the neural model). If in doubt, the Bayesian model comparison procedure described here could be used to decide which parameters to modulate by the task (by comparing a full PEB model against reduced models), as well as which parameters to treat as random effects (by comparing PEB models with different within-subject design matrices).Hierarchical Bayesian modelling is an established approach in the context of functional imaging data (Woolrich et al., 2004; Penny et al., 2005). The PEB framework (Friston et al., 2016) introduced hierarchical modelling of the connectivity parameters of DCMs, thereby linking individuals' connection strengths to the group(s) from which they were sampled. This has so far proved useful for modelling fMRI data (Dijkstra et al., 2017; Park et al., 2017; Jung et al., 2018; Zhou et al., 2018a, 2018b) and MEG/EEG/LFP data (Pinotsis et al., 2016; Fardo et al., 2017; Papadopoulou et al., 2017; Penny et al., 2018). The PEB framework has been validated in terms of reproducibility in the context of ERP data (Litvak et al., 2015) and – going forward – will benefit from further validation to assess the suitability of certain priors in novel applications. While this study has demonstrated the main features of the framework, there are others we did not cover for brevity. For instance, to pull subjects’ parameter estimates out of local minima, the first level model estimation can be re-initialized multiple times, using the group-level posteriors from the PEB model as empirical priors for each subject (Friston et al., 2015). This may be particularly useful where highly non-linear DCMs are in play. Furthermore, PEB analyses do not need to be limited to two levels. In a recent study investigating changes in the motor system following thalamic surgery for tremor, resting state fMRI data from multiple time points were modelled using separate DCMs (Park et al., 2017). The DCM parameters from each time point were collated into a subject-specific PEB model, forming the second level of the hierarchy. The second level parameters were then taken up to the group level, modelled using a third-level GLM, to capture commonalities and differences among subjects. By using this PEB-of-PEBs approach, the deep hierarchies implicit in certain kinds of experimental designs can be modelled explicitly.Here, we presented two methods for testing hypotheses at the group level – either designing a set of models/families explicitly, or automatically searching over potentially thousands of reduced models to ‘prune’ parameters that did not contribute to model evidence. While the two approaches gave similar results on the dataset presented here, this is not guaranteed to be the case in general, and there are disadvantages to using an automatic (greedy) search. The large number of possible models means the search cannot usually be exhaustive, and so a better solution may be found by a well thought-out set of models. Furthermore, an automatic model search introduces the temptation to construct post-hoc explanations for the surviving parameters. The number of possible reduced models might emphasise the utility of a priori hypotheses when using Bayesian model reduction.Specifying the PEB model in this study involved setting the between-subjects design matrix a priori. A complementary application of hierarchical modelling would be to discover group membership; i.e., to perform unsupervised classification or clustering of subjects based on their connectivity. A recently developed approach for this combines DCM with finite mixture models in a single hierarchical model (Raman et al., 2016). A complementary approach is possible with the PEB scheme, by searching over candidate group assignments using Bayesian Model Reduction (BMR). It would be interesting to compare this against sampling approaches, which are generally slower but constitute the gold standard in terms of robustness.As touched on in the introduction, the analysis workflow presented here differs from that of the random effects Bayesian model selection (RFX BMS) method of group analysis (Stephan et al., 2009). With RFX BMS, multiple DCMs are specified and estimated per subject, and their log model evidence (free energy) is used as the basis for computing a probability for each model. There is no restriction on the form of the models that can be compared, provided that a free energy can be computed. With the PEB scheme, only one DCM is specified and estimated per subject, and hypotheses are tested about the commonalities and differences in the parameters (connection strengths) across subjects. The PEB workflow was not designed for comparing different forms of DCM forward model per subject, e.g. to compare a linear versus nonlinear model of neural activity. (A workaround could found, however, by defining separate PEB models for each type of DCM and comparing their evidence). Despite this restriction, the PEB workflow introduces potential advantages that are still to be fully assessed. For instance, because a full posterior probability density over the DCM parameters is taken to the group level, subjects with noisier data should automatically contribute less to the group result. Furthermore, we speculate that estimating only a ‘full’ model with all parameters in play will typically engender a ‘smoother’ free energy landscape than multiple reduced models per subject with fewer parameters. In other words, having a high dimensional parameter space could, counterintuitively, ensure more robust convergence; in the sense that there are more opportunities to escape from local minima in high dimensional parameter spaces. This may be particularly prescient for electrophysiological DCMs, where the models are more non-linear.To get started with the PEB framework, we refer readers to the illustrated step-by-step guide and example data accompanying this article at https://github.com/pzeidman/dcm-peb-example.
Table 2
Symbols used in this paper.
Variable
Dimension
Meaning
α
1×1
Ratio of prior uncertainty about group-level effects to prior uncertainty about first-level effects.
β
1×1
Ratio of within-subject prior variance (uncertainty) to between-subject prior RFX variance (variability).
β0
C0×1
Parameters for null effects
C
1×1
Total number of group level covariates
C0
1×1
Total number of first level covariates of no interest
εi(1)
V×1
Observation noise for subject i
ε2
(N⋅P(1))×1
Between-subject variability or random effects (RFX)
ε3
(C⋅P(1))×1
Residuals (uncertainty) about group-level parameters θ(2)
η
(C⋅P(1))×1
Expected values of group-level parameters θ(2)
F(n)
1×1
Negative variational free energy for level n of the model
Γi
−
First-level model (e.g. DCM) for subject i
J
1×1
Number of experimental conditions at the first level
γj
1×1
Log scaling parameter for precision component Qj
L
1×C
Norm of each covariate (regressor)
m
−
Generative model (likelihood and priors)
μi,μi(1)
P(1)×1
Expected values of the DCM parameters for subject i
μ0(1)
P(1)×1
Prior expectation of the DCM parameters
N
1×1
Number of subjects (not to be confused with the multivariate normal distribution N(μ,Σ)).
P(1)
1×1
Total number of free parameters per DCM taken to the group level
Π2
(C⋅P(1))×(C⋅P(1))
RFX precision matrix
Qj
P(1)×P(1)
Precision component j=1…J
R
1×1
Number of modelled brain regions
Σi,Σi(1)
P(1)×P(1)
Covariance matrix of the DCM parameters for subject i
Σ0(1)
P(1)×1
Prior covariance of the DCM parameters
θ1
NP(1)×1
All DCM parameters from all subjects
θi1
P(1)×1
All DCM parameters for subject i
Σ2
(N⋅P(1))×(N⋅P(1))
Covariance of between-subject variability ε(2)
Σ3
(C⋅P(1))×(C⋅P(1))
Covariance of residuals (uncertainty) ε(3)
θ2
(C⋅P(1))×1
Group-level parameters
V
1×1
Total measurements (volumes) per subject
vj
1×1
Prior variance of DCM parameter j
X
(N⋅P(1))×(C⋅P(1))
Design matrix
XB
N×C
Between-subjects design matrix
XW
P(1)×P(1)
Within-subjects design matrix
X0
V×C0
Design matrix for null effects
Yi
V×R
Observed timeseries from subject i from all regions
Authors: Mark W Woolrich; Timothy E J Behrens; Christian F Beckmann; Mark Jenkinson; Stephen M Smith Journal: Neuroimage Date: 2004-04 Impact factor: 6.556
Authors: Will D Penny; Klaas E Stephan; Jean Daunizeau; Maria J Rosa; Karl J Friston; Thomas M Schofield; Alex P Leff Journal: PLoS Comput Biol Date: 2010-03-12 Impact factor: 4.475
Authors: Basil C Preisig; Lars Riecke; Matthias J Sjerps; Anne Kösem; Benjamin R Kop; Bob Bramson; Peter Hagoort; Alexis Hervais-Adelman Journal: Proc Natl Acad Sci U S A Date: 2021-02-16 Impact factor: 11.205
Authors: Amirhossein Jafarian; Peter Zeidman; Vladimir Litvak; Karl Friston Journal: Philos Trans A Math Phys Eng Sci Date: 2019-10-28 Impact factor: 4.226
Authors: Agnes Norbury; Sarah B Rutter; Abigail B Collins; Sara Costi; Manish K Jha; Sarah R Horn; Marin Kautz; Morgan Corniquel; Katherine A Collins; Andrew M Glasgow; Jess Brallier; Lisa M Shin; Dennis S Charney; James W Murrough; Adriana Feder Journal: Neuropsychopharmacology Date: 2021-07-31 Impact factor: 7.853
Authors: Maria G Veldhuizen; Michael C Farruggia; Xiao Gao; Yuko Nakamura; Barry G Green; Dana M Small Journal: J Neurosci Date: 2020-05-05 Impact factor: 6.167
Authors: Natalie E Adams; Laura E Hughes; Holly N Phillips; Alexander D Shaw; Alexander G Murley; David Nesbitt; Thomas E Cope; W Richard Bevan-Jones; Luca Passamonti; James B Rowe Journal: J Neurosci Date: 2020-01-08 Impact factor: 6.167
Authors: Gregory A Fonzo; Madeleine S Goodkind; Desmond J Oathes; Yevgeniya V Zaiko; Meredith Harvey; Kathy K Peng; M Elizabeth Weiss; Allison L Thompson; Sanno E Zack; Steven E Lindley; Bruce A Arnow; Booil Jo; Barbara O Rothbaum; Amit Etkin Journal: Biol Psychiatry Date: 2020-12-08 Impact factor: 13.382
Authors: Gennady G Knyazev; Alexander N Savostyanov; Andrey V Bocharov; Pavel D Rudych Journal: Front Hum Neurosci Date: 2021-06-29 Impact factor: 3.169