Literature DB >> 32065968

Sub-graph entropy based network approaches for classifying adolescent obsessive-compulsive disorder from resting-state functional MRI.

Bhaskar Sen¹, Gail A Bernstein², Bryon A Mueller², Kathryn R Cullen², Keshab K Parhi³.

Abstract

This paper presents a novel approach for classifying obsessive-compulsive disorder (OCD) in adolescents from resting-state fMRI data. Currently, the state-of-the-art for diagnosing OCD in youth involves interviews with adolescent patients and their parents by an experienced clinician, symptom rating scales based on Diagnostic and Statistical Manual of Mental Disorders (DSM), and behavioral observation. Discovering signal processing and network-based biomarkers from functional magnetic resonance imaging (fMRI) scans of patients has the potential to assist clinicians in their diagnostic assessments of adolescents suffering from OCD. This paper investigates the clinical diagnostic utility of a set of univariate, bivariate and multivariate features extracted from resting-state fMRI using an information-theoretic approach in 15 adolescents with OCD and 13 matched healthy controls. Results indicate that an information-theoretic approach based on sub-graph entropy is capable of classifying OCD vs. healthy subjects with high accuracy. Mean time-series were extracted from 85 brain regions and were used to calculate Shannon wavelet entropy, Pearson correlation matrix, network features and sub-graph entropy. In addition, two special cases of sub-graph entropy, namely node and edge entropy, were investigated to identify important brain regions and edges from OCD patients. A leave-one-out cross-validation method was used for the final predictor performance. The proposed methodology using differential sub-graph (edge) entropy achieved an accuracy of 0.89 with specificity 1 and sensitivity 0.80 using leave-one-out cross-validation with in-fold feature ranking and selection. The high classification accuracy indicates the predictive power of the sub-network as well as edge entropy metric.

Entities: Chemical Disease Gene Species

Keywords: Classification; Functional network; Obsessive compulsive disorder; Psychiatry; Sub-graph entropy

Year: 2020 PMID： 32065968 PMCID： PMC7025090 DOI： 10.1016/j.nicl.2020.102208

Source DB: PubMed Journal: Neuroimage Clin ISSN： 2213-1582 Impact factor: 4.881

Introduction

Obsessive-compulsive disorder (OCD) is a serious psychiatric illness that affects about 2 - 3% of the population worldwide (Rasmussen and Eisen, 1992). Onset of OCD is associated with excessive intrusive, unwanted thoughts (obsessions) and repetitive behaviors (compulsions). This paper addresses discriminating and classifying the brain state of adolescents with OCD from healthy controls using resting-state functional magnetic resonance imaging (fMRI). Identifying features for discriminating OCD in adolescents is an important topic of research that could lead to useful clinical tools to aid in the diagnosis and management of youth with OCD. The current psychiatric diagnostic practices for adolescents include interviews with youths and their parents by experienced clinicians, symptom rating scales and behavioral examinations (Grabill, Merlo, Duke, Harford, Keeley, Geffken, Storch, 2008, King, Leonard, March, 1998). However, there are no evidence-based neurobiological markers available to aid clinicians in the diagnosis. Finding the brain regions and functional connections that are affected during OCD among adolescents is still an active area of research. Past research has shown that fMRI is a useful tool for understanding psychiatric disorders (Bassett, Nelson, Mueller, Camchong, Lim, 2012, Mitterschiffthaler, Ettinger, Mehta, Mataix-Cols, Williams, 2006). Broadly, fMRI provides a non-invasive way to measure activity of the brain during resting-state (rs-fMRI) or task (t-fMRI) using the change of blood-oxygen level dependent (BOLD) signal (Sen and Parhi, 2017). The fMRI scans of a person can be used to generate the functional connectivity network of the brain. In particular, rs-fMRI measures the spontaneous fluctuation of BOLD signal during awake rest. In case of many psychiatric diseases, the rs-fMRI signal is highly meaningful as it is not confounded by any task-based performance (Dragomir, Vrahatis, Bezerianos, 2018, Greicius, Krasnow, Reiss, Menon, 2003). The rs-fMRI signal from a given brain region can yield particular properties of neurobiological interest such as amplitude and signal entropy (Bassett et al., 2012). In this paper, we are only concerned about the rs-fMRI analysis for adolescents with OCD and healthy controls. The human brain is a network of interconnected nodes or regions functionally connected with each other (Atluri et al., 2016). Each regional brain activity is described by a time-series that is calculated by averaging all the time-series corresponding to the voxels for that region and the functional connection is measured by correlation coefficients between the regions (Bullmore, Sporns, 2009, Sporns, 2003). Recent advancements in neuroimaging and network theory offer tools that may be useful to advance understanding of psychiatric illnesses such as OCD. For example, entropy is a concept with origins in physics but more recently adopted by information theory and indexes unpredictability and complexity. A univariate measure of entropy can be calculated from any time-series data from a particular brain region. A multivariate measure of entropy can also be calculated to describe the complexity of a graph. We recently introduced the approach of examining entropy of sub-graphs which was capable of classifying brain states with high accuracy (Sen et al., 2019). Furthermore, sub-graph entropy can be used to rank regions and edges within a brain network with respect to their differential values between two groups, and the ranking carries useful information about the disruption in brain network due to a disorder. The goal of this study was to apply advanced rs-fMRI analytical approaches for analyzing rs-fMRI data to classify adolescents with OCD vs. healthy controls. We used time-series, absolute correlation coefficients and graph-theoretic properties of functional network for these classifications. Although many brain network statistics exist in the literature and have been used to rank regions (nodes) or functional connections (edges), the proposed approach has never been applied to brain networks for diagnosing psychiatric disorders. First we used univariate (e.g., Shannon wavelet entropy of each time-series), bivariate (absolute value of Pearson correlation) and multivariate (network features, e.g., local efficiency, global efficiency (Latora, Marchiori, 2001, Sporns, 2003), clustering coefficient (Rubinov and Sporns, 2010), betweenness centrality (Freeman et al., 1979), modularity (Newman, 2006), graph and sub-graph entropy (Sen et al., 2019)) features for predicting OCD. These measures describe segregation or integration within the network. Next, network measures such as sub-graph entropy (Sen et al., 2019) and specifically those of nodes and edges were used to rank important brain regions and edges in each group of brain networks. In addition, the nodes and edges were also ranked by the extent to which they differentiated groups. This led to extracting a sub-network containing 120 edges, which was used for classification of OCD vs. controls. Some of the above network measures have been used previously for comparing characteristics between two groups (e.g., schizophrenia vs. healthy (Bassett, Nelson, Mueller, Camchong, Lim, 2012, Huang, Zhu, Hao, Shi, Gao, Xu, Zhang, 2018), Alzheimer’s disorder vs. healthy (Armañanzas, Iglesias, Morales, Alonso-Nanclares, 2016, Dragomir, Vrahatis, Bezerianos, 2018), borderline personality disorder vs. healthy (Xu et al., 2016), and obsessive-compulsive disorder vs. healthy (Armstrong et al., 2016)). While these traditional association studies can extract statistically significant neural correlates for a disorder, in this paper we extract a sub-network from resting-state fMRI data from two groups of adolescents that can predict whether a subject has OCD.

Materials and methods

Dataset and preprocessing

We describe the dataset and the preprocessing steps following the procedure outlined in Bernstein et al. (2016). Fifteen adolescents with OCD and 13 matched healthy control subjects were enrolled for the study. There were no significant differences between the OCD and control groups in terms of age, gender, socioeconomic status, IQ, ethnicity and handedness (Bernstein et al., 2016). Children’s Yale-Brown Obsessive-Compulsive Scale (CY-BOCS) checklist (Rosario-Campos et al., 2006) was used to calculate scores on 4 factor-analyzed symptom dimension (Bernstein et al., 2013). OCD patients had mean Y-BOCS score of 19.7 (standard deviation SD = 3.5, max 27, min 12) whereas the healthy controls had mean Y-BOCS score 0.1 (SD 0.3). Mean age for onset of the disease is 9.5 ± 4.0 years. The mean duration of the patients for the disease is 5.8 ± 4.2 years. Twelve out of 15 OCD subjects were on psychotropic medication. Details about age at assessment and clinical information for the adolescent are given in Table 1. More specifically, the mean age at the assessment for OCD subjects and controls were 15.3 (maximum 19, minimum 12.3) and 16 (max 18.8, min 12.3), respectively. Number of males in the groups were 8 and 7 for OCD and control groups, respectively. Resting-state fMRI scans (2 sessions each of 12 minutes duration) were taken using a novel multiband EPI sequence that accounts for acquiring multiple slices simultaneously (Feinberg et al., 2010). The rs-fMRI scans were acquired with the following parameters TR = 1.15 s, TE = 30 ms, voxel size = 2 mm isotropic, 60 slices, multiband factor = 4, echo spacing = 0.57 ms and number of 3D volumes = 600. During the scans, the subjects were asked to remain awake, eyes closed and not to think about anything in particular. More details about the data can be found in Bernstein et al. (2016). The experimental procedures involving human subjects described in this paper were approved by the University of Minnesota Institutional Review Board.

Table 1

Demographic information	OCD	Control	t or χ²	p-value
# of samples (n)	15	13	_	_
Age at onset - mean (SD)	9.5 (4.0)	_	_	_
Age at assessment - mean(SD, maximum, minimum)	15.3 (2.1, 19, 12.3)	16 (1.8, 18.8, 12.3)	t = 0.98	0.34
Male n(%)	8 (53)	7 (54)	χ2=0.001	0.98
Clinical Information - CY-BOCS	OCD	Control	t or χ²	p-value
Obsessions, mean (SD)	9.4 (2.2)	0.1(0.3)	t=15.9	< 0.001
Compulsions, mean (SD)	10.3 (1.7)	0.0 (0.0)	t=22.1	< 0.001
Total, mean (SD, maximum, minimum)	19.7 (3.5, 27, 12)	0.1 (0.3)	t=19.9	< 0.001

Demographic and clinical characteristics of the OCD and control groups. CY-BOCS: children’s yale-brown obsessive compulsive scale (Bernstein, Victor, Nelson, Lee, 2013, Rosario-Campos, Miguel, Quatrano, Chacon, Ferrao, Findley, Katsovich, Scahill, King, Woody, et al., 2006) . FSL tool (Jenkinson et al., 2012) was used for the pre-processing of the fMRI data. The steps included skull removal, distortion correction, motion correction and registration to Montreal Neurological Institute space (MNI). Using FSL program melodic, we analyzed the independent components and those corresponding to artifacts (heart rate, respiration, movement, white matter or cerebrospinal fluid) were removed. More details of the preprocessing can be found in Bernstein et al. (2016). After the preprocessing step, Desikan atlas (Desikan et al., 2006) was used to extract mean time-series for each of the 85 cortical and sub-cortical regions available in the atlas. The list of the regions used for the analysis is given in Supplementary Information Table A1. Each time-series was then filtered by a band-pass filter with pass-band between 0.01 Hz - 0.15 Hz, followed by a decomposition into two frequency bands using a db-4 wavelet filter: lower (B1: 0.01–0.08 Hz) and higher (B2: 0.08-0.15 Hz) (Xu et al., 2016).

Univariate features

The complexity of rs-fMRI time-series was first investigated by using a univariate approach following Bassett et al. (2012). Here we estimated the complexity of mean time-series extracted from each brain region through Shannon wavelet entropy (Rosso et al., 2001). We only focused our investigation of two frequency bands described above. Lower frequency band oscillations of BOLD signal have been shown previously to be adversely affected during a psychiatric disease state, (Lynall et al., 2010). The Shannon wavelet entropy was estimated using techniques developed in Coifman and Wickerhauser (1992) and Donoho et al. (1994) and implemented in the MATLAB Wavelet Toolbox (function wentropy.m). In short, the Shannon wavelet entropy is defined as: Here s is the mean time-series of a brain region for an individual, and s’s are the coefficients of s in the orthonormal wavelet basis.

Bivariate features

Pearson correlation coefficient () was calculated for each frequency band and the regions (r and r) where as described in Sen et al. (2016). Let be the mean time-series from frequency band f and region i from fMRI. Then, the Pearson correlation coefficient between two regions i, j is calculated by where is the estimated mean of the time-series from frequency band f and region i. Thus two adjacency matrices were created for each pair of regions corresponding to two frequency bands. These matrices can also be seen as edges in a graph where each node corresponds to one region. The total number of features for each frequency band was . Absolute values of Pearson correlations were used as features.

Multivariate features

For each subject, two 85 × 85 adjacency matrices (Sen et al., 2016) containing absolute Pearson correlation were extracted. Suppose that a group of N brain networks from R regions and N subjects is specified by the adjacency matrix where k ∈ {1, N}. R × R is the size of the adjacency matrix. Here C(i, j) is the connectivity between two regions (i, j) for subject k. Ideally is a binary matrix where existence of an edge is given by 1 and non-existence is given by 0. However, connectivity matrices extracted from fMRI are correlation values which are non-binary. In order to calculate the network features, first the matrices were binarized by keeping {5%, 20%, 35%, 50%} of the edges. The sparsity of the networks for all subjects remained the same for a specific density.

Network features

Adjacency matrices were used to calculate network characteristics using Brain Connectivity Toolbox (BCT). 1 At a local (node) level in the network, three features, namely local efficiency (LE), clustering coefficient (CC) and betweenness centrality (BC), were computed (Sen, Bernstein, Xu, Mueller, Schreiner, Cullen, Parhi, 2016, Xu, Cullen, Mueller, Schreiner, Lim, Schulz, Parhi, 2016). At a global level, we calculated two features: modularity and global efficiency,sen2016classification. The local and global features in the network represent complementary viewpoints of the network for segregation and integration of nodes, respectively. Hence from each subject, we extracted 85 × 3 × 2 (for 3 features at each node) + 2 × 2 (for modularity and global efficiency)= 514 network features corresponding to two frequency bands. An overview of the network based features is discussed next as described in Sen et al. (2019). Local efficiency is computed using the summation of inverse of the shortest paths to the neighbors of a node. This metric is used to understand how efficient a node is for transferring information between two neighboring nodes. Clustering coefficient is calculated by the number of triangles created around a node out of all possible triangles. Betweenness centrality of a node is calculated as the percentage of shortest paths that contain the node. Modularity metric measures how a network is sub-divided into smaller dense sub-networks with sparse inter-connections. Global efficiency describes the efficiency of information transfer within the whole graph. In addition, Network Based Statistics (NBS) (Zalesky et al., 2010) is used to extract important edges between two groups. NBS is a popular method for testing hypotheses about the edges in a network using t-test. It is used to identify connections and networks comprising the connectome associated with an experiment for a between-group difference. These network measures were also previously used for classifying OCD vs. healthy from fMRI data as described in Sen et al. (2016). More details of the network measures can be found in Rubinov and Sporns (2010).

Sub-graph entropy

We have recently shown that sub-graph entropy can effectively classify brain states for two different tasks (Sen et al., 2019). Sub-graph entropy captures the interaction of local neighborhood of a brain region or functional connection in an entropy formulation. Although sub-graph entropy of a network (G) can be computed in multiple ways (e.g., see Körner, 1973, Li, Pan, 2016), the scope of this paper is restricted to only the complexity associated with the entries in the adjacency matrix for the network. Within this limit, the current model achieves remarkable accuracy for diagnosing a common psychiatric disorder. In particular the sub-graph entropy is defined as the number of bits required to encode the adjacency structure of the network (Freeman, Roeder, Mulholland, 1979, Mackenzie, 1966). A brief description of sub-graph entropy (node and edge entropy) is presented next following Sen et al. (2019). A normalized adjacency matrix from one subject is defined asThis definition normalizes the entries in the adjacency matrix so that their summation is 1. For bi-directional networks, we just normalize the upper triangular part of C. We also define . In this case, the graph entropy is calculated by (Mackenzie, 1966) In our work we calculated the matrix consisting of the indices q as in Eq. (2). Graph entropy for a subset of a network was also computed and referred as sub-graph entropy. Node entropy Node entropy was calculated based on the sub-graph containing a region and its neighbors. We defined neighbors based on the 1-hop distances from the node, i.e., sub-graph containing nodes that are one edge away. Experiments involving 2-hop neighbors were also carried out and results from this analysis are presented in the Supplementary Information. The entries only signifying the interaction between one region and its neighbor were kept. All the other entries are forced to be zeros. Let the node entropy of node v is denoted by for subject k, then where q denotes the normalized edge weight for neighborhood, and M represents the number of edges in the sub-graph. After calculating node entropy for individual subjects, the node entropy from a group of N subjects can be estimated as follows. An algorithm for ranking important regions based on node entropy is given in Supplementary Information (SI) (Algorithm S1). Edge entropy Suppose region i is connected to j through edge e where i ≠ j. Similar to node entropy, the edge neighbors were also defined based on the 1-hop distances from the edge e. In this case, we computed the edge entropy of subject k as (Sen et al., 2019),where q denotes the normalized edge weight for neighborhood, and M is number of edges in the sub-graph. After calculating edge entropy for individual subjects, the group edge entropy from an ensemble of graphs can be estimated as follows: An algorithm for ranking important edges based on node entropy is given in SI (Algorithm S2). Differential sub-graph (node and edge) entropy Between two groups of subjects, if the local segregation of brain regions change, the change in pattern can be captured using differential entropy,sen2019ranking. In this case, the regions or edges which have the most change in entropies between two groups are most affected due to the change in the brain state. Suppose, for region i, the node entropy for subjects belonging to group G1 (where G1 ∈ {Healthy, OCD etc}) is given by and for group G2, . The difference between these two values would encompass differential entropy between two groups of subjects for region i. The change in entropy is calculated as where |x| is the absolute value of x. This metric is called differential node entropy,sen2019ranking. In our experiment, differential node entropy was calculated for each brain region and they were ranked based on differential entropy in decreasing value. The same argument and ranking procedure was applied to differential edge entropy as well. The algorithm is given in Algorithm 1. This process extracted nodes and edges that had changed the most due to the functional altercation during the disease.

Algorithm 1

Ranking of Regions and Edges for Two Groups.

Extracting predictive sub-network based on edge entropy

In order to understand if sub-graph entropy measures contain predictive information, we used the edge entropy values for classification between two groups of subjects. Brain network measures have been used to discriminate between two groups in a number of previous papers to understand the significance, e.g., see Sen et al. (2016) and Richiardi et al. (2013). In this framework, starting with the top-ranked edge, additional edges were added to the network in an iterative manner until the classification accuracy started dropping. At each iteration, the edge entropies of the sub-network were used in a leave-one-out fashion (Bao et al., 2019) for classifying OCD vs. healthy controls. Leave-one-out is a common cross-validation method for alleviating overfitting in case of small sample size classification problems which is often the case in psychiatry domain (Huang et al., 2018). The classifier used in this case was support vector machine (SVM) with radial basis kernel based on testing different models where SVM resulted in a better performance. The procedure for finding the predictive sub-network is shown in Fig. 1.

Fig. 1

Procedure for extracting a predictive sub-network for OCD vs. healthy. Edges with highest differential entropy are selected to identify the sub-network based on leave-one-out accuracy. The sub-network’s sub-graph entropy is compared between two groups using t-test for validation. Additionally, two more sub-graphs containing union and intersection of statistically different regions and edges were tested for predictive performance (Sen et al., 2019). An anatomically defined sub-graph containing regions from CSTC network (Bernstein et al., 2016) was also used for baseline comparison of performance. In order to address the challenge of overfitting and establish the stability of predictive sub-network, the ranking procedure was run 28 times in a leave-one-out fashion. At each iteration, ranking was performed for 27 subjects except the left out. A histogram was then plotted for finding how many times top edges are extracted in the predictive sub-network.

Statistical analysis

The univariate, bivariate and multivariate features along with node entropy and edge entropy values were compared using a 2-sided t-test across different groups. Their corresponding p-values were also calculated and shown. Furthermore, leave-one-out classification accuracy, specificity and sensitivity were calculated to validate our approach.

Results

Univariate analysis

The classification performance of regional features using Shannon wavelet entropy is shown in last row of Table 2. In healthy controls and OCD patients, Shannon wavelet entropy was heterogeneously distributed throughout the brain with lowest Shannon wavelet entropy found in regions of the postcentral, cuneus, superiorparietal, paracentral and the highest was found in entorhinal, hippocampus, pallidum and thalamus. The regions that have significant difference in Shannon wavelet entropy are found in caudate, putamen, paracentral, postcentral, precuneus with p < 0.05 for band B1 (lower frequency band). Frequency band B2 (higher frequency band) did not have any statistical difference for Shannon wavelet entropy. The regions are shown in a brain template in Supplementary Information Fig. S1.

Table 2

Leave-one-out classification results .

	Features	Accuracy	Specificity	Sensitivity
Proposed (edge) sub-network	120	0.89	1	0.80
Union sub-graph	145	0.89	1	0.80
Intersection sub-graph	114	0.86	0.92	0.80
CSTC sub-network	120	0.71	0.85	0.60
Node entropy (Sen et al., 2019)	85	0.71	0.62	0.80
Correlation (Sen et al., 2016)	5	0.71	0.85	0.60
Network features (Sen et al., 2016)	5	0.75	0.77	0.73
Correlation + Network features (Sen et al., 2016)	10	0.78	0.85	0.73
NBS (Zalesky et al., 2010)	95	0.64	0.54	0.73
Shannon wavelet entropy (Bassett et al., 2012)	85	0.54	0.38	0.66

Leave-one-out classification results .

Bivariate analysis

Using a minimum redundancy maximum relevance (mRMR) in-fold feature selection (Peng et al., 2005), 5 features that are most important according to the criterion, were selected. The selected features belong to lower frequency sub-band (B1). The classification performance by leave-one-out method was 0.71 with specificity 0.85 and sensitivity 0.60 (shown in Table 2) using the absolute Pearson correlation coefficients. The results showed that Pearson correlation between superior temporal gyrus - temporal pole was selected 26 out of 28 times during the mRMR feature selection and leave-one-out classification. Additionally, the edge weight between precuneus - precentral gyrus was selected 25 out of 28 times. The other important edge connections (absolute pearson correlation) are inferior frontal gyrus - mid frontal gyrus, pars operculum - insular cortex and inferior temporal gyrus- mid temporal gyrus (Sen et al., 2016).

Multivariate analysis

Following the bivariate experiment as described in previous section, an in-fold mRMR feature selection was employed (with 5 most significant features) using network based features extracted from the functional brain graphs containing 85 regions. The leave-one-out classification accuracy using these five features was 0.75 with 0.73 sensitivity and 0.77 specificity (Table 2). Note that, clustering coefficient of putamen (at sparsity 35%) and hub property of cingulate gyrus (at sparsity 50%) were selected in each fold by the feature selection algorithm. The features and corresponding values for different sparsity levels for two groups are shown in SI Fig. S2. The selected features belong to lower frequency sub-band (B1).

Ranking of brain regions

The lower frequency band (B1) at network sparsity level 35% was used for calculating sub-graph entropy based on the classification performance. All results presented in the paper and the SI correspond to frequency band B1 and network sparsity level 35%. Using the strategy for group ranking as formulated in Eq. (5) and Algorithm S1 in Supplementary Information, we identified important regions for OCD and healthy group separately (Fig. 2). The regions identified among the top-25 in each group are shown in Fig. 2(a) and 2(c), respectively. Most of the regions identified in this process belong to the default mode areas that are well known to be active during resting condition (Greicius, Krasnow, Reiss, Menon, 2003, Raichle, MacLeod, Snyder, Powers, Gusnard, Shulman, 2001). A more interesting ranking process is the ranking of regions affected during OCD. This information is captured using the differential node entropy of nodes. The regions extracted using this procedure (Algorithm 1) are illustrated in two ways - (1) visualizing the nodes corresponding to highly ranked regions as shown in Fig. 2(b), and (2) listing the top-25 regions in Table 3. Furthermore, the corresponding differential entropy value and p-value for each region’s node entropy are also shown in the same Table. There are 13 regions with statistically significant difference (p < 0.05) in node entropy among the 25 regions.

Fig. 2

Table 3

Top-25 regions extracted using differential node entropy for OCD vs. healthy controls.

Rank	Region/Hemisphere	Diff. Entropy	p-value
1	Parsopercularis - R	1.5099	0.0002
2	Thalamus Proper - R	0.9660	0.0129
3	Parsorbitalis - L	0.9252	0.0459
4	Cuneus - L	0.9098	0.0153
5	Accumbens Area - L	0.9095	0.0159
6	Postcentral - L	0.9021	0.0065
7	Parsorbitalis - R	0.8944	0.0427
8	Pallidum - L	0.8514	0.0944
9	Medial Orbitofrontal - L	0.8303	0.0476
10	Parstriangularis - R	0.8233	0.0500
11	Medial Orbitofrontal - R	0.8090	0.0392
12	Amygdala - R	0.7317	0.1534
13	Hippocampus - L	0.7103	0.1503
14	Lateral Orbitofrontal - L	0.6737	0.1846
15	Caudate - R	0.6620	0.0234
16	Rostral Anterior Cingulate - R	0.6574	0.1239
17	Rostral Anterior Cingulate - L	0.6378	0.1657
18	Lateral Orbitofrontal - R	0.6201	0.2251
19	Pericalcarine - R	0.6192	0.0188
20	Caudal Anterior Cingulate - R	0.6165	0.2049
21	Entorhinal - L	0.6147	0.1431
22	Frontal Pole - L	0.5782	0.2752
23	Insula - L	0.5692	0.1392
24	Putamen - R	0.5565	0.0318
25	Accumbens Area - R	0.5419	0.1184

Visualization of important regions that have differences in entropy between OCD and healthy groups corresponding to frequency band B1 at network sparsity 35%. (a) OCD, red: regions that have higher entropy. (b) Differentiating regions between OCD vs. healthy, red: regions that have higher entropy for OCD, blue: regions that have higher entropy for healthy. (c) Healthy blue: regions that have higher entropy for healthy. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Top-25 regions extracted using differential node entropy for OCD vs. healthy controls. Note that the ranking can be validated for OCD by considering the regions from CSTC circuitry (Bernstein et al., 2016). Our ranking methodology is able to capture the regions from that circuit in top 15% of these most important nodes. We illustrate the differential node entropy values for the regions in sorted order in Fig. S3 (SI). In addition, we also show the mean differential entropy value and standard deviation (SD) in the same figure. There are 13 regions with differential entropy more than 1-SD away from mean. Also, 38 regions have differential entropy values more than mean. Another ranking scheme using 2-hop neighbors (here neighborhood for a node is defined based on distance of 2 edges from a node) revealed the most important regions to be similar to 1-hop neighbors (Table S1 in SI). The leave-one-out classification performance using all node entropy values is shown in Table 2 and is 0.71.

Ranking of edges

The most important ranked edges are visualized by overlaying them on an MNI brain using BrainNet toolbox (Xia et al., 2013) (Fig. 3). The group ranking procedure based on edge entropy (Eq. (7) and Algorithm S2 in SI) extracts top edges from healthy, and OCD separately as shown in Fig. 3(a) and (c), respectively. Top 100 edges were identified using the ranking process for each group. Additionally, the ranking based on differential edge entropy is also visualized in Fig. 3(b). A close inspection of the results reveals several observations. First, for each group ranking procedure reveals edges that are distributed throughout the lateral and medial cortical part of the brain and some of them belong to the default mode network. Second, differential entropy elevates the edges that belong to fronto-parietal and frontal-subcortical areas. The regions that are connected using the edges belong to frontal lobe, parietal lobe, anterior and posterior cingulate gyrus, thalamus proper, default mode (DMN), accumbens (striatal area) and amygdala. The corresponding differential entropy values for the edges are shown in sorted order in Fig. S4 (SI).

Fig. 3

Visualization of important edges that have differences in entropy between OCD and healthy groups corresponding to frequency band B1 at network sparsity 35%. (a) OCD, (b) Differentiating edges between OCD vs. healthy. (c) Healthy.

Extracting predictive sub-network

Accuracy of predictive sub-network

Following the method as outlined in Fig. 1, we ran our edge ranking procedure Algorithm 1 for edges 28 times, each time selecting top edges for leave-one-out classification. The change in classification accuracy with the number of edges using SVM and radial basis function kernel (using only selected edges) is shown in Fig. 4. The classification accuracy has a behavior similar to traditional feature selection algorithms where it improves up to 120 edges and then starts to decrease. In order to understand the significance of the current work with the information about sub-network, the result is compared with a number of baseline network based features in terms of classification accuracy. The comparisons of leave-one-out classification results are shown in Table 2. In addition, results for repeated random sub-sampling validation (cross-validation process run 100 times, each time randomly splitting the dataset into training 23 subjects and testing 5 subjects) are shown in Supplementary Information Table S3. Edge entropies of the proposed predictive sub-network achieved 0.89 accuracy and correctly classify all healthy and 12 out of 15 OCD subjects. Using 2-hop neighbor scheme for classification did not improve the accuracy (see Table S2 in SI). In our study, SVM was chosen based on the performance of multiple classifiers (e.g., support vector machine, linear discriminant analysis, artificial neural network and random forest). Their comparison of performance is shown in Supplementary Information Table S4.

Fig. 4

Average Leave-one-out accuracy vs. number of edges in sub-network.

Sub-network visualization

The identified sub-network comprising 33 region and 120 edges is shown in Fig. 5. In Table S5 (SI), we list the regions from predictive sub-network as well as well as the CSTC network. Top-50 edges from the sub-network are also listed in Table S6 (SI). In both cases, whether the regions or edges belong to CSTC network are also illustrated.

Fig. 5

Predictive sub-network extracted using differential edge entropy and leave-one-out analysis. This network corresponds to frequency band B1 and density threshold of 35%.

Statistical analysis

We selected the top-120 features in each iteration for leave-one-out training, and plot their occurrence using a histogram as shown in Fig. S5 in SI. The top ranked edges shown in Fig. 5 are also very important for classification as they are selected most of the times as top edges. In addition, the predictive sub-network achieves a p-value of 0.0071 for a t-test of sub-graph entropies between OCD vs. healthy as demonstrated in Table 4. Similarly, CSTC sub-graph entropy is significantly reduced (p = 0.0077) for OCD. The box plot for the t-test is shown in Fig. 6. To validate that the classifier performs significantly better than chance, permutation tests were performed, the results of which are shown in Fig. 7.

Table 4

Statistical analysis of predictive sub-network and CSTC network using sub-graph entropy.

	# of nodes	# of edges	Sub-graph entropy		p-value
			Mean	SD
Proposed sub-network	33	120	Healthy: 6.9061 OCD: 6.9028	Healthy: 0.0004 OCD: 0.0039	0.0071
CSTC sub-network	16	120	Healthy: 6.9030 OCD: 6.8951	Healthy: 0.0032 OCD: 0.0090	0.0077

Fig. 6

Box-plot of sub-graph entropy values for OCD vs. healthy, (a) CSTC sub-network, (b) predictive sub-network.

Fig. 7

Results of permutation test on OCD data. The labels for healthy and OCD are permuted and a SVM classifier is fitted to each new dataset. Histogram of accuracies and accuracy on actual data is shown.

Statistical analysis of predictive sub-network and CSTC network using sub-graph entropy. Box-plot of sub-graph entropy values for OCD vs. healthy, (a) CSTC sub-network, (b) predictive sub-network. Results of permutation test on OCD data. The labels for healthy and OCD are permuted and a SVM classifier is fitted to each new dataset. Histogram of accuracies and accuracy on actual data is shown.

Discussion

This paper investigates the feasibility of automated classification of adolescent OCD vs. matched healthy controls. It proposes the use of information-theoretic sub-graph entropy for ranking regions and edges from a group of brain scans and extracting a predictive sub-network for differentiating adolescent OCD patients from healthy. The predictive frequency band was found to be B1 (low frequency BOLD oscillation) at network sparsity level 35%. Key observations from different classification approaches are discussed in this section.

Regions and edges

The present findings advance the method of resting-state fMRI analysis to examine neural networks in youth with OCD by network analysis; all prior methods had used standard, traditional approaches to examining functional connectivity (Bernstein, Mueller, Schreiner, Campbell, Regan, Nelson, Houri, Lee, Zagoloff, Lim, Yacoub, Cullen, 2016, Fitzgerald, Welsh, Stern, Angstadt, Hanna, Abelson, Taylor, 2011, Fitzgerald, Welsh, Stern, Angstadt, Hanna, Abelson, Taylor, 2011, Gruner, Anticevic, Lee, Pittenger, 2016, Weber, Soreni, Noseworthy, 2014). Fitzgerald et al. (2011) have found lower functional network connectivity in the cortico-striato-thalamo-cortical circuitry for youth OCD subjects in the age range 8–12. Fitzgerald et al. (2011) also reported decreased amount of functional connections in the anterior cingulate cortex, striatal and thalamus areas. Gruner et al. (2016) and Weber et al. (2014) used independent component analysis (ICA), a multivariate blind source separation technique, to analyze whole brain functional network of adolescent OCD subjects. Functional connectivity scores corresponding to each component were observed to be significantly higher among OCD patients in comparison with matched healthy controls in anterior/ posterior cingulate cortical areas and significantly sparser in visual cortical areas. Weber et al. (2014) also reported reduced functional connections in the cingulate cortex. Bernstein et al. (2016) found decreased connectivity between putamen and lateral prefrontal cortex. Also, Bernstein et al. (2016) found lower connectivity between the putamen and the right insula and operculum. Using the methods described here, we were able to capture previously known regions for resting-state fMRI scans from healthy humans (Greicius et al., 2008). Notably, all of the top-ranked regions that were identified to have high node entropy values for each group are default mode network nodes associated with resting-state. In addition, the OCD group showed higher regional entropy for regions such as thalamus, hippocampus, accumbens, putamen, anterior cingulate, postcentral gyrus, amygdala, pars-orbitalis and pars-opercularis as shown in Fig. 2. Interestingly, the high ranking of the thalamus and hippocampus regions by the proposed differential entropy approach is consistent with the traditional hypothesis that these regions are affected for subjects suffering from OCD. The regions and edges that had most difference between healthy vs. OCD consist of accumbens, amygdala, thalamus, pallidum which are part of the CSTC circuit (Bernstein et al., 2016) in brain. This circuit is shown in Fig. S6 in Supplementary Information. The intersection between CSTC and predictive sub-network is also illustrated in Fig. S7 (SI). In Figs. 3 and 5, we show the top ranked predictive edges between important brain nodes for two groups (OCD vs. controls). The predictive network consists of many regions that act as hub (Bullmore and Sporns, 2009), i.e., they consist of a number of edges that differentiated OCD group from healthy controls. Some of the edges include connection from operculum to putamen, accumbens, thalamus. Furthermore, edges from frontal pole to striatal regions also form a substantial part of the predictive network. The edge between amygdala and occipital lobe (cuneus) also distinguished OCD patients and healthy, which is notable as it represents a connection linking the default mode network and the limbic network. Edges that are not common between CSTC and predictive network are shown in Fig. S8 and S9, respectively. These plots illustrate how the proposed sub-network differs from the CSTC network. While the use of the edge entropies of the CSTC sub-network achieved an accuracy of 0.71, that of the proposed sub-network led to an accuracy of 0.89. This emphasizes the importance of including whole-brain data in the analysis, rather than focusing on just the CSTC. Although classifying OCD with high accuracy has been known to be difficult, the proposed sub-network improves the achievable accuracy significantly compared to all prior approaches. At a group level, the top discriminating links between OCD vs. controls are also statistically significant (p < 0.05). The individual edge entropy measures are useful metrics for classification as the proposed classifier significantly outperforms the classification with Pearson correlation coefficient values. However, the sub-graph entropy has significant reduction for OCD patients for both CSTC network and predictive sub-network as shown in Fig. 6. Fig. 2 shows a number of nodes that differentiated OCD versus controls, and they are mostly red, signifying that they had higher entropy in OCD than controls. However Fig. 6 shows that for both the CSTC network and the predictive network sub-graph entropy is lower. Although the nodes in OCD subjects tend to have higher node entropy, the sub-graphs used to calculate the node entropy consider only the neighbors the node is connected with. The predictive sub-graph is a different sub-graph that is extracted considering the most predictive edges based on edge entropy. The apparent difference in these two indicate that, although the nodes (that might be part of predictive sub-network) may have higher node entropy, the overlap of the edges (containing the nodes) that belong to the predictive sub-network is lower. In addition, this also indicates that the edge weights (in the predictive sub-network) are relatively well distributed in healthy controls (that lead to higher sub-graph entropy), and skewed in OCD patients (the lead to low sub-graph entropy).

Misclassified subjects

The OCD severity (CY-BOCS) of the patient group varied between 12–27. The mean, standard deviation and median were 19.7, 3.5, and 20, respectively. The OCD severity of misclassified subjects were 12, 15 and 19, respectively. Two of the misclassified subjects were under medication and one was non-medicated. The misclassified subjects fall in the lower end of CY-BOCS for the group of subjects. The inability of the classifier to distinguish these patients may indicate that these subjects had relatively mild level of OCD symptoms as indicated by their CY-BOCS score. The classifier identified them to fall in healthy group. A histogram illustrating the OCD severity of subjects and the CY-BOCS score for misclassified subjects is shown in Supplementary Information Fig. S10.

Validation

The binomial test on the classification using the proposed sub-network also shows that they have statistically significant predictive performance (p = for OCD vs. healthy with respect to a naive classifier and p = 0.0009 with respect to Sen et al. (2016)). In order to further validate the information-theoretic model, a sub-network containing only regions from CSTC circuits is extracted as shown in Fig. S6 and the sub-graph entropy between patient vs. control group is compared as shown in Fig. 6. CSTC sub-graph entropy is significantly reduced (p = 0.0077, as illustrated in Table 4) for OCD. In order to avoid potential problem with overfitting, leave-one-out classification with in-fold feature selection was used in every experimental setting. Note that this makes sure that the classifier is learning features from the training set and using the learned feature for finding the test accuracy in order to generalize the diagnosis to previously unseen subject. Although the classification used in this paper is leave-one-out, the result is robust even in the case of 5-fold cross validation and repeated random sub-sampling validation. For 5-fold CV, the dataset was divided into 5 sets each consisting of 5, 5, 6, 6, 6 subjects (5-folds), respectively. In each fold, one set was used for testing and others for training. We found out that edge entropy with 120 features achieve the same accuracy, specificity and sensitivity as leave-one-out. For repeated random sub-sampling validation, the cross-validation process was run 100 times, each time randomly splitting the dataset into training (23 subjects) and testing (5 subjects). The results for this procedure for each types of features are shown in Supplementary Information Table S3. Edge sub-network and Union sub-graph achieved accuracy of 0.87, specificity of 0.96 and sensitivity of 0.79. All these values are within 4% of leave-one-out accuracy and better than the performance of other features. In order to reinforce our conclusion that the classification performance of the extracted predictive network from the model are better than baseline models, permutation tests were performed. In this vein, a ‘naive’ baseline model was created effectively by permuting the labels in the training and test set. Thus the baseline model removed all the signals associated with OCD from the labels - the leave-one-out classifier then learned the relationships between the edge entropy and permuted labels. The actual learned model performed significantly better than baseline which suggests that the classification scheme outlined before is also statistically significant. Here, for OCD vs. healthy dataset, 1000 iterations were performed where each time, labels were permuted and then trained using an SVM model on the training subset of this set and tested on the rest. Fig. 7 shows the distributions of accuracy values for the dataset. In this scenario, there is a significant distance between the centre of the accuracy distribution and the accuracy produced by the predictive sub-network.

Predictive sub-network as part of other known networks

Our results implicate regions and edges from a number of large-scale brain networks. Here we discuss potential interpretations for the involvement of these networks in adolescent OCD.

Default mode network (DMN)

The default mode network of human brain consists of regions that are active when a person is awake but resting and not doing any particular task (Biswal, Zerrin Yetkin, Haughton, Hyde, 1995, Greicius, Krasnow, Reiss, Menon, 2003). The DMN regions are typically associated with mind wandering, day dreaming, thoughts about the self, and ruminations etc. In our analysis, the top regions, edges and predictive network extracted using sub-graph entropy consist of regions from the DMN. This may indicate that adolescents suffering from OCD can get “stuck” in repetitive thought patterns. Previous works also support the hypothesis that OCD involves abnormalities of the DMN (Beucke, Sepulcre, et al., 2014, Hou, Song, Zhang, Wu, et al., 2013, Stern, Fitzgerald, Welsh, Abelson, Taylor, 2012).

Dorsal attention + salience

The dorsal attention network contains regions that show increased activation when a subject is deciding where to focus the attention in a proactive way. The salience network provides the mechanism to attend the important cues that stand out compared to the background. In our results, the predictive sub-network contains regions (entorhinal) from dorsal attention (Bell, Shine, 2015, Yuan, Di, Taylor, Gohel, Tsai, Biswal, 2016) network. Moreover, it contains the hippocampus which is part of salience network (Bressler, Menon, 2010, Heine, Soddu, Gómez, Vanhaudenhuyse, Tshibanda, Thonnard, Charland-Verville, Kirsch, Laureys, Demertzi, 2012, Riedl, Utz, Castrillón, Grimmer, Rauschecker, Ploner, Friston, Drzezga, Sorg, 2016, Yeo, Krienen, Sepulcre, et al., 2011). The functional regions and edges containing salience and dorsal attention indicate that OCD subjects may deploy attention and attending visual cues differently from healthy adolescents (Chen, Li, Lv, Zhu, Wang, Meng, Hu, Li, Zhang, Chu, et al., 2018, Fasching, Walczak, Bernstein, et al., 2016, Nestadt, Kamath, Maher, Others, 2016).

Executive network

The executive network, which consists of fronto-parietal brain regions, is responsible for high level cognition functions, such as, problem solving, and decision-making. The predictive sub-network implicated for adolescent OCD includes many regions from frontal part of the brain (Fig. 1), e.g., orbitofrontal, fontal pole, parsorbitalis, and also parietal part of brain, e.g., postcentral gyrus. Their involvement in prediction performance may indicate the change in executive functions in OCD patients compared with healthy subjects (Bannon, Gonsalvez, Croft, Boyce, 2006, Kashyap, Kumar, Kandavel, Reddy, 2013, Stern, Fitzgerald, Welsh, Abelson, Taylor, 2012).

Limbic network

Finally, the predictive sub-network contains the following regions from limbic system - hippocampus and amygdala. Limbic regions play a critical role in mediating emotional responses and forming new memories. This functional network has been shown to be affected for OCD patients in a number of previous studies (Modell, Mountz, Curtis, Greden, 1989, Saxena, Rauch, 2000, Sheth, Neal, Tangherlini, Mian, Gentil, Cosgrove, Eskandar, Dougherty, 2013). Specifically, amygdala had higher node entropy value compared with controls, which may indicate its utility for relieving the elevated anxiety level for OCD patients.

Conclusion and future work

Using an information-theoretic network approach, this paper has identified a predictive sub-network of the brain that can be used to discriminate brain networks of adolescents with OCD from healthy controls. The regions and edges that are found to be most important based on differential entropy are also found to be statistically significant. The predictive sub-network contains brain regions from well known large-scale brain functional networks. Their involvement implies possible impairment of brain function of adolescents suffering from OCD. However, one limitation of the current work is the small number of participants in each group. Therefore, future work needs to be directed towards analysis of datasets with larger sample size. In addition, features and classifier models need to be developed to predict the onset of OCD and other psychiatric disorders like depression. Future work may also be directed towards classification based on frequency-domain features.

Declaration of Competing Interest

The authors declare no competing financial interests.

52 in total

1. Network-based statistic: identifying differences in brain networks.

Authors: Andrew Zalesky; Alex Fornito; Edward T Bullmore
Journal: Neuroimage Date: 2010-06-25 Impact factor: 6.556

2. Modularity and community structure in networks.

Authors: M E J Newman
Journal: Proc Natl Acad Sci U S A Date: 2006-05-24 Impact factor: 11.205

3. The Dimensional Yale-Brown Obsessive-Compulsive Scale (DY-BOCS): an instrument for assessing obsessive-compulsive symptom dimensions.

Authors: M C Rosario-Campos; E C Miguel; S Quatrano; P Chacon; Y Ferrao; D Findley; L Katsovich; L Scahill; R A King; S R Woody; D Tolin; E Hollander; Y Kano; J F Leckman
Journal: Mol Psychiatry Date: 2006-05 Impact factor: 15.992

4. Automatically Evaluating Balance: A Machine Learning Approach.

Authors: Tian Bao; Brooke N Klatt; Susan L Whitney; Kathleen H Sienko; Jenna Wiens
Journal: IEEE Trans Neural Syst Rehabil Eng Date: 2019-01-04 Impact factor: 3.802

5. Executive functions in obsessive-compulsive disorder: state or trait deficits?

Authors: Shelley Bannon; Craig J Gonsalvez; Rodney J Croft; Philip M Boyce
Journal: Aust N Z J Psychiatry Date: 2006 Nov-Dec Impact factor: 5.744

Review 6. Applications of functional magnetic resonance imaging in psychiatry.

Authors: Martina T Mitterschiffthaler; Ulrich Ettinger; Mitul A Mehta; David Mataix-Cols; Steve C R Williams
Journal: J Magn Reson Imaging Date: 2006-06 Impact factor: 4.813

Review 7. Practice parameters for the assessment and treatment of children and adolescents with obsessive-compulsive disorder. AACAP.

Authors:
Journal: J Am Acad Child Adolesc Psychiatry Date: 1998-10 Impact factor: 8.829

8. A preliminary study of functional connectivity of medication naïve children with obsessive-compulsive disorder.

Authors: Alexander Mark Weber; Noam Soreni; Michael David Noseworthy
Journal: Prog Neuropsychopharmacol Biol Psychiatry Date: 2014-04-12 Impact factor: 5.067

Review 9. Assessment of obsessive-compulsive disorder: a review.

Authors: Kristen Grabill; Lisa Merlo; Danny Duke; Kelli-Lee Harford; Mary L Keeley; Gary R Geffken; Eric A Storch
Journal: J Anxiety Disord Date: 2007-02-03

10. Ranking Regions, Edges and Classifying Tasks in Functional Brain Graphs by Sub-Graph Entropy.

Authors: Bhaskar Sen; Shu-Hsien Chu; Keshab K Parhi
Journal: Sci Rep Date: 2019-05-20 Impact factor: 4.379

3 in total

1. Connectome-wide Functional Connectivity Abnormalities in Youth With Obsessive-Compulsive Symptoms.

Authors: Aaron F Alexander-Bloch; Rahul Sood; Russell T Shinohara; Tyler M Moore; Monica E Calkins; Casey Chertavian; Daniel H Wolf; Ruben C Gur; Theodore D Satterthwaite; Raquel E Gur; Ran Barzilay
Journal: Biol Psychiatry Cogn Neurosci Neuroimaging Date: 2021-08-08

2. Individual-specific networks for prediction modelling - A scoping review of methods.

Authors: Mariella Gregorich; Federico Melograna; Martina Sunqvist; Stefan Michiels; Kristel Van Steen; Georg Heinze
Journal: BMC Med Res Methodol Date: 2022-03-06 Impact factor: 4.615

3. Complexity changes in functional state dynamics suggest focal connectivity reductions.

Authors: David Sutherland Blair; Carles Soriano-Mas; Joana Cabral; Pedro Moreira; Pedro Morgado; Gustavo Deco
Journal: Front Hum Neurosci Date: 2022-09-23 Impact factor: 3.473

3 in total