Literature DB >> 29876249

Early prediction of cognitive deficits in very preterm infants using functional connectome data in an artificial neural network framework.

Lili He¹, Hailong Li², Scott K Holland³, Weihong Yuan³, Mekibib Altaye⁴, Nehal A Parikh⁵.

Abstract

Investigation of the brain's functional connectome can improve our understanding of how an individual brain's organizational changes influence cognitive function and could result in improved individual risk stratification. Brain connectome studies in adults and older children have shown that abnormal network properties may be useful as discriminative features and have exploited machine learning models for early diagnosis in a variety of neurological conditions. However, analogous studies in neonates are rare and with limited significant findings. In this paper, we propose an artificial neural network (ANN) framework for early prediction of cognitive deficits in very preterm infants based on functional connectome data from resting state fMRI. Specifically, we conducted feature selection via stacked sparse autoencoder and outcome prediction via support vector machine (SVM). The proposed ANN model was unsupervised learned using brain connectome data from 884 subjects in autism brain imaging data exchange database and SVM was cross-validated on 28 very preterm infants (born at 23-31 weeks of gestation and without brain injury; scanned at term-equivalent postmenstrual age). Using 90 regions of interests, we found that the ANN model applied to functional connectome data from very premature infants can predict cognitive outcome at 2 years of corrected age with an accuracy of 70.6% and area under receiver operating characteristic curve of 0.76. We also noted that several frontal lobe and somatosensory regions, significantly contributed to prediction of cognitive deficits 2 years later. Our work can be considered as a proof of concept for utilizing ANN models on functional connectome data to capture the individual variability inherent in the developing brains of preterm infants. The full potential of ANN will be realized and more robust conclusions drawn when applied to much larger neuroimaging datasets, as we plan to do.

Entities: Chemical Disease Gene Mutation Species

Keywords: Artificial neural network; Cognitive deficit; Functional MRI; Stacked sparse autoencoder; Support vector machine; Very preterm infants

Mesh：

Substances：
Oxygen

Year: 2018 PMID： 29876249 PMCID： PMC5987842 DOI： 10.1016/j.nicl.2018.01.032

Source DB: PubMed Journal: Neuroimage Clin ISSN： 2213-1582 Impact factor: 4.881

Introduction

The high risk of neurodevelopmental impairments is a major concern for parents and clinicians caring for premature babies, especially for those born very preterm (Jarjour, 2015). Up to 40% of very preterm infants (i.e. ≤32 weeks gestational age) in the United States are diagnosed with cognitive deficits at 2 years of age (Hamilton et al., 2016). Unfortunately, cognitive impairments cannot be accurately diagnosed until 3 to 5 years of age (Hack et al., 2005; Ment et al., 2003; Spencer-Smith et al., 2015). While recent studies demonstrate the importance of genetic factors in premature birth (Zhang et al., 2017) and outcome, there remains a gap in our knowledge about early identification of infants at high-risk for cognitive deficits. This gap limits our ability to target early interventions (Nordhov et al., 2010; Spittle et al., 2012) to the highest risk infants during periods of optimal neuroplasticity in the first 3 years after birth to enhance their ability to reach their full intellectual potential. The human brain is a highly interactive and organized system that exhibits functional units. Each brain unit is connected to multiple other units. Resting-state functional connectivity MRI (fcMRI) has made possible quantitative mapping of the connections within and between these units. The architecture conveys intrinsic information about the connectivity of the brain, referred to as the brain connectome (Glasser et al., 2016; Sporns, 2013), which has opened a window for observing the human mind (Sporns, 2013; Sporns et al., 2005). Mathematically, a connectome is a graph, representing the brain connectivity (described as a set of edges) between pairs of brain regions of interest (ROI) (described as a set of nodes). The connectome can also be encoded as an adjacency matrix, in which each entry represents the brain connectivity between each pair of ROIs. Research supports the notion that cognitive deficits may result from a perturbation of neural connection and communication (Fei et al., 2014). The brain connectome also shows a high degree of individual or inter-subject variability (Finn et al., 2015). Investigation of the brain connectome will improve our understanding of how individual brain organizational changes influence cognitive function, resulting in an improved individual risk stratification. Brain connectome studies in adults and older children have shown that abnormal network properties may be useful as discriminative features for early diagnosis in a variety of neurological conditions. Many of these studies have exploited machine learning models using brain connectome data for such early prediction (Arbabshirani et al., 2013; Barkhof et al., 2014; Fei et al., 2014; Finn et al., 2015; Jie et al., 2014a; Jie et al., 2014b; Khazaee et al., 2015; Khazaee et al., 2016; Prasad et al., 2015; Sacchet et al., 2015; Vanderweyen et al., 2015; Wee et al., 2012; Wee et al., 2016; Zhan et al., 2015; Zhu et al., 2014). The progress has now begun to be extended to neonatal population (Kawahara et al., 2017; Smyser et al., 2016; Ziv et al., 2013). Brain connectome data are inherently complicated and have high dimensionality, which makes it very challenge to effectively extract intrinsic information embedded in the data. The most popular method is through principal component analysis (PCA), however, it is a linear method. The complexed patterns embedded in the brain connectome data may not be explained linearly. In addition, it is unclear how many components are needed to reconstruct the data to a reasonable approximation, as many of the components are trivial. On the other hand, significant progress has been made on learning high-level representation of the raw data using artificial neural network (ANN) model (Hinton and Salakhutdinov, 2006). In this paper, we propose a Stacked Sparse Autoencoder (SSAE) based ANN framework for early prediction of cognitive deficits in very preterm infants based on functional connectome data. Specifically, we build an unsupervised SSAE model using functional connectome data from 884 subjects in autism brain imaging data exchange database (ABIDE) to discover low-dimensional latent representations/features from the original high-dimensional data. 28 very preterm infants are used to cross-validate a support vector machine (SVM) classifier to predict cognitive deficit. We hypothesize that our proposed ANN framework analyzing functional brain connectome data at birth can accurately predict cognitive deficits at 2 years corrected age at an individual level in very preterm infants.

Methods

Overview

The proposed ANN framework for early prediction of cognitive deficits consists of three components: 1) construct whole brain functional connectome; 2) implement SSAE to take functional connectome as input and extract its high-level connectome features (these features capture the embedded salient information that is useful for differentiating a single subject); and 3) implement SVM (Arbabshirani et al., 2017; Chang and Lin, 2001; Levman and Takahashi, 2015) classifier to conduct 2-class classification (i.e. high risk of cognitive deficits vs. low risk) using extracted functional brain connectome features. This research design is summarized in Fig. 1.

Fig. 1

Overview of proposed ANN framework for early prediction of cognitive deficits.

Subjects and cognitive assessments

The Nationwide Children's Hospital Institutional Review Board approved this study and written parental informed consent was obtained for every subject. The data for this study is from a cohort of 28 very preterm infants, ≤32 weeks gestational age, cared for in the neonatal intensive care unit at Nationwide Children's Hospital. Infants with known structural congenital central nervous system anomalies, congenital chromosomal anomalies, or congenital cyanotic cardiac defects were excluded. In addition, parents were not approached for consent if their infant remained on persistently high mechanical ventilator support (e.g., peak inspiratory pressure > 30 and/or fraction of inspired oxygen >50%) within the first 28 days after birth. All 28 infants now reached 2 years corrected age and completed their standardized Bayley Scales of Infant and Toddler Development III test. The Bayley-III normative cognitive scores are on a scale of 50 to 150, with a mean of 100 and standard deviation (SD) of 15. We grouped our cohort using a cut-off of 85 into those at high vs. low risk for cognitive deficits (i.e. two classes). A child with a cognitive score of <85 is considered to have moderate to severe deficit and is comparable to a child with a mental developmental index <70 on the Bayley-II (Johnson et al., 2014). The demographic information for these infants are provided in Table 1. We conducted two-sided t-tests (assuming unequal variance) and found that between the high and low risk groups, there were no significant differences in birth weight (p = 0.08), gestational age at birth (p = 0.28) and postmenstrual age at scan (p = 0.34). There was significance difference of cognitive scores (P < 0.0001) between two groups.

Table 1

Demographic summary of all very preterm infants.

Group	N	Sex	BW(g)	GA at birth (weeks)	PMA at Scan (weeks)	Cognitive score
Low-risk subjects	14	9 M, 5F	1080 ± 295.4	27.3 ± 2.0	39.6 ± 1.5	92.6 ± 4.2
High-risk subjects	14	5 M, 9F	878 ± 283.5	26.4 ± 2.2	39.1 ± 0.9	77.4 ± 9.7
All subjects	28	14 M, 14F	979 ± 302.1	26.8 ± 2.1	39.4 ± 1.3	85.0 ± 10.7

N = Number; F=Female; M = Male; BW=Birth weight; GA = Gestational age; PMA = Postmenstrual age. All ± values are mean ± standard deviation.

Demographic summary of all very preterm infants. N = Number; F=Female; M = Male; BW=Birth weight; GA = Gestational age; PMA = Postmenstrual age. All ± values are mean ± standard deviation.

MRI acquisition

Infants were scanned on a 3T GE HDx scanner equipped with an eight-channel infant head coil (Lammers Medical Technology, Germany). Functional images were collected using a single-shot echo planar image sequence sensitized to T2* weighted blood oxygenation level dependent (BOLD) signal changes. Acquisition parameters are: repetition time TR = 3000 ms, echo time TE = 35 ms, flip angle FA = 90°, resolution 2.8 × 2.8 × 3.0 mm3. A total of 100 frames were collected in 5.2 min. This acquisition time was chosen because it was more clinically feasible without compromising data quality (Van Dijk et al., 2010). Anatomical scans were conducted with a Proton Density/T2-weighted sequence (TR/TE1/TE2 = 11,000/14/185 ms, FA = 90°, resolution 0.35 × 0.35 × 2 mm3). All subjects were scanned during natural sleep without the use of any sedation after being fed and swaddled. A 3T MRI-compatible transport incubator (Nomag 3.0IC, Lammers Medical Technology, Germany) was used for the inpatient scans. MRI noise was minimized using Insta-Puffy Silicone Earplugs (E.A.R. Inc., Boulder, CO) and Natus Mini Muffs (Natus Medical Inc., San Carlos, CA).

Whole-brain functional connectome construction

A four-dimensional fcMRI dataset requires extensive preprocessing before resting-state network analyses can be conducted (Glasser et al., 2013; Smith et al., 2013). We developed a neonatal-optimized pipeline, (He and Parikh, 2015) that can be briefly summarized as follows: 1) Reorientation – acquired scans are aligned with anterior commissure (AC) - posterior commissure (PC) line into a standard image plane; 2) Skull stripping – remove non-brain parts of the image; 3) Realignment – align each time point's frame to the mean frame, reducing the effects of subject head motion during the acquisition; 4) Normalization – align fcMRI frames to the same subject's high-resolution structural image using rigid body registration and also align this structural image to a neonatal template (Shi et al., 2011) using affine transformation. This allows both fcMRI and structural images to be in the same “standard space” reference coordinate system; 5) Spatial smoothing – apply isotropic Gaussian filter with 6 mm kernel to improve signal-to-noise ratio and ameliorate the effects of functional misalignments across subjects; 6) Band-pass filtering (0.008 < f < 0.09 Hz) – remove the lowest and highest temporal drifts in the data; 7) Motion artifact reduction – detects corrupted time points using motion scrubbing (Power et al., 2012) and regresses this confounding factor out of the data (Behzadi et al., 2007). The above preprocessing methods are achieved using FMRIB Software Library (FSL, Oxford University, UK), Statistical Parametric Mapping software (SPM, University College London, UK) and Artifact Detection Tools (ART, MIT, Cambridge, US). Ninety ROIs are defined based on a neonatal automated anatomical labeling (AAL) atlas (Shi et al., 2011). The edges in the functional connectome describe the degree of functional connectivity; defined as a partial correlation between two ROIs. This results in a 90 × 90 adjacency matrix symmetric about the diagonal, in which each entry represents the brain connectivity between each pair of ROIs. To compute the partial correlation between two BOLD time signals, the effect of every other time signal is first removed (via regression) and then Pearson's correlation is computed. We then use Fisher's r-to-z transform to convert correlation values into Z-scores in order to prevent bias from being introduced in subsequent steps (Marrelec et al., 2006; Whitfield-Gabrieli and Nieto-Castanon, 2012). This is implemented using functional connectivity toolbox (CONN) (Marrelec et al., 2006; Whitfield-Gabrieli and Nieto-Castanon, 2012).

Stacked sparse autoencoder (SSAE) for high-level connectome feature learning

Autoencoder (AE) is a class of ANN. It aims to develop better feature representation of input high-dimensional data which is sufficiently useful for identifying individual subject by a classifier.

SSAE model

SSAE is a neural network consisting of a stack of multiple sparsed AEs (SAE). We build a SSAE in an unsupervised sequential fashion, where each SAE is optimized separately to increase the likelihood of finding the global optimum. A SAE consists of a neural net with one input layer, one hidden layer and one output layer (Fig. 2). A sparsity constraint is imposed on nodes in the hidden layer to reduce the over fitting issue. Nodes between different layers of an SAE are fully-connected. The SAE is able to reproduce the input patterns on the output layer through the intermediate hidden layer, so the number of nodes in the input layer and the output layer are equal. We denote x = [x1, …x, x] a n-dimension input vector of the SAE (i.e. original unlabeled connectome data, in this work); h = [h1, …h, h] denote the activation of m hidden nodes, (m < n) and denote the reconstructed input. Assume that input x is encoded to h by a linear deterministic mapping with encoding weights and bias .where f is a Sigmoid logistic function Similarly, The latent representation h in the hidden layer is then mapped to the reconstructed input with decoding weights and bias

Fig. 2

Basic architecture of a SAE. The input layer transforms high-dimensional features x to the corresponding representation h, and the hidden layer h can be seen as a new low-dimensional representation of the input data. The output layer is a decoder which can reconstruct an approximation of the input from the hidden representation h. Initializing with random values, we define an unsupervised optimization problem to iteratively determine the values in , , , that minimize the reconstruction error , a difference between input x and its reconstruction . Then the cost function can be modeled as: The first part of cost function is the reconstruction error, and the second part is the sparsity penalty term. For the sparcity penalty term, we adopted Kullback-Leibler (KL) divergence KL(ρ‖ρ') (Shin et al., 2013), defined as:where is the average activation of the hidden node j over the training dataset. Sparsity parameter ρ is a pre-defined small fraction constant. β is the weight control parameter for the sparsity penalty term. Using the stochastic gradient descent algorithm (Bishop, 1995), the optimal weights , ′, and bias , ′ can be obtained for a single SAE. In other words, the input layer transforms high-dimensional features x to the corresponding representation h, and the hidden layer h can be seen as a new low-dimensional representation of the input data. The output layer is a decoder which can reconstruct an approximation of the input from the hidden representation h. Considering the amount of available data, we apply a 2-layer SSAE in this work. Stacking more SAEs may learn more complex patterns, but require more training data. In SSAE, the nodes in the hidden layers of each SAE are wired to the input nodes of each successive SAE (Bengio, 2009; Bengio and LeCun, 2007). For the first layer of SSAE, we optimize first SAE(1) by using original brain connectome features as input x( so that weights ( and bias ( can be obtained. Then, activation vector of hidden nodes h(in first layer was used as input x( to optimize second SAE(2) for the second layer of SSAE, where weights ( and bias ( can be obtained. At the end, we connect two SAEs into a 2-layer SSAE (Fig. 3) with parameters optimized in two individual SAEs.

Fig. 3

The architecture of 2-layer SSAE. By adjusting the weight (, the first SAE project raw data x (i.e. original brain connectome features) onto primary features h(. Following this, by adjusting the weight (, the primary features are fed into the second SAE to obtain secondary features h( (i.e. extracted high-level brain connectome features).

Independent data set for SSAE model optimization

One of the challenges of SSAE model optimization in this proposed study is the relatively small data set (fcMRI in preterm infants is not routinely performed in clinical practice). By adopting a transfer learning concept (Gupta et al., 2013; Pan and Yang, 2010; Raina et al., 2007), we propose to use 884 ABIDE subjects, an independent data set, to train and optimize SSAE model in an unsupervised fashion. This strategy prevents model overfitting and also ensure independent data sets are used for model training and testing, respectively.

Support vector machine (SVM) classifier for outcome prediction

SVM is a supervised classification based on the concept of decision planes (Cortes and Vapnik, 1995). A SVM classifier with a linear kernel is used for outcome prediction (Cristianini and Shawe-Taylor, 2000). The performance of SVM is assessed using 10-fold cross-validation. More specifically, the data set is divided into 10 subsets, and the holdout method is repeated 10 times. Each time, one of the 10 subsets is used as the test set and the other 9 subsets are put together to form a training set. Then the average performance across all 10 trials is computed. To reduce variability, we implement a bootstrapping technique (Varian, 2005). In bootstrapping, 100 rounds of cross-validation are performed using different partitions and the validation results are averaged over the rounds. The accuracy of the classification is assessed using conventional metrics of accuracy, specificity and sensitivity and area under receiver operating characteristic curve (AUC), (Arbabshirani et al., 2017) and p-values for binomial probabilities. Sensitivity is defined as the percent of correctly classified as high-risk within all truly high-risk infants. Specificity is defined as the percent of correctly classified as low-risk within all truly low risk infants. Sensitivity is the ability of the classifier to correctly identify those high risk infants (true positive rate), whereas specificity is the ability of the classifier to correctly identify those low risk infants (true negative rate). The p-values represent the probability of observing the reported accuracy (number of correct classification trials) by chance based upon the binomial distribution for the given sample size (Smyser et al., 2016). To select the soft margin parameter C in the linear SVM, we tuned the model using training data via a linear search (i.e., C = 2−10, 2−9, …, 29, 210). The optimal parameter C of the SVM classifier was selected when AUC was maximal.

Identification of discriminative functional connections for prediction of cognition deficit

We adopt a feature ranking approach (Simonyan et al., 2013) designed for deep learning algorithms to unveil which functional connections are learned by our proposed ANN framework to be most predictive of cognitive outcome. Specifically, we calculate the partial derivatives of the SVM output with respect to the connectivity weights from the brain connectome. For the SVM output r, the partial derivative , where i ≠ j, i ∈ [1, 2, …, 90], j ∈ [1, 2, …, 90], is computed for each brain functional connection w between ROIs i and j. A higher absolute value of the partial derivative of the connection indicates a higher level of the importance for cognitive function prediction.

Results

Optimization of SSAE architecture

The optimal architecture (i.e., number of layers and nodes) of a SSAE for a specific classification application depends on several factors, including sample size, feature dimension and distribution (Bengio, 2009; Bengio and LeCun, 2007). We stacked two SAEs in a SSAE, considering the small sample size of our cohort. The number of nodes in the input layer was set based on the original dimension of functional connectome data, which is 4005 unique weights of undirected brain connectome edges. The number of nodes in the two hidden layers was optimized via a grid search. Specifically, we tested the numbers of nodes in the 1st hidden layer with empirical values from 100 to 600 in increments of 100; and the candidate numbers of nodes for 2nd hidden layer were considered from 5 to 20 in increments of 5, which were selected to be smaller than our sample size for dimensionality reduction. For each architecture setting, we repeated 100 times 10-fold cross-validation to evaluate prediction performance. Table 2 shows the mean AUC from various SSAE architectures. According to the highest mean AUC, we set the numbers of nodes in the first and second hidden layers to be 500 and 10, respectively.

Table 2

A grid search for the optimal number of nodes in SSAE's two hidden layers. Each row stands for the number of nodes in the first hidden layer, and each column indicates the number of nodes in 2nd hidden layer. The highest mean AUC of 0.76 was achieved when the numbers of nodes in the hidden layers were 500 and 10 respectively.

Cognitive deficit prediction

We compared the prediction performance of SVM among several approaches, including using raw functional connectome features (noted as Raw + SVM), high-level connectome features extracted via principle components analysis (PCA; noted as PCA + SVM where k is the number of top components) (Zhou et al., 2009) and via our proposed SSAE approach (noted as SSAE + SVM). In addition, as a baseline, we calculated the prediction accuracy of perinatal clinical variables (including sex, birth weight, gestational age at birth and postmenstrual age at MRI Scan, noted as Clinical + SVM). As shown in Table 3, the limited clinical factors we considered were not able to provide discriminative information for risk stratification, as its accuracy was moderately higher than 50%. As expected, the prediction using raw connectome features performed worse than random guessing on a two-classification problem. This poor prediction was caused by a “curse of dimensionality” problem, because the feature size was far greater than the sample size. To overcome this problem, we proposed to use SSAE to reduce the dimensionality of functional connectome features and compared with conventional PCA approach. Since there is no a priori knowing how many components in PCA are required to reconstruct the data to a reasonable approximation, we conducted PCA with the consideration of different numbers of top components. Overall, PCA method performed poorly to capture the salient information that was useful for differentiating a single subject. Using SSAE extracted high-level brain connectome features, we accurately classified 70.6% of very preterm infants at high risk of cognitive deficits with 70.1% sensitivity, 71.2% specificity and 0.76 AUC, all at p < 0.00001. Our proposed ANN framework improved prediction accuracy by over 12% with respect to the PCA-based model. The improvements (as compared with PCA10 which produced the best AUC) in sensitivity, specificity and AUC were 10.1%, 18.3% and 0.11, respectively. We ran each approach in Matlab environments (MathWorks, R2016a) on a workstation with Intel Xeon CPU E51620 and 128 GB RAM. No GPU was applied for the training acceleration. The execution of our proposed approach was 11 min, which was about 5 min longer than other approaches (Table 3). The training of an ANN model involves iteratively optimizing many parameters, thus requires more time than PCA modeling.

Table 3

Performance of different approaches for prediction of cognitive deficits. As a baseline, we calculated the prediction accuracy of perinatal clinical variables (including sex, birth weight, gestational age at birth and postmenstrual age at MRI Scan, noted as Clinical + SVM).

	Accuracy (%)	Sensitivity (%)	Specificity (%)	AUC	Execution time (mins)
Clinical + SVM	60.0 ± 3.9	62.9 ± 9.3	57.1 ± 5.1	0.63 ± 0.03	4.1
Raw + SVM	48.6 ± 2.0	35.7 ± 7.1	61.4 ± 6.4	0.51 ± 0.02	6.5
PCA₃ + SVM	59.3 ± 6.0	59.1 ± 8.9	59.4 ± 9.2	0.58 ± 0.03	5.3
PCA₅ + SVM	58.1 ± 4.2	49.3 ± 6.0	67.0 ± 8.2	0.59 ± 0.03	6.3
PCA₁₀ + SVM	56.4 ± 6.4	60.0 ± 8.1	52.9 ± 10.8	0.65 ± 0.03	6.2
PCA₁₅ + SVM	52.1 ± 5.4	46.4 ± 7.8	57.7 ± 10.1	0.52 ± 0.03	5.7
Proposed SSAE + SVM	70.6 ± 4.9	70.1 ± 8.2	71.2 ± 6.2	0.76 ± 0.03	10.8

All ± values are mean ± standard deviation.

Most discriminative brain functional connections

To visualize which brain connections and regions are most predictive of cognitive outcome, we calculated averaged partial derivatives of cognitive outcome for all 4005 functional connections, and the connections having larger partial derivative values are more important in the prediction. We plotted the top 40 most predictive connections on line segments connecting centroids of a brain atlas region in BrainNet Viewer, (Xia et al., 2013) as shown in Fig. 4. While there were several brain functional connections from the frontal lobe that were predictive of cognitive deficits in this list, several somatosensory regions were also identified. For example, connectivity involving bilateral postcentral gyri and thalamic nuclei appeared to be the most prominent for prediction of cognitive deficits. If we eliminated the top 40 most discriminative connections, the classification accuracy, sensitivity, specificity and AUC were decreased by about 10%, 14%, 5% and 0.12, respectively.

Fig. 4

Top 40 most discriminative brain functional connections learned by the proposed ANN model. The width of each segment (functional connection) indicates the predictive strength (i.e., more predictive regions are wider). The size of each node/region indicates the importance of that node/region in the prediction (i.e., more important regions are larger).

Discussion

Several systematic reviews have highlighted growing interest in studies that are developing neuroimaging-based single subject prognostic models to discriminate adult patients with brain disorders from healthy controls (Calhoun and Arbabshirani, 2012; Dai et al., 2012; Dazzan, 2014; Demirci et al., 2008; Dyrba et al., 2015; Kambeitz et al., 2015; Kloppel et al., 2012; Levman and Takahashi, 2015; Retico et al., 2014; Veronese et al., 2013; Wee et al., 2012; Wolfers et al., 2015; Zarogianni et al., 2013). In the surveyed studies, the most common features are volume and cortical thickness from anatomical MRI, functional connectivity from fMRI data, and apparent diffusion coefficient from dMRI data. In parallel, brain connectome studies in adults and older children have shown that abnormal network properties may be useful as discriminative features for early diagnosis in a variety of neurological conditions (Arbabshirani et al., 2013; Fei et al., 2014; Jie et al., 2014a; Jie et al., 2014b; Khazaee et al., 2015; Khazaee et al., 2016; Prasad et al., 2015; Sacchet et al., 2015; Vanderweyen et al., 2015; Wee et al., 2012; Wee et al., 2016; Zhan et al., 2015; Zhu et al., 2014). Although very limited, this progress has now begun to be extended to the preterm population, especially with regard to neonatal encephalopathy (Ziv et al., 2013), brain maturity prediction (Smyser et al., 2016), and the prediction of motor and cognitive deficits (Kawahara et al., 2017), by analyzing either the structural or functional connectome. In this work, we developed a ANN framework to analyze the brain functional connectome using multi-layer ANN to improve prediction of cognitive deficits in individual, very preterm infants, soon after birth. We used functional connectomes and identified the most discriminative networks and connections that presumably support cognitive function. Further development of this line of research could facilitate early risk stratification following neonatal intensive care unit discharge for early intervention and novel neuroprotective therapies during critical periods of brain development. We found several somatosensory regions, including multiple connections to bilateral regions of the postcentral gyrus, thalamus, superior temporal gyrus, supramarginal gyrus and paracentral lobule, that significantly contributed to prediction of cognitive deficits on the Bayley-III at 2 years corrected age. We previously reported that somatosensory and subcortical gray matter networks exhibit the strongest inter-hemispheric connectivity, even as early as 32 weeks postmenstrual age in very preterm infants (He and Parikh, 2016). In relation, the connectivity strength of long-range networks such as the executive function, default mode, and frontoparietal networks are considerably weaker during the first 6 months after very preterm birth. Furthermore, the somatosensory and subcortical network connectivity significantly increases between 32 and 52 weeks postmenstrual age (He and Parikh, 2016). We also observed several connections to the superior temporal gyrus (bilateral) and Heschl's gyrus that are involved in auditory processing. These findings suggest the strong influence sensory systems play in learning and cognition during early infancy. Both the somatosensory cortex and supramarginal gyrus are thought to be part of the mirror neuron system. Infants likely learn a great deal through observation of others, anticipating and mirroring their activities. Mirror neurons are thought to play a critical role in action understanding, anticipation, imitation, and social behavior (Acharya and Shukla, 2012). Learning through observation and imitation of others plays a key role for developing cognitive functions for motor learning and goal prediction (Meltzoff et al., 2009). The few identified connections to motor regions, including precentral gyrus, putamen, and globus pallidum, likely support these learning behaviors. Connectivity to several well-established regions that support cognitive/executive functions in children/adults, such as the orbitofrontal cortex, superior frontal gyrus, and middle temporal gyrus, were also highlighted within the top 40 most discriminative regions by our ANN algorithm to be predictive of cognitive deficits. Our findings highlighted some of the key functional brain regions that are involved in cognitive development in preterm infants and further suggest that ANN's process of functional network selection for prediction is mostly grounded in well-established brain structure-function relationships. The biggest concern for our proposed connectome-based ANN prediction is that the number of subjects is markedly smaller than the number of brain connectome features. To mitigate this issue and to prevent model overfitting, we 1) employed transfer learning techniques to use an independent large ABIDE dataset for SSAE optimization; 2) conducted feature dimensionality reduction via SSAE; and 3) set the architecture of SSAE to be “shallow” (two layers). With the increasing availability of connectome datasets from premature infants with subsequent cognitive outcome measures, multi-layer ANN is expected to improve the modeling fidelity and the prediction performance in the same way as it has revolutionized other fields (e.g., natural image classification and retrieval and natural language processing) (LeCun et al., 2015). The current study has several limitations. First, we calculated the accuracy of only a handful of clinical variables for prediction. We recognized this baseline performance should be improved by including other known perinatal clinical variables into the model. For example, by addition of maternal (e.g., chorioamnionitis), neonatal (e.g., medical morbidities and therapies), and social/environmental factors (e.g., socioeconomic status). In future analyses, we plan to collect and incorporate such variables in the proposed ANN model to further improve our prediction performance. Second, the current study only conducted prediction analyses of cognitive deficits by assigning a categorical label to each subject using classification techniques. We will also be interested in designing a pattern regression model to estimate cognitive scores on a continuous scale. Third, we defined ROIs based on an anatomical labeling atlas in this work. We were aware that functional connectivity estimation can be effected by within-ROI signal heterogeneity. We were also aware of a recent publication (Shi et al., 2017) that had derived a set of anatomically constrained, infant-specific functional brain parcellations using functional connectivity-based clustering. The results from Shi et al., revealed significantly higher levels of signal homogeneity within the newly defined functional parcellations compared with other schemes. To date, no studies have compared the use of automated anatomical labeling with this newly developed functional atlas. We are planning to employ this functional atlas for future work and anticipate improved estimation of ROI-based functional connectivity and resulting higher classification accuracy. Finally, due to the small sample size, our work can only be considered as a proof of concept for utilizing ANN models on connectome data to capture the individual variability inherent in the developing brains of neonates. The full potential of ANN will only be achieved and more reliable conclusions drawn when applied to much larger neuroimaging datasets, as we are currently undertaking. In summary, we demonstrated that an ANN model applied to functional connectome data alone from very premature neonates can predict cognitive outcome at 2 years of corrected age with an accuracy of 70.6% and AUC 0.76. Future expansion of the approach to extend ANN application to structural connectome data based on diffusion tensor and anatomical MRI data and augmentation of the final classifier with inclusion of clinical data in the same model is likely to improve performance considerably as shown in previous studies. This approach defines a path to precise prediction of risk for poor outcomes in infants born prematurely, which will be critical data to guiding early intervention. As outcome prediction improves with a larger data set and expanded model, it will also be possible to more precisely define the specific brain regions and connections that are the most important determinants of cognitive outcome. Relating these functional and structural connections to specific genes that are now being identified as factors in premature birth will be a critical step to fully understand premature birth and minimize its impact on the population.

Conclusions

In this study, we 1) Constructed brain functional connectomes using neonatal-optimized image processing and analysis methods; 2) Explicated the brain connectome using SSAE to capture the embedded salient information that is useful for differentiating a single subject; and 3) Accurately predicted cognitive deficits/function in individual very preterm infants soon after birth using SVM. Our study shows that functional brain connectome data are useful as prognostic biomarkers. It also shows a proof of concept for using ANN on connectome data to capture individual variability. Our study holds promise as a means of characterizing brain connectome disturbances before the onset of significant overt cognitive differences. A larger study is important to validate our approach.

16 in total

1. Early brain abnormalities in infants born very preterm predict under-reactive temperament.

Authors: Leanne Tamm; Meera Patel; James Peugh; Beth M Kline-Fath; Nehal A Parikh
Journal: Early Hum Dev Date: 2020-03-09 Impact factor: 2.079

2. A Multichannel Deep Neural Network Model Analyzing Multiscale Functional Brain Connectome Data for Attention Deficit Hyperactivity Disorder Detection.

Authors: Ming Chen; Hailong Li; Jinghua Wang; Jonathan R Dillman; Nehal A Parikh; Lili He
Journal: Radiol Artif Intell Date: 2019-12-11

3. A self-training deep neural network for early prediction of cognitive deficits in very preterm infants using brain functional connectome data.

Authors: Redha Ali; Hailong Li; Jonathan R Dillman; Mekibib Altaye; Hui Wang; Nehal A Parikh; Lili He
Journal: Pediatr Radiol Date: 2022-09-22

4. Artificial Intelligence in NICU and PICU: A Need for Ecological Validity, Accountability, and Human Factors.

Authors: Avishek Choudhury; Estefania Urena
Journal: Healthcare (Basel) Date: 2022-05-21

5. ConCeptCNN: A novel multi-filter convolutional neural network for the prediction of neurodevelopmental disorders using brain connectome.

Authors: Ming Chen; Hailong Li; Howard Fan; Jonathan R Dillman; Hui Wang; Mekibib Altaye; Bin Zhang; Nehal A Parikh; Lili He
Journal: Med Phys Date: 2022-03-14 Impact factor: 4.506

Review 6. Machine Learning in Neuroimaging: A New Approach to Understand Acupuncture for Neuroplasticity.

Authors: Tao Yin; Peihong Ma; Zilei Tian; Kunnan Xie; Zhaoxuan He; Ruirui Sun; Fang Zeng
Journal: Neural Plast Date: 2020-08-24 Impact factor: 3.599

7. Objective and Automated Detection of Diffuse White Matter Abnormality in Preterm Infants Using Deep Convolutional Neural Networks.

Authors: Hailong Li; Nehal A Parikh; Jinghua Wang; Stephanie Merhar; Ming Chen; Milan Parikh; Scott Holland; Lili He
Journal: Front Neurosci Date: 2019-06-18 Impact factor: 4.677