Maria Jalbrzikowski1, Fuchen Liu2, William Foran1, Lambertus Klei1, Finnegan J Calabro1,3, Kathryn Roeder2,4, Bernie Devlin1, Beatriz Luna5,6,7. 1. University of Pittsburgh, Pittsburgh, Pennsylvania, USA. 2. Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA. 3. Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania, USA. 4. Department of Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA. 5. Department of Psychiatry, University of Pittsburgh, Pittsburgh, Pennsylvania, USA. 6. Department of Psychology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA. 7. Department of Pediatrics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Abstract
Pioneering studies have shown that individual correlation measures from resting-state functional magnetic resonance imaging studies can identify another scan from that same individual. This method is known as "connectotyping" or functional connectome "fingerprinting." We analyzed a unique dataset of 12-30 years old (N = 140) individuals who had two distinct resting state scans on the same day and again 12-18 months later to assess the sensitivity and specificity of fingerprinting accuracy across different time scales (same day, ~1.5 years apart) and developmental periods (youths, adults). Sensitivity and specificity to identify one's own scan was high (average AUC = 0.94), although it was significantly higher in the same day (average AUC = 0.97) than 1.5-years later (average AUC = 0.91). Accuracy in youths (average AUC = 0.93) was not significantly different from adults (average AUC = 0.96). Multiple statistical methods revealed select connections from the Frontoparietal, Default, and Dorsal Attention networks enhanced the ability to identify an individual. Identification of these features generalized across datasets and improved fingerprinting accuracy in a longitudinal replication data set (N = 208). These results provide a framework for understanding the sensitivity and specificity of fingerprinting accuracy in adolescents and adults at multiple time scales. Importantly, distinct features of one's "fingerprint" contribute to one's uniqueness, suggesting that cognitive and default networks play a primary role in the individualization of one's connectome.
Pioneering studies have shown that individual correlation measures from resting-state functional magnetic resonance imaging studies can identify another scan from that same individual. This method is known as "connectotyping" or functional connectome "fingerprinting." We analyzed a unique dataset of 12-30 years old (N = 140) individuals who had two distinct resting state scans on the same day and again 12-18 months later to assess the sensitivity and specificity of fingerprinting accuracy across different time scales (same day, ~1.5 years apart) and developmental periods (youths, adults). Sensitivity and specificity to identify one's own scan was high (average AUC = 0.94), although it was significantly higher in the same day (average AUC = 0.97) than 1.5-years later (average AUC = 0.91). Accuracy in youths (average AUC = 0.93) was not significantly different from adults (average AUC = 0.96). Multiple statistical methods revealed select connections from the Frontoparietal, Default, and Dorsal Attention networks enhanced the ability to identify an individual. Identification of these features generalized across datasets and improved fingerprinting accuracy in a longitudinal replication data set (N = 208). These results provide a framework for understanding the sensitivity and specificity of fingerprinting accuracy in adolescents and adults at multiple time scales. Importantly, distinct features of one's "fingerprint" contribute to one's uniqueness, suggesting that cognitive and default networks play a primary role in the individualization of one's connectome.
Accumulating evidence from resting‐state functional magnetic resonance imaging (rsfMRI) indicates that brain network architecture is highly individualized [Gordon et al., 2017; Gratton et al., 2018; Laumann et al., 2015; Laumann et al., 2015]. Improving our understanding of individual‐level brain activity is leading to a mechanistic understanding of how neural activity contributes to individual differences. Furthermore, using individual‐level neuroimaging markers to reflect indicators of pathology could, in the future, significantly improve our ability to make informed clinical decisions. A first step toward achieving this “personalized neuroscience” is to understand normative variation in individual‐level features of brain activity, particularly through a developmental period when psychiatric illness typically emerges.In participants with high‐quality, densely sampled neuroimaging information, within‐subject variability explained approximately one third of the variance in the data and is similar in magnitude to the explanatory power of the average canonical network structure (Gratton et al., 2018; Power et al., 2011; Yeo et al., 2011). Using alternative methods, others have obtained similar estimates of intra‐ and inter‐subject variation in rsfMRI data (Betzel et al., 2019; Mueller et al., 2013). While these studies have produced important insights, such large amounts of neuroimaging data (~5–84 hr of data per individual, see (Gratton et al., 2019) for a review) may be infeasible for clinical samples.In a related line of research, several studies show that individual patterns from one scan can identify another scan from that same individual at a high level of accuracy (Finn et al., 2015; Horien, Shen, Scheinost, & Constable, 2019; Miranda‐Dominguez et al., 2014). This method is known as “connectotyping” (Miranda‐Dominguez et al., 2014) or functional connectome “fingerprinting” (Finn et al., 2015). This analytic framework is effective with smaller amounts of rsfMRI data (3–15 min). Understanding the individual‐level network measurement characteristics in a normative sample could provide a foundation for later parsing out meaningful differences in behavior and/or pathology.Here, we aimed to build on initial findings indicating individualized brain functional architectonics (Finn et al., 2015; Miranda‐Dominguez et al., 2014; Waller et al., 2017; Xu et al., 2016) to characterize how and which brain patterns are unique to an individual and to what extent this pattern is stable or changes over time. First, we assessed the sensitivity of fingerprinting measures, that is, the likelihood of obtaining true positives, and the specificity of these measures, that is, the probability of distinguishing true negatives (Florkowski, 2008; Youngstrom, 2014). To determine the specificity and sensitivity of fingerprinting, we applied a classification procedure. Scans from the same individual were viewed as positive pairs while all the others were considered negative pairs. The area under the curve (AUC)‐receiver operating characteristics (ROC) curve was utilized as a performance measurement to identify the sensitivity (true positive rate) and specificity (1‐false positive rate) of fingerprinting at different thresholds and time scales. To measure group‐level influences, we first identified predictive edges from a discovery sample and then examined to what extent these edges improved fingerprinting in a replication sample.The majority of psychiatric disorders emerge during adolescence (Paus, Keshavan, & Giedd, 2008), a period of remarkable neuroplasticity and change (Larsen & Luna, 2018; Luna, Marek, Larsen, Tervo‐Clemmens, & Chahal, 2015; Murty, Calabro, & Luna, 2016). Thus, is it important to establish individualized brain markers of fingerprinting accuracy in this period and contrast them to that of adults. The extent to which there are age‐associated differences in fingerprinting measures is an open question, given inconsistent findings in the literature (Demeter et al., 2019; Horien et al., 2019; Kaufmann et al., 2017). One study used a combination of resting‐state and task‐based MRI scans to assess fingerprinting accuracy from same day scans (Kaufmann et al., 2017). They found that fingerprinting accuracy significantly improved from late childhood through adulthood (Kaufmann et al., 2017). However, others found that fingerprinting accuracy is not affected by age when scans were months to year apart (Horien et al., 2019); furthermore, fingerprinting accuracy in pediatric and adult samples is quite similar within a 5–18 month time frame (Demeter et al., 2019). Insofar as we are aware, there has not yet been a direct comparison of scans completed on the same day versus those completed much later (i.e., a year), and to what extent this direct comparison changes or remains the same across adolescence. It is possible that the neuroplasticity observed in resting state scans during adolescence (Calabro, Murty, Jalbrzikowski, Tervo‐Clemmens, & Luna, 2019; Jalbrzikowski et al., 2017) and/or motion artifact known to be more predominant in youth (Power et al., 2011; Satterthwaite et al., 2012); reduces the ability to accurately identify an individual's scan. Alternatively, “functional fingerprinting” of an individual's resting state scan could be robust to these changes.In our discovery sample, we leveraged a unique, two‐time point data set that had two resting state scans from an individual conducted on the same day (Visit 1, V1), and two resting state scans from the same individual collected on the same day 12–18 months later (Visit 2, V2; henceforth 1.5 years). We then assessed the level of sensitivity and specificity of fingerprinting accuracy; compared whether it is as stable for the same day as it is 1.5 years later. We also determined if sensitivity and specificity at these different time scales were similar in youths and adults. We used multiple statistical methods to determine connections that are “predictive” of individuals' scans, reflecting one's uniqueness. Finally, we explored how these edges performed in a replication sample.
Participants
The final discovery sample consisted of 140 participants (1–2 visits, mean time between visits: 18 months, range of time between visits: 17–25 months). To test the generalizability of the predictive edges identified in our discovery sample, we tested the extent to which the previously identified features from each method improved fingerprinting accuracy in a replication sample with longitudinal data (N = 208, 1–3 visits, mean time in between visits: 20 months, range of time between visits: 12–50 months).All participants were recruited from the greater Pittsburgh metro area. Participants and their first‐degree relatives did not have a psychiatric disorder, as determined by phone screen and a clinical questionnaire. Exclusion criteria for all participants included any drug use within the last month, history of alcohol abuse, medical illness affecting the central nervous system function, IQ lower than 80, a first‐degree relative with a major psychiatric disorder, or any MRI contraindications. There are previous publications using rsfMRI data from individuals within the replication sample that address separate questions (Jalbrzikowski et al., 2019; Jalbrzikowski, Murty, Tervo‐Clemmens, Foran, & Luna, 2019; Marek, Hwang, Foran, Hallquist, & Luna, 2015).Demographic information for both samples is reported in Table 1. Detailed inclusion/exclusion criteria are reported in the Supplemental Material.
TABLE 1
Participant Information for discovery and replication samples
Discovery sample
Pre‐task
Post‐task
N
M/F
Mean age (SD)
Age range
Mean FD (SD)
N
M/F
Mean age (SD)
Age range
Mean FD (SD)
Visit 1
140
67/73
19.7 (5.0)
12.0–31.0
0.24 (0.1)
126
57/69
20.1 (4.9)
12.0–31.0
0.25 (0.1)
Visit 2
93
48/45
22.2 (5.0)
13.5–32.6
0.22 (0.1)
86
45/41
22.3 (4.9)
13.5–32.6
0.23 (0.1)
Note: Detailed exclusion criteria are provided in Figure S1.
Participant Information for discovery and replication samplesNote: Detailed exclusion criteria are provided in Figure S1.
MR data acquisition: Discovery sample
Data were acquired using a Siemens 3 Tesla mMR Biograph with a 12‐channel head coil. Subjects' heads were immobilized using pillows placed inside the head coil, and subjects were fitted with earbuds for auditory feedback to minimize scanner noise. For each rsfMRI run, we collected 8 min of resting‐state data, eyes open. Resting state data were collected using an echo‐planar sequence sensitive to BOLD contrast (T2*). rsfMRI parameters were Repetition Time/Echo Time = 1500/30.0 ms; flip angle = 50°; voxel size = 2.3 × 2.3 × 2.3 mm. Structural images were acquired using a T1 weighted magnetization‐prepared rapid gradient‐echo (MPRAGE) sequence (TR/TE = 2300/2.98 ms; flip angle = 9°; voxel size = 1.0 × 1.0 × 1.0 mm).Participants completed a unique two‐visit scan protocol. In the first visit, individuals participated in a MRI protocol (Visit 1, V1) that included two rsfMRI runs (Pre‐Task, Post‐Task), with an fMRI reward learning task (~40 min) conducted between these two runs. Approximately 1.5 years later, the same individuals returned and completed an identical MRI protocol (Visit 2, V2), which also included two rsfMRI runs (Pre‐Task, Post‐Task) separated by the same fMRI task. A visual depiction of the scan protocol, along with the respective names given to each run or scan, are presented in Figure 1.
FIGURE 1
Visualization of protocol set‐up. The colors refer to comparisons that are made throughout the manuscript (oranges: same day comparisons, blues: 1.5 year comparisons)
Visualization of protocol set‐up. The colors refer to comparisons that are made throughout the manuscript (oranges: same day comparisons, blues: 1.5 year comparisons)
MR data acquisition: Replication sample
Scan parameters for the replication sample are detailed in Supplementary text.
rsfMRI processing
First, we performed simultaneous slice‐timing and motion correction of all functional images. Then, we implemented wavelet despiking to remove nonstationary events in the fMRI time series (Patel & Bullmore, 2015). Third, functional images were warped into MNI standard space using a series of affine and nonlinear transforms. Then, after normalization, functional images were spatially smoothed using a 5‐mm full width at half maximum Gaussian kernel. ICA‐Aroma was then implemented to remove any remaining motion artifacts (Pruim et al., 2015; Pruim, Mennes, Buitelaar, & Beckmann, 2015). Finally, to control nuisance‐related variability (Hallquist, Hwang, & Luna, 2013), we then conducted simultaneous multiple regression of nuisance variables and bandpass filtering at 0.009 Hz < f < 0.08. Nuisance regressors included were nonbrain tissue (NBT), average white matter signal, average ventricular signal, six head realignment parameters obtained by rigid body head motion correction, and the derivatives of these measures. NBT, average white matter, and average ventricular signal nuisance regressors were extracted using MNI template tissue probability masks (>95% white mater, >98% cerebral spinal fluid) (Fonov, Evans, McKinstry, & Almli, 2009).For all subjects, we calculated a quality control measure with respect to head motion, namely volume‐to‐volume frame displacement (FD). Consistent with recent developmental cognitive neuroscience publications (Bathelt, Johnson, Zhang, & Astle, 2019; Calabro et al., 2019; Hafeman et al., 2019; Li et al., 2019), subjects were removed from rsfMRI analyses if the average frame displacement across the run was >0.5 mm (N = 5).
Functional network parcellation
We applied a previously‐defined, functional connectome parcellation of 333 functional regions of interest (ROIs) across cortical structures (Gordon et al., 2016) to each participant's rsfMRI data (Figure 2a). This parcellation consists of 13 reliable rsfMRI networks, many of which have been identified in other studies, including the Frontoparietal, Default, and Visual networks (Glasser et al., 2016; Power et al., 2011; Shen, Tokoglu, Papademetris, & Constable, 2013). See Table S1 for a list of 13 networks and details about them (for each network: number of nodes, number of within‐connectivity edges, and number of between‐connectivity edges).
FIGURE 2
(a) After resting‐state fMRI data were processed, we extracted out the time series from an established parcellation and (b) calculated a correlation matrix for each individual and their respective scan visit. (c) For each scan at each visit, we stacked a vector from the upper diagonal of the correlation matrix. Each stacked vector represents a scan from one person. The stacked vectors could be a separate scan from the same individual or a separate scan from a different individual. We computed correlations between each vector for all possible pairs and nonpairs. (d) By varying the threshold of the correlation values to determine what was a true‐or false‐positive, we developed ROC curves for each comparison, with a total of four comparisons: same day visit 1, same day visit 2, 1.5 years apart pre‐task 1.5 years apart post‐task. We then used DeLong's method to compare the ROC curves. (e) For each of the four comparisons, we compared the ROC curves of youths versus adults
(a) After resting‐state fMRI data were processed, we extracted out the time series from an established parcellation and (b) calculated a correlation matrix for each individual and their respective scan visit. (c) For each scan at each visit, we stacked a vector from the upper diagonal of the correlation matrix. Each stacked vector represents a scan from one person. The stacked vectors could be a separate scan from the same individual or a separate scan from a different individual. We computed correlations between each vector for all possible pairs and nonpairs. (d) By varying the threshold of the correlation values to determine what was a true‐or false‐positive, we developed ROC curves for each comparison, with a total of four comparisons: same day visit 1, same day visit 2, 1.5 years apart pre‐task 1.5 years apart post‐task. We then used DeLong's method to compare the ROC curves. (e) For each of the four comparisons, we compared the ROC curves of youths versus adultsFor each participant, we computed Pearson's correlation of each ROI's time series with that of every other ROI, producing a 333 × 333 correlation matrix (Figure 2b). The upper diagonal of the correlation matrix for each individual was stacked into a vector (55,278 edges), and each vector was normalized to a mean of zero and variance of one (Figure 2c). We performed this procedure for each subject's rsfMRI run, resulting in four normalized vectors for the majority of participants. The correlation between any two of these normalized vectors, and , was their dot product:where is the total number of edges and is an individual edge.
Identification accuracy
Next, we sought a classifier to identify resting state fMRI‐measured connectomes that “match”; ideally these would be connectomes from the same subject. To do so, we first computed the correlation of normalized vectors, and , for all possible pairs of subjects, as described above (Figure 2c). The classifier seeks a threshold t for these correlation values that yields a high rate of true positive identification, namely connectomes from the same subject, while minimizing the number of false positive identifications, connectomes from different subjects labeled as from the same subject. We varied t from interval of zero to one to create a receiver operating characteristic (ROC) curve. We then estimated the area under the curve (AUC) to determine the accuracy of the classifier (Figure 2d). The t that maximized true positive rate‐false positive rate (TPR‐FPR) was chosen for reporting. ROC curves were generated for the entire sample for (a) same day identification accuracy (Pre‐Task vs. Post‐Task) and (b) identification accuracy 1.5 years apart (V1 vs. V2). To compare identification accuracy for same day versus 1.5 years later, we compared ROC curves to each using DeLong's test for two ROC curves ((DeLong, DeLong, & Clarke‐Pearson, 1988; Robin et al., 2011), Figure 2d).To determine whether fingerprinting accuracy was affected by age, we split the entire sample of 12–30 year olds by the median age (20.4 years), considering the participants “youths” if they were under the median age (V1 N = 70) or “adults” if they were over the median age (V1 N = 70). We then calculated ROC curves for youths and adults separately for (a) same day fingerprinting (Pre‐Task vs. Post‐Task) and (b) fingerprinting 1.5 years apart (V1 vs. V2). To test for significant differences in identification between youths and adults, we used DeLong's test for two ROC curves (Figure 2e). We also assessed the effects of sex on fingerprinting accuracy by calculating ROC curves for each sex separately for each visit and session.
Identification of predictive edges
Next, we sought to determine which edges contribute most to fingerprinting accuracy. We made four comparisons to assess identifiability of each subject (V1, pre vs. post; V2, pre vs. post; V1 pre vs. V2 pre and V1 post vs. V2 post) and the same comparisons among nonpairs, resulting in 98,790 comparisons (same subject pairs = 532, nonpairs = 98,258). Because there was an overrepresentation of nonpairs (two scans, one from individual s and another from individual j), we used synthetic minority over‐sampling technique to reduce bias when selecting the predictive features (Chawla, Bowyer, Hall, & Kegelmeyer, 2002) and selected 532 pairs and 1,596 nonpairs. This method uses Euclidean distance to select nonpairs closest to the pairs (Chawla et al., 2002). We then split the discovery sample into a training (2/3 of the data: 355 pairs, 1,064 nonpairs) and test set (1/3 of the data: 177 pairs, 532 nonpairs). The terms of the dot product (i.e., from 52,278 edges) for each comparison were the input features.
Finn method
We used a recently developed method to calculate the most predictive edges (Finn et al., 2015). We then ranked all edges from most predictive to least predictive and we successively decreased the threshold of “most predictive” edges to develop the ROC curve. Through cross‐validation in the training set, we determined that the optimal TPR‐FPR rate was when we included the 5 % “most predictive” edges. Then, in the test data set, we selected only these predictive edges and calculated a correlation between each scan and all other scans (sum of the dot product). We then used the correlation threshold to develop the ROC curve in the test data set.
Support vector‐machine learning and elastic net regression methods
Both of these methods are common tools for model selection. Here, the goal was to choose a set of predictive edges that differentiated same‐subject pairs from other pairs. For both methods, the input information was the terms of the dot product between two scans. We developed optimal tuning parameters in the training data set and obtained weights for the selected edges. Elastic net regression was implemented using R package glmnet (Friedman, Hastie, & Tibshirani, 2009). The support‐vector machine (SVM) analysis was implemented in R package, sparseSVM (Yi & Zeng, 2018), with an elastic net penalty. For each method described below, we used 10‐fold cross validation to identify the number of edges that gives the highest AUC and the tuning parameters were chosen from this procedure. We set and chose the penalty parameter ( with the minimal cross‐validation error. Within the discovery sample, we then applied the model developed in the training set to the test data set. By weighting the individual product of the selected edges, we obtained a predicted value, , for which ranges from −1 to 1. We developed ROC curves by changing the threshold of to determine what constituted a “pair” in the test data.
Assessing over‐representation of network connectivity in predictive edges
To identify network representation in the predictive edges for each method identified above, we conducted X2 tests to assess whether there was over‐representation of (a) between (e.g., Frontoparietal‐Default edges; the off‐diagonal edges) or within‐network connectivity (e.g., Frontoparietal‐Frontoparietal edges; the block structure on the diagonal), (b) specific within‐network connectivity networks, and/or (c) distinct between‐network connections. We used the standardized residuals (>3.0) to determine networks that contributed to one's uniqueness.
Performance of predictive edges in replication sample
To test the generalizability of the predictive edges identified in our discovery sample, we tested the extent to which the previously identified features from each method improved fingerprinting accuracy in a replication sample with longitudinal data.
Effects of possible confounds
Because motion is a well‐known confound in rsfMRI studies of development (Satterthwaite et al., 2012), we calculated multiple measures of motion: framewise displacement (FD) as described by Power et al. (Power, Barnes, Snyder, Schlaggar, & Petersen, 2012), as well as mean head displacement, maximum head displacement, the number of micromovements (N > 0.1 mm), and head rotation as described by Van Dijik et al. (Van Dijk, Sabuncu, & Buckner, 2012). We assessed whether motion differed between sessions or visits by running within‐subjects t‐tests of the motion variable between session (pre‐, post‐task) and visit (V1, V2). To assess the effects of age on motion, we conducted an independent samples t test between group (youths vs. adults) for each session and visit.Because “same day” scans were acquired within the same MRI session, improved same day fingerprinting accuracy could be due to better‐quality registration from the same scan session. A subset of the participants (ages 18–30 years, N = 76) participated in an additional scan on the same day, but in a different scan session (i.e, the participant came out of the MRI scanner and a few hours later participated in another MRI session). This subset of participants participated in a position emission tomography study that included an additional MRI scan with each visit. Classification as detailed previously was performed, using the separate scans at each visit to predict identification accuracy from this additional MRI session. See Table S2 for details on this subset of participants.
RESULTS
Participant information for both visits is presented in Table 1. Participant information for youth and adult groups are reported in Table S3.
Same day versus 1.5 years later
Across the entire sample, identifiability on the same day was quite high (Average AUC: 0.94). Identification accuracy between scans 1.5 years apart was also high (Average AUC: 0.91, Figure 3a). However, same day accuracy was significantly higher than identification of scans 1.5 years apart (Figure 3a, Table 2A). Differences in fingerprinting accuracy on the same day compared to scans 1.5 years apart were observed for both youth (Figure 3b, Table 2B) and adult groups (Figure 3c, Table 2C).
FIGURE 3
(a) Across the entire sample, identification accuracy was higher for same day (orange) versus 1.5 years later (blue). This pattern of results remained when youth (b) and adults (c) were assessed separately. The asterisk in each for each curve refers to the point when of optimal trade‐off between sensitivity and specificity. AUC, area under the curve; Thr, threshold at which optimal true positive rate was obtained; Sen, sensitivity; Spec, specificity
TABLE 2
Identification accuracy same day and 1.5 years apart
Group
Same day
1.5 years apart
1.5 YR pre‐task
1.5 YR post‐task
D
p
D
p
A. Entire sample
Same day V1
4.1
2.70E−05
4.1
3.80E−05
Same day V2
2.7
.02
2.9
.003
B. Youths
Same day V1
3.2
.001
2
.04
Same day V2
2.7
.008
1.1
.24
C. Adults
Same day V1
2.8
.008
3.2
.001
Same day V2
1.8
.07
2
.04
Note: (A) Across the entire sample, same‐day identification accuracy was significantly higher than identification accuracy 1.5 years apart. The D‐value represents the test statistic for the respective comparison. This pattern remained when youth (B) and adults (C) were assessed separately.
(a) Across the entire sample, identification accuracy was higher for same day (orange) versus 1.5 years later (blue). This pattern of results remained when youth (b) and adults (c) were assessed separately. The asterisk in each for each curve refers to the point when of optimal trade‐off between sensitivity and specificity. AUC, area under the curve; Thr, threshold at which optimal true positive rate was obtained; Sen, sensitivity; Spec, specificityIdentification accuracy same day and 1.5 years apartNote: (A) Across the entire sample, same‐day identification accuracy was significantly higher than identification accuracy 1.5 years apart. The D‐value represents the test statistic for the respective comparison. This pattern remained when youth (B) and adults (C) were assessed separately.
Fingerprinting accuracy of youth versus adults
Same day accuracy was not statistically different between youths and adults. The two groups exhibited similar levels of same‐day identification accuracy (Table 3A). The same pattern of results emerged when comparing youths and adults on fingerprinting accuracy 1.5 years apart for three out of four of the comparisons (Table 3B). ROC curves are presented in Figure 3. We also found a similar pattern of results when we split youths and adults by the mean age (20.8 years, 5 participants changed age group membership) and when we considered individuals as “adults” at age 18 (21 participants changed age group membership). Finally, we obtained a comparable pattern of results when we tested fingerprinting accuracy in males and females separately (Table S4).
TABLE 3
Identification accuracy comparison between youths and adults
Comparison
Adults versus youth
D
p
A.
Same day V1
−1.7
.08
Same day V2
0.16
.87
B.
1.5 YR: Pre‐task V1 vs. pre‐task V2
−2.7
.006
1.5 YR: Pre‐task V1 vs. post task V2
−0.42
.67
1.5 YR: Post‐task V1 vs. post‐task V2
−0.7
.46
1.5 YR: Post‐task V1 vs. pre‐task V2
−1.7
.09
Note: (A) Same‐day fingerprinting accuracy is similar for youth and adults. (B) Fingerprinting accuracy 1.5 years apart is similar for youth and adults in three of four comparisons.
Identification accuracy comparison between youths and adultsNote: (A) Same‐day fingerprinting accuracy is similar for youth and adults. (B) Fingerprinting accuracy 1.5 years apart is similar for youth and adults in three of four comparisons.
All model selection methods tested improve identification accuracy in the test portion of the discovery sample
As seen in Figure 4a, the fingerprinting accuracy significantly improved when we used predictive edges selected by the Finn method, SVM, or Elastic Net to predict identification accuracy. The three model selection techniques performed similarly to one another (Table S5). When we used only the predictive edges identified by these methods to re‐assess fingerprinting accuracy in all previous comparisons, each method improved fingerprinting accuracy but did not change reported results. When we calculated an average, method‐driven threshold (average threshold of all methods) to determine identification accuracy, sensitivity and specificity were similar for Elastic Net and SVM, but sensitivity of the Finn method was lower (Table S6).
FIGURE 4
(a) When edge selection was performed via various methods (Finn method (green), Elastic Net (orange), and SVM (purple)), identification accuracy was significantly improved in comparison to using all edges (pink) for identification accuracy. (b) When we applied predictive edges previously identified in the training sample to an independent sample, all methods significantly improved identification accuracy. The asterisk in each for each curve refers to the point when of optimal trade‐off between sensitivity and specificity
(a) When edge selection was performed via various methods (Finn method (green), Elastic Net (orange), and SVM (purple)), identification accuracy was significantly improved in comparison to using all edges (pink) for identification accuracy. (b) When we applied predictive edges previously identified in the training sample to an independent sample, all methods significantly improved identification accuracy. The asterisk in each for each curve refers to the point when of optimal trade‐off between sensitivity and specificity
Predictive edges from discovery sample improves accuracy in replication sample
Similar to previous results (Finn et al., 2015; Waller et al., 2017), we found lower identification accuracy in a replication sample with more “standard” MRI parameters (e.g., longer TR, shorter length of scan, fewer number of head coils, Figure 4b, pink ROC curve). When we applied the most predictive edges from previously applied model selection techniques, however, fingerprinting accuracy significantly improved (Figure 4b, Table S7). When we omitted pairs from scans conducted on the same day in the discovery sample and focused our analyses on identifying predictive edges from pairs that were 1.5 years apart, we obtained the same pattern of results (Table S8). The weights for predictive edges from Elastic Net and SVM are provided in Tables S9 and S10.Optimal thresholds were determined separately in the discovery and validation sample prior to generating an AUC. We made this choice because of differences in the nature of the data. As shown in Figure S2, the distributions of the correlations between pairs were quite different in the two samples, precluding use of the selected discovery threshold for the validation threshold. Despite these differences, when the predictive edges derived from the discovery sample were applied to the validation sample, fingerprinting accuracy improved, suggesting that edges important for prediction are similar across samples.
Predictive edges are over‐represented in Frontoparietal, default, and dorsal attention networks
Figure 5a–c shows the relative contribution, normalized for number of edges in each network (or off/diagonal edge group) and number of edges determined to be predictive by each method. Across all methods, in comparison to between‐connectivity edges (e.g., Frontoparietal‐Visual connections, Table S11), within‐connectivity edges (e.g., Frontoparietal‐Frontoparietal connections) were relatively more important in predicting identification accuracy. More specifically, within‐connectivity edges from the frontoparietal, default, and dorsal attention networks drove this finding (Figure 5b). These networks had standardized residuals greater than 3.0 in all comparisons. With SVM and Elastic Net, edges from the Ventral Attention network were also important predictors for fingerprinting accuracy. A similar pattern emerged when we examined same‐day and 1.5‐year comparisons separately (Table S12) and in youth and adults separately (Tables S13 and S14).
FIGURE 5
(a) The ratio of predictive edges to nonpredictive edges in each network connection, normalized for total number of edges in each connection for the three different methods. Warmer colors on the heatmap indicate that edges from a particular network are more important for identification accuracy. (b) Within‐network connections that are particularly important for identification accuracy in all three methods examined. In all methods, within connectivity edges in Frontoparietal (yellow), Default (red), and Dorsal Attention (bright green) networks are considered predictive. In the Finn method and SVM, within connectivity connections in the Ventral Attention network were also predictive of identification accuracy. (c) Between‐network connections that are particularly important for identification accuracy. The colors around the circle reflect the different networks examined. Thicker bands of color indicate a greater number of edges from that particular network were considered predictive. The between connectivity edges (lines going across the circle) were randomly chosen from one of the two connected networks (e.g., Between network connectivity between the Default and Cingulo‐Opercular network is red, but between network connectivity between the Default and Ventral Attention network is green)
(a) The ratio of predictive edges to nonpredictive edges in each network connection, normalized for total number of edges in each connection for the three different methods. Warmer colors on the heatmap indicate that edges from a particular network are more important for identification accuracy. (b) Within‐network connections that are particularly important for identification accuracy in all three methods examined. In all methods, within connectivity edges in Frontoparietal (yellow), Default (red), and Dorsal Attention (bright green) networks are considered predictive. In the Finn method and SVM, within connectivity connections in the Ventral Attention network were also predictive of identification accuracy. (c) Between‐network connections that are particularly important for identification accuracy. The colors around the circle reflect the different networks examined. Thicker bands of color indicate a greater number of edges from that particular network were considered predictive. The between connectivity edges (lines going across the circle) were randomly chosen from one of the two connected networks (e.g., Between network connectivity between the Default and Cingulo‐Opercular network is red, but between network connectivity between the Default and Ventral Attention network is green)We also examined which between‐network connectivity edges were relatively important for fingerprinting accuracy. Across all three methods, connections between the Frontoparietal‐Default, Frontoparietal‐Dorsal Attention, Ventral Attention‐Cingulo‐opercular, and Frontoparietal‐Ventral Attention networks were overrepresented in comparison to other connections (Figure 5c). See Table S15 for standardized residuals from a χ2 test of between‐network connectivity and Table S16 for standard residuals from a χ2 test of over‐representation of all network connections. Similar patterns emerged when we examined same‐day and 1.5‐year comparisons separately (Table S17) and youth and adults separately (Table S18).
Testing effects of possible confounds
To ensure that the improved same‐day accuracy was not because individuals were in the same scan session and benefiting from improved MRI registration, we examined a subset of individuals (N = 76) who completed an additional MRI session on V1 and V2 after being taken out of the scanner and repositioned. Table S19 reports similar AUC, threshold levels, and sensitivity/specificity for identification accuracy within the same MRI session (V1 Pre‐Task predicting accuracy of V1‐Post‐Task) and different MRI session on the same day (e.g., V1 Pre‐Task predicting identification of extra session V1).When we compared head motion between sessions, one of two of the session comparisons (pre‐task V1 vs. post‐task V1) was statistically significant. Individuals had higher FD during post‐task V1 in comparison to pre‐task V1; t = −2.5, p = 0.01). One of two of the visit comparisons (post‐task V1 vs. post‐task V2) was also statistically significant. FD at post‐task V2 was lower in comparison to FD at post‐task V1 (t = 2.8, p = .006). Notable, while FD was elevated in post‐task V1 in comparison to other sessions/visits, the impact of visit and session on fingerprint accuracy was similar in other comparisons, suggesting that motion differences were not driving our results. A similar pattern of findings was observed in other motion metrics (Table S19).There were no statistically significant differences between youths and adults at any session or visit in FD, mean head displacement, maximum head displacement, or number of micromovements (all p ≥ .05). Results are reported in Table S19.To ensure that our results were not driven by parcellation choice, we also re‐ran all analyses after extracting ROIs from two separate parcellations (Power et al., 2011; Shen et al., 2013) and obtained a similar pattern of results.
DISCUSSION
For neuroimaging to have clinical utility, it is essential to understand what to expect from an individual's network characteristics in multiple contexts. Here we show that identification accuracy of one's resting state scan—how much it reflects a “functional fingerprint”—depends on the timespan between assessments. We provide supporting evidence that adolescents have similar levels of fingerprinting accuracy to adults when visits are years apart (Horien et al., 2019; Miranda‐Dominguez et al., 2018) and extend this literature to show that this finding is consistent on a same‐day visit. Furthermore, we used multiple methods to identify a small number of edges consistently predictive of an individual's scan. These edges are more likely to be in the Frontoparietal, Default, and Dorsal Attention networks. We identified these edges in a discovery sample and then used these edges to improve identification accuracy in a replication sample. We propose that particular edges in the Frontoparietal, Default, and Dorsal Attention networks contribute to an individual's “uniqueness” and are similar in youths and adults. These results bring us a fuller understanding of functional networks in the human brain.
Stability of identification accuracy across time
Our results indicate a high level of subject identification accuracy, even after 18 months. However, greater time intervals between scans incurred a significant decrease in identification. This result provides compelling evidence that there are extant foundational properties of resting state network organization that are persistent and specific to each individual. The significant degradation in identification after 18 months was not driven by registration. Reduced fingerprinting accuracy with time could reflect greater noise between two scans; alternatively, or in addition to, the small degradation in identification accuracy could reflect inherent plasticity in network organization in both youths and adults.
Networks that underlie prediction
We found evidence that identification was driven by edges particularly in the Frontoparietal, Dorsal Attention, and Default mode networks. These networks were consistently identified by different analytical approaches. This is a striking result that identifies networks critical for higher‐order cognitive processing and endogenous self‐referential processing. Thus, these results provide suggestive evidence that how we engage in foundational cognitive and endogenous processes contributes to individuality. This finding also has the potential to inform our understanding of how altered development contributes to the onset of psychopathology, as these higher‐order cognitive networks are often altered in psychiatric disorders.
Predictive edges from the discovery sample improve fingerprinting accuracy in the validation sample
We also show that predictive edges identified in one sample can be applied to an independent sample to improve identification accuracy, even when the validation sample has somewhat different properties than the discovery sample (Figure S2). Sets of edges making dominant contributions to fingerprinting accuracy vary among individuals. Though the original goal of fingerprinting accuracy was, in part, to identify the specific predictive edges to identify specific individuals (Finn et al., 2015), we implemented a different approach. In dermal fingerprinting, ~30 islets or forks on the ridges or “identification points” are used to demonstrate uniqueness (Galton, 1892; Stigler, 1995). It is highly unlikely that any other individual will have the same combination of these distinct ridges. By showing predictive edges from one data set improve fingerprinting accuracy in an independent data set, we demonstrate that a similar phenomenon is taking place with rsfMRI fingerprinting.Multiple methods performed quite well in the discovery sample; however, as is consistent with the literature, a reduction in sensitivity and specificity in the replication sample suggests some over‐fitting in this sample. On the other hand, our replication sample is a strength of our study, because the majority of fingerprinting accuracy studies have not tested the utility of predictive edges in another data set. The robustness of these results demonstrates that particular edges carry the most information for identifiability. This is a first, important step for understanding how fingerprinting accuracy can be used in a clinical context. However, as observed in the two separate samples, the distributions of correlations for pair identification between the two samples are remarkably different (Figure S2). In line with other fields of medicine that use a biological measure to detect risk or disease status (Greenwood et al., 2012), in the future, it will be important to determine the quality of the rsfMRI scans to ensure valid analyses.The lower sensitivity and specificity in our replication sample is consistent with reports of fingerprinting accuracy in lower quality rsfMRI protocols (Finn et al., 2015; Waller et al., 2017). A number of factors could be contributing to the degradation in fingerprinting accuracy, including a shorter scan time (Finn et al., 2015) or decreased resolution of the rsfMRI scan. Additionally, the discovery sample had higher resolution structural scans to assist with more accurate registration of the rsfMRI data. Partial volume effects could also be contributing this degradation in fingerprinting accuracy. In the future, it will be informative to test if incorporating methods to correct for partial volume effects (Dukart & Bertolino, 2014) will improve rsfMRI fingerprinting accuracy in less optimized rsfMRI protocols.
Identification accuracy is similar in youths and adults
We did not observe differences in the fingerprinting accuracy between youth and adults, for both same day and 1.5‐year comparisons. Indeed, others have recently found that identification accuracy is similar in youth and adults (Demeter et al., 2019; Horien et al., 2019). We extend these findings by showing accuracy is high both for same day and longer‐term (1.5 years) intervals in the same sample. Furthermore, because identification accuracy in both youth and adults was lower across a longer period of time and similar edges contributed to same day and 1.5‐year identification accuracy, our results suggest that this reduction is not solely due to known developmental changes. Significant cognitive development occurs through adolescence (Larsen & Luna, 2018; Luna et al., 2015; Steinberg, 2005), in the context of evidence for stability at the group level in network properties (Hwang, Hallquist, & Luna, 2013; Jalbrzikowski, Murty, et al., 2019; Marek et al., 2015; Marek et al., 2019). The stability in identification accuracy across development further supports that implication that network properties contain individualized foundational properties that define uniqueness.In one case, we did find that youths had significantly worse identification accuracy in comparison to adults (Table 3B: 1.5 YR: Pre‐task V1 vs. Pre‐task V2) when the scans were 1.5 year apart. However, in three of four similar relevant comparisons, we did not observe differences in fingerprinting accuracy between youths and adults. In this particular comparison (i.e., Pre‐Task V1 to Pre‐Task V2), reduced accuracy in these youths was driven by the lower identification accuracy in the pre‐task scans: we speculate that youths are more variable and excitable when first getting in a scanner, perhaps because they have had less experience with life events akin to an MRI scan than do adults. Furthermore, we know that identification accuracy is reduced with increased head motion (Horien et al., 2018), and youth have greater levels of head motion in comparison to adults (Satterthwaite et al., 2012). However, in our sample, we did not see a statistical difference between head motion in these two groups and our results remain stable when we use more conservative framewise displacement thresholds (FD < .3).
An improved statistical framework for identification accuracy
We show that edges important for identification accuracy are similar across the different methods used to identify them. Furthermore, predictive edges identified in one sample can be applied to an independent sample to improve identification accuracy. The robustness of these results demonstrates that particular edges carry the most information for identifiability.We also believe this statistical framework improves upon the majority of previous methods used in this area. Previous methods, for instance, presume there is a “match” for each respective scan in the data set (i.e., each individual has at least two scans in the pool of available data) and do not consider false positives. Furthermore, given that identifiability of an individual decreases as the sample size increases (Waller et al., 2017), it is important to account for sample size in the model. Additionally, while many studies show that the identification test metric for individual identifiability is significantly greater than would be expected by chance, it is difficult to know how meaningful this metric is when identification accuracy is in the range of ~40–60% (e.g., (Horien et al., 2019)). Moreover, we were interested in understanding how feature selection approaches (e.g., elastic net and SVM) compare with the original method presented by Finn et al. (2015). Similar to our approach, SVM was recently applied to better understand genetic similarity in fingerprinting accuracy (Demeter et al., 2019). Finally, we assured that our statistical procedures were both replicable and generalizable to a replication data set. We (a) trained a portion of our data to identify predictive edges (i.e., 75% of discovery sample, the training data), (b) assessed the performance of the training set in a test portion of the training data (i.e., 25% of discovery sample, the test data), and (c) determined the generalizability of our results in an independent replication sample.
A framework for future investigations in psychiatric research
We also provide a statistical framework that can later be used to assess the clinical utility of identification accuracy. Viewing identification accuracy of rsfMRI scans as a classification problem is useful to other relevant questions in neuropsychiatric research. To improve early identification and detection of those at risk for psychiatric disorders, we need to answer questions such as, do people with similar connectivity profiles share common psychiatric features? Does reduced accuracy in functional connectome fingerprinting indicate increased risk for developing a psychiatric disorder? These questions all fall within the realm of a classification problem, and the framework we present can be applied to relevant data to answer these questions.There is work showing that distinct fingerprinting patterns map onto particular phenotypes or outcomes of interest. While two studies report that fingerprinting accuracy is reduced in psychiatric samples (Kaufmann et al., 2017; Kaufmann et al., 2018), these were cross‐sectional and relied upon comparison of already established clinical phenotypes (i.e., level of fingerprinting accuracy did not determine a future variable of interest). In the future, it will be important to use a sensitivity‐specificity analytic framework to test our ability to identify sub‐groups of individuals, such as adolescents who eventually go on to develop a psychiatric disorder or to predict treatment response in a group of patients.
Limitations
As with any study, there are limitations in our present design. First, we split our sample by the median age and identified the two groups as “youths” and “adults.” This approach could obscure subtle differences in identification accuracy that occur across adolescent development. We chose this approach because identification accuracy increases with smaller sample sizes, and our sample size would be quite small if we split our sample into more age groups, as we have done in previous publications. Another approach could be to view age as a continuous variable; however, this imposes a strong linear assumption, when we know that development through adolescence is curvilinear in nature as stability is reached. In the future, examining identification accuracy in large samples of youth (i.e., the adolescent brain cognitive development study, (Casey et al., 2018)) in comparison to large samples of adults (e.g., Human Connectome Project, (Van Essen et al., 2013)) may prove to be the most fruitful in terms of more fully understanding identification accuracy across development. Additionally, although we made all attempts remove noise in our processing steps (e.g., wavelet despiking and ICA‐Aroma), we did not adjust for physiological noise (e.g., cardiac and respiratory cycles) in our analyses. Thus, we cannot rule out the possibility that this source affected our results. Physiological noise could be obscuring results in our data; it is even possible it could be generating part of the signal. In the future, it will be important to test fingerprinting accuracy after regressing out physiological influences or implementing analytic approaches to account for this noise in the data (Aslan, Hocke, Schwarz, & Frederick, 2019).
CONCLUSIONS AND FUTURE DIRECTIONS
In this study, we showed that identification accuracy is high in both youths and adults at both short (same‐day) and extended (1.5 years apart) periods of time. Importantly, our results suggest that networks properties have an individualized foundational characterization that is inherent to individuation, with some room for flexibility in expression. In the future, we predict that combined use of group‐ and individual level data will become the sine qua non for identifying meaningful relationships between brain and behavior.Appendix S1. Supporting InformationClick here for additional data file.Table S9
See text file, Elastic Net weights obtained from training sample.Click here for additional data file.Table S10
See text file, SVM weights obtained from training sample.Click here for additional data file.Table S15
See text file, Standardized residuals when testing over‐representation of all between‐network connections.Click here for additional data file.Table S16
See text file, Standardized residuals when testing over‐representation of all network connections.Click here for additional data file.
Authors: Tobias Kaufmann; Dag Alnæs; Nhat Trung Doan; Christine Lycke Brandt; Ole A Andreassen; Lars T Westlye Journal: Nat Neurosci Date: 2017-02-20 Impact factor: 24.884
Authors: Danella M Hafeman; Henry W Chase; Kelly Monk; Lisa Bonar; Mary Beth Hickey; Alicia McCaffrey; Simona Graur; Anna Manelis; Cecile D Ladouceur; John Merranko; David A Axelson; Benjamin I Goldstein; Tina R Goldstein; Boris Birmaher; Mary L Phillips Journal: Neuropsychopharmacology Date: 2018-11-08 Impact factor: 7.853
Authors: Timothy O Laumann; Evan M Gordon; Babatunde Adeyemo; Abraham Z Snyder; Sung Jun Joo; Mei-Yen Chen; Adrian W Gilmore; Kathleen B McDermott; Steven M Nelson; Nico U F Dosenbach; Bradley L Schlaggar; Jeanette A Mumford; Russell A Poldrack; Steven E Petersen Journal: Neuron Date: 2015-07-23 Impact factor: 17.173
Authors: B J Casey; Tariq Cannonier; May I Conley; Alexandra O Cohen; Deanna M Barch; Mary M Heitzeg; Mary E Soules; Theresa Teslovich; Danielle V Dellarco; Hugh Garavan; Catherine A Orr; Tor D Wager; Marie T Banich; Nicole K Speer; Matthew T Sutherland; Michael C Riedel; Anthony S Dick; James M Bjork; Kathleen M Thomas; Bader Chaarani; Margie H Mejia; Donald J Hagler; M Daniela Cornejo; Chelsea S Sicat; Michael P Harms; Nico U F Dosenbach; Monica Rosenberg; Eric Earl; Hauke Bartsch; Richard Watts; Jonathan R Polimeni; Joshua M Kuperman; Damien A Fair; Anders M Dale Journal: Dev Cogn Neurosci Date: 2018-03-14 Impact factor: 6.464
Authors: Biao Cai; Gemeng Zhang; Aiying Zhang; Li Xiao; Wenxing Hu; Julia M Stephen; Tony W Wilson; Vince D Calhoun; Yu-Ping Wang Journal: Hum Brain Mapp Date: 2021-04-09 Impact factor: 5.038