Aliza P Wingo1,2, Wen Fan3, Duc M Duong4, Ekaterina S Gerasimov3, Eric B Dammer4, Yue Liu3, Nadia V Harerimana3, Bartholomew White5, Madhav Thambisetty6, Juan C Troncoso5, Namhee Kim7, Julie A Schneider7, Ihab M Hajjar3, James J Lah3, David A Bennett7, Nicholas T Seyfried8, Allan I Levey9, Thomas S Wingo10,11. 1. Division of Mental Health, Atlanta VA Medical Center, Decatur, GA, USA. aliza.wingo@emory.edu. 2. Department of Psychiatry, Emory University School of Medicine, Atlanta, GA, USA. aliza.wingo@emory.edu. 3. Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA. 4. Department of Biochemistry, Emory University School of Medicine, Atlanta, GA, USA. 5. Department of Pathology, Johns Hopkins School of Medicine, Baltimore, MD, USA. 6. Clinical and Translational Neuroscience Section, Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA. 7. Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, Illinois, USA. 8. Department of Biochemistry, Emory University School of Medicine, Atlanta, GA, USA. nseyfri@emory.edu. 9. Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA. alevey@emory.edu. 10. Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA. thomas.wingo@emory.edu. 11. Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA. thomas.wingo@emory.edu.
Abstract
Cerebral atherosclerosis contributes to dementia via unclear processes. We performed proteomic sequencing of dorsolateral prefrontal cortex in 438 older individuals and found associations between cerebral atherosclerosis and reduced synaptic signaling and between RNA splicing and increased oligodendrocyte development and myelination. Consistently, single-cell RNA sequencing showed cerebral atherosclerosis associated with higher oligodendrocyte abundance. A subset of proteins and modules associated with cerebral atherosclerosis was also associated with Alzheimer's disease, suggesting shared mechanisms.
Cerebral atherosclerosis contributes to dementia via unclear processes. We performed proteomic sequencing of dorsolateral prefrontal cortex in 438 older individuals and found associations between cerebral atherosclerosis and reduced synaptic signaling and between RNA splicing and increased oligodendrocyte development and myelination. Consistently, single-cell RNA sequencing showed cerebral atherosclerosis associated with higher oligodendrocyte abundance. A subset of proteins and modules associated with cerebral atherosclerosis was also associated with Alzheimer's disease, suggesting shared mechanisms.
Cerebral atherosclerosis (CA) is the accumulation of cholesterol-laden plaques in the walls of large arteries in the brain and ranges in severity from minor arterial wall thickening to significant luminal stenosis causing reduced blood flow and metabolism[1]. Its prevalence increases with age and the presence of vascular risk factors such as hyperlipidemia, hypertension, diabetes, and smoking[1]. It is a common condition and present in 7% of asymptomatic middle-aged individuals and 82% of individuals over 80 years old[1,2]. CA and its consequences are among the earliest changes associated with development of Alzheimer’s disease dementia (henceforth referred to as AD)[3-6] and with approximately three times higher risk for AD and nine times higher risk for vascular dementia[2,7-9]. However, we have limited insights into molecular effects of CA on the human brain, and how these contribute to dementia.To identify brain proteomic changes associated with CA, we performed proteomic sequencing of the dorsolateral prefrontal cortex of 438 brains with detailed neuropathological assessment of nine age-associated brain pathologies – β-amyloid, neurofibrillary tangles, CA, gross infarcts, microinfarcts, cerebral amyloid angiopathy (CAA), TDP-43, Lewy body, and hippocampal sclerosis – to identify the proteomic signature of CA independently of the eight other measured pathologies. The discovery dataset were 391 subjects from the Religious Order Study and Memory and Aging Project (ROS/MAP; Supplementary Table 1), and the replication dataset were 47 subjects from the Baltimore Longitudinal Study of Aging (BLSA; Supplementary Table 2). CA burden ranged from none to severe, with a median of mild in both datasets. CA was correlated with high plasma low-density lipoprotein and triglyceride levels, lower high-density lipoprotein level, more gross infarcts, and lower β-amyloid burden (Supplementary Figure 1). CA was associated with lower white matter integrity (β = −3.0; adjusted p=0.01; N=236) in the frontal inferior periventricular and subcortical white matter based on voxel-wise R2, an index of brain tissue integrity[10], in 236 subjects with imaging data who were representative of the 391 subjects with brain proteomes (Supplementary Table 3).
Brain proteomic signature of cerebral atherosclerosis
We performed a proteome-wide association study (PWAS) of CA adjusting for the presence of the other 8 measured pathologies in the discovery dataset and found 114 proteins differentially expressed at proteome-wide adjusted p<0.05 (Figure 1A, Supplementary table 4). Proteins with positive effect sizes (e.g., QDPR) were more abundant, and conversely, those with negative effect sizes (e.g., UBXN7) were less abundant with greater CA. The 32 higher-abundance proteins in CA were enriched for oligodendrocyte cell-type specific markers and oligodendrocyte differentiation, development, and re-myelination (Figure 2). The 82 lower-abundance proteins in CA were enriched for RNA spliceosome and mRNA processing (Figure 2). Next, we examined whether these findings were sensitive to the presence of vascular risk factors (i.e., predisposing factors for CA including diabetes, hypertension, and smoking) or infarction (i.e., the potential consequence of CA). These sensitivity analyses indicate that the CA-associated proteomic changes above were independent of infarcts and vascular risk factors (Supplementary Tables 5–7 and Supplementary Figures 2–3).
Figure 1:
Differential protein expression in cerebral atherosclerosis (CA) and Alzheimer’s Disease (AD). This figure summarizes the proteome-wide association study (PWAS) of CA and AD in the discovery dataset (N=375) and gives the genes associated with both CA and AD. a) Volcano plot of the PWAS of CA plotting the effect sizes and p-values estimated from linear regression of each protein and CA adjusted for clinical covariates and 8 other measured pathologies. For each protein, points are either blue or red to reflect whether the protein was or was not significantly associated with CA at proteome-wide significant level, respectively. Labels are given as gene symbols and Uniprot IDs in parentheses for the top two associated proteins and for NEFL and NEFM. b) Volcano plot of the PWAS of AD plotting the effect sizes and p-values estimated from logistic regression of each protein and AD adjusted for clinical covariates. For each protein, points are either blue or red to reflect whether the protein was or was not significantly associated with AD at proteome-wide significant level, respectively. Labels are given as gene symbols and Uniprot IDs tested, in parentheses, for the top two associated proteins and for NEFL and NEFM. c) Gene symbols and the Uniprot ID of the proteins associated with both AD and CA at proteome-wide adjusted p <0.05.
Figure 2:
Enrichment analysis of proteins associated with cerebral atherosclerosis (CA). This figure shows the results of the gene set enrichment analysis (GSEA) for the 114 proteins associated with CA independently of the 8 other measured pathologies at proteome-wide adjusted p <0.05 in the discovery dataset (N=375). a) Proteins with significantly higher-abundance in cerebral atherosclerosis (n=32) were used for GSEA, and one-sided Z-scores above 2 were plotted for results of each of the five annotation databases used. b) Proteins with significantly lower-abundance in cerebral atherosclerosis (n=82) were used for GSEA and a one-sided Z-scores above 2 were plotted for results of each of the five annotation databases used.
Using weighted gene co-expression network analysis, we identified 31 protein co-expression modules in the discovery dataset (Figure 3A). Among these, five modules were associated with CA independently of the other 8 measured pathologies at adjusted p<0.05, and all five modules had lower abundance in greater CA (modules 3, 7, 9, 15, and 17; Figure 3C, Supplementary Table 8). Three modules were enriched for synaptic signaling, regulation, and plasticity (modules 3, 9, and 17), and two for mRNA splicing and mRNA processing (modules 7 and 15; Supplementary Table 8). Additionally, two modules were also enriched for oligodendrocyte cell-type specific markers (modules 3 and 17), two for neuron-specific markers (modules 7 and 15), and one for astrocyte-specific markers (module 15; Figure 3B, Supplementary Table 8).
Figure 3:
Protein co-expression network analysis for brain cell types and clinical and neuropathologic outcomes. This figure summarizes the protein co-expression modules and their associations with AD, β-amyloid, neurofibrillary tangles, and cerebral atherosclerosis in the discovery dataset (N=375). a) Dendrogram for 31 protein co-expression modules derived using WGCNA. b) Brain cell-type enrichment for each module. The heatmap shows the one-sided Z-test for each brain cell-type and module and significant enrichment is indicated by an asterisk. c) Results of association testing between modules and AD dementia, β-amyloid, neurofibrillary tangles, and cerebral atherosclerosis, respectively. The heatmap gives the beta values of the regression models for each outcome using the module as the predictor adjusted for relevant covariates as described in the text. For the three neuropathologic outcomes (i.e., β-amyloid, neurofibrillary tangles, and cerebral atherosclerosis), linear regression adjusted for 8 other neuropathologies and relevant covariates was used. Benjamin-Hochberg false discovery rate were used to adjust p-values and associations with adjusted p<0.05 are indicated with asterisk. No module was associated with β-amyloid after adjusting for all the other 8 measured pathologies. Nine modules were associated with tangles after adjusting for the 8 other measured pathologies at adjusted p<0.05. Five modules were associated with cerebral atherosclerosis after adjusting for the 8 other measured pathologies.
Together, the PWAS and protein co-expression network analysis highlight that CA was associated with i) more oligodendrocyte differentiation, development, and remyelination, ii) less RNA splicing and mRNA processing in neurons and astrocytes, and iii) decreased synaptic signaling, regulation, and plasticity, independently of the 8 other measured pathologies.To validate our findings that CA was associated with increased oligodendrocyte differentiation, development, and re-myelination, we used human brain single-cell RNA sequencing data from non-overlapping ROS/MAP subjects[11] to test whether CA was associated with higher oligodendrocyte abundance. Consistent with our findings, persons with greater CA had greater proportions of oligodendrocytes in the dorsolateral prefrontal cortex after adjusting for sex, age, and the 8 other measured pathologies (p=0.003, N=48, Extended Data Figure 1).
Extended Data Fig. 1
Association between cerebral atherosclerosis and oligodendrocyte abundance
This figure summarizes the association between CA and oligodendrocyte abundance estimated from single-cell nuclear RNA sequencing of human dorsolateral prefrontal cortex in an independent dataset. Higher cerebral atherosclerosis was associated with higher oligodendrocyte abundance in the dorsolateral prefrontal cortex after adjusting for sex, age, and 8 other measured pathologies using linear regression (p=0.003, β=0.077, N=48). For each box plot, the box reflects the first and third quartile, the dark horizontal line reflects the median, the vertical lines extending from the boxes show the 1.5x the interquartile range, and points beyond the lines are outliers.
To replicate the proteomic findings for CA, we examined the 114 CA-associated proteins in a replication dataset (BLSA cohort). We performed a targeted replication due to sample size difference between the datasets (391 versus 47) and technical differences in proteomic profiling (i.e., trypsin versus LysC) leading to 51% of the discovery dataset proteins being quantified in both datasets (4254 of 8356 proteins). All analyses in the replication dataset were adjusted for β-amyloid, tangles, gross infarcts, microinfarcts, CAA, sex, age at death, and cell type composition in the same fashion as was done for the discovery dataset. Of the 114 CA-associated proteins, 79 proteins were measured in the replication dataset. Of these 79 proteins, 42% (33 / 79) were replicated at adjusted p<0.05 (Supplementary Table 9).
Shared molecular processes of cerebral atherosclerosis and AD
We performed PWAS and protein co-expression network analysis of AD and its hallmark pathologies – β-amyloid and neurofibrillary tangles – respectively (Figure 1B, 3C; Supplementary Table 10–12; Supplementary Figure 4–6). Subsequently, we identified 23 proteins associated with both AD and CA independently of the other 8 measured pathologies at proteome-wide adjusted p<0.05 (Figure 1C; Supplementary table 13). All 23 proteins consistently had either higher abundance in both AD and CA, or lower abundance in both, supporting the notion that CA has a detrimental effect on cognition. Of these 23 proteins, 48% were hub proteins of protein co-expression modules (Supplementary table 13). Notably, seven were hub proteins of module 2, which had higher abundance in CA and was enriched for myelination and oligodendrocytes cell-type specific markers (Supplementary Table 8). Additionally, modules 3 and 9 were associated with both AD and CA independently of the other 8 measured pathologies, and both modules had reduced abundance in CA and AD. Module 3 was enriched for synaptic regulation and oligodendrocyte cell-type specific markers, and module 9 for synaptic signaling and plasticity (Supplementary Table 8). Together, these findings suggest CA likely contributes to AD through its effects linked to reduction in synaptic signaling, regulation, and plasticity, and increase in myelination within the grey matter, which was the brain region examined.We examined evidence for physical interaction and common cellular localization for the proteins in the modules associated with both CA and AD. We found substantial physical interactions between and among the proteins in modules 3 and 9 and the 23 differentially expressed proteins (Supplementary Figure 7, Supplementary Table 14). Additionally, many CA-associated proteins were localized in oligodendrocytes and pre- and post-synaptic regions of neurons (Supplementary Figure 7). Finally, we identified 13 non-redundant protein domains among these proteins suggesting shared physiological roles (Supplementary Table 15).We did not find a statistical interaction between CA and β-amyloid or tangles, suggesting that CA contributes to AD independently of the two AD hallmark pathologies, consistent with prior studies showing that vascular and β-amyloid and tau pathologies are independent and additive risk factors of cognitive decline[12-15].
NEFL and NEFM proteins are associated with cerebral atherosclerosis and AD
Among the 23 proteins associated with both AD and CA independently of the other 8 measured pathologies, NEFL and NEFM were noteworthy (Figure 1C). They are essential structural scaffolding proteins of neurons and released into CSF in many pathological processes that cause axonal damage[16]. Several studies found an association between CSF or blood NEFL level and AD diagnosis and progression[17-19]. Thus, we asked whether altered protein levels of NEFL and NEFM were specific to CA by performing a PWAS of each of the remaining pathologies adjusting for the other 8 measured pathologies in the discovery dataset. Strikingly, NEFL and NEFM were only associated with CA and not with β-amyloid, tangles, microinfarcts, gross infarcts, CAA, Lewy body, hippocampal sclerosis, or TDP-43 (Extended Data Figure 2A-E). We replicated the association between NEFL and NEFM proteins and CA in the replication dataset (Supplementary Table 16). Likewise, we replicated the associations between NEFL and NEFM protein abundances and AD in the replication dataset (Supplementary Table 16). NEFL and NEFM proteins showed a step-wise increase in abundance with worsening cognition from controls to MCI to AD. Notably, the associations between NEFL and NEFM proteins and AD were no longer significant when we adjusted for CA in the regression models, suggesting that CA mediates the association between NEFL, NEFM and AD. Together, these findings suggest NEFL and NEFM likely contribute to AD through CA and further studies are needed to elucidate the underlying biology of this association.
Extended Data Fig. 2
Association between NEFL or NEFM brain protein and cerebral atherosclerosis, AD dementia.
a) Boxplot of dorsolateral prefrontal cortex NEFL protein level for each level of CA (i.e., none to severe). NEFL protein levels were identified as associated with CA using linear regression adjusted for relevant covariates and 8 other pathologies (N=375; p=0.0002; adjusted p=0.034; Figure 1). b) Boxplot of dorsolateral prefrontal cortex NEFL protein level by clinical diagnosis (i.e., cognitive normal control [control], mild cognitive impairment [MCI], and AD dementia). MCI is generally regarded as an intermediate stage between cognitively normal and AD. NEFL protein levels were identified as associated with AD dementia by logistic regression adjusted for relevant covariates (N=383; p=0.0025; adjusted p=0.0322; Figure 1). c) Boxplot of dorsolateral prefrontal cortex NEFM protein level for each level of CA (i.e., none to severe). NEFM protein levels were identified as associated with CA using linear regression adjusted for relevant covariates and 8 other pathologies (N=375; p=0.0006; adjusted p=0.045). d) Boxplot of dorsolateral prefrontal cortex NEFM protein level by clinical diagnosis (i.e., cognitive normal control [control], mild cognitive impairment [MCI], and AD dementia). NEFM protein levels were identified as associated with AD dementia by logistic regression adjusted for relevant covariates (N=383; p=0.0019; adjusted p=0.028). For each boxplot, the box reflects the first and third quartile, the dark horizontal line reflects the median, the vertical lines extending from the boxes show the 1.5x the interquartile range, and points beyond the lines are outliers. e) Proteome-wide adjusted p-values for associations between NEFL and NEFM brain protein levels and each of the 9 measured pathologies adjusting for the remaining 8 measured pathologies. Associations were tested with regression and the provided p-values were adjusted for multiple testing using Benjamin-Hochberg false discovery rate.
In summary, CA has proteomic effects on the human brain and contributes to AD. Furthermore, since CA has important and early contribution to AD development[3,4], surrogate brain biomarkers for CA might be promising and sensitive for early AD detection.
METHODS
Study design and participants in the discovery dataset (ROS/MAP cohorts)
A total of 391 participants were from two longitudinal clinical-pathologic cohort studies of aging and Alzheimer’s disease - Religious Orders Study and Rush Memory and Aging Project (ROS/MAP) cohorts (Supplementary Table 1)[20]. These are participants with proteomic data that passed quality control and were included in our proteomic analyses. ROS is comprised of older Catholic priests, nuns, and monks throughout the USA. MAP recruited older lay persons from the greater Chicago area. Both studies involve detailed annual cognitive and clinical evaluations and brain autopsy. Participants provided informed consent, signed an Anatomic Gift Act and a repository consent to allow their data and biospecimens to be repurposed. The studies were approved by an Institutional Review Board of Rush University Medical Center.
Cognitive diagnosis and brain pathologies in the discovery dataset (ROS/MAP cohorts)
Final clinical diagnosis of cognitive status, including no cognitive impairment, mild cognitive impairment (MCI), Alzheimer’s disease (AD) dementia, or dementia due to other causes, was determined at the time of death by a neurologist with expertise in dementia using all available clinical data but blinded to postmortem data. The diagnosis of dementia was based on the recommendation of the National Institute of Neurological and Communicative Disorders and Stroke and the AD and Related Disorders Association[21]. MCI was based on accepted criteria. Case conferences including one or more neurologists and a neuropsychologist were used for consensus as necessary.Cerebral atherosclerosis was pathologically assessed by visual inspection of the vessels in the Circle of Willis including vertebral, basilar, posterior cerebral, middle cerebral, and anterior cerebral arteries and their proximal branches[9]. Severity of atherosclerosis was graded with a semiquantitative scale on the basis of the extent of involvement of each artery and the number of arteries involved. Scores ranged from 0 to 3 and were treated as a semiquantitative variable in our analyses. A score of zero means no significant atherosclerosis observed. A score of 1 (mild) indicates that small amounts of luminal narrowing were observed in up to several arteries without significant occlusion. A score of 2 (moderate) means that luminal narrowing occurred in up to half of all visualized major arteries with less than 50% occlusion of any single vessel. Lastly, a score of 3 (severe) indicates that luminal narrowing occurred in more than half of all visualized arteries, and/or more than 75% occlusions of one or more vessels[9].Presence of chronic gross infarcts was determined by neuropathologic evaluation blinded to clinical data and reviewed by a board-certified neuropathologist in nine regions (midfrontal, middle temporal, entorhinal, hippocampal, inferior parietal, anterior cingulate cortices, anterior basal ganglia, thalamus, and midbrain) and treated as a dichotomous variable of present or absent [22]. Presence of microinfarcts in nine brain regions was determined by neuropathological evaluation blinded to clinical data and reviewed by board-certified neuropathologist and treated as a dichotomous variable [23].Neurofibrillary tangles and β-amyloid were identified by molecularly specific immunohistochemistry and quantified by stereology and image analysis, respectively, in eight brain regions – hippocampus, entorhinal cortex, midfrontal cortex, inferior temporal, angular gyrus, calcarine cortex, anterior cingulate cortex, and superior frontal cortex [24]. Tangle density was determined using systematic sampling and was the average of the tangle densities in eight regions. β-amyloid score represents the percent area of the cortex occupied by β-amyloid and is calculated by taking the mean of β-amyloid scores in 8 brain regions. To create approximately normal distribution of tangles and β-amyloid, we took the square root of tangle density and β-amyloid score, respectively, in our analyses.Lewy body pathology was assessed using antibodies to α-synuclein in nine regions and were scored based on four stages of distribution of α-synuclein in the brain including 0 (not present), 1 (nigral-predominant), 2 (limbic-type) and 3 (neocortical-type), and treated as present or absent in our analyses [25]. Hippocampal sclerosis was identified as severe neuronal loss and gliosis in the hippocampus or subiculum using haematoxylin and eosin stain and treated as present or absent in our analyses [26]. TDP-43 cytoplasmic inclusions were assessed in six regions using antibodies to phosphorylated TDP-43 [27]. Inclusions in each region were rated on a six-point scale and the mean of the regional scores was created. TDP-43 scores were dichotomized into absent (i.e. mean score of 0) or present (mean score >0) in our analyses. Cerebral amyloid angiopathy (CAA) was assessed in the midfrontal, midtemporal, angular, and calcarine cortices using immunostain for β-amyloid [28]. Scores were averaged across these 4 regions and treated as a continuous measure.
Magnetic resonance imaging acquisition and voxelwise analysis in ROS/MAP cohorts
At autopsy brains were hemisected in accordance with standard protocol and the cerebral hemisphere with more visible pathology was immersed in 4% paraformaldehyde solution and refrigerated at 4°C[29]. At approximately one-month postmortem (mean=30 days; SD=19.2 days), the hemisphere from each decedent was imaged using a 3-Tesla MRI scanner after allowing the tissue to warm to room temperature. Four different scanners were used to acquire fast spin-echo T2-weighted MRI data with at least two different echo times (TE). Details of MRI acquisition have been previously reported[10,30-32]. The transverse relaxation time constant, T2, describes the decay rate of the transverse component of magnetization, which is sensitive to brain tissue’s free water content and paramagnetic materials, such as iron. As previously described[10,30,31], we quantified the reciprocal of T2, called R2, to reduce skewness and achieve a more normally distributed signal. Individual R2 volumes were spatially registered to a cerebral hemisphere template, using first linear and then nonlinear transformations[10]. In order to account for inter-scanner differences in R2, we carried out a voxel-by-voxel normalization of R2 values within each group of hemispheres imaged on a given scanner by subtracting the median and dividing by the interquartile range of R2 for that scanner. In addition, we carried out spatial smoothing with full width at half maximum 2mm. Details of postmortem MRI preprocessing procedure have been described previously[10,30,31]. Voxelwise linear regression analysis was applied to identify brain regions where postmortem R2 was associated with cerebral atherosclerosis after controlling for age at death, sex, and years of education. Thresholds applied were first on individual voxels (p<0.01) and then on individual clusters (p <0.05 corrected for multiple comparisons) based on Random Field theory[33,34].
Proteomic profiling for the discovery dataset (ROS/MAP cohorts)
The dorsolateral prefrontal cortex was chosen for examination because neuropathologic changes associated with AD are relatively late features of the disease in that region of the brain. We posit that this will enable detection of earlier changes in the brain proteome with disease. Post-mortem tissues from the dorsolateral prefrontal cortex (Brodmann area 9) was cortically dissected. Technicians were blinded to clinical outcomes of samples. Procedures for tissue homogenization were performed essentially as described previously [35]. Approximately 100 mg (wet tissue weight) of brain tissue was homogenize in 8 M urea lysis buffer (8 M urea, 10 mM Tris, 100 mM NaHPO4, pH 8.5) with HALT protease and phosphatase inhibitor cocktail (ThermoFisher) using a Bullet Blender (NextAdvance). Each Rino sample tube (NextAdvance) was supplemented with ~100 μL of stainless-steel beads (0.9 to 2.0 mm blend, NextAdvance) and 500 μL of lysis buffer. Tissues were added immediately after excision and samples were then placed into the bullet blender (in 4 °C cold room). The samples were homogenized for 2 full 5 min cycles and the lysates were transferred to new Eppendorf Lobind tubes. Each sample was then sonicated for 3 cycles consisting of 5 s of active sonication at 30% amplitude followed by 15 s on ice. Samples were then centrifuged for 5 min at 15,000 g and the supernatant was transferred to a new tube. Protein concentration was determined by bicinchoninic acid (BCA) assay (Pierce). For protein digestion, 100 μg of each sample was aliquoted and volumes normalized with additional lysis buffer. An equal amount of protein from each sample was aliquoted and digested in parallel to serve as the global pooled internal standard (GIS) in each TMT batch described below. Samples were reduced with 1 mM dithiothreitol (DTT) at room temperature for 30 min, followed by 5 mM iodoacetamide (IAA) alkylation in the dark for another 30 min. Lysyl endopeptidase (Wako) at 1:100 (w/w) was added and digestion continued overnight. Samples were then 7-fold diluted with 50 mM ammonium bicarbonate (AmBic). Trypsin (Promega) was then added at 1:50 (w/w) and digestion was carried out for another 16 h. The peptide solutions were acidified to a final concentration of 1% (vol/vol) formic acid (FA) and 0.1% (vol/vol) triflouroacetic acid (TFA) and desalted with a 30 mg HLB column (Oasis). Each HLB column was rinsed with 1 mL of methanol, washed with 1 mL 50% (vol/vol) acetonitrile, and equilibrated with 2×1 mL 0.1% (vol/vol) TFA. The samples were then loaded and each column was washed with 2×1 mL 0.1% (vol/vol) TFA. Elution was performed with 2 rounds of 0.5 mL of 50% (vol/vol) acetonitrile.
Isobaric tandem mass tag (TMT) peptide labeling in the discovery dataset (ROS/MAP cohorts)
Prior to TMT labeling, samples were randomized by co-variates (age, sex, post-mortem interval, cognitive diagnosis, and pathologies) into 50 total batches (8 cases per batch). Peptides from each individual case (N=400) and the GIS pooled standard (N=100) were labeled using the TMT 10-plex kit (ThermoFisher). In each batch, TMT channels 126 and 131 were used to label GIS standards, while the 8 middle TMT channels were reserved for individual samples following randomization. Labeling was performed as described previously [35,36]. Briefly, each sample (100 μg of peptides each) was re-suspended in 100 μL of 100 mM TEAB buffer. The TMT labeling reagents were equilibrated to room temperature and 256 μL anhydrous acetonitrile was added to each reagent channel and softly vortexed for 5 min and 41 μL of the corresponding TMT channels were transferred to peptide suspensions. The samples were then incubated for 1 h at room temperature. The reaction was quenched with 8 μl of 5% (vol/vol) hydroxylamine. All 10 channels were then combined and dried by SpeedVac to approximately 150 μL and diluted with 1 mL of 0.1% (vol/vol) TFA and then acidified to a final concentration of 1% (vol/vol) FA and 0.1% (vol/vol) TFA. Sep-Pak desalting was performed with a 200 mg tC18 Sep-Pak column (Waters). Each Sep-Pak column was activated with 3 mL of methanol, washed with 3 mL of 50% (vol/vol) acetonitrile, and equilibrated with 2×3 mL of 0.1% TFA. The samples were then loaded and each column was washed with 2×3 mL 0.1% (vol/vol) TFA, followed by 2 mL of 1% (vol/vol) FA. Elution was performed with 2 rounds of 1.5 mL of 50% (vol/vol) acetonitrile. The elution was dried to completeness.
High-pH off-line fractionation in the discovery dataset (ROS/MAP cohorts)
High pH fractionation was performed as previously described[37] with slight modification. Dried samples were re-suspended in high pH loading buffer (0.07% vol/vol NH4OH; 0.045% vol/vol formic acid, 2% vol/vol acetonitrile) and loaded onto an Agilent ZORBAX 300Extend-C18 column (2.1mm x 150 mm with 3.5 μm beads). An Agilent 1100 HPLC system was used to carry out the fractionation. Solvent A consisted of 0.0175% (vol/vol) NH4OH; 0.01125% (vol/vol) formic acid; 2% (vol/vol) acetonitrile and solvent B consisted of 0.0175% (vol/vol) NH4OH; 0.01125% (vol/vol) formic acid; 90% (vol/vol) acetonitrile. The sample elution was performed by a 58.6 min gradient with a flow rate of 0.4 mL/min. The gradient goes 100% solvent A for 2 min, from 0% to 12% solvent B in 6 min, from 12% to 40 % over 28 min, from 40% to 44% in 4 min, from 44% to 60% in 5 min, and then kept 60% solvent B for 13.6 min. A total of 96 individual fractions were collected across the gradient and subsequently pooled into 24 fractions and dried to completeness by speedvac.For proteomic quantification, 90% was performed using MS2 and only 10% using MS3. MS3 quantification greatly reduces the contribution of any interfering signals but has reduced sensitivity due to decreased dynamic range of reporter ion quantitation[38]. The use of MS2 acquisition can suffer from the presence of co-isolated and co-fragmented interfering ions that can obscure quantification; however, our use of high-pH offline fractionation with MS2 largely mitigated this issue[37,38].
Mass spectrometry analysis in the discovery dataset (ROS/MAP cohorts)
All fractions were resuspended in equal volume of loading buffer (0.1% formic acid, 0.03% trifluoroacetic acid, 1% acetonitrile) and analyzed by liquid chromatography coupled to mass spectrometry essentially as described[39] with slight modifications. Peptide eluents were separated on a self-packed C18 (1.9 um Dr. Maisch, Germany) fused silica column (25 cm × 75 μM internal diameter; New Objective, Woburn, MA) by a Dionex UltiMate 3000 RSLCnano liquid chromatography system (ThermoFisher Scientific) and monitored on an Orbitrap Fusion mass spectrometer (ThermoFisher Scientific). Sample elution was performed over a 180-min gradient with flow rate at 225 nL/min. The gradient goes from 3% to 7% buffer B in 5 mins, from 7% to 30% over 140 mins, from 30% to 60% in 5 mins, 60% to 99% in 2 mins, kept at 99% for 8 min and back to 1% for additional 20 min to equilibrate the column. The mass spectrometer was set to acquire in data dependent mode using the top speed workflow with a cycle time of 3 seconds. Each cycle consisted of 1 full scan followed by as many MS/MS (MS2) scans that could fit within the time window. The full scan (MS1) was performed with an m/z range of 350–1500 at 120,000 resolution (at 200 m/z) with AGC (automatic gain control) set at 4×105 and maximum injection time 50 ms. The most intense ions were selected for higher energy collision-induced dissociation (HCD) at 38% collision energy with an isolation of 0.7 m/z, a resolution of 30,000 and AGC setting of 5×104 and a maximum injection time of 100 ms. Five of the 50 TMT batches were run on the Orbitrap Fusion mass spectrometer using the SPS-MS3 method as previously described[35].
Database Searches and protein quantification in the discovery dataset (ROS/MAP cohorts)
All raw files were analyzed using the Proteome Discoverer suite (version 2.3 ThermoFisher Scientific). MS2 spectra were searched against the canonical UniProtKB Human proteome database (Downloaded February 2019 with 20,338 total sequences). The Sequest HT search engine was used and parameters were specified as: fully tryptic specificity, maximum of two missed cleavages, minimum peptide length of 6, fixed modifications for TMT tags on lysine residues and peptide N-termini (+229.162932 Da) and carbamidomethylation of cysteine residues (+57.02146 Da), variable modifications for oxidation of methionine residues (+15.99492 Da) and deamidation of asparagine and glutamine (+0.984 Da), precursor mass tolerance of 20 ppm, and a fragment mass tolerance of 0.05 Da for MS2 spectra collected in the Orbitrap (0.5 Da for the MS2 from the SPS-MS3 batches). Percolator was used to filter peptide spectral matches (PSM) and peptides to a false discovery rate (FDR) of less than 1%. Following spectral assignment, peptides were assembled into proteins and were further filtered based on the combined probabilities of their constituent peptides to a final FDR of 1%. In cases of redundancy, shared peptides were assigned to the protein sequence in adherence with the principles of parsimony. Reporter ions were quantified from MS2 or MS3 scans using an integration tolerance of 20 ppm with the most confident centroid setting.
Quality control of proteomic profiles in the discovery dataset (ROS/MAP cohorts)
For each batch, the global internal standards were used to check for proteins outside of the 95% confidence interval and set to missing. Proteins with missing values in more than 50% of the 400 subjects were excluded. Each protein abundance was then scaled by a sample-specific total protein abundance to remove effects of protein loading differences, and then log2 transformed. Outlier samples were identified and removed through iterative principal component analysis (PCA). In each iteration, samples more than four standard deviations from the mean of either the first or second principal component were removed, and principal components were recalculated for the next iteration. A total of 9 subjects were removed after three rounds of PCA. After quality control, 391 subjects having 8356 proteins were included in our analyses. The normalized abundance of these 8356 proteins was the residual of the linear regression to remove effects of protein batch, MS2 versus MS3 reporter quantitation mode, sex, age at death, postmortem interval, and study (ROS vs. MAP). The normalized abundance of these proteins was used in our analyses.
Estimate of brain cell type composition in the discovery dataset (ROS/MAP cohorts)
To account for the heterogeneity of cell types in brain tissues in our analyses, we estimated the proportions of neuron, astrocyte, oligodendrocyte, and microglia using CIBERSORT pipeline and Sharma’s cell type-specific proteins as the reference [40,41]. We modified the CIBERSORT pipeline by taking the absolute values of the negative regression coefficients instead of setting them to zero. This minor modification produced estimates of cell types similar to those obtained using the original CIBERSORT pipeline and RNA sequencing data. We included the proportion of each of these four cell types as covariates in our proteomic analyses.
Weighted protein co-expression network analysis in the discovery dataset (ROS/MAP cohorts)
Weighted protein co-expression network analysis was performed using Weighted Gene Co-expression Network Analysis (WGCNA) R package on normalized protein abundance in which effects of batch, sex, age at death, PMI, and study were removed [42]. We applied the following parameters: soft-thresholding power of 12, “bicor” correlation type, “signed” network type, “signed” Topological Overlap Matrix, a minimum module size of 30, and p-value ratio threshold for reassigning proteins between modules of 0.05, a low propensity to merge modules with “mergeCutHeight” of 0.07, and a high sensitivity to cluster splitting (deepSplit=4). Among the 8356 proteins, 62% (5199 proteins) were assigned to 31 modules, with the largest module having 518 proteins and the smallest module having 46 proteins.We defined hub proteins as those in the top 20% with regard to high connectivity. To determine the hub proteins for each module, the signed eigenprotein based connectivity, which is the correlation of the protein with its corresponding module eigenprotein, was calculated using the signedKME function from WGCNA package with the parameter corFnc “bicor” [43].
Protein-protein interaction and protein domain composition analyses
Lists of pairwise protein interactions were downloaded from the BioGRID database (v3.5.179, October 29, 2019) then culled for interactions containing only human gene symbols, and culled further to only include physical protein-protein interactions among the 770 gene products representing module 3, module 9, and 23 proteins associated with both cerebral atherosclerosis and AD dementia from the proteome-wide association study at FDR < 0.05. Modules 3 and 9 were chosen because they were associated with both cerebral atherosclerosis and AD dementia in our analysis. Edges and nodes among proteins with pairwise interactions among the two modules and the 23 proteins were visualized in Cytoscape[44].
Enrichment analysis for pathways and gene ontology in ROS/MAP cohorts
Gene set enrichment analysis of protein co-expression modules was performed using GO-Elite with the background set to all 8356 proteins quantified in the ROS/MAP cohorts [45]. Protein list per module was subjected to Fisher exact test and Z-test in the Python command line version of GO-Elite (v1.2.5) against the current annotation database for Gene Ontology Biological Process, Molecular Function, Cellular Component, Wiki Pathways, KEGG terms, and Sharma cell types downloaded in January 2019 [41]. Similarly, gene set enrichment analysis of the cerebral atherosclerosis-associated proteins was performed using the GO-Elite software as described above.
Clinical and pathologic characterization of the Baltimore Longitudinal Study of Aging (BLSA) cohort (replication dataset)
The replication cohort was 47 subjects from the BLSA with brain proteomic data (Supplementary Table 2). BLSA is a prospective cohort study of aging in community dwelling individuals [46]. It continuously recruits healthy volunteers aged 20 or older and follows them for life regardless of changes in health or functional status. The BLSA study was approved by the Institutional Review Board and the National Institute on Aging. Cerebral atherosclerosis was assessed by visual examination of the vessels in the Circle of Willis as described previously [47]. Areas of atherosclerosis were identified and then sectioned to identify the degree of stenosis by visual inspection and rated on a scale of minimal (<10%), mild (10% 48]. Briefly, silver staining was performed to assess severity of neuritic plaques using CERAD (score ranges from 0 to 3) and of neurofibrillary tangles using Braak (score ranges from 1 to 6), with higher scores indicating higher level of severity. Additionally, cerebral amyloid angiopathy was assessed and rated as mild (rare staining), moderate (scattered, incomplete staining), and severe (circumferential, diffuse staining) and treated as a semi-quantitative variable. Gross infarcts were assessed and quantified as absent versus present anywhere in the brain. Likewise, microinfarcts were assessed and determined as absent versus present in any locations of the brain. All BLSA participants provided written informed consent and the study was approved by the Institutional Review Board of the National Institute on Aging.
Proteomic profiling, quality control, and normalization in the BLSA replication cohort
Proteins were profiled from the dorsolateral prefrontal cortex using an analysis pipeline consisting of isobaric tandem mass tag (TMT) mass spectrometry and offline prefractionation as described in detail previously[36]. TMT labeling was performed following manufacturer’s protocol. Technicians were blinded to clinical outcomes of samples. MS/MS spectra were searched against a Uniprot human database and Proteome Discoverer 2.1 (ThermoFisher Scientific) as described in detail previously[36]. Global internal standard (GIS) mixture was checked for extreme outliers, and values outside of the range of log2(0.01) to log2(100) were excluded from analysis[36]. Proteins with more than four unquantifiable batches (out of a total of 8 batches) due to 0 or NA value for the GIS channel were excluded from consideration. Proteins with missing value in more than 50% of the samples were excluded. After quality control, 6533 proteins were detected[36]. Effects of sex, age at death, batch, and postmortem interval were regressed out using non-parametric bootstrap regression[36]. The normalized proteomic abundance was used for our analyses.
Statistical analysis
A proteome-wide association study (PWAS) of clinical AD dementia (no cognitive impairment versus MCI/AD dementia), neurofibrillary tangles, β-amyloid, and cerebral atherosclerosis was performed, individually, using regression and the normalized protein abundance, adjusting for sex, age at death, proportions of neuron, astrocyte, oligodendrocyte, microglia, and other covariates as specified. Continuous outcomes were visually inspected for normality of distribution and neurofibrillary tangles and β-amyloid were transformed using square root to enhance normal distribution. Breusch-Pagan test found that equal variance assumption was met for linear regression. Of note, we adjusted for sex, age at death, and cell type composition in all the PWAS. Meta-analysis of the findings from the discovery (ROS/MAP) and replication (BLSA) datasets was performed with METAL using effect size estimates and standard errors[49].Associations between protein co-expression modules and clinical AD dementia, neurofibrillary tangles, β-amyloid, and cerebral atherosclerosis, individually, were examined using linear regression, adjusting for sex, age at death, estimated proportions of neuron, astrocyte, oligodendrocyte, microglia, and other covariates as specified. Again, we adjusted for sex, age at death, and cell type composition in all of the protein co-expression module analyses. For all analyses, multiple testing adjustment was addressed with Benjamini-Hochberg (BH) false discovery rate (FDR) [50].
Association between cerebral atherosclerosis and oligodendrocyte abundance
This figure summarizes the association between CA and oligodendrocyte abundance estimated from single-cell nuclear RNA sequencing of human dorsolateral prefrontal cortex in an independent dataset. Higher cerebral atherosclerosis was associated with higher oligodendrocyte abundance in the dorsolateral prefrontal cortex after adjusting for sex, age, and 8 other measured pathologies using linear regression (p=0.003, β=0.077, N=48). For each box plot, the box reflects the first and third quartile, the dark horizontal line reflects the median, the vertical lines extending from the boxes show the 1.5x the interquartile range, and points beyond the lines are outliers.
Association between NEFL or NEFM brain protein and cerebral atherosclerosis, AD dementia.
a) Boxplot of dorsolateral prefrontal cortex NEFL protein level for each level of CA (i.e., none to severe). NEFL protein levels were identified as associated with CA using linear regression adjusted for relevant covariates and 8 other pathologies (N=375; p=0.0002; adjusted p=0.034; Figure 1). b) Boxplot of dorsolateral prefrontal cortex NEFL protein level by clinical diagnosis (i.e., cognitive normal control [control], mild cognitive impairment [MCI], and AD dementia). MCI is generally regarded as an intermediate stage between cognitively normal and AD. NEFL protein levels were identified as associated with AD dementia by logistic regression adjusted for relevant covariates (N=383; p=0.0025; adjusted p=0.0322; Figure 1). c) Boxplot of dorsolateral prefrontal cortex NEFM protein level for each level of CA (i.e., none to severe). NEFM protein levels were identified as associated with CA using linear regression adjusted for relevant covariates and 8 other pathologies (N=375; p=0.0006; adjusted p=0.045). d) Boxplot of dorsolateral prefrontal cortex NEFM protein level by clinical diagnosis (i.e., cognitive normal control [control], mild cognitive impairment [MCI], and AD dementia). NEFM protein levels were identified as associated with AD dementia by logistic regression adjusted for relevant covariates (N=383; p=0.0019; adjusted p=0.028). For each boxplot, the box reflects the first and third quartile, the dark horizontal line reflects the median, the vertical lines extending from the boxes show the 1.5x the interquartile range, and points beyond the lines are outliers. e) Proteome-wide adjusted p-values for associations between NEFL and NEFM brain protein levels and each of the 9 measured pathologies adjusting for the remaining 8 measured pathologies. Associations were tested with regression and the provided p-values were adjusted for multiple testing using Benjamin-Hochberg false discovery rate.
Authors: Thomas G Beach; Jeffrey R Wilson; Lucia I Sue; Amanda Newell; Marissa Poston; Raquel Cisneros; Yoga Pandya; Chera Esh; Donald J Connor; Marwan Sabbagh; Douglas G Walker; Alex E Roher Journal: Acta Neuropathol Date: 2006-09-20 Impact factor: 17.088
Authors: A Hofman; A Ott; M M Breteler; M L Bots; A J Slooter; F van Harskamp; C N van Duijn; C Van Broeckhoven; D E Grobbee Journal: Lancet Date: 1997-01-18 Impact factor: 79.321
Authors: Robert J Dawe; Lei Yu; Sue E Leurgans; Julie A Schneider; Aron S Buchman; Konstantinos Arfanakis; David A Bennett; Patricia A Boyle Journal: Neurobiol Aging Date: 2016-06-06 Impact factor: 4.673
Authors: Prashanthi Vemuri; Timothy G Lesnick; Scott A Przybelski; David S Knopman; Greg M Preboske; Kejal Kantarci; Mekala R Raman; Mary M Machulda; Michelle M Mielke; Val J Lowe; Matthew L Senjem; Jeffrey L Gunter; Walter A Rocca; Rosebud O Roberts; Ronald C Petersen; Clifford R Jack Journal: Brain Date: 2015-01-15 Impact factor: 13.501
Authors: Anna-Märta Gustavsson; Danielle van Westen; Erik Stomrud; Gunnar Engström; Katarina Nägga; Oskar Hansson Journal: Ann Neurol Date: 2019-11-27 Impact factor: 10.422
Authors: Zoe Arvanitakis; Ana W Capuano; Sue E Leurgans; David A Bennett; Julie A Schneider Journal: Lancet Neurol Date: 2016-06-14 Impact factor: 44.182
Authors: Lei Yu; Patricia A Boyle; Aliza P Wingo; Jingyun Yang; Tianhao Wang; Aron S Buchman; Thomas S Wingo; Nicholas T Seyfried; Allan I Levey; Philip L De Jager; Julie A Schneider; David A Bennett Journal: Neurology Date: 2021-12-22 Impact factor: 9.910
Authors: Viktor J Olah; Annie M Goettemoeller; Sruti Rayaprolu; Eric B Dammer; Nicholas T Seyfried; Srikant Rangaraju; Jordane Dimidschstein; Matthew J M Rowan Journal: Elife Date: 2022-06-21 Impact factor: 8.713
Authors: Jaime Ramos-Cejudo; Andrew D Johnson; Alexa Beiser; Sudha Seshadri; Joel Salinas; Jeffrey S Berger; Nathanael R Fillmore; Nhan Do; Chunlei Zheng; Zanetta Kovbasyuk; Babak A Ardekani; Nunzio Pomara; Omonigho M Bubu; Ankit Parekh; Antonio Convit; Rebecca A Betensky; Thomas M Wisniewski; Ricardo S Osorio Journal: J Am Heart Assoc Date: 2022-04-26 Impact factor: 6.106
Authors: Chloe Robins; Yue Liu; Wen Fan; Duc M Duong; Jacob Meigs; Nadia V Harerimana; Ekaterina S Gerasimov; Eric B Dammer; David J Cutler; Thomas G Beach; Eric M Reiman; Philip L De Jager; David A Bennett; James J Lah; Aliza P Wingo; Allan I Levey; Nicholas T Seyfried; Thomas S Wingo Journal: Am J Hum Genet Date: 2021-02-10 Impact factor: 11.025
Authors: Aliza P Wingo; Yue Liu; Ekaterina S Gerasimov; Jake Gockley; Benjamin A Logsdon; Duc M Duong; Eric B Dammer; Chloe Robins; Thomas G Beach; Eric M Reiman; Michael P Epstein; Philip L De Jager; James J Lah; David A Bennett; Nicholas T Seyfried; Allan I Levey; Thomas S Wingo Journal: Nat Genet Date: 2021-01-28 Impact factor: 38.330