Literature DB >> 33067813

Assessing the Role of Long Noncoding RNA in Nucleus Accumbens in Subjects With Alcohol Dependence.

John Drake¹, Gowon O McMichael², Eric Sean Vornholt², Kellen Cresswell³, Vernell Williamson⁴, Chris Chatzinakos², Mohammed Mamdani², Siddharth Hariharan⁵, Kenneth S Kendler^2,6,7, Gursharan Kalsi⁸, Brien P Riley^2,6,7, Mikhail Dozmorov³, Michael F Miles^2,9, Silviu-Alin Bacanu^2,6, Vladimir I Vladimirov^2,6,10,11.

Abstract

BACKGROUND: Long noncoding RNA (lncRNA) have been implicated in the etiology of alcohol use. Since lncRNA provide another layer of complexity to the transcriptome, assessing their expression in the brain is the first critical step toward understanding lncRNA functions in alcohol use and addiction. Thus, we sought to profile lncRNA expression in the nucleus accumbens (NAc) in a large postmortem alcohol brain sample.
METHODS: LncRNA and protein-coding gene (PCG) expressions in the NAc from 41 subjects with alcohol dependence (AD) and 41 controls were assessed via a regression model. Weighted gene coexpression network analysis was used to identify lncRNA and PCG networks (i.e., modules) significantly correlated with AD. Within the significant modules, key network genes (i.e., hubs) were also identified. The lncRNA and PCG hubs were correlated via Pearson correlations to elucidate the potential biological functions of lncRNA. The lncRNA and PCG hubs were further integrated with GWAS data to identify expression quantitative trait loci (eQTL).
RESULTS: At Bonferroni adj. p-value ≤ 0.05, we identified 19 lncRNA and 5 PCG significant modules, which were enriched for neuronal and immune-related processes. In these modules, we further identified 86 and 315 PCG and lncRNA hubs, respectively. At false discovery rate (FDR) of 10%, the correlation analyses between the lncRNA and PCG hubs revealed 3,125 positive and 1,860 negative correlations. Integration of hubs with genotype data identified 243 eQTLs affecting the expression of 39 and 204 PCG and lncRNA hubs, respectively.
CONCLUSIONS: Our study identified lncRNA and gene networks significantly associated with AD in the NAc, coordinated lncRNA and mRNA coexpression changes, highlighting potentially regulatory functions for the lncRNA, and our genetic (cis-eQTL) analysis provides novel insights into the etiological mechanisms of AD.

Entities: Chemical Disease Gene Mutation Species

Keywords: eQTLs; gene expression; long noncoding RNA; nucleus accumbens; postmortem brain

Year: 2020 PMID： 33067813 PMCID： PMC7756309 DOI： 10.1111/acer.14479

Source DB: PubMed Journal: Alcohol Clin Exp Res ISSN： 0145-6008 Impact factor: 3.455

Alcohol use disorder (AUD) is a chronic and debilitating disease, with an estimated heritability of around 50% (Verhulst et al., 2015). Previous postmortem brain expression studies have shown that chronic alcohol consumption leads to broad transcriptional changes in different brain regions (Warden and Mayfield, 2017). Gene expression studies in the prefrontal cortex have identified genes encoding to GABAA receptor subunits or related to mitochondrial function (Buckley et al., 2000; Fan et al., 1999) as well as in functions related to myelination, cell cycling, oxidative stress, and transcription (Farris et al., 2015; Flatscher‐Bader et al., 2005, 2006; Iwamoto et al., 2004; Kapoor et al., 2019; Lewohl et al., 2000; Mayfield et al., 2002; Ponomarev et al., 2012; Rao et al., 2019; Sutherland et al., 2014; Zhang et al., 2014). Expression studies in nucleus accumbens (NAc) and ventral tegmental area have also revealed gene expression changes related to cell architecture, signaling, vesicle formation, and synaptic transmission (Flatscher‐Bader et al., 2010, 2006, 2008; Mamdani et al., 2015; Rao et al., 2019). These findings suggest that there are region‐specific susceptibilities and adaptations to chronic alcohol consumption in the brain that are likely to have a distinct effect on the behavioral phenotypes comprising AUD (Flatscher‐Bader et al., 2010). Evaluation of the regulatory mechanisms underlying genetic differentiation is necessary for a better understanding of the neurobiology of alcohol addiction (Ron and Messing, 2013). The recent emergence of Long noncoding RNA (lncRNA) has provided an additional layer of transcriptional and translational control, highlighting potentially important neurobiological mechanisms of AUD that could be missed if only the protein‐coding genome was studied. LncRNA are longer than 200 base pairs with limited or no protein‐coding potential (Jarroux et al., 2017), and while functionally not fully characterized yet, they have been shown to participate in chromatin remodeling (Han and Chang, 2015), transcriptional and posttranscriptional regulation (Clark and Blackshaw, 2014), and in sequestering miRNA (Du et al., 2016; Shan et al., 2018). LncRNA have also been implicated in the plasticity of neuronal circuity (Wang et al., 2017) and neurodegenerative and neuropsychiatric disorders, including substance abuse and alcohol dependence (AD; Bohnsack et al., 2019; Zuo et al., 2016). The potential role of lncRNA in the etiology of AUD is further supported by several recent alcohol‐related genomewide association scans (GWAS; Adkins et al., 2017; Xu et al., 2015). Due to the complex nature of lncRNA functions, the relationship between lncRNA and alcohol consumption is largely unknown; while miRNA roles in the control of gene expression and functions in the brains of subjects with AUD have been reported (Li et al., 2013; Mamdani et al., 2015; Mizuo et al., 2012; Pietrzykowski et al., 2008; Yadav et al., 2011), to our knowledge the interactions between lncRNA and protein‐coding gene (PCG) are still understudied. Gene expression alone cannot explain the complex etiology of AUD, and assessing lncRNA and PCG expression in the context of available genetic data is necessary to discern the genetic basis of AUD susceptibility. This approach models associations between genetic variants and gene expression as quantitative traits, that is, expression quantitative trait loci (eQTL; Farris et al., 2010; Li, 2013; Nica and Dermitzakis, 2013). EQTLs can help discover unknown AUD risk loci or offer specific, testable, hypotheses for the genetic impact of polymorphisms associated with AUD (Hindorff et al., 2009; Liu, 2011; Shastry, 2009). Linkage disequilibrium (LD) of eQTLs with genetic variants implicated in AUD can further establish a biological mechanism for disease‐associated variants with no apparent functions; there is empirical evidence suggesting that eQTLs are overrepresented among GWAS signals (Nica et al., 2010; Nicolae et al., 2010). We hypothesize that lncRNA are involved in the neuropathology of AUD and that, on a molecular level, this involvement is manifested by dysregulated patterns of expression between cases with AD and controls. To further elucidate the lncRNA disease functions in subjects with AD, we performed a weighted gene coexpression network analysis (WGCNA) that led to the identification of lncRNA and PCG modules significantly correlated with AD. Key genes from these lncRNA and PCG modules (i.e., “hubs”) were then integrated to identify a set of interacting lncRNA and PCG. Since lncRNA are not yet annotated, we used system approaches to assign biological functions for the PCG hubs interacting with the lncRNA hubs. By integrating our genomewide SNP and expression data, we also identified eQTL affecting the expression of lncRNA and PCG in NAc. Thus, the overall goals of this study were first to profile the lncRNA and PCG expressions in NAc, identify protein‐coding and lncRNA networks, and finally test whether these are under the control of specific genetic elements. To our knowledge, this is the only study to comprehensively assess the lncRNA expression in NAc between subjects with AD and controls.

MATERIALS AND METHODS

Postmortem Tissue

Brain tissue from 41 cases with AD and 41 controls was received from the Australian Brain Donor Program, New South Wales Tissue Resource Centre, which is supported by The University of Sydney, National Health and Medical Research Council of Australia, Schizophrenia Research Institute, National Institute of Alcohol Abuse and Alcoholism, and the New South Wales Department of Health (http://sydney.edu.au/medicine/pathology/trc/; Supplemental Methods; Table S1).

RNA Isolation and Sample Selection

Total RNA was isolated from 50 mg of frozen tissue from NAc using the mirVana‐PARIS kit (Thermo Fisher, Carlsbad, CA), following manufacturer's protocols. RNA concentration was measured using the Quant‐iT Broad Range RNA Assay kit (Life Technologies), and the RNA integrity number (RIN) was measured on the Agilent 2100 Bioanalyzer (Agilent Technologies, Inc., Santa Clara, CA).

Expression Arrays and Data Normalization

The RNA samples were assayed using the Arraystar Human lncRNA Array v3.0 (Rockville, MD) that is designed to profile both 30,586 lncRNA and 26,109 PCG. Linear Models for Microarray Data (limma) v3.42 (Ritchie et al., 2015) package in R environment (v3.6.1) was used to log2 transform, background correct (method = “normexp”), and normalize (method = “quantile”) each microarray. Control probes were removed prior to downstream analysis, leaving 61,464 probes remaining. As each probe recognizes a specific and unique transcript, probes were used in all subsequent analyses. The final number of probes that were used in the subsequent analyses was 61,464. The efficiency of batch effect removal and overall array quality were assessed using principal components analysis on the expression values in which each array is plotted along the first 3 principal components (PCs) to identify potential outliers. Of the 82 samples, 9 samples were excluded due to low RINs (i.e., RINs ≤ 4), leaving 73 samples for the microarray run. Of the 73 samples run on the array, 8 samples (5 cases and 3 controls) did not load on 2 of the first 3 PCs and were removed from the subsequent analysis. Finally, 2 subjects lacked corresponding genotypic data and were excluded leaving a final sample of 63 subjects for all subsequent analyses.

Statistical Analyses

Differential Expression Analysis

Differential expression analysis was performed by limma using ordinary least square. Number Cruncher Statistical Software (NCSS) v12 was used to extract the first 3 PCs for all technical (e.g., Batch, RIN, PMI, pH) and biological (e.g., sex, smoking, age, brain weight, hemisphere, neuropathology) confounds, which were then used as covariates in the subsequent analysis (Mamdani et al., 2015). Microarray reliability was validated by assessing the expression of 5 genes at the Arraystar facilities using quantitative PCR (qPCR). The assessed genes showed a high mean correlation (Pearson r = 0.88 [SD ± 0.049]) between the 2 platforms (Supplemental Methods and Table S2).

WGCNA

WGCNA has been heavily applied to cancer phenotypes, which are associated with dramatic changes to the transcriptome (Gibb et al., 2011; Peng et al., 2015; Uhlen et al., 2016). Given that the effect sizes for neuropsychiatric and addiction disorders are substantially lower, preselection of genes to be included in WGCNA is necessary for the identification of propitious targets without violating the assumption of scale‐free topology. Thus, a relaxed nominal significance for differentially expressed genes (i.e., p‐value ≤ 0.10) was chosen to: (i) include genes with smaller effect sizes, albeit true positive signals, (ii) exclude genes with low disease variance, and (iii) provide a sufficient number of genes for the network analysis, which was performed separately on the differentially expressed lncRNA and PCG using the WGCNA v1.68 package in R environment v3.6.1. While WGCNA allows for the inclusion of additional variables, it does not correct for their effects; therefore, the gene networks were built using the residuals from the regression model, that is, fitting a regression model with all potential confounds, except disease status. WGCNA relies on pairwise Pearson correlations to generate a signed similarity matrix, selecting for positive correlations only. The signed similarity matrix of the lncRNA and PCG expression data was raised to the lowest power (β = 5 and β = 10, respectively) that approximated a scale‐free network topology (R 2 > 0.85), to generate an adjacency matrix (Fig. S1A,B). A topological overlap measure (TOM) was calculated to assess transcript interconnectedness. A dissimilarity measure was derived from TOM and was subsequently used for average linkage hierarchical clustering. The clustered genes were defined into modules using the “tree” method with a minimum module size of 25 genes and a minimum module merge height of 0.8. Following module definition, the first principal component of each module—the module eigengene (ME)—was calculated as a synthetic gene representing the expression profile of all genes within a given module. With the exception of the gray module, which contains genes poorly defined by other modules, modules are arbitrary named after a conventional color scheme. After naming, modules are correlated to AD case status, matching demographics, and relevant covariates. Statistical significance was assessed at Bonferroni adj. p ≤ 0.05 (corrected for the number of tested modules). As previously outlined, for the identification of genes with a high module membership (MM) and strong phenotype association (i.e., the hub genes), we applied 2 selection criteria: (i) genes with the high intramodular connectivity (r 2 ≥ 0.7) and (ii) significantly correlated with AD (i.e., p‐value ≤ 0.05; Mamdani et al., 2015; Ponomarev et al., 2012).

Weighted Key Driver Analysis

As lncRNA genes can act locally, near their transcription site, and mechanically heterogenous, there was concern that some lncRNAs would build smaller modules, which may not be captured by WGCNA (Marchese et al., 2017). To overcome this limitation, we used the weighted key driver analysis (WKDA) from the mergeomics v1.14.0 R package (Shu et al., 2016). WKDA identifies “key drivers,” which are genes whose local connectivity to neighboring genes of prespecified modules (i.e., modules significantly correlated with AD) is greater than what would be expected by chance. This connectivity score takes into account the topology and edge weights of a gene to its neighbors, which was generated by WGCNA as a TOM matrix. As the TOM matrix topology is complete, the edgefactor parameter was set to 1 to maximize the influence of the edge weights. Key drivers, which had an FDR of 5%, were identified as hub genes and merged with those found by WGCNA.

Gene Set Enrichment Analysis

Prior to running the enrichment analysis, all transcript IDs from the PCG modules significantly correlated with AD case status were converted to HUGO Gene Nomenclature Committee (HGNC; Gray et al., 2013). Our gene enrichment analysis was based on 2 approaches: (i) a brain‐related enrichment and (ii) enrichment for specific gene ontology terms. In the brain‐related enrichment, we created a curated list derived from both of WGCNA’s brain‐related cell‐type lists, PsychENCODE’s brain cell‐type marker genes, and a neural gene set comprised of physiological and behavioral marker genes (Schrode et al., 2019; Wang et al., 2018). The significance of the brain‐related enrichment was estimated via a hypergeometric test using WGCNA’s “userListEnrichment” function with a significant enrichment declared at FDR of 10%. To detect a broader enrichment for known biological processes and pathways, we used the gene set enrichment analysis (GSEA) software (v4.0.1) from the Broad Institute as previously described (Hung et al., 2012; Subramanian et al., 2005). Due to the poor lncRNA annotation, in this analysis, the individual genes from each of the significant PCG modules only were generated by rank‐ordering all genes by their MM to each of the significant AD modules. We derived the a priori gene sets from the Molecular Signatures Database v7.0 (MSigDB; http://www.broadinstitute.org/gsea/msigdb) from the Broad Institute. A total of 5,501 gene sets from the Canonical Pathways subset of the C2: Curated Pathways collection of MSigDB were assessed. Default parameters were then applied to give a minimum, and maximum a priori gene set size between 15 and 500 genes, respectively. The identified enriched pathways of both approaches were further adjusted for multiple testing at FDR of 10%.

eQTL Detection

The genotype calls were generated as part of a larger meta‐GWAS study (Adkins et al., 2017) and were integrated with the lncRNA and PCG hubs expression values to identify eQTLs for the lncRNA and PCG. To reliably estimate the eQTL effects, we excluded SNPs with missingness of ≥20% and a minor allele frequency of ≤10%, thus retaining a final set of 3,494,304 SNPs for the eQTL analysis. As previously outlined and to maintain power, we focused on using only local, cis‐eQTLs (Mamdani et al., 2015). Specifically, SNPs located 500 kilobase pairs (kbp) upstream and downstream from the selected lncRNA and PCG hubs (i.e., cis‐eQTL) were extracted and filtered with Plink v1.90 to exclude variants in LD (R 2 ≥ 0.7, window size = 250 kbp, step size = 25 kbp; Purcell et al., 2007). EQTL impact on gene expression was detected by MatrixEQTL software package in R using a linear regression framework and adjusting for the potential effects of technical and biological covariates as summarized by the first 3 PCs (Shabalin, 2012). The significant eQTL results were adjusted for multiple testing at FDR of 10%. Visualization of eQTLs and subsequent integration analysis was conducted utilizing ggplot2 package in R environment v3.6.1 (Wickham, 2016).

Integration Analyses

To increase the robustness of our integration analyses, we followed a 2‐pronged approach. First, the MEs from the significant lncRNA and PCG modules were correlated using Pearson product–moment correlations, and only the significant module correlations at FDR of 10% were selected for consideration. In the second round, all individual hubs from the significant PCG and lncRNA module correlations were extracted and correlated to identify the significant individual mRNA/lncRNA pairs. All pairs surviving this second round of correction for multiple testing at an FDR of 10% were then selected for additional series of tests.

Results

Chronic Alcohol Consumption Leads to Generalized Gene Expression Changes Between Cases and Controls

At the single gene expression analysis (p ≤ 0.10), we identified a total of 6,469 probes differentially expressed between AD cases and controls, representing 3,645 lncRNA and 2,725 PCG that compromise 5.9% and 4.4% of the total gene pool assayed, respectively. Among the differentially expressed genes, we also identified 99 pseudogenes. The complete list of significant results from the univariate gene expression analyses is provided in the supplementary table (Table S3).

lncRNA and PCG Show a Disease Relevant Network Pattern

While the gene expression analysis identifies differentially expressed genes in NAc of subjects with AD, these do not provide an integrative, systemic view of the interaction between the lncRNA and mRNA genes. Therefore, to better understand the higher‐order (system biology impact) of the interactions between lncRNA and PCG, we performed a WGCNA (Langfelder and Horvath, 2008; Zhang and Horvath, 2005). WGCNA identified 36 lncRNA and 12 mRNA modules (Fig. 1 A,B). In the lncRNA network analysis, we identified 19 modules (M brown, M sienna3, M steelblue, M black, M darkmagenta, M tan, M violet, M plum1, M skyblue, M salmon, M lightyellow, M saddlebrown, M darkorange, M green, M darkgreen, M darkolivegreen, M cyan, M orange, and M gray) significantly correlated with AD at a Bonferroni adj. p ≤ 0.001(Fig. 2), with M black being the largest containing 200 genes. In the PCG network analysis, we identified 5 modules (M gray60, M green, M red, M black, M gray) significantly correlated with AD at a Bonferroni adj. p ≤ 0.004 (Fig. 3), with M black being the largest containing 407 genes. In addition to the primary dichotomous case/control status, we also tested additional quantitative phenotypes such as the age of initiation of alcohol drinking, the amount of daily alcohol consumption, and years of drinking. We identified only 2 PCG modules (M gray and M red) as significantly correlated with the amount of daily alcohol consumption at a Bonferroni adj. p‐value ≤ 0.004. Full tables with module size, correlations, and p‐values for all PCG and lncRNA modules correlated with AD are provided in supplementary tables (Tables S4A,B).

Fig. 1

Gene dendrogram of lncRNA (A) and mRNA (B) merged modules, defined by the "tree" method, minimum module size of 25 genes, and minimum module merge height of 0.8.

Fig. 2

Module–trait relationships for lncRNA. The residuals of the expression values used to generate the lncRNA module MEs are correlated (Pearson) to the dichotomous AD case/control status (Diagnosis) and to quantitative alcohol measures such as daily alcohol consumptions (Alc‐Cons), total amount of drinks (Tot_drinks), and initial age of drinking (Age_began). The lncRNA modules were also correlated to the first 3 PCs to assess for confounding. p‐values shown are unadjusted for multiple testing. After adjusting for number of modules tested, 19 modules (M brown, M sienna3, M steelblue, M black, M darkmagenta, M tan, M violet, M plum1, M skyblue, M salmon, M lightyellow, M saddlebrown, M darkorange, M green, M darkgreen, M darkolivegreen, M cyan, M orange, and M gray) were significantly correlated with AD case status.

Fig. 3

Module–trait relationships for PCG. The PCG module MEs are correlated (Pearson) to the dichotomous AD case/control status (Diagnosis) and to quantitative alcohol measures such as daily alcohol consumptions (Alc‐Cons), total amount of drinks (Tot_drinks), and initial age of drinking (Age_began). The lncRNA modules were also correlated to the first 3 PCs to assess for confounding. p‐values shown are unadjusted for multiple testing. After adjusting for number of modules tested, M gray60, M green, M red, M black, M gray are significantly correlated with AD case status (Class) and M gray and ME red also with daily alcohol consumption.

Gene dendrogram of lncRNA (A) and mRNA (B) merged modules, defined by the "tree" method, minimum module size of 25 genes, and minimum module merge height of 0.8. Module–trait relationships for lncRNA. The residuals of the expression values used to generate the lncRNA module MEs are correlated (Pearson) to the dichotomous AD case/control status (Diagnosis) and to quantitative alcohol measures such as daily alcohol consumptions (Alc‐Cons), total amount of drinks (Tot_drinks), and initial age of drinking (Age_began). The lncRNA modules were also correlated to the first 3 PCs to assess for confounding. p‐values shown are unadjusted for multiple testing. After adjusting for number of modules tested, 19 modules (M brown, M sienna3, M steelblue, M black, M darkmagenta, M tan, M violet, M plum1, M skyblue, M salmon, M lightyellow, M saddlebrown, M darkorange, M green, M darkgreen, M darkolivegreen, M cyan, M orange, and M gray) were significantly correlated with AD case status. Module–trait relationships for PCG. The PCG module MEs are correlated (Pearson) to the dichotomous AD case/control status (Diagnosis) and to quantitative alcohol measures such as daily alcohol consumptions (Alc‐Cons), total amount of drinks (Tot_drinks), and initial age of drinking (Age_began). The lncRNA modules were also correlated to the first 3 PCs to assess for confounding. p‐values shown are unadjusted for multiple testing. After adjusting for number of modules tested, M gray60, M green, M red, M black, M gray are significantly correlated with AD case status (Class) and M gray and ME red also with daily alcohol consumption. To ensure the robustness of our gene coexpression networks and reduce the potential influence of outlier samples on network structure, we used the robust “bootstrapped” version of WGCNA (rWGNCA). We performed 100 iterations, that is, generated 100 networks after randomly subsetting two‐third of the total samples, as previously suggested (Gandal et al., 2018) with the resulting networks then merged into 1 large, final consensus network. The individual subnetworks showed reasonably high consistency with the final lncRNA and mRNA networks (Fig. S2A,B). In scale‐free network topology, “hubs” are the most highly connected genes (of which there are relatively few among all the nodes within a network). Hubs were identified by examining modules that were significantly associated with disease status and extracting genes with a high MM (MM ≥ 0.70) and a significant correlation with disease status (p ≤ 0.05). Among the significant PCG and lncRNA modules, we identified a total of 86 and 20 hubs, respectively. The smaller number of detected lncRNA hubs could be due to the lower expression values that are typical for lncRNA genes, which may lead to the formation of weaker network structures. Thus, we applied the wKDA analysis to increase our power to identify a larger set of lncRNA hubs based on their connectivity to neighboring genes within a module. At FDR of 5%, we identified 295 lncRNA hubs significantly correlated with AD (p ≤ 0.05), which, in the wKDA analysis, are also designated as key drivers (Table S5). The 295 lncRNA hubs were then combined with the initially identified 20 hubs for a total of 315 lncRNA hubs.

AD Gene Modules Are Enriched in Alcohol‐Related Processes

Previous studies have shown that coexpressed genes are enriched for specific cell types and biologically relevant functions (Guttman and Rinn, 2012; Long et al., 2017; Mamdani et al., 2015; Marty and Spigelman, 2012). As lncRNA are poorly annotated for the enrichment analyses, we focused on the PCG modules significantly associated with AD. The brain‐related enrichment analysis was conducted in WGCNA using a curated list of brain‐related gene markers. At an FDR of 10%, 3 separate modules (M black, M red, and M green) showed significant brain‐related enrichment (Table S6). The M black module was enriched for microglia, while the M red module was enriched for gene markers specific to the Raphe nuclei. The M green module was enriched for cell attributes such as postsynaptic membrane and glutamatergic synapse. Using the default parameters in GSEA, at FDR of 10%, we identified 111 a priori gene sets significantly enriched in the M black module and 1 a priori gene set significantly enriched in the M green module (Table S7). Specifically, M black was enriched for genes up‐regulated in basal neuron processes (i.e., dendritic maturation) and immune‐related systems (i.e., innate immunity, interferon signaling, etc.).

lncRNA and PCG Show a Complex Pattern of Interactions

Since lncRNA can function as potential molecular “scaffolders” (Guttman et al., 2011; Guttman and Rinn, 2012), that is, by bringing different sets of genes near each other, we were interested in identifying PCG hubs significantly correlated with lncRNA hubs. Limited studies have shown that lncRNA can affect gene functions by interfering with the expression of their specific gene targets (Long et al., 2017). For maximum power, the MEs from the significant AD modules were correlated with each other (Fig. S3), and all hubs within the significantly correlated PCG and lncRNA modules were extracted. At FDR of 10%, we observed 4,985 correlations, of which the positive (N = 3,125) significantly exceeded the negative (N = 1,860) correlations (Mann–Whitney U‐test p = 2.2E−16). The strongest positive correlation (Pearson r = 0.95, q = 8.34E−27) was observed between RP11‐23J18.1 in the M steelblue lncRNA module and IFITM1 in the mRNA M black module, while the strongest negative correlation (Pearson r = 0.80, q = 6.52E−12) was observed between RP11‐356J5.12 in the M lightyellow lncRNA module and YBX3 in the mRNA M black module. The entire list of positive and negative lncRNA/PCG hub correlations is provided in the supplementary table (Table S8A,B).

Genetic Variants Affect lncRNA and PCG Expression in a Disease‐Specific Manner

To understand the underlying genetic mechanisms of AD risk polymorphisms, we further integrated the lncRNA and PCG hub expression data from our significant modules with previously GWAS data collected on the sample (Adkins et al., 2017). To maintain sufficient power, we focused on testing cis‐eQTLs only. At FDR of 10%, we identified a total of 150 eQTLs impacting hubs’ expression, with a higher number of eQTLs for the lncRNA (n = 135) versus PCG (n = 15) hubs (Table S9A,B). The most significant PCG eQTL was between Interferon regulatory factor 7 (IRF7) and rs11246386 (Fig. 4). The most significant lncRNA eQTL was between AK128400 and rs1669681 (Fig. 5). In a follow‐up analysis, to identify potential disease risk eQTLs, we further tested the significant lncRNA and PCG eQTLs for an interaction (SNP × AD) term between genotype and AD status. A significant genotype/disease interaction for a SNP/gene pair indicates that the effect of genotype on expression is significantly different in AD cases versus controls. At FDR of 10%, we identified 93 eQTL with significant (SNP × AD) interaction terms that mediate the expression of 24 PCG hubs and 69 lncRNA hubs (Table S10A,B). The most significantly interacting PCG eQTL was between FK506 binding protein 51 (FKBP5) and rs2766532 (Fig. 6 A,B), whereas the most significantly interacting lncRNA eQTL was between G006838 and rs12142153 (Fig. 7 A,B). We further tested for enrichment of our 243 significant eQTLs among the significant GWAS hits of addiction phenotypes via the Cauchy and Simes test. These are powerful, mutually complementing, tests that compute the p‐values to control for type I error due to background enrichment, and are appropriate for assessing eQTL enrichment in GWAS significant signals with either numerous and smaller (i.e., Cauchy) or fewer and larger (i.e., Simes) effect sizes (Suppl. Methods). We observed our eQTLs to be significantly enriched for cigarettes per day [Cauchy; p‐value = 0.036, Simes; p‐value = 0.047] and smoking initiation [Simes; p‐value = 0.031]. We further observed a marginal significance for smoking cessation [Cauchy; p‐value = 0.072, Simes; p‐value = 0.073] and drinks per week [Cauchy; p‐value = 0.099]. Finally, as lncRNA and mRNA are capable of interacting with each other, we tested whether any of the significant eQTLs are also capable of modulating the lncRNA and PCG interaction. We applied this test to the negatively interacting lncRNA/PCG pairs, and while we observed nominally significant genotype effects modulating lncRNA and PCG interactions (Table S11), none of these achieved statistical significance at FDR of 10%, most likely due to the limited sample size.

Fig. 4

Plot showing the most significant mRNA eQTL between IRF7 and rs11246386. The alternative allele of rs11246386 confers a lower IRF7 expression.

Fig. 5

Plot showing the most significant lncRNA eQTL between AK128400 and rs1669681. The alternative allele of rs1669681 confers a lower AK128400 expression.

Fig. 6

Plot showing the most significant mRNA eQTL between FKBP5 and rs2766532 with (A) and without (B) the group interaction term. The alternative allele of rs2766532 confers a higher expression of FKBP5 in controls and a lower expression in cases.

Fig. 7

Plot showing the most significant lncRNA eQTL between G006838 and rs12142153 with (A) and without (B) the group interaction term. The alternative allele rs12142153 confers a higher expression of G006838 in cases and a lower expression in controls.

Plot showing the most significant mRNA eQTL between IRF7 and rs11246386. The alternative allele of rs11246386 confers a lower IRF7 expression. Plot showing the most significant lncRNA eQTL between AK128400 and rs1669681. The alternative allele of rs1669681 confers a lower AK128400 expression. Plot showing the most significant mRNA eQTL between FKBP5 and rs2766532 with (A) and without (B) the group interaction term. The alternative allele of rs2766532 confers a higher expression of FKBP5 in controls and a lower expression in cases. Plot showing the most significant lncRNA eQTL between G006838 and rs12142153 with (A) and without (B) the group interaction term. The alternative allele rs12142153 confers a higher expression of G006838 in cases and a lower expression in controls.

Discussion

The main goal of this study is to profile the expression patterns of the noncoding and coding transcriptome of NAc in a large sample of subjects with AD and controls. The NAc is a central component of the mesocorticolimbic system and is known to be involved in addictive behaviors (Marty and Spigelman, 2012; Russo et al., 2010; Tabakoff and Hoffman, 2013). By integrating GWAS data on the sample (Adkins et al., 2017) with our expression data, we further attempted to reveal hitherto unknown relationships between lncRNA and PCG and to assess how genetic variants modulate this relationship in the brain. In agreement with previous postmortem brain expression studies, we identified differentially expressed PCG, supporting the notion that chronic alcohol consumption is associated with widespread transcriptomic changes in the brain. We further identified numerous lncRNA differentially expressed between AD cases and controls, supporting their role in the neuroadaptation processes associated with chronic alcohol consumption. Among the most significant differentially expressed PCG were serpin peptidase inhibitor clade A (alpha‐1 antiproteinase, antitrypsin) member 3 (SERPINA3), interferon‐induced transmembrane protein 2 and 3 (IFITM2 and IFITM3), solute carrier family 14 (urea transporter), member 1 (Kidd blood group; SLC14A1) that were also identified in previous alcohol postmortem brain gene expression reports (Lewohl et al., 2000; McClintick et al., 2013). While not the main topic of the study, interestingly, we also detected a limited number of differentially expressed pseudogenes, such as the annexin A2 pseudogenes 2 and 3 (ANXA2P2 and ANXA2P3) and interferon‐induced transmembrane protein 4 pseudogene (IFITM4P) on chromosomes 9, 10, and 6, respectively. Previous postmortem brain expression studies have shown ANXA2P2 to be involved in stress response pathways in chronic alcoholics (McClintick et al., 2013); however, little is known about the functions of ANXA2P3 and IFITM4P; it appears that ANXA2P3 is involved in reducing lipid levels in response to statins (Barber et al., 2010) and chronic obstructive pulmonary disease (Wilk et al., 2007), while no known functions were reported for IFITM4P. A major limitation of the single gene expression analysis to identify and prioritize a set of genes associated with a given phenotype is its inability to consider the complex molecular interactions between such identified genes (Wu et al., 2013); this is especially relevant in studying the noncoding transcriptome, which is poorly annotated, with many lncRNA still of unknown function. To address this limitation, we performed a WGCNA, which allowed us to identify a set of coregulated PCG and lncRNA. We identified 19 lncRNA modules and 5 PCG modules that were significantly correlated with AD. Among the significantly correlated with AD status lncRNA and PCG modules was also M gray. Genes in the gray module are frequently considered as “noise,” although there are suggestions that this interpretation may not always be accurate (Saelens et al., 2018). In general, observations have indicated that the genes in the gray module are usually ubiquitously expressed, exhibit oscillating, or highly variable patterns of expression, or in some cases could be simply assigned to a wrong module (Greenfest‐Allen et al., 2017). Regardless, a potential explanation for M gray being significant in our analyses could be that certain lncRNA and mRNA loci may impact the neuropathology of AD in isolation and not as a part of a gene network. Indeed, other studies have also reported similar observations, where, despite their unassigned status, genes in M gray have shown significant disease associations (Li et al., 2018; Reinhold et al., 2017). Using the available quantitative measures of alcohol consumption, we further identified 2 PCG modules significantly correlated with the amount of daily alcohol consumption. Interestingly, while we observed the same PCG modules to be significantly correlated with both AD status and quantitative alcohol phenotype, no such relationship was revealed for the lncRNA module. This suggests that different sets of lncRNA genes may impact these related, yet distinct, measures of alcohol‐related phenotypes. To further understand the potential regulatory functions of lncRNA, we performed a series of correlation analyses between the hub genes identified from the significant lncRNA and PCG modules. In these analyses, we observed highly significant positive and negative correlations, in which the positive correlations significantly outnumbered the negative correlations. From our GSEA, we observed that similar to other studies (Mamdani et al., 2015; Osterndorff‐Kahanek et al., 2015; Ponomarev et al., 2012), some of the PCG modules are enriched for immune‐related or neurodegenerative processes. Activation of immune‐related processes in the brain as a result of chronic alcohol consumption has been well documented (Crews et al., 2013; Kelley and Dantzer, 2011). Likewise, the neurodegenerative processes observed for some of our modules may reflect the nature of our postmortem sample composed exclusively from chronic alcoholics, that is, subjects that have been drinking for over 2 decades. Both animal and human genetic models have demonstrated that prolonged exposure to alcohol leads to activation of the microglial population with the concurrent activation of immune processes in the brain (Cui et al., 2014; Robinson et al., 2014). It has been well known, for over several decades now, that prolonged and excessive alcohol drinking leads to loss of a brain matter (Harper and Kril, 1985; Kubota et al., 2001) and, consequently, activation of genes involved in neurodegenerative disorders. Integrating our expression and genetic data led to the identification of eQTLs that affect lncRNA and PCG expression. Having an integrated view on the expression of lncRNA and PCG, we were also interested in identifying eQTL that can impact lncRNA and PCG expressions. We focused our attention on eQTL impacting negatively correlated lncRNA/mRNA hub pairs, due to the likely regulatory nature of the negative interactions as well as to increase our statistical power. While we observed some interesting genotype effects on lncRNA/PCG interactions, none of these, however, survived correction for multiple testing at FDR of 10%. For instance, the lncRNA G081251 locus was negatively correlated with 2 genes, leukocyte elastase inhibitor (SERPINB1) and S100 calcium‐binding protein A11 (S100A11; Fig. S4A,B). SERPINB1 has been linked to neuroinflammation (Hou et al., 2019), whereas S100A11 exhibits one of the most significant coexpression differences between schizophrenia and bipolar disorder (Baumont et al., 2015). Nevertheless, our results demonstrate that the complex interactions between the coding and noncoding transcriptome observed in the neuropathology of AD are, at least partially, modulated by the genetic factors (Mamdani et al., 2015; Sartor et al., 2012). Among the eQTLs affecting PCG expression, we also detected strong genetic interactions between PCG expression and disease status, with the strongest interaction effect observed between FKBP5 and rs2766532. Several studies have implicated FKBP5 in the severity of alcohol withdrawal (Huang et al., 2014), alcohol drinking patterns in rodents (Qiu et al., 2016), problematic drinking (Dragan et al., 2018), as well as other psychiatric ailments, such as posttraumatic stress disorder (PTDS; Wilker et al., 2014). We further identified eQTLs impacting the expression of PCG with potential functions linked to neuroinflammation (SERPINB1), alcohol‐induced neuroinflammation (interferon regulatory factor 7, IRF7 [Coleman et al., 2017; Erickson et al., 2019]), neurodegenerative diseases (neural precursor cell expressed, developmentally down‐regulated 9, or NEDD9 [Li et al., 2008]), nicotine addiction (DNA methyltransferase 3 beta or DNMT3B [Hancock et al., 2015]), and ethanol exposure (solute carrier organic anion transporter family member 4A1 or SLCO4A1 [Osterndorff‐Kahanek et al., 2015]). Among the eQTLs associated with lncRNA disease expression differences, many appear to be novel discoveries, likely due to the limited number of reports studying lncRNA role in alcohol addiction. For instance, within our results, we found only 1 lncRNA locus (i.e., AK128400), whose expression was previously reported to be associated with autism (Tang et al., 2017). The importance of our postmortem brain expression study to assess lncRNA expression in NAc is further supported by 3 recent GWAS of AUD that has reported genomewide signals near lncRNA genes: (i) a study conducted by Gelernter and colleagues (2014) that implicated the ADH gene cluster on chromosome 4 and the LOC100507053 locus, (ii) a study conducted by the COGA group (Wetherill et al., 2015) that identified a polymorphism near LOC151121 on chromosome 2, and (iii) a study conducted by our group that has reported several polymorphisms in LOC339975 (Adkins et al., 2017). However, in agreement with a previous report from our group, none of these 3 loci showed evidence for differential expression, although the expression of LOC339975 was nominally affected by the most significantly associated with AD polymorphism (rs11726136) in NAc (Adkins et al., 2017). In summary, the main goal of this study was to test the hypothesis that lncRNA, as a novel class of none coding RNAs, contribute to the neuropathology of AD. While our study is novel in its approach to address the potential role of lncRNA in the neuropathology of AD by integrating both genetic and molecular data generated in postmortem brains from subjects with AD, it is not without limitations. First, postmortem brain studies are observational as manipulation of living human subjects is not possible. Although the cross‐sectional nature of these studies limits the causal inference we can make, we believe the eQTL analysis is a major step toward clarifying the directionality of these observations. Secondly, although our sample size (N = 63) is small, we believe with our careful experimental design and implementation of integrative multivariate approaches, we can circumvent some of these limitations and further broaden our understanding of alcohol addiction processes. And finally, we acknowledged that examining AD through a lens of lncRNA’s interaction with mRNA overlooks epigenetic mechanisms (Bohnsack et al., 2019; Marchese et al., 2017). We hope that our study will provide additional molecular targets that will help translate these advances into effective therapeutic strategies for patients suffering from alcohol addiction. In conclusion, we believe our study makes an important contribution to the field of alcohol research. To the best of our knowledge, ours is among the first studies to date to pave the road for future lncRNA research in the neuropathology of alcohol addiction.

Conflict of interest

The authors declare no conflict of interest. Fig. S1A. Scale‐free topology fit graphs of (A) lncRNA and (B) mRNA. Showing the scale free topology fitting index (R2) or mean connectivity against various power (β) settings. Click here for additional data file. Fig. S1B. Scale‐free topology fit graphs of (A) lncRNA and (B) mRNA. Showing the scale free topology fitting index (R2) or mean connectivity against various power (β) settings. Click here for additional data file. Fig. S2A. Robust, bootstrapped version of WGCNA (rWGCNA). Click here for additional data file. Fig. S2B. Robust, bootstrapped version of WGCNA (rWGCNA). Click here for additional data file. Fig. S3. Heatmap of the Pearson correlations between modules, which were significantly correlated with disease status. Click here for additional data file. Fig. S4A. Evidence of genotype (A:rs2446448 B:rs1496042) modulating lncRNA (G081251) and PCG (A: SERPINB1 B:S100A11) hubgenes, which were from significantly correlated AD modules and significantly negatively correlated with each other. Click here for additional data file. Fig. S4B. Evidence of genotype (A:rs2446448 B:rs1496042) modulating lncRNA (G081251) and PCG (A: SERPINB1 B:S100A11) hubgenes, which were from significantly correlated AD modules and significantly negatively correlated with each other. Click here for additional data file. Fig. S5. Pearson Correlation of t‐values between limma’s Empirical Bayes methodology with robust model and ordinary least square. Click here for additional data file. Table S1. Description of sample demographics per diagnostic group. Click here for additional data file. Table S2. Microarray expression data validation using quantitative PCR. Click here for additional data file. Table S3. Results from the univariate gene expression analysis at p≤0.10. Click here for additional data file. Table S4. Description of PCG(A) and lncRNA(B) with module membership, correlation with AD, significance of correlation, and module associations. Click here for additional data file. Table S5. Results from wKDA analysis of lncRNA within modules significantly correlated with disease status. Click here for additional data file. Table S6. Gene set enrichment analysis of PCG utilizing WGCNA's userListEnrichment function and curated list. Click here for additional data file. Table S7. Gene set enrichment analysis of PCG utilizing GSEA. Click here for additional data file. Table S8. Positive(A) and negative(B) significant correlations of hubgenes from modules significantly associated with AD. Click here for additional data file. Table S9. Results from the PCG(A) and lncRNA(B) eQTL analysis at FDR p≤0.10. Click here for additional data file. Table S10. Results from the PCG(A) and lncRNA(B) eQTL with group interaction analysis at FDR p≤0.10. Click here for additional data file. Table S11. Genotype modulation of negatively correlated lncRNA and PCG hubs. Click here for additional data file. Table S12. Primer set for lncRNA validation. Click here for additional data file. Methods S1. Supplementary methods Click here for additional data file. Click here for additional data file.

91 in total

Review 1. SNPs: impact on gene function and phenotype.

Authors: Barkur S Shastry
Journal: Methods Mol Biol Date: 2009

Review 2. Gene expression profiling in the human alcoholic brain.

Authors: Anna S Warden; R Dayne Mayfield
Journal: Neuropharmacology Date: 2017-03-15 Impact factor: 5.250

3. Innate immune response is differentially dysregulated between bipolar disease and schizophrenia.

Authors: Angelica de Baumont; Mariana Maschietto; Leandro Lima; Dirce Maria Carraro; Eloisa Helena Olivieri; Alex Fiorini; Luiz André Nardin Barreta; Joana Almeida Palha; Paulo Belmonte-de-Abreu; Carlos Alberto Moreira Filho; Helena Brentani
Journal: Schizophr Res Date: 2014-12-06 Impact factor: 4.939

4. SerpinB1 controls encephalitogenic T helper cells in neuroinflammation.

Authors: Lifei Hou; Deepak A Rao; Koichi Yuki; Jessica Cooley; Lauren A Henderson; A Helena Jonsson; Dion Kaiserman; Mark P Gorman; Peter A Nigrovic; Phillip I Bird; Burkhard Becher; Eileen Remold-O'Donnell
Journal: Proc Natl Acad Sci U S A Date: 2019-09-23 Impact factor: 11.205

5. High mobility group box 1/Toll-like receptor danger signaling increases brain neuroimmune activation in alcohol dependence.

Authors: Fulton T Crews; Liya Qin; Donna Sheedy; Ryan P Vetreno; Jian Zou
Journal: Biol Psychiatry Date: 2012-12-01 Impact factor: 13.382

6. Brain atrophy in chronic alcoholic patients: a quantitative pathological study.

Authors: C Harper; J Kril
Journal: J Neurol Neurosurg Psychiatry Date: 1985-03 Impact factor: 10.154

7. Evidence that common variation in NEDD9 is associated with susceptibility to late-onset Alzheimer's and Parkinson's disease.

Authors: Yonghong Li; Andrew Grupe; Charles Rowland; Peter Holmans; Ricardo Segurado; Richard Abraham; Lesley Jones; Joseph Catanese; David Ross; Kevin Mayo; Maribel Martinez; Paul Hollingworth; Alison Goate; Nigel J Cairns; Brad A Racette; Joel S Perlmutter; Michael C O'Donovan; John C Morris; Carol Brayne; David C Rubinsztein; Simon Lovestone; Leon J Thal; Michael J Owen; Julie Williams
Journal: Hum Mol Genet Date: 2007-12-06 Impact factor: 6.150

8. Allele-specific expression and high-throughput reporter assay reveal functional genetic variants associated with alcohol use disorders.

Authors: Xi Rao; Kriti S Thapa; Andy B Chen; Hai Lin; Hongyu Gao; Jill L Reiter; Katherine A Hargreaves; Joseph Ipe; Dongbing Lai; Xiaoling Xuei; Yue Wang; Hongmei Gu; Manav Kapoor; Sean P Farris; Jay Tischfield; Tatiana Foroud; Alison M Goate; Todd C Skaar; R Dayne Mayfield; Howard J Edenberg; Yunlong Liu
Journal: Mol Psychiatry Date: 2019-09-02 Impact factor: 15.992

9. Genome-wide meta-analysis reveals common splice site acceptor variant in CHRNA4 associated with nicotine dependence.

Authors: D B Hancock; G W Reginsson; N C Gaddis; X Chen; N L Saccone; S M Lutz; B Qaiser; R Sherva; S Steinberg; F Zink; S N Stacey; C Glasheen; J Chen; F Gu; B N Frederiksen; A Loukola; D F Gudbjartsson; I Brüske; M T Landi; H Bickeböller; P Madden; L Farrer; J Kaprio; H R Kranzler; J Gelernter; T B Baker; P Kraft; C I Amos; N E Caporaso; J E Hokanson; L J Bierut; T E Thorgeirsson; E O Johnson; K Stefansson
Journal: Transl Psychiatry Date: 2015-10-06 Impact factor: 6.222

10. Application of Weighted Gene Co-expression Network Analysis for Data from Paired Design.

Authors: Jianqiang Li; Doudou Zhou; Weiliang Qiu; Yuliang Shi; Ji-Jiang Yang; Shi Chen; Qing Wang; Hui Pan
Journal: Sci Rep Date: 2018-01-12 Impact factor: 4.379

6 in total

1. RNA m6A Modification Changes in Postmortem Nucleus Accumbens of Subjects with Alcohol Use Disorder: A Pilot Study.

Authors: Ying Liu; Huiping Zhang
Journal: Genes (Basel) Date: 2022-05-27 Impact factor: 4.141

2. Network preservation reveals shared and unique biological processes associated with chronic alcohol abuse in NAc and PFC.

Authors: Eric Vornholt; John Drake; Mohammed Mamdani; Gowon McMichael; Zachary N Taylor; Silviu-Alin Bacanu; Michael F Miles; Vladimir I Vladimirov
Journal: PLoS One Date: 2020-12-17 Impact factor: 3.240

3. Effects of Ethanol on Expression of Coding and Noncoding RNAs in Murine Neuroblastoma Neuro2a Cells.

Authors: Mi Ran Choi; Sinyoung Cho; Dai-Jin Kim; Jung-Seok Choi; Yeung-Bae Jin; Miran Kim; Hye Jin Chang; Seong Ho Jeon; Young Duk Yang; Sang-Rae Lee
Journal: Int J Mol Sci Date: 2022-06-30 Impact factor: 6.208

Review 4. Long Non-Coding RNAs: The New Frontier into Understanding the Etiology of Alcohol Use Disorder.

Authors: Allie N Denham; John Drake; Matthew Gavrilov; Zachary N Taylor; Silviu-Alin Bacanu; Vladimir I Vladimirov
Journal: Noncoding RNA Date: 2022-08-04

5. Mergeomics 2.0: a web server for multi-omics data integration to elucidate disease networks and predict therapeutics.

Authors: Jessica Ding; Montgomery Blencowe; Thien Nghiem; Sung-Min Ha; Yen-Wei Chen; Gaoyan Li; Xia Yang
Journal: Nucleic Acids Res Date: 2021-07-02 Impact factor: 16.971

6. Cocaine self-administration induces sex-dependent protein expression in the nucleus accumbens.

Authors: Alberto J López; Amy R Johnson; Tanner J Euston; Rashaun Wilson; Suzanne O Nolan; Lillian J Brady; Kimberly C Thibeault; Shannon J Kelly; Veronika Kondev; Patrick Melugin; M Gunes Kutlu; Emily Chuang; TuKiet T Lam; Drew D Kiraly; Erin S Calipari
Journal: Commun Biol Date: 2021-07-16

6 in total