Literature DB >> 35224162

Integrated analysis on transcriptome and behaviors defines HTT repeat-dependent network modules in Huntington's disease.

Lulin Huang1,2, Li Fang2, Qian Liu2, Abolfazl Doostparast Torshizi2, Kai Wang2.   

Abstract

Huntington's disease (HD) is caused by a CAG repeat expansion in the huntingtin (HTT) gene. Knock-in mice carrying a CAG repeat-expanded Htt will develop HD phenotypes. Previous studies suggested dysregulated molecular networks in a CAG length genotype- and the age-dependent manner in brain tissues from knock-in mice carrying expanded Htt CAG repeats. Furthermore, a large-scale phenome analysis defined a behavioral signature for HD genotype in knock-in mice carrying expanded Htt CAG repeats. However, an integrated analysis correlating phenotype features with genotypes (CAG repeat expansions) was not conducted previously. In this study, we revealed the landscape of the behavioral features and gene expression correlations based on 445 mRNA samples and 445 microRNA samples, together with behavioral features (396 PhenoCube behaviors and 111 NeuroCube behaviors) in Htt CAG-knock-in mice. We identified 37 behavioral features that were significantly associated with CAG repeat length including the number of steps and hind limb stand duration. The behavioral features were associated with several gene coexpression groups involved in neuronal dysfunctions, which were also supported by the single-cell RNA sequencing data in the striatum and the spatial gene expression in the brain. We also identified 15 chemicals with significant responses for genes with enriched behavioral features, most of them are agonist or antagonist for dopamine receptors and serotonin receptors used for neurology/psychiatry. Our study provides further evidence that abnormal neuronal signal transduction in the striatum plays an important role in causing HD-related phenotypic behaviors and provided rich information for the further pharmacotherapeutic intervention possibility for HD.
© 2021 Chongqing Medical University. Production and hosting by Elsevier B.V.

Entities:  

Keywords:  Behaviors; CAG repeat; Huntington's disease; Mice; Single-cell RNA sequencing; Small chemicals; Striatum; Transcriptome

Year:  2021        PMID: 35224162      PMCID: PMC8843892          DOI: 10.1016/j.gendis.2021.05.004

Source DB:  PubMed          Journal:  Genes Dis        ISSN: 2352-3042


Introduction

Huntington's disease (HD) was first described by George Huntington in 1872. It is an autosomal dominant condition that typically presents in middle age with a combination of movement, cognitive, and psychiatric problems, and it leads to death 20–25 years later. Currently, HD has no effective treatment. The disease-causing gene mutation consists of an expanded CAG tract (>35 repeats) within the first exon of the huntingtin (HTT) gene, which is translated into a corresponding poly-glutamine stretch (polyQ). Proteins with the expanded repeats make the medium spiny neurons in the striatum particularly vulnerable to cell death and cause the dysfunction and death of neurons in other brain regions. HTT protein is essential for embryogenesis and is ubiquitously expressed in moderate amounts in and outside the nervous system; the mutant HTT protein is toxic and initiates the disease. Although multiple pathologic mechanisms have been proposed, the exact mechanism by which mutant HTT causes neuronal dysfunction is still unknown. Recent studies demonstrated the mRNA transcriptional dysregulation as a central mechanism. Knock-in mice carrying a CAG repeat-expanded Htt have been shown to develop neurological and neurodegenerative phenotypes similar to HD. Using this model, a previous study provided a large-scale, comprehensive transcriptomic characterization of the molecular pathogenic effects of CAG repeat expansion, identifying a consistent set of genes and networks that were dysregulated in HD mouse brains. Patients with early HD do not have impairments in decision making, but those with later HD have decreased attention, perseveration, construction, conceptualization, and memory in comparison with normal people. Another released behavioral feature dataset used a machine learning approach and analyzed the behavioral effects of CAG repeat length in the Htt CAG-knock-in mouse model, and it revealed sufficient discriminatory power to accurately predict the genotype-required combined analysis of more than 200 behavioral features. These previous works raised an open question about the gene network associated with the behaviors in HD conditions. Moreover, the shared data with the research community provides support for a focused understanding of the mechanism that links behavioral consequences to a genetic mutation in HD. Understanding the behavioral features of HD patients requires clinical knowledge within the larger context of a person's life. The interaction between disease mechanisms and behaviors may be complex. In the CAG repeat knock-in mouse model, previous studies only examined the gene expression profiles with the CAG repeat length genotype or the behavioral feathers with the CAG repeat length genotype, respectively, thus providing a limited view of understanding the disease mechanism. In this study, we present a combinatorial analysis of brain transcriptomes and behavioral features by using the Htt CAG-knock-in mouse model to establish the network of HD behavioral features and gene expressions linked to the CAG repeat length genotype to find the biomarkers and behavior markers for HD.

Methods

mRNA sequencing datasets of brain tissues from knock-in mice

As the main affected tissue of HD is the brain, we used the transcriptomic study (i.e., mRNA and microRNA) of the four brain regions (i.e., striatum, cortex, hippocampus, and cerebellum) with an allelic series of CAG repeat lengths HD knock-in mice (Q7, Q20, Q80, Q92, Q111, Q140, and Q175) at three ages (2, 6, and 10 months), from the previous research. In total, the transcriptomes of 445 mRNA samples and 455 microRNA samples were combined and analyzed in this study. These datasets included GSE65770, GSE65774, GSE65775, GSE73468, GSE73503, GSE65769, GSE65773, GSE73505, and GSE73507 from the Gene Expression Omnibus repository (https://www.ncbi.nlm.nih.gov/gds). In these datasets, mRNA was extracted and prepared using the Illumina TruSeq RNA sample prep kit and sequenced on an Illumina HiSeq2000 sequencer using strand-specific, paired-end, and 50-mer sequencing protocols to a minimum read depth of 40 million reads per sample. The sequencing was performed in two separate batches (6-month samples in batch 1 and 2, and 10-month samples in batch 2). Raw data were downloaded and aligned to the mouse genome mm9 using the STAR aligner, and read counts for the individual genes were obtained using HTSeq.

PGI high-throughput behavioral datasets

PGI's comprehensive high-throughput systems (i.e., PhenoCube, NeuroCube, and SmartCube systems) capture the different domains of behavior, namely, cognitive, motor, circadian, social, anxiety-like, and gait, among others, using custom-built computer vision software and machine learning algorithms., PhenoCube is a high-throughput platform that assesses the circadian, cognitive, social, and motor behaviors exhibited by group-housed mice. NeuroCube uses computer vision to detect the stance characteristics, gait geometry, and dynamics in rodent models of disorders. These two platforms detected the behaviors of HD knock-in mice (Q7, Q20, Q80, Q92, Q111, Q140, and Q175) at three ages (2, 6, and 10 months) were used in this study. The behavior datasets were downloaded from the Open Source Science for the HD Research Community (HDinHD: https://www.hdinhd.org/). The behavioral features used in this study are listed in Table S1.

Data preprocessing

We first conducted a quick check of the samples in each tissue using a sample correlation heat map to determine the sample cluster and delete the outgroup sample. This step resulted in 143 samples in the striatum, 104 samples in the cortex, 104 samples in the hippocampus, and 103 samples in the cerebellum for further analysis. The low count was filtered out from the data to keep only the genes that were informative during the differential expression analysis and network construction. We then performed a series of differential expression contrasts using the samples in each tissue to remove the non-differentially expressed genes using the limma R package, which uses raw counts and using the results to further filter out the genes that did not have a significant amount of variance. The properly normalized and preprocessed data were necessary for the downstream analyses of WGCNA. Thus, 17,548 genes in the striatum, 9,291 genes in the cortex, 755 genes in the hippocampus, and 5,802 genes in the cerebellum were obtained for further analysis. The average values of a group of each mouse line of PhenoCube 396 behaviors and 111 NeuroCube behaviors were used.

Gene coexpression module detection

The WGCNA starts by constructing a matrix of pairwise correlations between all pairs of genes across the measured samples in a dataset. Constructing a weighted gene network entails the choice of the soft thresholding power β to which coexpression similarity is raised to calculate adjacency. We chose the candidate powers for which the scale-free topology index reaches at least 0.80 to perform the analysis of network topology. We calculated the adjacencies using the soft thresholding power of each tissue and time point. To minimize the effects of noise and spurious associations, we transformed the adjacency into a topological overlap matrix and calculated the corresponding dissimilarity. To quantify the coexpression similarity of all the modules, we calculated their eigengenes and clustered them according to their correlation. As we already had a summary profile (eigengenes) for each module, we simply correlated the eigengenes with behavioral features and looked for the most significant associations. We quantified the associations of individual genes with our behavior of interest by defining the gene significance (GS) as the correlation between the gene and the behavior. For each module, we also defined a quantitative measure of module membership (MM) as the correlation of the module eigengene and the gene expression profile. This step enabled us to quantify the similarity of all genes on the array to every module. Using the GS and MM measures, we can identify the genes with a high significance for the significant behavior and the high module membership in the modules of interest. The main code used in the WGCNA analysis was shown in supplementary file 1.

Gene functional annotation clustering and visualization

The DAVID functional annotation clustering uses a similar fuzzy clustering concept as the functional classification by measuring the relationships among the annotation terms based on the degree of their co-association with the genes within the users' list to cluster somewhat heterogeneous, yet highly similar, annotation into functional annotations. This process reduces the burden of associating different terms related to similar biological processes, thus enabling the biological interpretation to be more focused on the “biological module” level. This type of grouping of functional annotation can give a more insightful view of the relationships between annotation categories and terms than the traditional linear list of enriched terms. The enriched functional annotation terms associated with the users’ gene list were identified and listed according to their enrichment P value by DAVID. The DAVID-enriched functional annotation tables (FDR < 0.05) served as the input data for the network visualization by the Enrichment Map app of Cytoscape 3.3.

Gene functional group network analysis and visualization

GeneMANIA searches many large, publicly available biological datasets to find related genes, including protein–protein, protein–DNA and genetic interactions, pathways, reactions, gene and protein expression data, protein domains, and phenotypic screening profiles.17, 18, 19 The gene lists in the category of the DAVID functional annotation table were used for network analysis by GeneMANIA. The results of the visualization were performed by the GeneMANIA app in Cytoscape 3.3.

Single-cell RNA sequence data cluster analysis

GSE82187 data was downloaded. The genes were extracted from the significant GO terms in the GO network of Fig. 4B, 6C, 7B and 8B. The gene expression values of each cell type were calculated by the mean values in each cell's populations. The gene expression cluster was down by cluster 3.0 by using hierarchical methods.
Figure 4

Networks shared by the CAG repeat length and behaviors. (A) A scatter plot of the gene significance for CAG repeat length vs. module membership in the brown module. (B) Gene ontology networks for the CAG repeat length. (C–G) Gene networks of the significant ontologies for the CAG repeat length. Blue lines, physical interactions. Brown lines, co-localization. Green lines, pathway. Dark pink lines, coexpression. Light green lines, shared protein domains. (C) Postsynaptic density (GO: 0014069), (D) Calcium, (E) Signal transduction inhibitor function, (F) Amphetamine ontology, and (G) Neuronal cell body ontology (GO: 0043025).

Figure 6

Highlighted networks in the grouping behaviors in the PhenoCube platform. (A) Visualization of the eigengene network representing the relationships between the modules and the number of groupings. The upper panel shows a hierarchical clustering dendrogram of the eigengenes in which the dissimilarity of eigengenes EI and EJ is given by 1 - cor (EI; EJ). The heat map in the lower panel shows the eigengene adjacency AIJ = (1 + cor (EI; EJ))/2. (B) A scatter plot of the gene significance for number of groupings vs. module membership in the brown module. (C) Gene ontology networks for number of groupings. (D–H) Gene networks of the significant ontologies for the number of grouping. Blue lines, physical interactions. Brown lines, co-localization. Green lines, pathway. Dark pink lines, coexpression. Light green lines, shared protein domains. (D) Dendritic spine, (E) Synapse, (F) S_TKc domain, (G) Cell junction, and (H) Oxytocin signaling pathway.

Figure 7

Highlighted networks in gait characteristics and number of steps in the NeuroCube platform. (A) Visualization of the eigengene network representing the relationships between the modules and the number of steps. The upper panel shows a hierarchical clustering dendrogram of the eigengenes in which the dissimilarity of eigengenes EI and EJ is given by 1 – cor (EI; EJ). The heat map in the lower panel shows the eigengene adjacency AIJ = (1 + cor (EI; EJ))/2. (B) A scatter plot of the gene significance for number of steps vs. module membership in the brown module. (C) Gene ontology networks for number of steps. (D–F) Gene networks of the significant ontologies for number of steps. Blue lines, physical interactions. Brown lines, co-localization. Green lines, pathway. Dark pink lines, coexpression. Light green lines, shared protein domains. (D) Postsynaptic membrane, (E) Locomotor, and (F) Perikaryon.

Figure 8

Highlighted networks in the gait characteristics of the hind limb angle (hind limb mean angle of orientation of the maximal diameter to the direction of the run [X]) in the NeuroCube platform. (A) A scatter plot of the gene significance for number of hind limb angles vs. module membership in the brown module. (B) Gene ontology networks for the hind limb angle. (C, D) Gene networks of the significant ontologies for the hind limb angle. Blue lines, physical interactions. Brown lines, co-localization. Green lines, pathway. Dark pink lines, coexpression. Light green lines, shared protein domains. (C) Cytoskeleton and (D) Proton acceptor.

Gene small chemical targets

We aimed to identify the potential small chemical targets by searching for behavior-enriched genes for HD in Connectivity Map (CMap dataset, https://clue.io/). The CMap dataset of cellular signatures catalogs the transcriptional responses of human cells to chemical and genetic perturbations (1.3 M L1000 profiles and the tools for their analysis). A total of 27,927 perturbations have been profiled to produce 476,251 expression signatures. We searched the chemicals by the gene for interaction with the existing CMap database. Fisher's test was then used to identify the significant small chemicals. The P values of the Fisher's test were adjusted by Bonferroni correction. An FDR of less than 0.05 was considered significant.

Results

Study design

As the main affected tissue in HD is the brain, we used the published transcriptomic data of CAG repeat knock-in HD mice (mRNA and microRNA) of the four brain regions, namely, striatum, cortex, hippocampus, and cerebellum, for data analysis. For each tissue, the data contain an allelic series of CAG repeat lengths (Q7, Q20, Q80, Q92, Q111, Q140 and Q175) at three ages (2, 6 and 10 months). For the behavior data, we used the PhenoCube and NeuroCube platform-generated data from mouse lines matched to the transcriptomic data. PhenoCube is a high-throughput platform that assesses circadian, cognitive, and motor behaviors, whereas NeuroCube® is a platform that uses computer vision to detect stance characteristics, gait geometry, and dynamics in the HD mouse model. The transcriptome data of the 445 mRNA samples, 445 microRNA samples, 396 PhenoCube behaviors, and 111 NeuroCube behaviors were combined and analyzed in this study (Fig. 1). These samples are expected to provide crucial systems-level insight into CAG length-dependent networks linked to behaviors in HD. We used a consensus weighted gene coexpression network analysis (WGCNA) with the behavioral features to define the modules of the genes related to the CAG length genotype (Fig. 1). Afterward, a gene ontology network analysis was performed to address the disease mechanisms (Fig. 1). Besides, single-cell-based expression patterns of ontology network genes in the most affected tissue striatum were analyzed to reveal the most affected cell type in the striatum. Finally, we predicted small molecules targeted to behavior-enriched genes.
Figure 1

Overview of the experimental design and data analysis strategy. An allelic series of CAG repeat lengths of the transcriptome data of 445 mRNA samples, 445 microRNA samples, 396 PhenoCube behaviors, 111 NeuroCube behaviors were analyzed by consensus weighted gene coexpression network analysis (WGCNA), followed by gene ontology network analysis, to address the disease mechanisms in the WGCNA detected modules.

Overview of the experimental design and data analysis strategy. An allelic series of CAG repeat lengths of the transcriptome data of 445 mRNA samples, 445 microRNA samples, 396 PhenoCube behaviors, 111 NeuroCube behaviors were analyzed by consensus weighted gene coexpression network analysis (WGCNA), followed by gene ontology network analysis, to address the disease mechanisms in the WGCNA detected modules.

The landscape of the behavioral features and gene expression correlations

We first used WGCNA to analyze all the datasets to give an overview of our data. The results demonstrated that the batch and tissue effects played the main role and that some behavior sets have a similar pattern (Fig. S1). We then separated the batch (tissues) and combined the similar behavior phenotypes (see methods) for the module detection. The results suggested that the batch and age effect played a larger effect than the CAG repeat length (Fig. S2–S5). To detect the real network linked to the CAG repeat length and behavior, we separated the dataset in each tissue and each time point for behavioral analysis. The separate analysis provided a reasonable pattern for the CAG repeat length-associated behaviors. In general, the striatum, which is HD's most affected tissue, showed the strongest correlation of gene expression changes and behavioral changes in both PhenoCube and NeuroCube. For the time course, the correlation at 10 months was the most significant time point. The WGCNA module analyses provided the strength and significance of the associations between the modules and the behavioral features according to the CAG repeat length. Among the four brain tissues, namely, the cortex, hippocampus, cerebellum, and striatum, we detected the strongest association of gene expression profile and behavioral features in the striatum, consistent with the previous studies. At 10 months, two modules were detected in the striatum tissue associated with the CAG repeat length and behaviors, with a correlation score of more than 0.6 (Fig. S6). We also detected a weak association between the CAG repeat length and behavior phenotype in the cerebellum (Fig. S7). No significant behaviors were found in each of the four tissues during the seven CAG repeat lengths at the 2-month and 6-month time points. However, at 6 months, the significant behaviors driven by gender difference were detected in the cortex, striatum, and cerebellum. Especially in the cortex, most of the PhenoCube and NeuroCube behaviors were significantly correlated with gender differences (Fig. S8, S9).

Behavioral features and CAG repeat length-dependent gene coexpression network in the striatum

Our results suggested several special behavioral changes in HD condition linked to the CAG repeat length controlling gene expression profiles in the striatum at 10 months. The most significant module is the brown module, which was shared by the CAG repeat length and behavioral features (Fig. 2). In the striatum at the 10-month time point, we detected 29 significant (P < 7.25 × 10−4, 0.05/69 behavioral feathers for Bonferroni correction) PhenoCube behaviors associated with CAG length (Fig. 3A, Table S1). The grouping and clustering of the behaviors were negatively correlated with the CAG repeat length. This result suggests that these behavioral features decreased according to the CAG repeat length increase in the HD condition (Fig. 3A). On the contrary, most of the visit activities showed the same direction of correlation with the CAG repeat length, thus indicating that these behaviors increased according to the CAG repeat length expansion in the HD condition (Fig. 3A). We also found eight significant (P < 4.63 × 10−4, 0.05/108 behavioral features) NeuroCube behaviors associated with the CAG repeat length in one or more modules. These behaviors presented the stance characteristics, gait geometry, and dynamics (Fig. 3B, Table S1). The behavioral features, including the number of steps and hind limb stand duration, among others, were negatively related to the CAG repeat length, whereas the other behavioral features, including the hind limb, mean angle, among others, were positively related to the CAG repeat length (Fig. 3B).
Figure 2

Integrated effect of the gene expression and behavioral features in the striatum at 10 months. (A) The gene co-expression network and (B) bar plot of the module significance defined as the mean gene significance across all genes in the module. MEbrown and MEturquoise are the two modules with a correlation of more than 0.6 detected in the striatum at 10 months.

Figure 3

Module–behavioral feature associations in the striatum at 10 months of the PhenoCube behavior (A) and NeuroCube behavior (B). Each row corresponds to a module eigengene and each column to a behavioral feature. Each cell contains the corresponding correlation and P-value. The full names of the behaviors are listed in Table S1.

Integrated effect of the gene expression and behavioral features in the striatum at 10 months. (A) The gene co-expression network and (B) bar plot of the module significance defined as the mean gene significance across all genes in the module. MEbrown and MEturquoise are the two modules with a correlation of more than 0.6 detected in the striatum at 10 months. Module–behavioral feature associations in the striatum at 10 months of the PhenoCube behavior (A) and NeuroCube behavior (B). Each row corresponds to a module eigengene and each column to a behavioral feature. Each cell contains the corresponding correlation and P-value. The full names of the behaviors are listed in Table S1.

Highlighted networks shared by the CAG repeat length and behaviors

As most of the significant genes were shared by the CAG repeat length and significant behavioral features, we initially focused on the CAG repeat length-modified genes. In total, 1111 genes were significantly altered according to the CAG repeat length (false discovery rate, FDR < 0.05). A highly significant correlation was found between the gene significance for the CAG repeat length versus the brown module membership detected by WGCNA (Fig. 4A). This gene set highlighted 12 significant functional groups by the DAVID analysis (FDR < 0.05) (Fig. 4B). The basic striatal functional proteins involved in phosphoprotein, membrane, and alternative splicing were the largest gene groups enriched according to the CAG repeat lengths and significant behaviors (Fig. 4B). Postsynaptic density (GO: 0014069) was the most significant ontology enriched (FDR = 8.33 × 10−7) by the DAVID annotation. This result is consistent with those of previous studies in which postsynaptic density was reduced in the HD condition (Fig. 4C).24, 25, 26, 27, 28 The next significant ontology is the calcium term (FDR = 4.08 × 10−6). This result is also consistent with the previous finding that neuronal calcium signaling is abnormal in HD., A set of 43 genes was enriched in this ontology (Fig. 4D). This ontology was followed by the signal transduction inhibitor function (FDR = 4.80 × 10−5) with 10 genes enriched (Fig. 4E). As expected, the response to amphetamine ontology (GO: 0001975) was also enriched (FDR = 9.12 × 10−3) (Fig. 4F). The neuronal cell body ontology (GO: 0043025) gene set enriched 28 genes (FDR = 1.62 × 10−2), and most of them were co-expressed as markers of the neuronal cell body (Fig. 4G). Networks shared by the CAG repeat length and behaviors. (A) A scatter plot of the gene significance for CAG repeat length vs. module membership in the brown module. (B) Gene ontology networks for the CAG repeat length. (C–G) Gene networks of the significant ontologies for the CAG repeat length. Blue lines, physical interactions. Brown lines, co-localization. Green lines, pathway. Dark pink lines, coexpression. Light green lines, shared protein domains. (C) Postsynaptic density (GO: 0014069), (D) Calcium, (E) Signal transduction inhibitor function, (F) Amphetamine ontology, and (G) Neuronal cell body ontology (GO: 0043025).

Highlighted networks in the motor and visit behaviors in the PhenoCube platform

The brown module showed that the PhenoCube-detected motor behavioral features, namely, grouping, clustering, and locomotion, were positively correlated with gene expression. The most significant behavior was the number of grouping (number of short bouts together, P = 1 × 10−7). A total of 725 genes were significantly altered in the PhenoCube feature (FDR < 0.05). The summary of the DAVID FDR of the enriched functional terms for gene modules that are associated with representative PhenoCube behaviors was shown in Fig. 5 and Table S2. Fig. 6A illustrates the visualization of the eigengene network representing the relationships between the modules and the number of grouping. Fig. 6B presents the scatter plot of the gene significance for the number of grouping versus the module membership in the brown module. This gene set highlighted 20 significant functional groups through the DAVID analysis (FDR < 0.05) (Fig. 6C). Besides the functional ontologies detected in the CAG repeat previously, the other five functional groups were enriched in this behavior. The dendritic spine functional group (FDR = 4.03 × 10−4) contained 13 genes, as shown in Fig. 6D. This result is consistent with those of previous reports in which the dendritic spine was found to be unstable and reduced in the HD condition.32, 33, 34 The synapse functional group (FDR = 6.31 × 10−3), which included 21 genes, was also enriched for motor behavior (Fig. 6E). This result is consistent with those of previous reports in which HD was found to be partially caused by abnormal synaptic transmission., The S_TKc domain functional group with 20 genes was also enriched (FDR = 0.017) (Fig. 6F). The cell junction function group (FDR = 0.02), which contained 22 genes, was also enrich (Fig. 6G). This result is consistent with those of previous reports in which the signal transmission was reduced in the HD condition. Eleven oxytocin signaling pathway genes were enriched in this study (FDR = 0.03) (Fig. 6H). This result is consistent with those of previous reports in which oxytocin was found to affect HD patients.,
Figure 5

The graphical representation of the DAVID FDR of the enriched functional terms for gene modules that are associated with representative PhenoCube and NeuroCubebe behaviors.

The graphical representation of the DAVID FDR of the enriched functional terms for gene modules that are associated with representative PhenoCube and NeuroCubebe behaviors. Highlighted networks in the grouping behaviors in the PhenoCube platform. (A) Visualization of the eigengene network representing the relationships between the modules and the number of groupings. The upper panel shows a hierarchical clustering dendrogram of the eigengenes in which the dissimilarity of eigengenes EI and EJ is given by 1 - cor (EI; EJ). The heat map in the lower panel shows the eigengene adjacency AIJ = (1 + cor (EI; EJ))/2. (B) A scatter plot of the gene significance for number of groupings vs. module membership in the brown module. (C) Gene ontology networks for number of groupings. (D–H) Gene networks of the significant ontologies for the number of grouping. Blue lines, physical interactions. Brown lines, co-localization. Green lines, pathway. Dark pink lines, coexpression. Light green lines, shared protein domains. (D) Dendritic spine, (E) Synapse, (F) S_TKc domain, (G) Cell junction, and (H) Oxytocin signaling pathway. A set of visit behaviors, such as the NumExplVisitsWithCorrInitialNP behavior (number of visits to non-reinforced corners with a first pseudocorrect nose poke), was also associated with gene expression (P = 9 × 10−6) (Fig. 3A). A total of 263 genes contributing to 13 functional terms were significantly regulated in the NumExplVisitsWithCorrInitialNP behavior. The significant gene functional annotation terms were similar to the number of grouping behavior, but the correlation pattern was the opposite. The significant ontologies were postsynaptic density (FDR = 1.49 × 10−8), calcium (FDR = 4.05 × 10−5), and dendritic spine (FDR = 3.08 × 10−4), among others.

Highlighted networks in the gait characteristics in the NeuroCube platform

The gait characteristics are known to be abnormal in HD., In the NeuroCube platform, we detected eight gait characteristics that were significantly changed in the brown module (Fig. 3B). The summary of the DAVID FDR of the enriched functional terms for gene modules that are associated with representative NeuroCubebe behaviors was shown in Fig. 5. The expression of the gene set was positively correlated with the number of steps feature (FDR = 7 × 10−6); 371 genes were significantly altered in this behavior (FDR < 0.05). Along with the CAG repeat length, the number of steps should be lower in the HD condition than in the normal one. Fig. 7A shows the visualization of the eigengene network, which represents the relationships between the modules and the number of steps. Fig. 7B illustrates the scatter plot of the gene significance for the number of steps versus the module membership in the brown module. Twelve functional ontologies were significantly enriched in the number of steps behavior (Fig. 7C). Except for the ontologies previously mentioned, namely, the response to amphetamine (FDR = 3.97 × 10−5), postsynaptic density (FDR = 5.52 × 10−5), phosphoprotein (FDR = 7.26 × 10−4), synapse (FDR = 2.55 × 10−3), plasma membrane (FDR = 2.79 × 10−3), neuronal cell body (FDR = 4.69 × 10−3), cell junction (FDR = 1.41 × 10−2), and dendritic spine (FDR = 1.45 × 10−2), the other three ontologies were specifically highlighted in this feature. The postsynaptic membrane was enriched in the number of steps feature (FDR = 2.82 × 10−3) (Fig. 7D). This result is consistent with the previous results that the postsynaptic membrane is a dysfunction in the HD condition.41, 42, 43 A set of genes with the locomotor behavioral feature was also enriched (FDR = 3.03 × 10−3) (Fig. 7E). Most of these genes are connected by physical interaction. The remaining gene ontology is perikaryon, with eight genes assigned in this ontology (FDR = 0.016). These genes tightly interacted mainly by co-localization (Fig. 7F). Highlighted networks in gait characteristics and number of steps in the NeuroCube platform. (A) Visualization of the eigengene network representing the relationships between the modules and the number of steps. The upper panel shows a hierarchical clustering dendrogram of the eigengenes in which the dissimilarity of eigengenes EI and EJ is given by 1 – cor (EI; EJ). The heat map in the lower panel shows the eigengene adjacency AIJ = (1 + cor (EI; EJ))/2. (B) A scatter plot of the gene significance for number of steps vs. module membership in the brown module. (C) Gene ontology networks for number of steps. (D–F) Gene networks of the significant ontologies for number of steps. Blue lines, physical interactions. Brown lines, co-localization. Green lines, pathway. Dark pink lines, coexpression. Light green lines, shared protein domains. (D) Postsynaptic membrane, (E) Locomotor, and (F) Perikaryon. The other behavioral characteristic gene sets, including the imaging features, body motion, and rhythmicity, were also significantly associated with the HD behavior state. Another significant behavioral feature is the HindLimbMeanAngle (hind limb mean angle of orientation of the maximal diameter to the direction of the run [X]) (FDR = 6 × 10−7, Fig. 3B). Fig. 8A shows the scatter plot of the gene signature of the number of hind limb angles versus the module membership in the brown module. A total of 1087 genes, consisting of 19 gene ontologies (Fig. 8B), were significantly altered in this behavioral feature (FDR < 0.05). This behavior shared the most ontologies with other behaviors, such as phosphoprotein (FDR = 8.47 × 10−8), postsynaptic density (FDR = 1.83 × 10−6), and calcium (FDR = 2.80 × 10−6), among others (Fig. 8B). Two specific ontologies were enriched in this behavior. The first one is cytoskeleton ontology (FDR = 3.12 × 10−3) (Fig. 8C). This result is consistent with those of previous reports in which cytoskeleton abnormalities were found in the HD condition., The second one is the proton acceptor ontology (FDR = 0.03), in which 32 genes were detected (Fig. 8D). Highlighted networks in the gait characteristics of the hind limb angle (hind limb mean angle of orientation of the maximal diameter to the direction of the run [X]) in the NeuroCube platform. (A) A scatter plot of the gene significance for number of hind limb angles vs. module membership in the brown module. (B) Gene ontology networks for the hind limb angle. (C, D) Gene networks of the significant ontologies for the hind limb angle. Blue lines, physical interactions. Brown lines, co-localization. Green lines, pathway. Dark pink lines, coexpression. Light green lines, shared protein domains. (C) Cytoskeleton and (D) Proton acceptor.

Single cell-based RNA expression patterns and spatial expression patterns of the HTT repeat–dependent behavior genes

To investigate the expression patterns of the HTT repeat–dependent behavior genes in the striatum cell populations, we downloaded the mouse striatum single-Cell RNA-Seq data. We searched the cell population expression patterns of genes in the GO network of CAG repeat length (Fig. 9A), number of grouping (Fig. 9B), number of steps (Fig. 9C), and hind limb angle (Fig. 9D). All the ten cell populations in the striatum can detect some gene expression, with the strongest expression in neuron cells (Fig. 9). These results suggested that HD patients’ neurons are highly impacted in the striatum. Other cell types were also impacted to a different degree. Cbx4, Tcf7, Cap1, and Tmc3 were highly expressed in all the ten cell populations in GO network of CAG repeat length genes (Fig. 9A). Cbx4, Cap1, and Tmc3 were also involved in the behavior number of grouping (Fig. 9B). Cap1 and Tmc3 were also involved in behavior hind limb angle (Fig. 9D). In the behavior number of steps, only Cnr1 was highly expressed in all the ten cell populations (Fig. 9C).
Figure 9

Single-cell-based RNA expression patterns of the HTT repeat-dependent behavior genes in the Striatum cell population. The cell population expression patterns of genes in the network of CAG repeat length (A), number of grouping (B), number of steps (C), and hind limb angle (D). Ependy-C: Ependy cilia; NSCs: neuronal stem cells; OPCs: oligodendrocyte precursor cells.

Single-cell-based RNA expression patterns of the HTT repeat-dependent behavior genes in the Striatum cell population. The cell population expression patterns of genes in the network of CAG repeat length (A), number of grouping (B), number of steps (C), and hind limb angle (D). Ependy-C: Ependy cilia; NSCs: neuronal stem cells; OPCs: oligodendrocyte precursor cells. To validate the gene expression patterns in brain space, we selected some of the enriched functional terms for gene modules that are associated with representative PhenoCube and NeuroCubebe behaviors. We looked up the genes' spatial expression patterns involved in behavior, locomotor behavior, learning or memory, dendritic spine, postsynaptic density in C57BL/6 mice and App mice (a model for Alzheimer's diseases, AD) (Fig. S10) (https://alzmap.org/). These genes showed dynamic expression in corresponding brain regions with age (3 months vs. 18 months). Comparing to the wild type, these genes generally down-regulated their expression in the other typical neurodegenerative disease–AD, which is consistent with the expression patterns observed in the HD in our previous analysis.

Enriched small molecule response to behaviors defines the HTT repeat-dependent genes

We aimed to identify the interactions between genes and small molecules (using the Connectivity Map; https://clue.io/repurposing-app) in HD based on the behavior-enriched genes to provide the potential targets for pharmacotherapeutic intervention. The 15 significant small molecules are listed in Table 1, most of them are dopamine receptor agonist or antagonist, serotonin receptor agonist or antagonist. Staurosporine, an inhibitor for CDK, CHK, and PKC, was reported to be an inducer of apoptosis for HD cells. At the present stage, Chlorprothixene is used as a dopamine receptor antagonist for schizophrenia and bipolar disorder. Naltrexone is used as an opioid receptor antagonist for abstinence from alcohol. Levomepromazine is used as a dopamine receptor antagonist for psychosis, schizophrenia, bipolar disorder, nausea, and insomnia. Pramipexole is a clinically effective non-ergot dopamine agonist and interacts with dopamine D2 subfamily receptors, namely, the D2, D3, and D4 receptor subtypes. It has a positive effect on HD. Aripiprazole was well tolerated and remarkably improved some of the motor and behavioral symptoms in patients affected by HD. Ziprasidone, as a dopamine receptor antagonist and serotonin receptor antagonist, was improved several categories of the motor function of HD. Nortriptyline is a tricyclic antidepressant, is used for depression. Dantrolene was neuroprotective in the HD transgenic mouse model. The interaction analysis of the small molecules and behavior-enriched genes of HD provided information for the further pharmacotherapeutic intervention possibility for HD.
Table 1

Fifteen significant chemicals for HD based on genes with enriched behavior features.

NamePFDRATC codeTargetMechanism of actionaDisease areaa
Staurosporine1.46 × 10−83.44 × 10−6CDK2, GSK3B, CAMK2B, CDK1, CDK5, CHEK1, CHRM1, CHRM2, CHRM4, CSK, DAPK1, GPR 35, IKBKB, ITK, LCK, LRRK2, MAP2K4, MAP2K6, MAPKAPK2, PAK2, PDPK1, PHKG2, PIK3CG, PIM1, PKN1, PRKACB, PRKCI, PRKCQ, RPS6KA1, STK3, SYK, TNIK, ZAP70CDK inhibitor, CHK inhibitor, PKC inhibitorundefined
Quinpirole4.58 × 10−54.95 × 10−3DRD2, DRD3, DRD4, DRD1, HTR1A, HTR2A, HTR2B, HTR2Cdopamine receptor agonistundefined
Chlorprothixene7.92 × 10−55.33 × 10−3N05AF03DRD2, CHRM1, CHRM2, CHRM3, CHRM4, CHRM5, DRD1, DRD3, HRH1, HTR2A, HTR2B, HTR2Cdopamine receptor antagonistschizophrenia (neurology/psychiatry)bipolar disorder (neurology/psychiatry)
Boldine6.3 × 10−54.95 × 10−3CHRNA4, CHRNB2, DRD1, DRD2
Naltrexone6.3 × 10−54.95 × 10−3N07BB04OPRK1, OPRM1, OPRD1, SIGMAR1opioid receptor antagonistabstinence from alcohol (neurology/psychiatry)
Levomepromazine2.44 × 10−46.16 × 10−3N05AA02ADRA1A, ADRA1B, ADRA1D, ADRA2A, ADRA2B, ADRA2C, CHRM1, CHRM2, CHRM3, CHR M4, CHRM5, DRD1, DRD2, DRD3, DRD4, DRD5, HRH1, HTR2A, HTR2Cdopamine receptor antagonistpsychosis (neurology/psychiatry) schizophrenia (neurology/psychiatry)bipolar disorder (neurology/psychiatry)nausea (gastroenterology) insomnia (neurology/psychiatry)
Minaprine2.59 × 10−46.16 × 10−3N06AX07HTR2B, SLC6A4, ACHE, CHRM1, DRD1, DRD2, HTR2A, HTR2C, MAOAserotonin reuptake inhibitorundefined
Pramipexole3 × 10−46.16 × 10−3N04BC05DRD3, DRD2, ADRA2A, ADRA2B, ADRA2C, DRD1, DRD4, DRD5, HTR1A, HTR1B, HTR1D, HTR2A, HTR2B, HTR2Cdopamine receptor agonistParkinson's Disease (neurology/psychiatry)
Ropinirole3 × 10−46.16 × 10−3N04BC04DRD2, DRD3, ADRA2A, ADRA2B, ADRA2C, DRD1, DRD4, DRD5, HTR1A, HTR1B, HTR1D, HTR2A, HTR2B, HTR2Cdopamine receptor agonistParkinson's Disease (neurology/psychiatry)restless leg syndrome (neurology/psychiatry)
SCH- 233902.61 × 10−46.16 × 10−3DRD1, DRD5, HTR2C, KCNJ4, KCNJ6dopamine receptor antagonistundefined
Aripiprazole3.79 × 10−47.14 × 10−3N05AX12, other antipsychotics, antipsychoticsDRD2, HTR1A, HTR2A, HRH1, HTR1B, HTR1D, HTR2C, ADRA1A, ADRA1B, ADRA2A, ADRA 2B, ADRA2C, CHRM1, CHRM2, CHRM3, CHRM4, CHRM5, DRD1, DRD3, DRD4, DRD5, HTR1E, HTR3A, HTR6, HTR7serotonin receptor agonistserotonin receptor antagonistdepression (neurology/psychiatry) schizophrenia (neurology/psychiatry)bipolar disorder (neurology/psychiatry)
Ziprasidone3.79 × 10−47.14 × 10−3N05AE04, indole derivatives antipsychoticsDRD2, HTR2A, HTR1A, HTR1D, HRH1, HTR1B, HTR1E, HTR2C, HTR7, ADRA1A, ADRA1B, ADRA2A, ADRA2B, ADRA2C, CHRM1, CHRM2, CHRM3, CHRM4, CHRM5, DRD1, DRD3, DRD4, DRD5, HTR3A, HTR6, SLC6A4dopamine receptor antagonistserotonin receptor antagonistschizophrenia (neurology/psychiatry)bipolar disorder (neurology/psychiatry)
Romazine5.29 × 10−49.51 × 10−3N05AA03CHRM5, DRD2, ADRA1A, ADRA1B, ADRA1D, CHRM1, CHRM2, CHRM3, CHRM4, DRD1, DRD3, DRD4, HRH1, HTR2A, HTR2CNAundefined
Nortriptyline5.45 × 10−49.51 × 10−3N06AA10, non-selective monoamine reuptake inhibitors, antidepressantsKCNJ10, SLC6A2, SLC6A4, ADRA1A, ADRA1B, ADRA1D, ADRA2A, ADRA2B, ADRA2C, ADR B1, ADRB2, ADRB3, CHRM1, CHRM2, CHRM3, CHRM4, CHRM5, CYP2C19, DRD2, HRH1, HTR1A, HTR2A, HTR2C, HTR6, PGRMC1, PIK3CD, SIGMAR1tricyclic antidepressantdepression (neurology/psychiatry)
Dantrolene3.01 × 10−46.16 × 10−3M03CA01, dantrolene and derivatives, muscle relaxants, directly acting agentsRYR1, RYR3calcium channel blockerspasms (neurology/psychiatry)malignant hyperthermia (MH) (endocrinology)

https://clue.io/repurposing-app.

Fifteen significant chemicals for HD based on genes with enriched behavior features. https://clue.io/repurposing-app.

Discussion

To the best of our knowledge, this analysis is the first systematic and integrated one on a gene coexpression network corresponding to HD behaviors in mice models. In this study, we applied an extensive association analysis of genome-wide transcriptomic characterization with behaviors using an allelic series of the Htt CAG-knock-in mouse model to elucidate the CAG length-dependent molecular networks in disease-relevant behaviors. In this study, we used a consensus WGCNA on 445 mRNA samples and 445 microRNA samples along with behavioral features (396 PhenoCube behaviors and 111 NeuroCube behaviors) to define the gene modules that correlate with CAG repeat-dependent behaviors in Htt CAG-knock-in mice. We identified 37 behavioral features that were significantly associated with the CAG repeat length, and these behavioral features were associated with several gene coexpression groups involved in neuronal dysfunctions. We also identified the significant responses of 15 chemicals in behavior-enriched HD genes. Our study provides further evidence that the neuronal signal transduction abnormal in the striatum plays an important role in causing HD-related phenotypic behaviors. The neuronal dysfunction module is considered the molecular system most strongly associated with the pathophysiology of HD. The gene expression in the postsynaptic density, calmodulin-binding, neuronal cell body, synapse, perikaryon, and dendrite functional groups was modulated and linked to behaviors detected by the PhenoCube and NeuroCube platforms. These results dominated the dysfunction of neurons in the HD condition, consistent with the results of previous reports that neuronal death is the hallmark of HD., These results were also supported by the single-cell RNA sequencing data in the striatum which suggested many genes in the network were highly expressed in the neurons. Moreover, although most of the functional ontologies were reported in HD, about half of the genes in our networks were newly identified to be associated with HD. The prominent behavioral changes in HD patients are learning and memory, perception, executive functions, apathy, organization, impulsivity, frustration, irritability, anger, strategies, denial and unawareness, perseveration, depression, anxiety, psychosis, and sleep disturbances, among others., In the mouse HD model, we found that grouping, clustering, and locomotion behaviors were significantly reduced. Many functional ontology classes were found to correspond to these behaviors. The postural and gait patterns in HD patients were abnormal, and the walking speed was reduced in HD patients in comparison with normal individuals.56, 57, 58 Previous findings showed compensation production in the lower limbs in HD patients.40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59 We identified eight HD mouse behaviors similar to the postural and gait patterns of the lower limb compensation production of human HD patients. Our results provided the dominant behaviors to test drug compounds and narrowed down the behavioral features from thousands to about 40 by using the CAG-expansion HD mouse model for drug screening. In this study, we also analyzed the miRNA data, but no miRNA markers were found to be significantly associated with the CAG repeat length and behaviors. We also examined the association of transcriptome data and 62 SmartCube platform behaviors of 2 months (no 6- and 10-month behavior data were available) and found no associated behaviors in this dataset mainly because of the early age. In summary, our study provided a large-scale, comprehensive transcriptomic and behavioral characterization of the molecular pathogenic effects of the Htt CAG-knock-in mouse model. We provided integrative genomic evidence to show that converging molecular networks linked to behavioral features are perturbed in HD mice. Taken together, our integrative findings capturing the social and activity behaviors with dysregulated genes and ontologies provide rich information for the understanding of HD.

Author contributions

K. W and L.H. designed the study. L.H., L.F., Q.L., and A.D. performed the statistical analysis. L.H. and K.W wrote the manuscript. All authors critically revised and provided the final approval for this manuscript.

Conflict of interests

The authors declare no competing financial interests related to this paper.

Funding

This research was supported by the (No. 81970839 and 81670895 to L.H.) and the , China (No. 2021YFS0033 to L.H). We would like to thank the Wang lab members for their assistance in the R package codes and their helpful comments.
  60 in total

1.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

Authors:  Da Wei Huang; Brad T Sherman; Richard A Lempicki
Journal:  Nat Protoc       Date:  2009       Impact factor: 13.491

2.  Increased apoptosis of Huntington disease lymphoblasts associated with repeat length-dependent mitochondrial depolarization.

Authors:  A Sawa; G W Wiegand; J Cooper; R L Margolis; A H Sharp; J F Lawler; J T Greenamyre; S H Snyder; C A Ross
Journal:  Nat Med       Date:  1999-10       Impact factor: 53.440

3.  Oxytocin selectively modulates brain processing of disgust in Huntington's disease gene carriers.

Authors:  Izelle Labuschagne; Govinda Poudel; Catarina Kordsachia; Qizhu Wu; Hannah Thomson; Nellie Georgiou-Karistianis; Julie C Stout
Journal:  Prog Neuropsychopharmacol Biol Psychiatry       Date:  2017-09-23       Impact factor: 5.067

Review 4.  Striatal specificity of gene expression dysregulation in Huntington's disease.

Authors:  Elizabeth A Thomas
Journal:  J Neurosci Res       Date:  2006-11-01       Impact factor: 4.164

5.  limma powers differential expression analyses for RNA-sequencing and microarray studies.

Authors:  Matthew E Ritchie; Belinda Phipson; Di Wu; Yifang Hu; Charity W Law; Wei Shi; Gordon K Smyth
Journal:  Nucleic Acids Res       Date:  2015-01-20       Impact factor: 16.971

6.  Depressed Synaptic Transmission and Reduced Vesicle Release Sites in Huntington's Disease Neuromuscular Junctions.

Authors:  Ahmad Khedraki; Eric J Reed; Shannon H Romer; Qingbo Wang; William Romine; Mark M Rich; Robert J Talmadge; Andrew A Voss
Journal:  J Neurosci       Date:  2017-07-19       Impact factor: 6.167

Review 7.  Faulty splicing and cytoskeleton abnormalities in Huntington's disease.

Authors:  Marta Fernández-Nogales; María Santos-Galindo; Ivó H Hernández; Jorge R Cabrera; José J Lucas
Journal:  Brain Pathol       Date:  2016-11       Impact factor: 6.508

8.  Evidence for a disorder of locomotor timing in Huntington's disease.

Authors:  Belinda Bilney; Meg E Morris; Andrew Churchyard; Edmond Chiu; Nellie Georgiou-Karistianis
Journal:  Mov Disord       Date:  2005-01       Impact factor: 10.338

Review 9.  Normal huntingtin function: an alternative approach to Huntington's disease.

Authors:  Elena Cattaneo; Chiara Zuccato; Marzia Tartari
Journal:  Nat Rev Neurosci       Date:  2005-12       Impact factor: 34.870

10.  Differential Changes in Postsynaptic Density Proteins in Postmortem Huntington's Disease and Parkinson's Disease Human Brains.

Authors:  C Fourie; E Kim; H Waldvogel; J M Wong; A McGregor; R L M Faull; J M Montgomery
Journal:  J Neurodegener Dis       Date:  2014-01-16
View more
  1 in total

1.  Deep Sc-RNA sequencing decoding the molecular dynamic architecture of the human retina.

Authors:  Lulin Huang; Runze Li; Lin Ye; Shanshan Zhang; Huaping Tian; Mingyan Du; Chao Qu; Shujin Li; Jie Li; Mu Yang; Biao Wu; Ran Chen; Guo Huang; Ling Zhong; Hongjie Yang; Man Yu; Yi Shi; Changguan Wang; Houbin Zhang; Wei Chen; Zhenglin Yang
Journal:  Sci China Life Sci       Date:  2022-09-15       Impact factor: 10.372

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.