Literature DB >> 27553423

Characterisation of the global transcriptional response to heat shock and the impact of individual genetic variation.

Peter Humburg1, Narelle Maugeri1,2, Wanseon Lee1, Bert Mohr1,3, Julian C Knight4.   

Abstract

BACKGROUND: The heat shock transcriptional response is essential to effective cellular function under stress. This is a highly heritable trait but the nature and extent of inter-individual variation in heat shock response remains unresolved.
METHODS: We determined global transcription profiles of the heat shock response for a panel of lymphoblastoid cell lines established from 60 founder individuals in the Yoruba HapMap population. We explore the observed differentially expressed gene sets following heat shock, establishing functional annotations, underlying networks and nodal genes involving heat shock factor 1 recruitment. We define a multivariate phenotype for the global transcriptional response to heat shock using partial least squares regression and map this quantitative trait to associated genetic variation in search of the major genomic modulators.
RESULTS: A comprehensive dataset of differentially expressed genes following heat shock in humans is presented. We identify nodal genes downstream of heat shock factor 1 in this gene set, notably involving ubiquitin C and small ubiquitin-like modifiers together with transcription factors. We dissect a multivariate phenotype for the global heat shock response which reveals distinct clustering of individuals in terms of variance of the heat shock response and involves differential expression of genes involved in DNA replication and cell division in some individuals. We find evidence of genetic associations for this multivariate response phenotype that involves trans effects modulating expression of genes following heat shock, including HSF1 and UBQLN1.
CONCLUSION: This study defines gene expression following heat shock for a cohort of individuals, establishing insights into the biology of the heat shock response and hypotheses for how variation in this may be modulated by underlying genetic diversity.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27553423      PMCID: PMC4995779          DOI: 10.1186/s13073-016-0345-5

Source DB:  PubMed          Journal:  Genome Med        ISSN: 1756-994X            Impact factor:   11.117


Background

The heat shock response is a highly conserved mechanism found across organisms that ensures effective maintenance of cellular function under stress. Transcriptional activation involving heat shock proteins (HSPs) was found to underpin the seminal observation of expanded chromosomal puffs in Drosophila salivary glands following exposure to heat [1], with subsequent studies in different species highlighting not only changes in expression of genes encoding these essential molecular chaperones but also their regulators, proteins involved in proteolysis, transcription factors and kinases, membrane transport, maintenance of cellular structures, metabolism and nucleic acid repair [2-9]. As well as significant upregulation of gene expression, involving rapid induction of HSP gene transcription by activated heat shock factors (HSF) binding to promoter heat shock elements (HSEs), the coordinated stress response is also recognised to involve downregulation of a greater number of genes. However, to date inter-individual variation in the heat shock response at the level of transcription in humans remains largely unknown, with studies defining the global transcriptome based on specific cell lines or cells/tissue from particular individuals [8, 9]. Further delineation of the nature and variability in this response is important given the role of HSPs in ensuring effective intracellular protein folding during stress, protecting cells from denaturation, aggregation and apoptosis [4]. This is underlined by evidence linking HSPs with ageing and cancer, as well as the response to infection and immunity [10-13]. Genetic modulators of gene expression are important determinants of inter-individual variation in diverse phenotypes and may only operate in specific cell types or after particular environmental exposures [14, 15]. Mapping gene expression as a quantitative trait to identify regulatory genetic variants has informed recent genome-wide association studies (GWAS) of disease as well as pathophysiology including the immune response to endotoxin [16], sepsis [17], T-cell activation [18] or viral infection [19, 20]. Expression of heat shock proteins is highly heritable and has been mapped as a quantitative trait in diverse organisms including Drosophila melanogaster [21-23], Caenorhabditis elegans [24] and the Artic charr [25]. In resting (non-heat shocked) human Epstein-Barr virus (EBV)-immortalised lymphoblastoid cell lines (LCLs), expression of heat shock protein and molecular chaperone genes shows high heritability on eQTL mapping, with response to unfolded proteins having the highest heritability of any biological process on gene ontology (GO) analysis (H2 0.38) [26]. A previous QTL analysis of heat shock phenotypes in human cells was restricted to the Hsp70 genes in the MHC class II region and demonstrated a local eQTL for HSPA1B [27]. Here we report the genome-wide changes to gene expression induced by heat shock in HapMap cell lines from Yoruba (YRI) individuals and perform analysis to identify genes and pathways involved in the human heat shock response. To further elucidate underlying mechanisms, we present an analysis of genetic variants modulating the global heat shock transcriptional response.

Methods

Cell culture and heat shock

The 60 founder YRI HapMap cell lines (Coriell) [28] were cultured. These anonymised cell lines were established by the International HapMap Project and made available for use by the scientific research community [29]. LCLs were maintained in RPMI 1640 medium supplemented with 10 % fetal calf serum and 2 mM L-glutamine at 37 °C in 5 % humidified CO2. Growth rates were determined after 72 h in culture for each cell line to ensure the cells were at comparable densities and total numbers when harvested. Trypan blue staining was used to define cell viability. Cells were subject to heat shock at 42 °C for 1 h and then allowed to recover for 6 h in a 37 °C, 5 % CO2 incubator. 2 × 107 cells were harvested for each of the two paired experimental conditions (i.e. heat shock stimulated and basal un-stimulated culture conditions) per individual cell line and stored in RLT buffer with β-mercaptoethanol at −80 °C. Total RNA was purified using QIAGEN RNeasy Mini purification kit following manufacturer’s instructions, including on-column DNase digestion.

Gene expression pre-processing and quality control

Genome-wide gene expression analysis was carried out using the Illumina Human-HT-12 v3 Expression BeadChip gene expression platform comprising 48,804 probes. Probe intensities for resting and stimulated cells were imported into R for further processing together with associated metadata. Annotations for all probes were obtained via the illuminaHumanv3.db Bioconductor package [30]. Only probes considered to be of perfect or good quality according to these annotations were taken forward for analysis. Additionally, all probes mapping to more than one genomic location or to a location that contains a known single nucleotide polymorphism (SNP) were excluded. Probes were required to exhibit significant signal (detection p value <0.01) in at least ten samples and samples with less than 30 % of the remaining probes providing significant signal were excluded (together with the paired sample from the same cell line). Samples showing exceptionally low variation in probe intensities (standard deviation of the log intensities of all retained probes below 0.8) were also removed. After filtering 12,416 of 48,803 probes (25.4 %) remained.

Normalising gene expression estimates

Probe intensities were normalised with VSN [31] and outlier samples removed. The remaining 43 samples were normalised separately for each BeadChip and differences between groups corrected with ComBat [32], preserving differences due to heat shock stimulation (Additional file 1: Figure S1).

Differential expression analysis

Following quality control (QC), samples were analysed for differences in gene expression levels between the basal and stimulated states, i.e. pairing samples from the same individual, using the limma Bioconductor package [33]. Individual probes were associated with corresponding genes by comparing probe positions as provided by the illuminaHumanv3.db Bioconductor package [30] with transcript coordinates obtained via the TxDb.Hsapiens.UCSC.hg19.knownGene Bioconductor package [34]. One of the genes (N4BP2L2) had two probes with opposite effects in terms of differential expression and these probes were excluded from further analysis. For all other genes with multiple differentially expressed probes, the direction of the effect was consistent between probes.

GO enrichment and pathway analysis

GO enrichment analysis was carried out using the Bioconductor package topGO [35]. Fisher’s exact test was used to determine enrichment separately for significantly upregulated and downregulated genes (false discovery rate (FDR) <0.01 and >1.2 fold change (FC)). Biological pathways, function enrichment and prediction of upstream regulators were generated for these genes using Qiagen’s Ingenuity Pathway Analysis (IPA) (www.qiagen.com/ingenuity, QIAGEN Redwood City). For the shortest path analysis, we used the path explorer tool. Here, if two molecules do not have specific direct connections in the Ingenuity Knowledge Base, this tool will define how many and which molecules can be added to the pathway to create the shortest path between them.

Gene functional annotations with heat shock

We investigated which differentially expressed genes we identified had been previously associated with the heat shock or, more generally, stress response. We used the set of genes previously linked directly to heat shock [4] and from this created an extended set based on GO terms and PubMed articles linking differentially expressed genes to heat shock response and closely related processes. As a first step in highlighting genes not previously known to play a role in this context, we identified all significantly upregulated genes that lack GO annotations of obvious relevance to heat shock response. In addition to terms related to stress response and protein folding, we also explored an extended set that included terms related to cell death and proliferation. To account for the presence of EBV in these cell lines, we excluded all genes annotated with terms related to viral infections. Finally, any remaining genes related to regulation of gene expression were considered to be likely to be explained by the large-scale changes in gene expression that are taking place in response to heat shock and also included in the extended set. All genes not annotated with obvious GO terms were subjected to a PubMed search to find publications that link the gene to heat shock or stress response.

Heat shock factor binding

Using binding sites derived from ChIP-seq data obtained from the K562 immortalised leukaemic cell line [36], we annotated our list of differentially expressed genes by cross-referencing it with the list of HSF-binding genes. Groups of genes corresponding to upregulated or downregulated genes as well as those with existing heat shock-related annotations and those without were tested for enrichment of HSF-binding genes using Fisher’s exact test. In addition to the direct evidence from the ChIP-seq data, we carried out a scan for the presence of HSF-binding motifs in the promoter region (1200 bp upstream–300 bp downstream of the transcriptional start site (TSS)) of differentially expressed genes. The scan was based on the position weight matrices (PWM) defined by SwissRegulon [37] and carried out with the Bioconductor package PWMEnrich [38].

Multivariate global heat shock response phenotype

The global heat shock response was summarised using partial least squares (PLS) regression (generated as detailed in ‘Results’). Using the first two PLS components with respect to the treatment, i.e. the two components of the gene expression space that maximise the variation between basal and stimulated samples, we defined the response for each individual as the combination of the vector between the basal and stimulated sample for this individual in the space spanned by the first two PLS components and the location of the basal sample in the same space. Hierarchical cluster analysis was used to investigate grouping of individuals following heat shock and differential gene expression between clusters analysed.

Genotype QC

Genotype data provided by the HapMap project [39] were processed with Plink [40] to restrict the data to autosomes and remove SNPs with low genotyping rate and those with a minor allele frequency of less than 10 % in our sample set. This resulted in the exclusion of 794,511 of 2,582,999 SNPs (30.76 %). Estimation of the proportion of identity by descent for all sample pairs demonstrated three pairs showing evidence of higher than expected relatedness (Additional file 2: Figure S2) which was supported by IBS nearest neighbour calculation. As a result, samples NA18913, NA19192, NA18862 and NA19092 were excluded.

Genotypic association with gene expression

The multivariate global heat shock response phenotype was tested for association with SNPs within a 10 kb window either side of the probe location using the MultiPhen R package [41], 10 kb selected as informative for including functional elements interacting with a gene [42, 43]. All differentially expressed probes and all probes involving predicted upstream regulator genes were analysed but only genotyped SNPs that passed QC were considered. The GRCh37 coordinates for SNPs were obtained via the SNPlocs.Hsapiens.dbSNP142.GRCh37 Bioconductor package [44] and gene coordinates via the TxDb.Hsapiens.UCSC.hg19.knownGene package [34]. The significance of the observed associations was assessed through a permutation test to account for the structure inherent to the data. To this end the observed global response phenotype for each individual and covariates used in the model were randomly assigned to one of the observed set of genotypes 1000 times and p values for the joint model were computed for each permutation. From these we computed FDRs by comparing observed p values to the empirical distribution of minimum p values from each permutation. We tested for associations between genotype and heat shock response (log2 FC) for individual genes using a linear model as implemented in Matrix-eQTL [45], correcting for sex as well as the first two principal components of the treatment response to capture confounding variation, an approach which enhances eQTL mapping [46-48].

Results

Transcriptomic response to heat shock

We aimed to establish the nature and extent of inter-individual variation in the genome-wide transcriptomic response to heat shock for a panel of LCLs established from unrelated individuals of African ancestry for whom high-resolution genotyping data are available (International HapMap Project, YRI population) [28]. We cultured the LCLs and exposed cells to heat shock at 42 °C for 1 h and harvested after recovery at 37 °C for 6 h. We then quantified genome-wide gene expression using Human-HT-12 v3 Expression BeadChips (Illumina). Following QC and processing, paired expression data (baseline and following heat shock) were available for 12,416 probes on 43 individual cell lines. We found that 500 probes (4 % of all analysed probes, corresponding to 465 genes) were differentially expressed (FDR <0.01 and >1.2 FC) with 249 probes (226 genes) upregulated and 251 probes (238 genes) downregulated (Fig. 1, Table 1, Additional file 3: Table S1). The majority of the most significantly differentially expressed probes were upregulated, including 18 of the top 20 genes, of which nine encoded known heat shock proteins. The most significant difference in expression was seen for HSPA1B (22.2 FC, FDR 1.4 × 10−48).
Fig. 1

Heat shock response in LCLs. a Volcano plot showing differentially expressed genes following heat shock (42 °C for 1 h with 6 h recovery) in LCLs. Probes with an adjusted p value below 0.01 and a log FC of at least 0.5 are shown as yellow and red dots. Probes showing particularly strong evidence of changes in gene expression through a combination of p value and FC are labelled with the corresponding gene symbol. b Heatmap comparing gene expression for differentially expressed genes between basal and stimulated samples. Samples were clustered by gene with heat shocked (red) and basal (blue) samples forming two distinct groups. Expression estimates for each gene were scaled and centred across samples. Blue cells correspond to lower than average expression and red cells correspond to higher than average expression

Table 1

Top 20 differentially expressed genes following heat shock

GeneEntrezIDlogFCFCAverage expressiont p valueAdjusted p valueB
HSPA1B 33044.522.211.953.91.1E-521.4E-48105.5
DNAJB1 33373.712.710.942.21.8E-461.1E-4293.4
CLK1 11951.83.510.841.18.4E-463.5E-4292.0
HSPA6 33104.218.39.429.97.8E-382.4E-3475.2
HSPH1 108081.22.312.723.38.2E-322.0E-2861.9
ZFAND2A 906371.93.89.523.29.8E-322.0E-2861.8
TNFSF14 87401.42.79.423.01.4E-312.5E-2861.4
HSPA6 33102.86.87.821.92.3E-303.6E-2758.7
FXR1 80870.91.99.920.83.3E-294.5E-2656.1
SERPINH1 8711.93.78.720.48.6E-291.1E-2555.2
TDG 6996−0.90.510.7−20.41.1E-281.2E-2555.0
CCL3 6348−1.30.411.4−19.85.1E-285.2E-2553.4
KIAA0907 228890.91.99.519.51.0E-279.9E-2552.7
HSPA4L 228240.91.99.319.22.2E-272.0E-2452.0
JUN 37251.22.39.219.03.5E-272.9E-2451.5
CACYBP 271010.91.911.418.95.7E-274.2E-2451.1
DNAJB4 110801.32.47.218.95.8E-274.2E-2451.0
IER5 512781.42.610.518.14.3E-262.8E-2349.1
LMAN2L 815620.81.78.418.14.9E-263.1E-2348.9
BANP 549710.81.810.818.05.9E-263.47E-2348.8

The most significant differentially expressed genes for a panel of LCLs exposed to heat shock (42 °C for 1 h, 6 h recovery) and assayed by microarray are shown following limma analysis

Heat shock response in LCLs. a Volcano plot showing differentially expressed genes following heat shock (42 °C for 1 h with 6 h recovery) in LCLs. Probes with an adjusted p value below 0.01 and a log FC of at least 0.5 are shown as yellow and red dots. Probes showing particularly strong evidence of changes in gene expression through a combination of p value and FC are labelled with the corresponding gene symbol. b Heatmap comparing gene expression for differentially expressed genes between basal and stimulated samples. Samples were clustered by gene with heat shocked (red) and basal (blue) samples forming two distinct groups. Expression estimates for each gene were scaled and centred across samples. Blue cells correspond to lower than average expression and red cells correspond to higher than average expression Top 20 differentially expressed genes following heat shock The most significant differentially expressed genes for a panel of LCLs exposed to heat shock (42 °C for 1 h, 6 h recovery) and assayed by microarray are shown following limma analysis To further investigate the patterns of transcriptional response, we carried out a GO enrichment analysis for differentially expressed genes (>1.2 FC, FDR <0.01). This demonstrated significant enrichment among upregulated genes (seven categories with an FDR <0.05 on Fisher’s exact test) but no significant enrichment for downregulated genes (Table 2, Additional file 3: Tables S2 and S3). Considering the top categories, we found that genes upregulated following heat shock were predominantly related to the response to heat (including GO:0009408) and to unfolded protein (GO:0006986), together with negative regulation of inclusion body assembly (GO:0090084), endoplasmic reticulum stress (GO:1903573) and cell death (GO:0060548).
Table 2

GO categories enriched for upregulated and downregulated genes

GO IDTermAnnotated genesSignificantExpectedRank in downregulated genes p value (FDR) for upregulated genes p value (FDR) for downregulated genes
(A) Top GO categories enriched for upregulated genes
 GO:0009408Response to heat70131.5124835.7 × 10−8 (2.3 × 10−4)0.87 (1)
 GO:0006986Response to unfolded protein97152.0927577.1 × 10−8 (2.7 × 10−4)1 (1)
 GO:0006457Protein folding133172.8716001.7 × 10−7 (6.4 × 10−4)0.52 (1)
 GO:0035966Response to topologically incorrect protein106152.2923682.4 × 10−7 (9.4 × 10−4)0.81 (1)
 GO:0009266Response to temperature stimulus96132.0722572.5 × 10−6 (9.8 × 10−3)0.76 (1)
 GO:0042026Protein refolding1150.2427586.7 × 10−6 (0.028)1 (1)
 GO:0034605Cellular response to heat5091.0827598.6 × 10−6 (0.035)1 (1)
 GO:0043618Regulation of transcription from RNA polymerase II promoter in response to stress3570.7619074.3 × 10−5 (0.18)0.63 (1)
 GO:1900034Regulation of cellular response to heat2660.5627606.6 × 10−5 (0.27)1 (1)
 GO:0043620Regulation of DNA-templated transcription in response to stress3970.8420089 × 10−5 (0.37)0.67 (1)
(B) Top GO categories enriched for down regulated genes
 GO:0051225Spindle assembly3760.8211 (1)5.4 × 10−4 (1)
 GO:0043207Response to external biotic stimulus342207.5920.63 (1)1.6 × 10−3 (1)
 GO:0051707Response to other organism342207.5930.63 (1)1.6 × 10−3 (1)
 GO:0045931Positive regulation of mitotic cell cycle6471.4241 (1)2.1 × 10−3 (1)
 GO:0007049Cell cycle10374523.0250.69 (1)2.2 × 10−3 (1)
 GO:0007143Female meiotic division1030.2261 (1)2.3 × 10−3 (1)
 GO:0009607Response to biotic stimulus355207.8870.54 (1)2.6 × 10−3 (1)
 GO:0032496Response to lipopolysaccharide128102.8480.49 (1)3.3 × 10−3 (1)
 GO:1903047Mitotic cell cycle process5522712.2690.78 (1)3.7 × 10−3 (1)
 GO:0002237Response to molecule of bacterial origin134102.98100.52 (1)4.6 × 10−3 (1)
 GO:0008219Cell death10224322.69114.3 × 10−3 (1)4.9 × 10−3 (1)

The most significant GO categories for differentially expressed genes following heat shock in LCLs are shown. Numbers of significant and expected genes shown, together with p values (Fisher’s exact test)

GO categories enriched for upregulated and downregulated genes The most significant GO categories for differentially expressed genes following heat shock in LCLs are shown. Numbers of significant and expected genes shown, together with p values (Fisher’s exact test) We then performed pathway analysis of differentially expressed genes. Using IPA we found that the most significantly enriched canonical pathway among upregulated and downregulated genes (>1.2 FC, FDR <0.01) was the unfolded protein response (p value 6.8 × 10−8). We also found that heat shock factor 1 (HSF1) was the most significant upstream regulator (p value 2.5 × 10−13). Further investigation established that 81 % of observed differentially expressed genes were linked to HSF1 directly or through one additional molecule based on shortest path analysis using the Ingenuity Knowledge Base (Additional file 4: Figure S3). In addition to networks involving heat shock protein genes, this analysis highlighted the role of ubiquitination (UBC) and sumoylation (SUMO2, SUMO3) as well as transcription factors (including NFkB, JUN, ATF2, CEBP) and cytokines (IL6 and TNF) in the observed heat shock response at the transcriptional level (Additional file 4: Figure S3). In terms of biological functions, we resolved using IPA that cell death (p value 2.2 × 10−8), cell proliferation (p value 3.6 × 10−8), apoptosis (p value 8.2 × 10−8), cell cycle (p value 2.6 × 10−7) and gene expression (p value 6.6 × 10−7) were most significantly enriched. Upregulated and downregulated genes were found to cluster in a number of highly enriched networks constructed from the Ingenuity Knowledge Base (Additional file 3: Table S4).

Heat shock factor recruitment

Of the 226 significantly upregulated genes following heat shock, 24 genes have been previously directly linked to the heat shock response. We found that there was significant enrichment for genes associated with GO terms that clearly relate to heat shock response with 98 genes annotated with such terms (p value 2.3 × 10−10, Fisher’s exact test) and 21 otherwise linked to the heat shock response as revealed by a text mining strategy (detailed in ‘Methods’). Additionally, 30 genes were annotated with other relevant processes. This leaves 53 genes with no obvious previous association to heat shock. To further establish links between differentially expressed genes and heat shock response, we considered the evidence for binding of HSF1 and HSF2 in the promoter regions of upregulated genes using ChIP-seq data obtained for K562 cells following heat shock [36]. Overall there was significant enrichment of HSF1 (51 genes, p 4.7 × 10−10 on Fisher’s exact test, odds ratio (OR) 3.0), HSF2 (55 genes, p 9.4 × 10−9, OR 2.6) and binding of both HSF1 and HSF2 (46 genes, p 9.1 × 10−15, OR 4.5) among upregulated genes following heat shock. Of the nine upregulated genes following heat shock without an established role where we find evidence of HSF binding on ChIP-seq (Additional file 3: Table S5), four have HSF-binding motifs in the promoter region (Additional file 3: Table S6).

Variation in the global heat shock response

To assess the global difference in gene expression induced by heat shock, we carried out PLS, using the treatment state (basal or following heat shock) as a binary response variable and all gene expression probes that passed QC as explanatory variables (12,416 probes targeting 10,214 genes). PLS has been previously used to identify differentially expressed genes [49] and coordinated expression profiles [50] including global response phenotypes [51]. The supervised PLS approach identifies variance components that differentiate treatment groups. This contrasts with principal component analysis (PCA), which considers overall variance irrespective of any known groupings. The PLS analysis demonstrated that there is a considerable change in overall gene expression in response to heat shock with the first two PLS components together accounting for 96.1 % of the variation observed and providing clear separation of the two treatment groups (Fig. 2).
Fig. 2

Variance in the global heat shock response. a Modelling of the genome-wide transcriptional response to heat-shock (component plot) based on PLS to identify latent structures in the data for cohort of 43 LCLs. The x-axis represents the first PLS component which segregates basal samples (left) and heat shocked samples (right). The y-axis represents the second PLS component which involves variation between cell lines in basal and heat shock response states. Each cell line’s basal and heat shock samples are similarly coloured and paired samples are connected with an arrow, which represents the vector used as quantitative trait in the genetic association test for genetic modulators of the global heat shock response. The average response is indicated by a black arrow. Overall, samples separate clearly by treatment, showing a consistent global effect on gene expression from heat shock. Heat shock stimulated samples show evidence of three distinct clusters (indicated by shaded ovals). b Unsupervised hierarchical cluster analysis with heat shock stimulated samples showing evidence of three distinct clusters (indicated on panel A by shaded ovals). Below the cluster dendrogram is a heatmap showing differential gene expression. Expression estimates for each gene were scaled and centred across samples. Blue cells correspond to lower than average expression and red cells correspond to higher than average expression. c Volcano plot of differential expression results between clusters 1 and 2. Probes with an adjusted p value below 0.01 and a log FC of at least 0.5 are shown as yellow and red dots

Variance in the global heat shock response. a Modelling of the genome-wide transcriptional response to heat-shock (component plot) based on PLS to identify latent structures in the data for cohort of 43 LCLs. The x-axis represents the first PLS component which segregates basal samples (left) and heat shocked samples (right). The y-axis represents the second PLS component which involves variation between cell lines in basal and heat shock response states. Each cell line’s basal and heat shock samples are similarly coloured and paired samples are connected with an arrow, which represents the vector used as quantitative trait in the genetic association test for genetic modulators of the global heat shock response. The average response is indicated by a black arrow. Overall, samples separate clearly by treatment, showing a consistent global effect on gene expression from heat shock. Heat shock stimulated samples show evidence of three distinct clusters (indicated by shaded ovals). b Unsupervised hierarchical cluster analysis with heat shock stimulated samples showing evidence of three distinct clusters (indicated on panel A by shaded ovals). Below the cluster dendrogram is a heatmap showing differential gene expression. Expression estimates for each gene were scaled and centred across samples. Blue cells correspond to lower than average expression and red cells correspond to higher than average expression. c Volcano plot of differential expression results between clusters 1 and 2. Probes with an adjusted p value below 0.01 and a log FC of at least 0.5 are shown as yellow and red dots In addition to the pronounced shared response to heat shock that is largely accounted for by the first component, a further effect related to differences in the individual response is noticeable in the second component. This manifests in a visually striking grouping of samples into three clusters post treatment (Fig. 2). To further characterise the difference between these clusters we carried out a differential expression analysis between the two clusters that differ most with respect to the second PLS component. Using an FDR threshold of 0.01 and requiring a FC of at least 1.2, this identified 1094 differentially expressed probes (Additional file 3: Table S7). Of these 681 are upregulated and 415 are downregulated in cluster 2 compared to cluster 1 (Fig. 2). To further investigate which biological processes underlie the observed differences, we carried out a GO analysis of genes exhibiting significantly increased expression in either cluster. GO categories enriched in the set of genes upregulated in cluster 2 are largely similar to those identified in the analysis of genes that show increased expression in response to heat shock, including response to unfolded protein (GO:0006986) and response to topologically incorrect protein (GO:0035966) (Additional file 3: Table S8). In contrast, genes with higher expression in cluster 1 are enriched for GO annotations relating to DNA replication and cell division including DNA recombination (GO:0006310) and DNA replication (GO:0006260) (Additional file 3: Table S9). To explore to what extent this response is modulated by genetic variation, we used the length and direction of the response vector, i.e. the vector between the basal and stimulated sample for each individual in the space spanned by the first two PLS components, together with the location of the basal sample in the same space, as a multivariate phenotype. This was then tested for association with genotypes for SNPs within a 10-kb window of differentially expressed genes following heat shock or genes encoding predicted upstream regulators of differentially expressed genes identified by IPA analysis. This revealed two significant associations (Fig. 3). The first involved rs10509407 (FDR 0.021), a promoter variant of MINPP1 (encoding endoplasmic reticulum luminal enzyme multiple inositol polyphosphate phosphatase), which was in complete linkage disequilibrium with three further SNPs. The other association we identified involved rs12207548 (FDR 0.064), a regulatory variant located in a CTCF binding site 1.14 kb downstream of CDKN1A. CDKN1A is an important regulator of cell cycle progression. The SNP rs12207548 shows significant variation in allele frequency between human populations (Fig. 3) with an estimated FST of 0.142 (the FST providing a summary of the genetic differentiation between these populations).
Fig. 3

Genotypic association with global heat shock response. a Standardized coefficients and adjusted p values for the top associated SNPs. b, c The distribution of p values after permutation of the global response phenotype is shown for rs10509407 (b) and rs12207548 (c). d, e Global response to heat shock showing individual LCLs by genotype for rs10509407 (d) and rs12207548 (e). Each individual is represented by two points corresponding to basal and stimulated state with arrows connecting paired samples. Genotypes are indicated by colour with blue corresponding to homozygous carriers of the major allele and red indicating the presence of at least one copy of the minor allele. Coloured arrows show the average response for each group. The overall average is indicated in black. f Ancestral Allele Frequencies for rs12207548 from Human Genome Diversity Project in 53 populations. g Circos plot showing trans associations for rs12207548. h Box plots for expression of UBQLN1, HSF1, TNFRSF8, EPHB1, SHC1, ZC3HAV1 and ABCD3 by allele for SNPs as indicated. i Pathway analysis using IPA showing links between trans associated genes for rs12207548 and CDKN1A

Genotypic association with global heat shock response. a Standardized coefficients and adjusted p values for the top associated SNPs. b, c The distribution of p values after permutation of the global response phenotype is shown for rs10509407 (b) and rs12207548 (c). d, e Global response to heat shock showing individual LCLs by genotype for rs10509407 (d) and rs12207548 (e). Each individual is represented by two points corresponding to basal and stimulated state with arrows connecting paired samples. Genotypes are indicated by colour with blue corresponding to homozygous carriers of the major allele and red indicating the presence of at least one copy of the minor allele. Coloured arrows show the average response for each group. The overall average is indicated in black. f Ancestral Allele Frequencies for rs12207548 from Human Genome Diversity Project in 53 populations. g Circos plot showing trans associations for rs12207548. h Box plots for expression of UBQLN1, HSF1, TNFRSF8, EPHB1, SHC1, ZC3HAV1 and ABCD3 by allele for SNPs as indicated. i Pathway analysis using IPA showing links between trans associated genes for rs12207548 and CDKN1A To explore the observed association between heat shock response and genotypes at these two loci, we proceeded to test for association with differential expression (FC) following heat shock for individual genes with the two identified variants. We found evidence that both SNPs show trans association with differential induction of UBQLN1 after heat shock (rs10509407 FDR 0.011, beta 0.232; rs12207548 FDR 0.010, beta –0.238) (Fig. 3). UBQLN1 encodes ubiquilin, which is involved in protein degradation by linking the ubiquitination machinery to the proteasome. We found that rs12207548 was also associated with a trans network involving differential expression of six further genes: HSF1 (FDR 0.00075, beta –0.643); TNFRSF8 (FDR 0.00075, beta –0.477); EPHB1 (FDR 0.00075, beta –0.532); SHC1 (FDR 0.0031, beta –0.456); ZC3HAV1 (FDR 0.0036, beta –0.399) and ABCD3 (FDR 0.010, beta –0.279) (Fig. 3). Network analysis using IPA highlights the relationship of these trans genes, either directly or involving additional molecules, with CDKN1A (Fig. 3).

Discussion

We have generated a comprehensive catalogue of differential gene transcription following heat shock for human LCLs, significantly expanding the number of genes recognised to be upregulated and downregulated by exposure of cells to heat shock [4, 8, 9]. We have shown how this relates to HSF1 and HSF2 recruitment and determined several key nodal molecules in the observed pattern of differential expression using a network approach. This includes a role for ubiquitin C and small ubiquitin-like modifiers SUMO2/3 as well as heat shock proteins, transcription factors (NFkB, CEBP, JUN) and cytokines (TNF, IL6). Given that transcriptomic differences may not be reflected at a protein level [52], complementary proteomic analysis such as used to define stress-independent HSF1 activation in a ligand-mediated cell line model system would be informative [53]. We have investigated variation in the global heat shock response across individual LCLs, defining a multivariate phenotype using PLS which revealed evidence of clustering with relative predominance of differential expression of genes involved in DNA replication and cell division in some individuals. We further investigated specific genotypic associations with the observed variation which revealed associations with putative regulatory variants, tagged by rs10509407 and rs12207548 located in/near the genes MINPP1 and CDKN1A, key genes involved in cell growth and survival. These SNPs show trans association with differential expression following heat shock of UBQLN1 (ubiquilin), an important mediator of degradation of proteins in the stress response [54] implicated in Alzheimer’s disease [55], and a network of six further genes including HSF1. However, we did not observe cis-associations with expression of MINPP1 and CDKN1A which leaves unresolved the cis-drivers of the observed trans associations. This may require additional time points of sampling to capture such cis-effects, as illustrated by our recent studies of trans-eQTL following endotoxin induction [16]. Our results are necessarily exploratory given the modest sample size of this study requiring further validation and functional characterisation to establish mechanism. If functionally validated, the geographic distribution of the major and minor alleles of rs12207548 suggests selection may be operating on such variants. We recognise that there may be cell type-specific differences in heat shock response not captured by our analysis in LCLs, including differences in HSF binding from the K562 cell line, and that there may also be population specific differences in terms of regulatory variants with the data presented here generated in cells from individuals of African ancestry. We elected to follow a focused high-level approach in this paper as we are not adequately powered for a systematic QTL analysis of all individual genes. Our approach to analysing the global transcriptional response to stimuli or treatment as a multivariate phenotype provides a single global phenotype for analysis, rather than several thousands of gene-level phenotypes, which is more robust to probe-level technical artefacts and reduces the number of multiple comparisons as well as computational cost of eQTL analysis, especially for omics-scale data. We suggest it is broadly applicable and relevant to other phenotypes in which modulation by genetic variation may be sought. These are highlighted by recent work that has demonstrated the context-specificity of regulatory variants including different disease contexts through QTL approaches in patient samples [15]. For the inflammatory response, these can be complemented by analysis ex vivo of specific phenotypes such as heat shock.

Conclusions

We have defined the global transcriptional response to heat shock for a panel of human B lymphocyte cell lines, establishing a comprehensive catalogue of differentially expressed genes, pathways and networks of broad utility to understand this highly conserved and pathophysiologically significant response. We have also explored the genetic basis for inter-individual variation in the global response, highlighting putative regulatory variants modulating ubiquilin and a further trans gene network.
  48 in total

1.  Diverse and specific gene expression responses to stresses in cultured human cells.

Authors:  John Isaac Murray; Michael L Whitfield; Nathan D Trinklein; Richard M Myers; Patrick O Brown; David Botstein
Journal:  Mol Biol Cell       Date:  2004-03-05       Impact factor: 4.138

Review 2.  The heat shock response: life on the verge of death.

Authors:  Klaus Richter; Martin Haslbeck; Johannes Buchner
Journal:  Mol Cell       Date:  2010-10-22       Impact factor: 17.970

Review 3.  Targeting heat shock proteins in cancer.

Authors:  Gaëtan Jego; Adonis Hazoumé; Renaud Seigneuric; Carmen Garrido
Journal:  Cancer Lett       Date:  2010-11-13       Impact factor: 8.679

4.  A genome-wide association study of global gene expression.

Authors:  Anna L Dixon; Liming Liang; Miriam F Moffatt; Wei Chen; Simon Heath; Kenny C C Wong; Jenny Taylor; Edward Burnett; Ivo Gut; Martin Farrall; G Mark Lathrop; Gonçalo R Abecasis; William O C Cookson
Journal:  Nat Genet       Date:  2007-09-16       Impact factor: 38.330

5.  Combined expression patterns of QTL-linked candidate genes best predict thermotolerance in Drosophila melanogaster.

Authors:  Fabian M Norry; Peter F Larsen; Yongjie Liu; Volker Loeschcke
Journal:  J Insect Physiol       Date:  2009-08-12       Impact factor: 2.354

6.  Identification of genes associated with heat tolerance in Arctic charr exposed to acute thermal stress.

Authors:  Nicole L Quinn; Colin R McGowan; Glenn A Cooper; Ben F Koop; William S Davidson
Journal:  Physiol Genomics       Date:  2011-04-05       Impact factor: 3.107

7.  Family-based association between Alzheimer's disease and variants in UBQLN1.

Authors:  Lars Bertram; Mikko Hiltunen; Michele Parkinson; Martin Ingelsson; Christoph Lange; Karunya Ramasamy; Kristina Mullin; Rashmi Menon; Andrew J Sampson; Monica Y Hsiao; Kathryn J Elliott; Gonül Velicelebi; Thomas Moscarillo; Bradley T Hyman; Steven L Wagner; K David Becker; Deborah Blacker; Rudolph E Tanzi
Journal:  N Engl J Med       Date:  2005-03-03       Impact factor: 91.245

8.  Heat shock response of Archaeoglobus fulgidus.

Authors:  Lars Rohlin; Jonathan D Trent; Kirsty Salmon; Unmi Kim; Robert P Gunsalus; James C Liao
Journal:  J Bacteriol       Date:  2005-09       Impact factor: 3.490

9.  The functional consequences of variation in transcription factor binding.

Authors:  Darren A Cusanovich; Bryan Pavlovic; Jonathan K Pritchard; Yoav Gilad
Journal:  PLoS Genet       Date:  2014-03-06       Impact factor: 5.917

10.  MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS.

Authors:  Paul F O'Reilly; Clive J Hoggart; Yotsawat Pomyen; Federico C F Calboli; Paul Elliott; Marjo-Riitta Jarvelin; Lachlan J M Coin
Journal:  PLoS One       Date:  2012-05-02       Impact factor: 3.240

View more
  2 in total

1.  Transcriptional response to stress is pre-wired by promoter and enhancer architecture.

Authors:  Anniina Vihervaara; Dig Bijay Mahat; Michael J Guertin; Tinyi Chu; Charles G Danko; John T Lis; Lea Sistonen
Journal:  Nat Commun       Date:  2017-08-15       Impact factor: 14.919

2.  rDNA Clusters Make Contact with Genes that Are Involved in Differentiation and Cancer and Change Contacts after Heat Shock Treatment.

Authors:  Nickolai A Tchurikov; Daria M Fedoseeva; Elena S Klushevskaya; Ivan Y Slovohotov; Vladimir R Chechetkin; Yuri V Kravatsky; Olga V Kretova
Journal:  Cells       Date:  2019-11-05       Impact factor: 6.600

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.