Literature DB >> 23645984

A Comprehensive Profile of ChIP-Seq-Based STAT1 Target Genes Suggests the Complexity of STAT1-Mediated Gene Regulatory Mechanisms.

Jun-Ichi Satoh1, Hiroko Tabunoki.   

Abstract

Interferon-gamma (IFNγ) plays a key role in macrophage activation, T helper and regulatory cell differentiation, defense against intracellular pathogens, tissue remodeling, and tumor surveillance. The diverse biological functions of IFNγ are mediated by direct activation of signal transducer and activator of transcription 1 (STAT1) as well as numerous downstream effector genes. Because a perturbation in STAT1 target gene networks is closely associated with development of autoimmune diseases and cancers, it is important to characterize the global picture of these networks. Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) provides a highly efficient method for genome-wide profiling of DNA-binding proteins. We analyzed the STAT1 ChIP-Seq dataset of IFNγ-stimulated HeLa S3 cells derived from the ENCODE project, along with transcriptome analysis on microarray. We identified 1,441 stringent ChIP-Seq peaks of protein-coding genes. They were located in the promoter (21.5%) and more often in intronic regions (72.2%) with an existence of IFNγ-activated site (GAS) elements. Among the 1,441 STAT1 target genes, 212 genes are known IFN-regulated genes (IRGs) and 194 genes (13.5%) are actually upregulated in response to IFNγ by transcriptome analysis. The panel of upregulated genes constituted IFN-signaling molecular networks pivotal for host defense against infections, where interferon-regulatory factor (IRF) and STAT transcription factors serve as a hub on which biologically important molecular connections concentrate. The genes with the peak location in intronic regions showed significantly lower expression levels in response to IFNγ. These results indicate that the binding of STAT1 to GAS is not sufficient to fully activate target genes, suggesting the high complexity of STAT1-mediated gene regulatory mechanisms.

Entities:  

Keywords:  ChIP-seq; GenomeJack; STAT1; binding sites; interferon-gamma

Year:  2013        PMID: 23645984      PMCID: PMC3623615          DOI: 10.4137/GRSB.S11433

Source DB:  PubMed          Journal:  Gene Regul Syst Bio        ISSN: 1177-6250


Introduction

Interferons (IFNs) constitute a group of cytokines with antiviral, antiproliferative, and immunomodulatory effects on diverse cell types.1 The IFN family proteins are classified into two major groups: type I IFNs, composed of various IFNα subtypes, IFNβ, IFNδ, IFNɛ, IFNκ, IFNτ, and IFNω, and type II IFNs, composed solely of IFNγ. Type I IFNs interact with the IFNα/β receptor (IFNAR) subunits composed of IFNAR1 and IFNAR2 associated with tyrosine kinase 2 (TYK2) and Janus kinase 1 (JAK1), while IFNγ binds to the IFNγ receptor (IFNGR) receptor subunits composed of IFNGR1 and IFNGR2 associated with JAK1 and JAK2. The ligand-dependent dimerization of the receptor subunits rapidly activates the associated JAKs by autophosphorylation, which provide docking sites for signal transducer and activator of transcription (STAT) proteins. Type I IFNs phosphorylate the C-terminal tyrosine residues Y701 in STAT1 and Y690 in STAT2 via TYK2 and JAK1, leading to the formation of the IFN-stimulated gene factor 3 (ISGF3) complex, composed of STAT1, STAT2, and interferon regulatory factor 9 (IRF9). After nuclear translocation, ISGF3 binds to IFN-stimulated response elements (ISREs) on target genes. Type II IFN, along with type I IFNs, induces the formation and nuclear translocation of STAT1-STAT1 homodimer that binds to IFNγ-activated site (GAS) elements on target genes. Thus, IFNs induce the expression of hundreds of IFN-regulated genes (IRGs) via the JAK-STAT pathway.2 Some of IRGs are regulated by both types of IFNs, whereas others are selectively induced by distinct IFNs through drastic changes in genomic binding locations in a manner dependent on the combinational involvement of STAT1 and STAT2.3 IFNγ plays a key role in a wide range of immune responses, such as macrophage activation, T helper and regulatory cell differentiation, defense against intracellular pathogens, tissue remodeling, and tumor surveillance.4 The diverse biological functions of IFNγ are mediated by direct activation of STAT1 and downstream effector genes that encode cytokines, chemokines, phagocytotic receptors, antiviral proteins, antigen-presenting molecules, and microbicidal molecules. STAT1 knockout mice exhibit severe defects in biological responses to both types of IFNs.5 In the human STAT1 gene, loss-of-function mutations enhance susceptibility to mycobacterial and viral infections, while gain-of-function mutations causes chronic mucocutaneous candidiasis attributable to impaired development and function of Th17 cells.6 Increasing numbers of genome-wide association studies (GWAS) showed that common disease-associated variants are enriched in the recognition sequences of transcription factors, and deregulated activation of STAT1, by perturbing the regulatory network shared by core transcription factors, is closely associated with development of autoimmune diseases and cancers.7 Therefore, it is highly important to characterize the global picture of STAT1 target gene networks. Recently, the rapid progress in the next-generation sequencing (NGS) technology has revolutionized the field of genome research. As a NGS application, chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) provides a highly efficient method for genome-wide profiling of DNA-binding proteins, histone modifications, and nucleosomes.8 ChIP-Seq has the advantages of higher resolution, less noise, and greater coverage of the genome, compared with the microarray-based ChIP-Chip method, and serves as an innovative tool for studying the comprehensive gene regulatory networks.9 Since the NGS analysis produces extremely high-throughput experimental data, it is often difficult to extract the meaningful biological implications. Recent advances in systems biology enable us to illustrate the cell-wide map of the complex molecular interactions by using the literature-based knowledgebase of molecular pathways.10 The logically arranged molecular networks make up the whole system characterized by robustness, which maintains the proper function of the system in the face of genetic and environmental perturbations. Therefore, the integration of high dimensional NGS data with underlying molecular networks offers a rational approach to characterize the network-based molecular mechanisms of gene regulation in the whole genome scale. To study the global picture of STAT1 target gene network, we analyzed the STAT1 ChIP-Seq dataset of the Encyclopedia of DNA Elements (ENCODE) project,11 derived from IFNγ-stimulated HeLa S3 cells, along with our original transcriptome study on microarray. Overall, we identified 1,441 stringent ChIP-Seq peaks of protein-coding genes. Surprisingly, only a small set of ChIP-Seq-based STAT1 target genes are actually upregulated in response to IFNγ, suggesting the complexity of STAT1-mediated gene regulatory mechanisms.

Methods

ChIP-seq dataset of STAT1-binding sites

To extract a comprehensive set of STAT1-target genes, we investigated a ChIP-Seq dataset retrieved from DDBJ Sequence Read Archive (DRA) under the accession number of SRP000703. We utilized the dataset of the ENCODE project (encodeproject. org/ENCODE) derived from the experiments, in which HeLa S3 cells were exposed for 30 minutes to 50 ng/mL recombinant human IFNγ (R & D systems). They were processed for ChIP with a rabbit anti-STAT1 alpha p91 antibody (sc-345; Santa Cruz Biotechnology). NGS libraries constructed from ChIP DNA fragments and from input DNA samples were processed for deep sequencing on Genome Analyzer II (Illumina). We evaluated the quality of short reads by searching them on the FastQC program (www.bioinformatics.babraham.ac.uk/projects/fastqc). We considered the quality score greater than 30 in per base sequence quality as sufficient quality. We mapped them on the human genome reference sequence hg19 by using Bowtie 0.12.7 (bowtie-bio.sourceforge.net). Then we detected statistically significant peaks of mapped reads by using the MACS program (liulab.dfci.harvard.edu/MACS) under the highly stringent condition that satisfies fold enrichment ≥20 and the false discovery rate (FDR) ≤1%, according to the methods described previously.12 Next, we identified genomic locations of MACS peaks by importing the processed data into GenomeJack v1.3, a novel genome viewer for NGS platforms developed by Mitsubishi Space Software (www.mss.co.jp/businessfield/bioinformatics). Based on RefSeq ID, MACS peaks were categorized into the following: the peaks located on protein-coding genes with NM-heading numbers, the peaks located on non-coding genes with NR-heading numbers, and the peaks located in intergenic regions with no relevant neighboring genes. The genomic locations of the peaks were further classified into the following: the promoter region defined by the location within a 5 kb upstream from the 5′ end of genes, the 5′ untranslated region (5′ UTR), the exon, the intron, and the 3′UTR. The locations outside these were defined as intergenic regions. The consensus motif sequences were identified by importing a 400 bp-length sequence surrounding the summit of MACS peaks into the MEME-ChIP program (meme.sdsc.edu/meme/cgi-bin/meme-chip.cgi).13 The information of IFN-regulated genes (IRGs) was extracted from Interferome (www.interferome.org/index.php), the most comprehensive database that collects type I, II and III IRGs manually curated from more than 28 publicly available microarray datasets.14

Microarray analysis

HeLa cells were maintained in Dulbecco’s Modified Eagle’s medium (DMEM; Invitrogen) supplemented with 10% fetal bovine serum (FBS), 100 U/mL penicillin, and 100 μg/mL streptomycin (feeding medium). They were incubated for 6 hours with or without inclusion of 50 ng/mL human recombinant IFNγ (Pepro- Tech) in the medium. Total cellular RNA was then isolated by using the TRIZOL Plus RNA Purification kit (Invitrogen). The quality of total RNA was evaluated on Agilent 2100 Bioanalyzer (Agilent Technologies). Three hundred ng of total RNA was processed for cRNA synthesis, fragmentation, and terminal labeling with the GeneChip Whole Transcript Sense Target Labeling and Control Reagents (Affymetrix). The labeled cRNA was then processed for hybridization at 45 °C for 17 hours with Human Gene 1.0 ST Array (28,869 genes; Affymetrix). The arrays were washed in the GeneChip Fluidic Station 450 (Affymetrix), and scanned by the GeneChip Scanner 3000 7G (Affymetrix). The raw data was expressed as CEL files and normalized by the robust multiarray average (RMA) method with the Expression Console software (Affymetrix). To investigate possible differences in gene expression profiles among different sources and concentrations of IFNγ on distinct microarray platforms, we also retrieved the transcriptome data of HeLa cells treated for 6 hours with 100 U/mL recombinant human IFNγ (Roche) from Gene Expression Omnibus (GEO) under the accession number of GSE21760 for comparison. In their experiments, the data analyzed on Human Genome U133 Plus 2.0 Array (38,500 genes; Affymetrix) were normalized by the GCRMA method. We considered the genes exhibiting ≥2-fold change as upregulation and those exhibiting ≤0.5- fold change as downregulation when compared with the signal intensities of untreated cells.

Molecular network analysis

To identify biologically relevant molecular networks and pathways, we imported Entrez Gene IDs of STAT1 target genes into the Functional Annotation tool of Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7 (david.abcc.ncifcrf.gov).15 DAVID identifies the most relevant pathway constructed by Kyoto Encyclopedia of Genes and Genomes (KEGG), composed of the genes enriched in the given set with an output of statistical significance evaluated by the modified Fisher’s exact test. KEGG (www.kegg.jp) is a publicly accessible knowledgebase containing manually curated reference pathways that cover a wide range of metabolic, genetic, environmental, and cellular processes as well as human diseases. It is currently composed of 224,601 pathways generated from 436 reference pathways. We also imported Entrez Gene IDs into Ingenuity Pathways Analysis (IPA) (Ingenuity Systems, Redwood City, CA, USA; www.ingenuity.com) and KeyMolnet (Institute of Medicinal Molecular Design, Tokyo, Japan; www.immd.co.jp), both of which are provided as a commercial tool for molecular network analysis. IPA is a knowledgebase that contains approximately 2,500,000 biological and chemical interactions and functional annotations with definite scientific evidence. By uploading the list of Gene IDs and expression values, the network-generation algorithm identifies focused genes integrated in a global molecular network. IPA calculates the score P-value that reflects the statistical significance of association between the genes and the networks by the Fisher’s exact test. KeyMolnet contains knowledge-based contents on 150,500 relationships among human genes and proteins, small molecules, diseases, pathways, and drugs.16 They are categorized into the core contents collected from selected review articles with the highest reliability or the secondary contents extracted from abstracts of PubMed and Human Reference Protein database (HPRD). By importing the list of Gene ID and expression values, KeyMolnet automatically provides corresponding molecules as a node on networks. The neighboring network-search algorithm selected one or more molecules as starting points to generate a network of all kinds of molecular interactions around starting molecules, including direct activation/ inactivation, transcriptional activation/repression, and the complex formation within the designated number of paths from starting points. The generated network was compared side by side with 484 human canonical pathways of the KeyMolnet library. The algorithm counting the number of overlapping molecular relations between the extracted network and the canonical pathway makes it possible to identify the canonical pathway showing the most significant contribution to the extracted network.

Results

Identification of 1,441 ChIP-Seq-based STAT1 target genes

We first evaluated the quality of short read NGS data of STAT1-ChIP-treated DNA and input DNA. The quality scores across all bases exceeded 30 on FastQC, indicating that these data are acceptable for downstream analysis (Fig. 1, Panels A and B). After mapping them on hg19, we identified totally 3,744 stringent ChIP-Seq peaks that meet the criteria of fold enrichment ≥20 and FDR ≤1%. The genomic locations of the peaks were determined by using GenomeJack (Fig. 2, Panels A and B). We omitted the peaks located in non-coding genes (n = 157), those in intergenic regions (n = 1917), and redundant genes. Finally, we identified 1,441 ChIP-Seq peaks of protein-coding genes. The summits of the peaks were located in the promoter (n = 310; 21.5%), 5′UTR (n = 48; 3.3%), exon (n = 22; 1.5%), intron (n = 1,041; 72.2%), or 3′UTR (n = 20; 1.4%). The comprehensive list of 1,441 genes is shown inSupplementary Table 1. Top 20 significant genes based on fold enrichment are shown in Table 1.
Figure 1

FastQC analysis of ChIP-Seq data. FASTQ format files are derived from short read NGS data of STAT1-ChIP-treated DNA (Panel A) and input DNA (Panel B).

Notes: They were imported into the FastQC program. The per base sequence quality score is shown with the median (red line), the mean (blue line), and the interquatile range (yellow box).

Figure 2

Identification of genomic locations of ChIP-Seq peaks by GenomeJack. By analyzing the ChIP-Seq dataset of STAT1-binding sites, we identified totally 3,744 stringent peaks showing fold enrichment ≥20 and FDR ≤1%. The genomic locations of the peaks were determined by importing the processed data into GenomeJack. An example of interferon-regulatory factor 1 (IRF1) (yellow line) listed in Table 2 is shown, where a MACS peak in the stat1_sorted.bam Coverage lane is located in the promoter region of IRF1 (Panel A) with a GAS element highlighted by an orange square (Panel B).

Table 1

Top 20 significant genes based on fold enrichment in ChIP-Seq data.

ChromosomeStartEndFEFDR (%)LocationEntrez gene IDGene symbolIRGGene ST1.0 Array FCU133 Plus 2.0 Array FCGene name
chr1159046093159048290349.810.39Promoter9447AIM2Yes1.494.26Absent in melanoma 2
chr184230477142306267218.620.39Intron26040SETBP11.230.67SET binding protein 1
chr189738814897422022160.39Promoter115362GBP5Yes19.148.33Guanylate binding protein 5
chr14103893373103894934207.630.39Intron4140MARK30.981.27MAP/microtubule affinity-regulating kinase 3
chr223665388136655602201.520.39Intron8542APOL1Yes5.512.54Apolipoprotein L, 1
chr15101136222101138145200.760.39Intron55180LINS1.61.37Lines homolog (Drosophila)
chr4170486989170488616197.650.39Intron4750NEK10.961.36NIMA (never in mitosis gene a)-related kinase 1
chr142498177224983259186.080.39Promoter1215CMA111.08Chymase 1, mast cell
chr1243602656243604716181.840.39Intron10806SDCCAG81.021.58Serologically defined colon cancer antigen 8
chr117662150276622964179.280.39Intron55331ACER31.071.27Alkaline ceramidase 3
chr4113217720113220103178.460.39Promoter80216ALPK1Yes2.992.47Alpha-kinase 1
chr7143411541143413217172.080.39Intron285966FAM115C1.631.3Family with sequence similarity 115, member C
chr164826482048266548171.260.395′UTR85320ABCC110.971.42ATP-binding cassette, sub-family C (CFTR/ MRP), member 11
chr155702734557031166170.160.39Promoter54816ZNF280D1.151.24Zinc finger protein 280D
chrX104941773104943192168.580.39Intron26280IL1RAPL20.91.01Interleukin 1 receptor accessory protein-like 2
chrX1152736711528830160.490.39Intron395ARHGAP61.221.22Rho GTPase activating protein 6
chr21340830391340852511600.39Intron344148NCKAP51.380.34Nck-associated protein 5
chr63194916131950466158.610.39Promoter720C4A4.597.27Complement component 4A (Rodgers blood group)
chr118615254286154846158.290.39Intron10873ME31.042.12Malic enzyme 3, NADP(+)-dependent, mitochondrial
chr1196407167196408295155.990.39Intron343450KCNT21.431.14Potassium channel, subfamily T, member 2

Notes: By analyzing the dataset SRP000703, we identified 1,441 stringent peaks of protein-coding genes exhibiting fold enrichment (FE) ≥20 and the false discovery rate (FDR) ≤1%. Top 20 significant genes based on FE are listed with the chromosome, the position (start, end), FE, FDR, the location (promoter, 5′UTR, exon, intron, 3′UTR), Entrez Gene ID, Gene Symbol, IFN-regulated genes (IRGs) on Interferome, transcriptome data presenting with fold change (FC) on Human Gene 1.0 ST Array (our experiments), FC on Human Genome U133 Plus 2.0 Array (GSE21760), and gene name.

Among 1,441 STAT1 target genes, 212 genes (14.7%) were categorized into IFN-regulated genes (IRGs) on Interferome. By motif analysis with MEME-ChIP, the genes with top 20 fold enrichment scores exhibited an existence of the GAS element comprising TTCCNGGAA (Fig. 3, Panels A–C), irrespective of the location of the peaks in the promoter or the intron, and even in intergenic regions (Fig. 4, Panels A and B; Fig. 5, Panels A and B). These results validated the specific mapping of ChIP-Seq short reads to the genomic regions of the GAS consensus sequence motif.
Figure 3

Identification of GAS consensus sequences in the promoter, intron, and intergenic regions. The consensus motif sequences were identified by importing a 400 bp-length sequence surrounding the summit of MACS peaks of the genes with top 20 fold enrichment scores into the MEME-ChIP program. The GAS elements located in the promoter (A), intron (B), and intergenic regions (C) are highlighted by an blue square.

Figure 4

Identification of ChIP-Seq peaks in intronic regions. The genomic locations of the ChIP-Seq peaks were determined by importing the processed data into GenomeJack. An example of SET binding protein 1 (SETBP1) (yellow line) listed in Table 1 is shown, where a MACS peak in the stat1_sorted.bam Coverage lane is located in the intronic region of SETBP1 (Panel A) with a GAS element highlighted by an orange square (Panel B).

Figure 5

Identification of ChIP-Seq peaks in intergenic regions. The genomic locations of the ChIP-Seq peaks were determined by importing the processed data into GenomeJack. A MACS peak in the stat1_sorted.bam Coverage lane with fold enrichment of 333 and FDR of 0.39% is located in the intergenic region of chromosome 21 (Panel A) with a GAS element highlighted by an orange square (Panel B).

A small set of STAT1 target genes were transcriptionally activated by IFNγ

In general, the STAT1 homodimer serves as a transcriptional activator of numerous IRGs.1 To determine whether ChIP-Seq-based STAT1 target genes are actually upregulated by IFNγ, we studied the genome-wide gene expression profile of HeLa cells exposed for 6 hours to IFNγ on Human Gene 1.0 ST Array. Among top 20 upregulated genes based on fold change, 16 genes (80%) were categorized into IRGs on Interferome (Table 2), supporting the validity of the experimental protocol. We also compared our results with publicly available transcriptome data of IFNγ-treated HeLa cells on Human Genome U133 Plus 2.0 Array numbered GSE21760. Overall, two distinct microarray data showed a trend toward concordant regulation in individual STAT1 target genes (Supplementary Table 1). Therefore, we identified upregulated or downregulated genes at least in one of these studies.
Table 2

Top 20 upregulated genes based on fold change in transcriptome data.

ChromosomeStartEndFEFDR (%)LocationEntrez gene IDGene symbolIRGGene ST1.0 Array FCU133 Plus 2.0 Array FCGene name
chr8397671413976819938.820.39Promoter3620IDO1Yes149.5743.92Indoleamine 2,3- dioxygenase 1
chr4769491487695032127.420.54Promoter3627CXCL10Yes117.221.1Chemokine (C-X-C motif) ligand 10
chr513181875013182869153.980.39Promoter3659IRF1Yes19.8121.09Interferon regulatory factor 1
chr189738814897422022160.39Promoter115362GBP5Yes19.148.33Guanylate binding protein 5
chr515664903515665035353.670.44Intron3702ITKYes16.3793.96IL2-inducible T-cell kinase
chr19103796561038458966.150.395′UTR3383ICAM1Yes13.8311.42Intercellular adhesion molecule 1
chr22360423733604574359.610.395′UTR80830APOL610.8235.4Apolipoprotein L, 6
chr713483214813483338667.120.35′UTR55281TMEM140Yes9.82.86Transmembrane protein 140
chr312228143212228447982.920.39Intron83666PARP98.9614.73Poly (ADP-ribose) polymerase family, member 9
chr1895940758959552724.350.62Promoter2634GBP2Yes8.27.18Guanylate binding protein 2, interferon-inducible
chr115073605415073893640.220.36Intron1520CTSSYes7.0713.79Cathepsin S
chr4769282687692925729.170.48Promoter4283CXCL9Yes6.681.27Chemokine (C-X-C motif) ligand 9
chr114413853441559139.260.395′UTR6737TRIM21Yes6.315.24Tripartite motif-containing 21
chr1895353248953665340.250.5Promoter2633GBP1Yes6.0312.65Guanylate binding protein 1, interferon-inducible, 67 kDa
chr17325814803258273245.480.47Promoter6347CCL2Yes5.840.28Chemokine (C-C motif) ligand 2
chr312228143212228447982.920.39Promoter151636DTX3LYes5.777.12Deltex 3-like (Drosophila)
chr6328194713282279846.790.39Promoter5698PSMB95.7731.71Proteasome (prosome, macropain) subunit, beta type, 9 (large multifunctional peptidase 2)
chr18526129235261411841.030.4Intron80323CCDC685.615.28Coiled-coil domain containing 68
chr223665388136655602201.520.39Intron8542APOL1Yes5.512.54Apolipoprotein L, 1
chr95509442551032428.260.47Intron80380PDCD1LG25.281.67Programmed cell death 1 ligand 2

Notes: By analyzing the dataset SRP000703, we identified 1,441 stringent peaks of protein-coding genes exhibiting fold enrichment (FE) ≥ 20 and the false discovery rate (FDR) ≤1%. Top 20 upregulated genes based on fold change (FC) in transcriptome data on Human Gene 1.0 ST array (our experiments) are listed with the position (start, end), FE, FDR, the location (promoter, 5′UTR, exon, intron, 3′UTR), Entrez Gene ID, Gene Symbol, IFN-regulated genes (IRGs) on Interferome, FC on Human Gene 1.0 ST Array, FC on Human Genome U133 Plus 2.0 Array (GSE21760), and Gene name.

Among 1,441 STAT1 target genes, a set of 194 genes (13.5%) that contained 70 IRGs were upregulated by IFNγ, while 42 genes (2.9%) were downregulated, suggesting that ChIP-Seq-based STAT1 target genes are not always followed by transcriptional activation by IFNγ. Thus, approximately 85% of ChIP-Seq-based STAT1 targets are poorly responsive to IFNγ in terms of expression levels on microarray. Among 1,441 genes, the genes with the location of ChIP-Seq peaks in intronic regions showed significantly lower expression levels in response to IFNγ, compared to those with the location of peaks in the promoter or in the 5′UTR, regardless of the great variation in expression levels (Fig. 6, Panels A and B). These results suggest that the binding of STAT to the region corresponding to intronic ChIP-Seq peaks could less effectively activate target gene expression.
Figure 6

The expression levels of 1,441 STAT1 target genes with distinct genomic locations of ChIP-Seq peaks. To determine whether ChIP-Seq-based STAT1 target genes are actually upregulated by IFNγ, we studied the gene expression profile of HeLa cells exposed for 6 hours to IFNγ on Human Gene 1.0 ST Array (Panel A), compared with publicly available transcriptome data GSE21760 of HeLa cells exposed for 6 hours to IFNγ on Human Genome U133 Plus 2.0 Array (Panel B). The location of ChIP-Seq peaks on 1,441 STAT1 target genes was classified into the promoter, 5′UTR, exon, intron, and 3′UTR. The fold change in expression levels is shown with the average, standard deviation, and statistical significance evaluated by one-way analysis of variance (ANOVA) followed by post-hoc Tukey’s test.

Molecular networks of ChIP-Seq-based STAT1 target genes

Finally, we studied the molecular network of the set of 194 upregulated genes by pathway analysis tools of bioinformatics. By using DAVID, we identified functionally associated gene ontology (GO) terms (Table 3). They include “immune response” (GO:0006955; P = 1.09E-07), “positive regulation of immune system process” (GO:000268; P = 7.54E-07), “response to wounding” (GO:0009611; P = 3.64E-06), and “response to virus” (GO:0009615; P = 4.06E-05), all of which represent key biological functions of IFNγ. They showed the closest association with chemokine signaling pathway (hsa04062; P = 0.0059, FDR = 6.29) on KEGG.
Table 3

Top 10 gene ontology terms associated with 194 upregulated STAT1 target genes.

RankGO termsFocused genesP-valueFDR
1GO:0006955~immune responseAIM2, APOL1, C1S, C3, C4A, CCL2, CIITA, CTSS, CXCL10, CXCL9, GBP1, GBP2, GBP5, GCH1, HLA-E, ICAM1, IFI35, IL7, IL4R, LYN, ORAI1, PDCD1LG2, PSMB8, PSMB9, RNF19B, TAP1, TAP21.09E-070.0002
2GO:0002684~positive regulation of immune system processBCL6, C1S, C3, C4A, F2RL1, FYN, ICAM1, IDO1, IL4R, IL7, LYN, PDCD1LG2, PVR, TAP2, TGFB27.54E-070.0013
3GO:0009611~response to woundingA2M, APOL3, C1S, C3, C4A, CCL2, CIITA, CXCL10, CXCL9, F2RL1, IDO1, IRF7, KLF6, LYN, NMI, PLSCR1, PLSCR4, SCARB1, SLC1A3, SOD2, TGFB23.64E-060.0061
4GO:0009615~response to virusIFI16, IFI35, IRF7, IRF9, MX1, PLSCR1, STAT1, STAT2, ZC3HAV14.06E-050.0683
5GO:0048584~positive regulation of response to stimulusC1S, C3, C4A, F2RL1, FYN, IDO1, IRF7, LYN, PVR, TAP2, TGFB2, TGM21.02E-040.1717
6GO:0000267~cell fractionABCC4, ANK3, BCL2L11, CALD1, CASP7, CYP1B1, DMD, DTNA, GCH1, IDO1, LYN, MCTP1, NRP2, PML, PSD3, RDH10, SCARB1, SH3KBP1, SLC16A1, SLC1A3, SLC7A2, SOD2, TAP1, TAP2, TRIM27, WARS1.18E-040.1503
7GO:0051272~positive regulation of cell motionBCL6, CSF1, CXCL10, F2RL1, ICAM1, LYN, SCARB11.47E-040.2479
8GO:0048534~hemopoietic or lymphoid organ developmentBAK1, BCL2L11, BCL6, CSF1, IFI16, IL7, IRF1, KLF6, LYN, PML, SOD2, TGFB22.38E-040.4005
9GO:0050778~positive regulation of immune responseC1S, C3, C4A, FYN, IDO1, LYN, PVR, TAP2, TGFB22.97E-040.4979
10GO:0006952~defense responseA2M, APOL1, APOL3, C1S, C3, C4A, CCL2, CIITA, CXCL10, CXCL9, GCH1, IDO1, IRF7, ITK, LYN, MX1, NMI, TAP1, TAP23.02E-040.5075

Notes: Gene ontology (GO) terms were studied by importing Entrez Gene IDs of 194 upregulated STAT1 target genes into DAVID. They are listed with GO terms, focused genes, P-value of the modified Fisher’s exact test, and false discovery rate (FDR).

By using the core analysis tool of IPA, we identified “interferon signaling” (P = 9.99E-11) and “antigen presentation pathway” (P = 2.80E-06) as the most significant canonical pathways associated with the set of genes. Furthermore, the functional networks of IPA defined by “Infectious Disease, Dermatological Diseases and Conditions, Organismal Development” (P = 1.00E-36) and “Infectious Disease, Respiratory Disease, Gastrointestinal Disease” (P = 1.00E-34) served as the networks with the most significant relationship ( Supplementary Table 2), supporting a key role of STAT1 target genes in host defense against infections. Next, with respect to the conventional location of transcriptional factor-binding sites, we extracted a set of 69 STAT1 target genes located either in the promoter or the 5′UTR and upregulated at ≥2-fold in at least one of the microarray studies described above. They constituted the functional network defined by “Infectious Disease, Antimicrobial Response, Inflammatory Response” (P = 1.00E-47), verifying a key role of the core STAT1 target genes in immune response to infections. By using KeyMolnet, the neighboring network-search algorithm operating on the core contents extracted the highly complex molecular network composed of 1,077 molecules and 1,298 molecular relations. These exhibited the most significant relationships with the canonical pathways termed “transcriptional regulation by estrogen-related receptor (ERR)” (P = 1.99E-132), “transcriptional regulation by interferon-regulatory factor (IRF)” (P = 3.08E-130), “transglutaminase 2 (TG2) signaling pathway” (P = 2.03E-100), “complement pathway” (P = 1.58E-069), and “transcriptional regulation by STAT” (P = 4.08E-069), validating a key role of IRF and STAT transcription factors in the molecular network of 194 IFNγ-upregulated STAT1 target genes (Fig. 7, blue circle). When the set of 69 upregulated STAT1 target genes with location of the peaks in the promoter or the 5′UTR were imported into KeyMolnet, it extracted the complex network composed of 337 molecules and 439 molecular relations. The network again showed the most significant relationship with the canonical pathways termed “transcriptional regulation by IRF” (P = 4.46E-174) and “transcriptional regulation by STAT” (P = 2.37E-094).
Figure 7

Molecular networks of ChIP-Seq-based STAT1 target genes.

Notes: Entrez Gene IDs of 194 upregulated STAT1 target genes were imported into KeyMolnet. The neighboring network-search algorithm extracted the highly complex molecular network composed of 1,077 molecules and 1,298 molecular relations. The cluster of IRF and STAT transcription factors is highlighted by blue circle. Red nodes represent STAT1 target genes, whereas white nodes exhibit additional nodes extracted automatically from the core contents of KeyMolnet to establish molecular connections. The molecular relation is indicated by solid line with arrow (direct binding or activation), solid line with arrow and stop (direct inactivation), solid line without arrow (complex formation), dash line with arrow (transcriptional activation), and dash line with arrow and stop (transcriptional repression).

Discussion

To study the global picture of STAT1 target gene network, we identified 1,441 stringent STAT1 ChIP-Seq peaks of protein-coding genes from the dataset SRP000703. They were located in the promoter (21.5%) and more often in intronic regions (72.2%) with an existence of IFNγ-activated site (GAS) elements. Among 1,441 ChIP-Seq-based STAT1 target genes, 212 genes (14.7%) are known IRGs on Interferome and only 194 genes (13.5%) are actually upregulated in response to IFNγ by transcriptome analysis. The panel of upregulated genes constituted IFN-signaling molecular networks pivotal for host defense against infections, where IRF and STAT transcription factors serve as a hub on which the biologically important molecular connections concentrate. The genes with the peak location in intronic regions showed significantly lower expression levels in response to IFNγ, compared to those with the peak location in the promoter or in the 5′UTR. These results indicate that the binding of STAT1 homodimer to GAS is not sufficient to fully activate target genes, suggesting the complexity of regulatory mechanisms involving STAT1-mediated gene activation. This view is supported by the most recent study of the ENCODE project performed on genomic binding sites of 119 transcription-related factors in over 450 experiments, which reveals that human transcription factors often show different co-association patterns in proximal and distal binding sites, and the binding of one transcriptional factor affects the preferred binding partners of others.9 The STAT family transcription factors are composed of highly conserved seven members. Their common structure is divided into seven structural domains: the amino terminal domain, the coiledChIP-coil domain, the DNA binding domain that mediates a direct binding to GAS elements, the linker domain, the SH2 domain that mediates specific recruitment to receptor subunits and the formation of active STAT dimers, the tyrosine activation motif, and the transcriptional activation domain (TAD) with conserved serine phosphorylation sites in the carboxyl terminus.17 STAT1 and STAT3 are affected by alternative splicing to produce α and β species, which differ at their C-terminal segments. Increasing evidence showed that efficient transcriptional activation of STAT1 target genes requires posttranslational modification of STAT1 and the recruitment of coactivators and histone and chromatin modifying complexes.1,4,17 Notably, nuclear translocation of STAT1 triggered by Y701 phosphorylation is pivotal for stable association with chromatin during IFNγ-driven transcriptional activation.18 Phosphorylated STAT1 in the nucleus directly interacts with the CREB-binding protein (CBP)/p300 family of transcriptional coactivators.19 STAT1β lacking TAD incapable of recruiting p300 to chromatin sites is defective in transcriptional activation from a chromatin template.20 Acetylation of STAT1 lysine residues 410 and 413 mediated by CBP in the nucleus plays a negative role in signaling via the mechanisms involving enhanced interaction with T-cell protein tyrosine phosphatase (TCP45; PTPN2) and increased dephosphorylation of STAT1, while histone deacetylase 3 (HDAC3) catalyzes STAT1 deacetylation.21 BRG1 (SMARCA4), an ATP-dependent helicase of the SWI/SNF chromatin remodeling complex, plays a pivotal role in IFNγ-induced expression of CIITA, the master regulator of major histocompatibility (MHX) class II complex.22 Both type I and type II IFNs phosphorylate the C-terminal serine residue S727 located in STAT1 TAD, which promotes recruitment of minichromosome maintenance deficient 5 (MCM5).23 STAT1 S727 phosphorylation is not required for nuclear translocation of STAT1 and the DNA binding capacity, but is indispensable for maximum transcriptional activation of target genes for achievement of optimum IFNγ-dependent immune response.24 Intricately, recent evidence indicated that a substantial part of STAT1 is present in the nuclei independently of tyrosine phosphorylation in a cell type-specific manner.25 Unphosphorylated STAT1 (U-STAT1) prolongs and increases the expression of a subset of genes induced initially by phosphorylated STAT1, suggesting that persistent transcriptional activation of target genes via DNA binding of STAT1 is not essentially dependent on the status of phosphorylation of STAT1.

Conclusions

We identified 1,441 stringent ChIP-Seq peaks of protein-coding genes. Among them, a small subset composed of 194 genes are actually upregulated in response to IFNγ. These results indicate that the binding of STAT1 to GAS is not sufficient to fully activate target genes, suggesting the complexity of STAT1- mediated gene regulatory mechanisms. Supplementary Table 1. The list of 1,441 ChIP-Seq-based STAT1 target genes. Supplementary Table 2. Top 10 significant functional networks of IPA associated with 194 upregulated STAT1 target genes.
  24 in total

1.  Distinct transcriptional activation functions of STAT1alpha and STAT1beta on DNA and chromatin templates.

Authors:  Natalia Zakharova; Elena S Lymar; Edward Yang; Sohail Malik; J Jillian Zhang; Robert G Roeder; James E Darnell
Journal:  J Biol Chem       Date:  2003-08-25       Impact factor: 5.157

Review 2.  Mechanisms of type-I- and type-II-interferon-mediated signalling.

Authors:  Leonidas C Platanias
Journal:  Nat Rev Immunol       Date:  2005-05       Impact factor: 53.106

3.  Identification of genes differentially regulated by interferon alpha, beta, or gamma using oligonucleotide arrays.

Authors:  S D Der; A Zhou; B R Williams; R H Silverman
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-22       Impact factor: 11.205

4.  A phosphorylation-acetylation switch regulates STAT1 signaling.

Authors:  Oliver H Krämer; Shirley K Knauer; Georg Greiner; Enrico Jandt; Sigrid Reichardt; Karl-Heinz Gührs; Roland H Stauber; Frank D Böhmer; Thorsten Heinzel
Journal:  Genes Dev       Date:  2009-01-15       Impact factor: 11.361

Review 5.  Cross-regulation of signaling pathways by interferon-gamma: implications for immune responses and autoimmune diseases.

Authors:  Xiaoyu Hu; Lionel B Ivashkiv
Journal:  Immunity       Date:  2009-10-16       Impact factor: 31.745

6.  Recruitment of Stat1 to chromatin is required for interferon-induced serine phosphorylation of Stat1 transactivation domain.

Authors:  Iwona Sadzak; Melanie Schiff; Irene Gattermeier; Reingard Glinitzer; Ines Sauer; Armin Saalmüller; Edward Yang; Barbara Schaljo; Pavel Kovarik
Journal:  Proc Natl Acad Sci U S A       Date:  2008-06-23       Impact factor: 11.205

7.  Architecture of the human regulatory network derived from ENCODE data.

Authors:  Mark B Gerstein; Anshul Kundaje; Manoj Hariharan; Stephen G Landt; Koon-Kiu Yan; Chao Cheng; Xinmeng Jasmine Mu; Ekta Khurana; Joel Rozowsky; Roger Alexander; Renqiang Min; Pedro Alves; Alexej Abyzov; Nick Addleman; Nitin Bhardwaj; Alan P Boyle; Philip Cayting; Alexandra Charos; David Z Chen; Yong Cheng; Declan Clarke; Catharine Eastman; Ghia Euskirchen; Seth Frietze; Yao Fu; Jason Gertz; Fabian Grubert; Arif Harmanci; Preti Jain; Maya Kasowski; Phil Lacroute; Jing Jane Leng; Jin Lian; Hannah Monahan; Henriette O'Geen; Zhengqing Ouyang; E Christopher Partridge; Dorrelyn Patacsil; Florencia Pauli; Debasish Raha; Lucia Ramirez; Timothy E Reddy; Brian Reed; Minyi Shi; Teri Slifer; Jing Wang; Linfeng Wu; Xinqiong Yang; Kevin Y Yip; Gili Zilberman-Schapira; Serafim Batzoglou; Arend Sidow; Peggy J Farnham; Richard M Myers; Sherman M Weissman; Michael Snyder
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

8.  MEME-ChIP: motif analysis of large DNA datasets.

Authors:  Philip Machanick; Timothy L Bailey
Journal:  Bioinformatics       Date:  2011-04-12       Impact factor: 6.937

9.  Comprehensive analysis of human microRNA target networks.

Authors:  Jun-Ichi Satoh; Hiroko Tabunoki
Journal:  BioData Min       Date:  2011-06-17       Impact factor: 2.522

10.  INTERFEROME: the database of interferon regulated genes.

Authors:  Shamith A Samarajiwa; Sam Forster; Katie Auchettl; Paul J Hertzog
Journal:  Nucleic Acids Res       Date:  2008-11-07       Impact factor: 16.971

View more
  50 in total

1.  JAK-STAT Activity in Peripheral Blood Cells and Kidney Tissue in IgA Nephropathy.

Authors:  Jianling Tao; Laura Mariani; Sean Eddy; Holden Maecker; Neeraja Kambham; Kshama Mehta; John Hartman; Weiqi Wang; Matthias Kretzler; Richard A Lafayette
Journal:  Clin J Am Soc Nephrol       Date:  2020-04-30       Impact factor: 8.237

2.  Competitive binding of STATs to receptor phospho-Tyr motifs accounts for altered cytokine responses.

Authors:  Stephan Wilmes; Polly-Anne Jeffrey; Jonathan Martinez-Fabregas; Maximillian Hafer; Paul K Fyfe; Elizabeth Pohler; Silvia Gaggero; Martín López-García; Grant Lythe; Charles Taylor; Thomas Guerrier; David Launay; Suman Mitra; Jacob Piehler; Carmen Molina-París; Ignacio Moraga
Journal:  Elife       Date:  2021-04-19       Impact factor: 8.140

Review 3.  The molecular details of cytokine signaling via the JAK/STAT pathway.

Authors:  Rhiannon Morris; Nadia J Kershaw; Jeffrey J Babon
Journal:  Protein Sci       Date:  2018-12       Impact factor: 6.725

Review 4.  Systems biology unravels interferon responses to respiratory virus infections.

Authors:  Andrea L Kroeker; Kevin M Coombs
Journal:  World J Biol Chem       Date:  2014-02-26

5.  CD95/Fas Increases Stemness in Cancer Cells by Inducing a STAT1-Dependent Type I Interferon Response.

Authors:  Abdul S Qadir; Paolo Ceppi; Sonia Brockway; Calvin Law; Liang Mu; Nikolai N Khodarev; Jung Kim; Jonathan C Zhao; William Putzbach; Andrea E Murmann; Zhuo Chen; Wenjing Chen; Xia Liu; Arthur R Salomon; Huiping Liu; Ralph R Weichselbaum; Jindan Yu; Marcus E Peter
Journal:  Cell Rep       Date:  2017-03-07       Impact factor: 9.423

6.  Single cell transcriptomics identifies focal segmental glomerulosclerosis remission endothelial biomarker.

Authors:  Rajasree Menon; Edgar A Otto; Paul Hoover; Sean Eddy; Laura Mariani; Bradley Godfrey; Celine C Berthier; Felix Eichinger; Lalita Subramanian; Jennifer Harder; Wenjun Ju; Viji Nair; Maria Larkina; Abhijit S Naik; Jinghui Luo; Sanjay Jain; Rachel Sealfon; Olga Troyanskaya; Nir Hacohen; Jeffrey B Hodgin; Matthias Kretzler; Kidney Precision Medicine Project Kpmp
Journal:  JCI Insight       Date:  2020-03-26

7.  C4A mRNA expression in PBMCs predicts the presence and severity of delusions in schizophrenia and bipolar disorder with psychosis.

Authors:  Jennifer K Melbourne; Cherise Rosen; Benjamin Feiner; Rajiv P Sharma
Journal:  Schizophr Res       Date:  2018-02-12       Impact factor: 4.939

8.  Programming of Distinct Chemokine-Dependent and -Independent Search Strategies for Th1 and Th2 Cells Optimizes Function at Inflamed Sites.

Authors:  Alison Gaylo-Moynihan; Hen Prizant; Milan Popović; Ninoshka R J Fernandes; Christopher S Anderson; Kevin K Chiou; Hannah Bell; Dillon C Schrock; Justin Schumacher; Tara Capece; Brandon L Walling; David J Topham; Jim Miller; Alan V Smrcka; Minsoo Kim; Angela Hughson; Deborah J Fowell
Journal:  Immunity       Date:  2019-08-06       Impact factor: 31.745

9.  Prediction of DNA binding motifs from 3D models of transcription factors; identifying TLX3 regulated genes.

Authors:  Mario Pujato; Fabien Kieken; Amanda A Skiles; Nikos Tapinos; Andras Fiser
Journal:  Nucleic Acids Res       Date:  2014-11-26       Impact factor: 16.971

10.  Activated Phosphorylated STAT1 Levels as a Biologically Relevant Immune Signal in Schizophrenia.

Authors:  Rajiv P Sharma; Cherise Rosen; Jennifer K Melbourne; Benjamin Feiner; Kayla A Chase
Journal:  Neuroimmunomodulation       Date:  2016-11-08       Impact factor: 2.492

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.