| Literature DB >> 23645984 |
Jun-Ichi Satoh1, Hiroko Tabunoki.
Abstract
Interferon-gamma (IFNγ) plays a key role in macrophage activation, T helper and regulatory cell differentiation, defense against intracellular pathogens, tissue remodeling, and tumor surveillance. The diverse biological functions of IFNγ are mediated by direct activation of signal transducer and activator of transcription 1 (STAT1) as well as numerous downstream effector genes. Because a perturbation in STAT1 target gene networks is closely associated with development of autoimmune diseases and cancers, it is important to characterize the global picture of these networks. Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) provides a highly efficient method for genome-wide profiling of DNA-binding proteins. We analyzed the STAT1 ChIP-Seq dataset of IFNγ-stimulated HeLa S3 cells derived from the ENCODE project, along with transcriptome analysis on microarray. We identified 1,441 stringent ChIP-Seq peaks of protein-coding genes. They were located in the promoter (21.5%) and more often in intronic regions (72.2%) with an existence of IFNγ-activated site (GAS) elements. Among the 1,441 STAT1 target genes, 212 genes are known IFN-regulated genes (IRGs) and 194 genes (13.5%) are actually upregulated in response to IFNγ by transcriptome analysis. The panel of upregulated genes constituted IFN-signaling molecular networks pivotal for host defense against infections, where interferon-regulatory factor (IRF) and STAT transcription factors serve as a hub on which biologically important molecular connections concentrate. The genes with the peak location in intronic regions showed significantly lower expression levels in response to IFNγ. These results indicate that the binding of STAT1 to GAS is not sufficient to fully activate target genes, suggesting the high complexity of STAT1-mediated gene regulatory mechanisms.Entities:
Keywords: ChIP-seq; GenomeJack; STAT1; binding sites; interferon-gamma
Year: 2013 PMID: 23645984 PMCID: PMC3623615 DOI: 10.4137/GRSB.S11433
Source DB: PubMed Journal: Gene Regul Syst Bio ISSN: 1177-6250
Figure 1FastQC analysis of ChIP-Seq data. FASTQ format files are derived from short read NGS data of STAT1-ChIP-treated DNA (Panel A) and input DNA (Panel B).
Notes: They were imported into the FastQC program. The per base sequence quality score is shown with the median (red line), the mean (blue line), and the interquatile range (yellow box).
Figure 2Identification of genomic locations of ChIP-Seq peaks by GenomeJack. By analyzing the ChIP-Seq dataset of STAT1-binding sites, we identified totally 3,744 stringent peaks showing fold enrichment ≥20 and FDR ≤1%. The genomic locations of the peaks were determined by importing the processed data into GenomeJack. An example of interferon-regulatory factor 1 (IRF1) (yellow line) listed in Table 2 is shown, where a MACS peak in the stat1_sorted.bam Coverage lane is located in the promoter region of IRF1 (Panel A) with a GAS element highlighted by an orange square (Panel B).
Top 20 significant genes based on fold enrichment in ChIP-Seq data.
| Chromosome | Start | End | FE | FDR (%) | Location | Entrez gene ID | Gene symbol | IRG | Gene ST1.0 Array FC | U133 Plus 2.0 Array FC | Gene name |
|---|---|---|---|---|---|---|---|---|---|---|---|
| chr1 | 159046093 | 159048290 | 349.81 | 0.39 | Promoter | 9447 | AIM2 | Yes | 1.49 | 4.26 | Absent in melanoma 2 |
| chr18 | 42304771 | 42306267 | 218.62 | 0.39 | Intron | 26040 | SETBP1 | 1.23 | 0.67 | SET binding protein 1 | |
| chr1 | 89738814 | 89742202 | 216 | 0.39 | Promoter | 115362 | GBP5 | Yes | 19.14 | 8.33 | Guanylate binding protein 5 |
| chr14 | 103893373 | 103894934 | 207.63 | 0.39 | Intron | 4140 | MARK3 | 0.98 | 1.27 | MAP/microtubule affinity-regulating kinase 3 | |
| chr22 | 36653881 | 36655602 | 201.52 | 0.39 | Intron | 8542 | APOL1 | Yes | 5.51 | 2.54 | Apolipoprotein L, 1 |
| chr15 | 101136222 | 101138145 | 200.76 | 0.39 | Intron | 55180 | LINS | 1.6 | 1.37 | Lines homolog (Drosophila) | |
| chr4 | 170486989 | 170488616 | 197.65 | 0.39 | Intron | 4750 | NEK1 | 0.96 | 1.36 | NIMA (never in mitosis gene a)-related kinase 1 | |
| chr14 | 24981772 | 24983259 | 186.08 | 0.39 | Promoter | 1215 | CMA1 | 1 | 1.08 | Chymase 1, mast cell | |
| chr1 | 243602656 | 243604716 | 181.84 | 0.39 | Intron | 10806 | SDCCAG8 | 1.02 | 1.58 | Serologically defined colon cancer antigen 8 | |
| chr11 | 76621502 | 76622964 | 179.28 | 0.39 | Intron | 55331 | ACER3 | 1.07 | 1.27 | Alkaline ceramidase 3 | |
| chr4 | 113217720 | 113220103 | 178.46 | 0.39 | Promoter | 80216 | ALPK1 | Yes | 2.99 | 2.47 | Alpha-kinase 1 |
| chr7 | 143411541 | 143413217 | 172.08 | 0.39 | Intron | 285966 | FAM115C | 1.63 | 1.3 | Family with sequence similarity 115, member C | |
| chr16 | 48264820 | 48266548 | 171.26 | 0.39 | 5′UTR | 85320 | ABCC11 | 0.97 | 1.42 | ATP-binding cassette, sub-family C (CFTR/ MRP), member 11 | |
| chr15 | 57027345 | 57031166 | 170.16 | 0.39 | Promoter | 54816 | ZNF280D | 1.15 | 1.24 | Zinc finger protein 280D | |
| chrX | 104941773 | 104943192 | 168.58 | 0.39 | Intron | 26280 | IL1RAPL2 | 0.9 | 1.01 | Interleukin 1 receptor accessory protein-like 2 | |
| chrX | 11527367 | 11528830 | 160.49 | 0.39 | Intron | 395 | ARHGAP6 | 1.22 | 1.22 | Rho GTPase activating protein 6 | |
| chr2 | 134083039 | 134085251 | 160 | 0.39 | Intron | 344148 | NCKAP5 | 1.38 | 0.34 | Nck-associated protein 5 | |
| chr6 | 31949161 | 31950466 | 158.61 | 0.39 | Promoter | 720 | C4A | 4.59 | 7.27 | Complement component 4A (Rodgers blood group) | |
| chr11 | 86152542 | 86154846 | 158.29 | 0.39 | Intron | 10873 | ME3 | 1.04 | 2.12 | Malic enzyme 3, NADP(+)-dependent, mitochondrial | |
| chr1 | 196407167 | 196408295 | 155.99 | 0.39 | Intron | 343450 | KCNT2 | 1.43 | 1.14 | Potassium channel, subfamily T, member 2 |
Notes: By analyzing the dataset SRP000703, we identified 1,441 stringent peaks of protein-coding genes exhibiting fold enrichment (FE) ≥20 and the false discovery rate (FDR) ≤1%. Top 20 significant genes based on FE are listed with the chromosome, the position (start, end), FE, FDR, the location (promoter, 5′UTR, exon, intron, 3′UTR), Entrez Gene ID, Gene Symbol, IFN-regulated genes (IRGs) on Interferome, transcriptome data presenting with fold change (FC) on Human Gene 1.0 ST Array (our experiments), FC on Human Genome U133 Plus 2.0 Array (GSE21760), and gene name.
Figure 3Identification of GAS consensus sequences in the promoter, intron, and intergenic regions. The consensus motif sequences were identified by importing a 400 bp-length sequence surrounding the summit of MACS peaks of the genes with top 20 fold enrichment scores into the MEME-ChIP program. The GAS elements located in the promoter (A), intron (B), and intergenic regions (C) are highlighted by an blue square.
Figure 4Identification of ChIP-Seq peaks in intronic regions. The genomic locations of the ChIP-Seq peaks were determined by importing the processed data into GenomeJack. An example of SET binding protein 1 (SETBP1) (yellow line) listed in Table 1 is shown, where a MACS peak in the stat1_sorted.bam Coverage lane is located in the intronic region of SETBP1 (Panel A) with a GAS element highlighted by an orange square (Panel B).
Figure 5Identification of ChIP-Seq peaks in intergenic regions. The genomic locations of the ChIP-Seq peaks were determined by importing the processed data into GenomeJack. A MACS peak in the stat1_sorted.bam Coverage lane with fold enrichment of 333 and FDR of 0.39% is located in the intergenic region of chromosome 21 (Panel A) with a GAS element highlighted by an orange square (Panel B).
Top 20 upregulated genes based on fold change in transcriptome data.
| Chromosome | Start | End | FE | FDR (%) | Location | Entrez gene ID | Gene symbol | IRG | Gene ST1.0 Array FC | U133 Plus 2.0 Array FC | Gene name |
|---|---|---|---|---|---|---|---|---|---|---|---|
| chr8 | 39767141 | 39768199 | 38.82 | 0.39 | Promoter | 3620 | IDO1 | Yes | 149.57 | 43.92 | Indoleamine 2,3- dioxygenase 1 |
| chr4 | 76949148 | 76950321 | 27.42 | 0.54 | Promoter | 3627 | CXCL10 | Yes | 117.22 | 1.1 | Chemokine (C-X-C motif) ligand 10 |
| chr5 | 131818750 | 131828691 | 53.98 | 0.39 | Promoter | 3659 | IRF1 | Yes | 19.81 | 21.09 | Interferon regulatory factor 1 |
| chr1 | 89738814 | 89742202 | 216 | 0.39 | Promoter | 115362 | GBP5 | Yes | 19.14 | 8.33 | Guanylate binding protein 5 |
| chr5 | 156649035 | 156650353 | 53.67 | 0.44 | Intron | 3702 | ITK | Yes | 16.37 | 93.96 | IL2-inducible T-cell kinase |
| chr19 | 10379656 | 10384589 | 66.15 | 0.39 | 5′UTR | 3383 | ICAM1 | Yes | 13.83 | 11.42 | Intercellular adhesion molecule 1 |
| chr22 | 36042373 | 36045743 | 59.61 | 0.39 | 5′UTR | 80830 | APOL6 | 10.82 | 35.4 | Apolipoprotein L, 6 | |
| chr7 | 134832148 | 134833386 | 67.12 | 0.3 | 5′UTR | 55281 | TMEM140 | Yes | 9.8 | 2.86 | Transmembrane protein 140 |
| chr3 | 122281432 | 122284479 | 82.92 | 0.39 | Intron | 83666 | PARP9 | 8.96 | 14.73 | Poly (ADP-ribose) polymerase family, member 9 | |
| chr1 | 89594075 | 89595527 | 24.35 | 0.62 | Promoter | 2634 | GBP2 | Yes | 8.2 | 7.18 | Guanylate binding protein 2, interferon-inducible |
| chr1 | 150736054 | 150738936 | 40.22 | 0.36 | Intron | 1520 | CTSS | Yes | 7.07 | 13.79 | Cathepsin S |
| chr4 | 76928268 | 76929257 | 29.17 | 0.48 | Promoter | 4283 | CXCL9 | Yes | 6.68 | 1.27 | Chemokine (C-X-C motif) ligand 9 |
| chr11 | 4413853 | 4415591 | 39.26 | 0.39 | 5′UTR | 6737 | TRIM21 | Yes | 6.31 | 5.24 | Tripartite motif-containing 21 |
| chr1 | 89535324 | 89536653 | 40.25 | 0.5 | Promoter | 2633 | GBP1 | Yes | 6.03 | 12.65 | Guanylate binding protein 1, interferon-inducible, 67 kDa |
| chr17 | 32581480 | 32582732 | 45.48 | 0.47 | Promoter | 6347 | CCL2 | Yes | 5.84 | 0.28 | Chemokine (C-C motif) ligand 2 |
| chr3 | 122281432 | 122284479 | 82.92 | 0.39 | Promoter | 151636 | DTX3L | Yes | 5.77 | 7.12 | Deltex 3-like (Drosophila) |
| chr6 | 32819471 | 32822798 | 46.79 | 0.39 | Promoter | 5698 | PSMB9 | 5.77 | 31.71 | Proteasome (prosome, macropain) subunit, beta type, 9 (large multifunctional peptidase 2) | |
| chr18 | 52612923 | 52614118 | 41.03 | 0.4 | Intron | 80323 | CCDC68 | 5.61 | 5.28 | Coiled-coil domain containing 68 | |
| chr22 | 36653881 | 36655602 | 201.52 | 0.39 | Intron | 8542 | APOL1 | Yes | 5.51 | 2.54 | Apolipoprotein L, 1 |
| chr9 | 5509442 | 5510324 | 28.26 | 0.47 | Intron | 80380 | PDCD1LG2 | 5.28 | 1.67 | Programmed cell death 1 ligand 2 |
Notes: By analyzing the dataset SRP000703, we identified 1,441 stringent peaks of protein-coding genes exhibiting fold enrichment (FE) ≥ 20 and the false discovery rate (FDR) ≤1%. Top 20 upregulated genes based on fold change (FC) in transcriptome data on Human Gene 1.0 ST array (our experiments) are listed with the position (start, end), FE, FDR, the location (promoter, 5′UTR, exon, intron, 3′UTR), Entrez Gene ID, Gene Symbol, IFN-regulated genes (IRGs) on Interferome, FC on Human Gene 1.0 ST Array, FC on Human Genome U133 Plus 2.0 Array (GSE21760), and Gene name.
Figure 6The expression levels of 1,441 STAT1 target genes with distinct genomic locations of ChIP-Seq peaks. To determine whether ChIP-Seq-based STAT1 target genes are actually upregulated by IFNγ, we studied the gene expression profile of HeLa cells exposed for 6 hours to IFNγ on Human Gene 1.0 ST Array (Panel A), compared with publicly available transcriptome data GSE21760 of HeLa cells exposed for 6 hours to IFNγ on Human Genome U133 Plus 2.0 Array (Panel B). The location of ChIP-Seq peaks on 1,441 STAT1 target genes was classified into the promoter, 5′UTR, exon, intron, and 3′UTR. The fold change in expression levels is shown with the average, standard deviation, and statistical significance evaluated by one-way analysis of variance (ANOVA) followed by post-hoc Tukey’s test.
Top 10 gene ontology terms associated with 194 upregulated STAT1 target genes.
| Rank | GO terms | Focused genes | FDR | |
|---|---|---|---|---|
| 1 | GO:0006955~immune response | AIM2, APOL1, C1S, C3, C4A, CCL2, CIITA, CTSS, CXCL10, CXCL9, GBP1, GBP2, GBP5, GCH1, HLA-E, ICAM1, IFI35, IL7, IL4R, LYN, ORAI1, PDCD1LG2, PSMB8, PSMB9, RNF19B, TAP1, TAP2 | 1.09E-07 | 0.0002 |
| 2 | GO:0002684~positive regulation of immune system process | BCL6, C1S, C3, C4A, F2RL1, FYN, ICAM1, IDO1, IL4R, IL7, LYN, PDCD1LG2, PVR, TAP2, TGFB2 | 7.54E-07 | 0.0013 |
| 3 | GO:0009611~response to wounding | A2M, APOL3, C1S, C3, C4A, CCL2, CIITA, CXCL10, CXCL9, F2RL1, IDO1, IRF7, KLF6, LYN, NMI, PLSCR1, PLSCR4, SCARB1, SLC1A3, SOD2, TGFB2 | 3.64E-06 | 0.0061 |
| 4 | GO:0009615~response to virus | IFI16, IFI35, IRF7, IRF9, MX1, PLSCR1, STAT1, STAT2, ZC3HAV1 | 4.06E-05 | 0.0683 |
| 5 | GO:0048584~positive regulation of response to stimulus | C1S, C3, C4A, F2RL1, FYN, IDO1, IRF7, LYN, PVR, TAP2, TGFB2, TGM2 | 1.02E-04 | 0.1717 |
| 6 | GO:0000267~cell fraction | ABCC4, ANK3, BCL2L11, CALD1, CASP7, CYP1B1, DMD, DTNA, GCH1, IDO1, LYN, MCTP1, NRP2, PML, PSD3, RDH10, SCARB1, SH3KBP1, SLC16A1, SLC1A3, SLC7A2, SOD2, TAP1, TAP2, TRIM27, WARS | 1.18E-04 | 0.1503 |
| 7 | GO:0051272~positive regulation of cell motion | BCL6, CSF1, CXCL10, F2RL1, ICAM1, LYN, SCARB1 | 1.47E-04 | 0.2479 |
| 8 | GO:0048534~hemopoietic or lymphoid organ development | BAK1, BCL2L11, BCL6, CSF1, IFI16, IL7, IRF1, KLF6, LYN, PML, SOD2, TGFB2 | 2.38E-04 | 0.4005 |
| 9 | GO:0050778~positive regulation of immune response | C1S, C3, C4A, FYN, IDO1, LYN, PVR, TAP2, TGFB2 | 2.97E-04 | 0.4979 |
| 10 | GO:0006952~defense response | A2M, APOL1, APOL3, C1S, C3, C4A, CCL2, CIITA, CXCL10, CXCL9, GCH1, IDO1, IRF7, ITK, LYN, MX1, NMI, TAP1, TAP2 | 3.02E-04 | 0.5075 |
Notes: Gene ontology (GO) terms were studied by importing Entrez Gene IDs of 194 upregulated STAT1 target genes into DAVID. They are listed with GO terms, focused genes, P-value of the modified Fisher’s exact test, and false discovery rate (FDR).
Figure 7Molecular networks of ChIP-Seq-based STAT1 target genes.
Notes: Entrez Gene IDs of 194 upregulated STAT1 target genes were imported into KeyMolnet. The neighboring network-search algorithm extracted the highly complex molecular network composed of 1,077 molecules and 1,298 molecular relations. The cluster of IRF and STAT transcription factors is highlighted by blue circle. Red nodes represent STAT1 target genes, whereas white nodes exhibit additional nodes extracted automatically from the core contents of KeyMolnet to establish molecular connections. The molecular relation is indicated by solid line with arrow (direct binding or activation), solid line with arrow and stop (direct inactivation), solid line without arrow (complex formation), dash line with arrow (transcriptional activation), and dash line with arrow and stop (transcriptional repression).