Literature DB >> 26484167

Analysis of changes to mRNA levels and CTCF occupancy upon TFII-I knockdown.

Maud Marques1, Rodrigo Peña Hernández1, Michael Witcher1.   

Abstract

CTCF is a key regulator of nuclear chromatin structure, chromatin organization and gene regulation. The impact of CTCF on transcriptional output is quite varied, ranging from repression, to transcriptional pausing and transactivation. The multifunctional nature of CTCF is mediated, in part, through differential association with protein partners having unique properties. We identified the general transcription factor TFII-I as an interacting partner of CTCF. To gain an understanding of the function of TFII-I in regulating gene expression and CTCF binding genome wide, we conducted microarray experiments following TFII-I knockdown and chromatin immunoprecipitation of CTCF followed by next generation sequencing (ChIP-seq) from the same TFII-I depleted cells. Here, we described the experimental design and the quality control and analysis that were performed on the dataset. The data is publicly available through the GEO database with accession number GSE60918. The interpretation and description of these data are included in a manuscript in revision (1).

Entities:  

Keywords:  CTCF; ChIP-seq; Microarray; TFII-I

Year:  2014        PMID: 26484167      PMCID: PMC4535928          DOI: 10.1016/j.gdata.2014.09.012

Source DB:  PubMed          Journal:  Genom Data        ISSN: 2213-5960


Direct link to deposited data

Deposited data are available here: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE60918.

Experimental design, materials and methods

Cell line

The mouse B lymphocyte cell lines Wehi-231 expressing shRNA construct Control (Wehi-CT) or a shRNA construct directed against the transcription factor TFII-I (Wehi-TKII-I-KD) were used to investigate the effect of TFII-I depletion on global gene expression and CTCF binding.

Microarray and quality control

To identify genes regulated by TFII-I, we extracted total RNA from Wehi-CT and Wehi-TFII-I-KD from three independent samples. The quantity and the quality of the RNA samples were assessed by a Nanodrop spectrophotometer and Agilent Bioanalyser. Illumina BeadChIPs MouseWG-6 was used to perform expression analysis. Data preprocessing was carried out with Bioconductor package “lumi”, and we used log2 transformation followed by quantile normalization [2], [3]. Quality controls were performed before (Fig. 1A) and after (Fig. 1B) microarray data preprocessing. Reproducibility between biological replicates was evaluated by calculating the correlation coefficient R2 (see example of the scatter plot Fig. 1C and D). Clustering of the microarray was performed to ensure correct segregation between Control and TFII-I knockdown samples (Fig. 1E). Identification of differentially expressed genes between Wehi-CT and Wehi-TFII-I-KD was made using Bioconductor package “limma” as shown with a volcano plot in Fig. 1F [4]. We identified 117 genes differentially regulated with a fold chance ≥ 2 and p-value ≤ 0.05 listed in Table 1. As a confirmation of the knockdown efficiency, we found Gtf2i, the gene coding for TFII-I, being the gene the most down regulated in our data.
Fig. 1

Effect of normalization on microarray signal intensity. Before (A) and after (B) normalization distribution of signal intensity by array. (C) and (D) are scatter plots showing the comparison between two biological replicates of the log2 expression value. R2 = 0.95 and R2 = 0.94. (E) Cluster dendogram of the arrays in function of change in gene expression. (F) Volcano plots contrast significance as the negative logarithm of the p-value against log fold change between control cells and TFII-I knockdown cells.

Table 1

List of genes differentially regulated.

List of differentially expressed genes (p < 0.05) with a fold change > 2 identified by microarray
Up-regulated genes (55)Down-regulated genes (62)
ALDH3B1WDR6ATP6AP2STARD13GTF2IZFP219MIB1LIAS
CNR2LMCD1SFRS11RAB8BEGFL7CYTIPZBTB17RILPL2
CNR2LSM14ADEKPOLR3GCYTH4TBC1D10CSHC1STC2
LMCD1ZFYVE26MSH6AATFSLMO2GSTT1PFN1FTL1
LRRC33ANKRD49HPRT1NPM3IL12ANANSD10ERTD610E2310033F14RIK
BLKAGPSPLSCR1POLE33300001G02RIK2310008H09RIK1600002K03RIKGSTO1
AURKAAF067061RNF145FAM178AKHK6330442E10RIKTRUB21810026J23RIK
DDX24TCIRG1HAAOVEGFBCLEC2DEBPLACTBBST2
CREG1BLVRBGNASYBX31600012P17RIKEIF2S2RPN2LOC629364
POLR2ARBBP7VPREB3C730026J16CALM3PICK1TMEM11GUSB
ARPP19MLLT4CHFRPLEKHA2SERPINF1MARCKSHIST1H2BJAP3D1
PREI4PANK4GPR107UBE2G1PSMD8CBR3SEC63RBM47
CEP120DCPSMT1CKMCDR2SYNCRIPVARSLOC100044172
TWSG1PDZD11CDC5LLCE1MFCRL5GPHNDYNC1LI1
KEAP1JAGN1FCGR2BRRM2
WDR68EHD1

ChIP-seq

To identify the CTCF binding sites that were affected by TFII-I depletion, we carried two independent ChIP-seq assays CTCF in Wehi-CT and Wehi-TFII-I-KD cells with CTCF antibody. Briefly, cells were collected and crosslinked with 1% folmaldehyde in PBS for 10 min at room temperature. Crosslinking reaction was stooped with Glycerine 125 mM and cells were washed with PBS and stored at − 80 °C until assay was carried out. Cells were lysed and DNA sheered by sonication with cell lysis/ChIP buffer (0.25% NP-40, 0.25% Trinton-X, 0.25% Sodium deoxycholate, 0.1% SDS, 50 mM Tris pH 8.0, 50 mM NaCl, 5 mM EDTA) for 15 s, 15 times. Lysed cells were centrifuged for 15 min at 14,000 rpm at 4 °C and supernantant was collected. 1 mg of protein was precleared for 2 h with Protein G agarose beads (50% slurry blocked with salmon sperm) at 4 °C. Immunoprecipitation was carried out by adding 2 μg of antibody and 30 μl of agarose G beads and nutated overnight at 4 °C. After immunoprecipitation, beads were pelleted by centrifugation and were washed 4 times to remove unspecific binding using buffers with varying concentrations of salt. Buffers 1 to 3 contained 0.1% SDS, 1% Triton-X, 2 mM EDTA, 20 mM Tris pH 8.0 and 150 mM NaCl, 300 mM Nacl, 500 mM NaCl respectively. Buffer 4 contained 0.25 M LiCl, 1% NP-40, 1% Sodium deoxycholate, 1 mM EDTA and 10 mM Tris pH8.0. Two additional washes with TE were done to remove any residual buffer from the beads. Complexes bound to the beads were eluted with 500 μl of elution buffer (1% SDS, 1 mM EDTA, 50 mM Tris pH 8.0) at 65 °C for 25 min with occasional vortexing. Beads were pelleted by centrifugation and supernatant was collected. Crosslink reversal was achieved by adding 0.2 mM NaCl at 65 °C overnight. Next proteins (including DNA bound factors and antibodies) were degraded by a treatment with Proteinase K, carried at 45 °C for 1 h and a second incubation of 15 min at 65 °C. PCR purification kit (Qiagen) was used to retrieve the DNA following manufactured instruction and store at − 20 °C. DNA was sent to the IRIC (Institut de Recherche en Immunologie et Cancérologie, Montreal, Canada) sequencing facility where both the library construction and sequencing (100bases, paired-end, HiSeq2000, Illumina) were carried out (Table 2).
Table 2

Reads count and numbers of peaks.




Number of reads in millions

Sample namesAntibodyCell linesRawNo duplicateMAPQ ≥ 20Peak number
Ctl1CTCFWehi-CT43.5836.128.624467
Ctl2CTCFWehi-CT36.132.926.123873
KD1CTCFWehi-TFII-I-KD36.232.32519076
KD2CTCFWehi-TFII-I-KD36.4623.716.915309

ChIP-seq quality control and analysis

Quality of the sequencing was assessed using FastQC software, an example is presented in Fig. 2A (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Using FastX tool kit (http://hannonlab.cshl.edu/fastx_toolkit/), DNA sequences obtained were trimmed to 45 bases, filtered for high quality scores (> 30), and duplicates were removed before being aligned to the mouse genome (U.S. National Center for Biotechnology Information (NCBI) Build 37, July 2007, mm9) using the BWA algorithm [5]. Quality of the alignment was assessed using SAMStat and only the sequences with MAPQ score ≥ 30 were kept for further analysis (Fig. 2B and C) [6]. The model based analysis of ChIP-Seq peak-finding algorithm was used to identify peaks in Wehi-CT and Wehi-TFII-I-KD conditions using the default settings and an example of peak model obtain with MACS is presented in Fig. 2D [7]. Overlap for CTCF binding sites between biological replicates was assessed using the intersect function of bedtools [8], the results are shown with Venn diagram (Fig. 2E). HOMER was used to annotate CTCF peaks, determine their genomic distribution and generated the bedgraph files to visualize the results in UCSC Genome Browser (homer.salk.edu/). We used previously published CTCF ChIP-seq data available in the UCSC genome browser as controls for our dataset (Fig. 3).
Fig. 2

Quality control for ChIP-seq raw data and alignment file. (A) Graph representing the per base quality using the Phred score. Pie chart obtained with SAMstat describing the distribution of the sequence alignment quality score before (B) or after (C) filtering. (D) Peak model produce by MACS. (E) Venn diagram representing the overlap of CTCF binding sites between biological replicates.

Fig. 3

Visualization of CTCF ChIP-seq data in the UCSC genome browser. Screenshot of UCSC genome browser showing CTCF ChIP-seq results in the Control and TFII-I knockdown samples. Previously published dataset for CTCF ChIP-seq in another hematopoietic cell line is also shown.

Discussion

Here, we described a dataset containing gene expression profiling using Illumina BeadChips (microarray) and ChIP-seq analysis of CTCF binding in mouse B cell lymphocyte cell lines expressing a shRNA construct against TFII-I, a general transcription factor. These data were generated to analyze the influence of TFII-I on the genomic targeting of the epigenetic regulatory protein CTCF, and understand how these two factors co-regulate gene transcription. With this dataset, we were able to show that TFII-I is important for targeting CTCF to a cohort of promoter regions where they co-operate to activate transcription. This finding sheds new light on how CTCF targeting at specific genomic regions can occur.

Conflict of interest

The authors have no conflicts of interest.
Specifications
Organism/cell line/tissueMus Musculus, Wehi-231, B lymphocyte immature
Strain(BLAB/c x NZB) F1
Sequencer or array typeIllumina HiSeq 2000 and Illumina BeadChips Mouse WG-6
Data formatChIP-seq: Raw (Fastq) and processed (bed file and bedgraph file)Microarray: excel spreadsheet before and after normalization.
Experimental factorsWehi231-CT vs Wehi231-TFII-I knockdown
Experimental featuresMicroarray gene expression profiling to identify genes that are regulated by TFII-I.ChIP-seq purpose was to map CTCF binding sites affected by TFII-I depletion.
ConsentNA
Sample source locationNA
  7 in total

1.  BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis.

Authors:  Steffen Durinck; Yves Moreau; Arek Kasprzyk; Sean Davis; Bart De Moor; Alvis Brazma; Wolfgang Huber
Journal:  Bioinformatics       Date:  2005-08-15       Impact factor: 6.937

2.  Genome-wide targeting of the epigenetic regulatory protein CTCF to gene promoters by the transcription factor TFII-I.

Authors:  Rodrigo Peña-Hernández; Maud Marques; Khalid Hilmi; Teijun Zhao; Amine Saad; Moulay A Alaoui-Jamali; Sonia V del Rincon; Todd Ashworth; Ananda L Roy; Beverly M Emerson; Michael Witcher
Journal:  Proc Natl Acad Sci U S A       Date:  2015-02-02       Impact factor: 11.205

3.  lumi: a pipeline for processing Illumina microarray.

Authors:  Pan Du; Warren A Kibbe; Simon M Lin
Journal:  Bioinformatics       Date:  2008-05-08       Impact factor: 6.937

4.  BEDTools: a flexible suite of utilities for comparing genomic features.

Authors:  Aaron R Quinlan; Ira M Hall
Journal:  Bioinformatics       Date:  2010-01-28       Impact factor: 6.937

5.  SAMStat: monitoring biases in next generation sequencing data.

Authors:  Timo Lassmann; Yoshihide Hayashizaki; Carsten O Daub
Journal:  Bioinformatics       Date:  2010-11-18       Impact factor: 6.937

6.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

7.  Model-based analysis of ChIP-Seq (MACS).

Authors:  Yong Zhang; Tao Liu; Clifford A Meyer; Jérôme Eeckhoute; David S Johnson; Bradley E Bernstein; Chad Nusbaum; Richard M Myers; Myles Brown; Wei Li; X Shirley Liu
Journal:  Genome Biol       Date:  2008-09-17       Impact factor: 13.583

  7 in total
  3 in total

1.  multiHiCcompare: joint normalization and comparative analysis of complex Hi-C experiments.

Authors:  John C Stansfield; Kellen G Cresswell; Mikhail G Dozmorov
Journal:  Bioinformatics       Date:  2019-09-01       Impact factor: 6.937

Review 2.  Tales from topographic oceans: topologically associated domains and cancer.

Authors:  Moray J Campbell
Journal:  Endocr Relat Cancer       Date:  2019-11       Impact factor: 5.678

Review 3.  Regulation of RNA Polymerase II Transcription Initiation and Elongation by Transcription Factor TFII-I.

Authors:  Niko Linzer; Alexis Trumbull; Rukiye Nar; Matthew D Gibbons; David T Yu; John Strouboulis; Jörg Bungert
Journal:  Front Mol Biosci       Date:  2021-05-13
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.