Literature DB >> 26484075

A ChIP-on-chip tiling array approach detects functional histone-free regions associated with boundaries at vertebrate HOX genes.

Surabhi Srivastava1, Divya Tej Sowpati1, Hita Sony Garapati1, Deepika Puri1, Jyotsna Dhawan1, Rakesh K Mishra1.   

Abstract

Hox genes impart segment identity to body structures along the anterior-posterior axis and are crucial for proper development. A unique feature of the Hox loci is the collinearity between the gene position within the cluster and its spatial expression pattern along the body axis. However, the mechanisms that regulate collinear patterns of Hox gene expression remain unclear, especially in higher vertebrates. We recently identified novel histone-free regions (HFRs) that can act as chromatin boundary elements demarcating successive murine Hox genes and help regulate their precise expression domains (Srivastava et al., 2013). In this report, we describe in detail the ChIP-chip analysis strategy associated with the identification of these HFRs. We also provide the Perl scripts for HFR extraction and quality control analysis for this custom designed tiling array dataset.

Entities:  

Keywords:  ChIP, chromatin immunoprecipitation; ChIP-on-chip; Chromatin domain boundary; DTT, dithiothreitol; Enhancer blocking; GAF, GAGA binding factor; GAGA factor; HFR, histone-free region; HS, hypersensitive; Histone H3; Hox; NLR, normalized log ratio; PMSF, phenylmethanesulfonylfluoride

Year:  2014        PMID: 26484075      PMCID: PMC4536032          DOI: 10.1016/j.gdata.2014.05.001

Source DB:  PubMed          Journal:  Genom Data        ISSN: 2213-5960


Specifications

Direct link to deposited data

Deposited data can be found here:http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42941

Experimental design, materials and methods

Array design and data normalization

A custom 1 million probe tiling array chip was designed (manufactured by Agilent Technologies) representing 52 Mb of the mouse genome (NCBI Build 37), of which 1.1 Mb was tiled from the four Hox clusters. The entire genomic sequence of the Hox clusters was tiled including all the flanking regions till the neighbouring genes at either side in each cluster. Approximately 887,000 repeat masked probes were designed for the array, of which 16,200 probes were specific for the Hox clusters. The size of each probe was 60 bp and these were designed with a 10 bp overlap between successive probes to achieve high resolution. The tiling arrays were subjected to dual color hybridization with Input DNA and ChIP samples obtained by chromatin immunoprecipitation using standard protocols as described in [1] with an antibody designed against the invariant portion of the core histone H3 (Abcam, #ab1791). This antibody recognizes all forms and modifications of histone H3 and hence is useful to report for nucleosomal presence. Following background subtraction and data extraction by Agilent's Feature Extraction, normalized enrichment values for the probes were identified using DNA Analytics (Agilent) software. Probe data were normalized using default blank subtraction and intra-array dye-bias median normalization against the Input to obtain normalized log ratio (NLR) IP/Input values. Fig. 1 shows the distribution of the log ratio values for all the probes in the array. The data were deposited in the Gene Expression Omnibus (GEO; [2]) database. Probe enrichment was visualized on the Mouse NCBI37/mm9 Assembly in the UCSC genome browser [3].
Fig. 1

Probe intensity and distribution of log ratio values A) Histogram depicting frequency of log ratios in the Histone H3 ChIP-chip dataset. B) MA plot showing correlation between log ratios (y axis) and their respective average intensities (x axis). Log ratios are calculated as log2 (IP)/log2 (Input). Average intensity is given by the formula (log2 (IP) + log2 (Input)) / 2.

Identification of histone-free regions

To detect histone-free regions (HFRs), the dataset was mined to extract contiguous probes from the Hox clusters that showed no positive enrichment with the histone H3 antibody. A custom Perl script (Supplementary file 1) was used to identify blocks of regions that had probes with very low or negative NLR values, corresponding to HFRs. The HFR script accepts three parameters for running, i) an input file containing information of all probes on the array, ii) an output file into which the coordinates of HFRs are printed out, and iii) the Hox cluster that is to be analyzed. The script works by first extracting from the array all the probe information of the Hox cluster specified by the user. Using the extracted probe information, the probe coordinates and their corresponding NLR values are stored into arrays. The next part of the script takes this array of normalized log ratios and returns blocks of H3-unenriched regions. A block is considered unenriched if 3 out of 5 probes contributing to the block have an NLR < 0, and is classified as an ‘n’ block. If more than 2 probes have an NLR > 0, it is classified as a ‘p’ block. The enrichment information for all consecutive blocks (n and p) is stored in a separate array. Using the three arrays of coordinates, normalized log ratios and enrichment information, the final part of the script returns a list of coordinates corresponding to regions unenriched for histone H3 or histone-free regions (HFRs). For this, starting with each negative (‘n’) block the negative region is extended till two consecutive ‘p’ blocks are encountered. The script also terminates negative regions where the genomic distance between two consecutive probes is greater than 200 bp. The minimum size cut-off for the negative regions is set to 500 bp. The final list of HFRs is output to a user-specified file in bed format. The total number of HFRs identified in each of the four Hox clusters is provided in Table 1.
Table 1

Total number of histone-free regions.

ClusterNo. of HFRsUnenriched in H3K4me3 arrayUnenriched in H3K27me3 array
HOXA747468
HOXB222218
HOXC676564
HOXD525247

The total number of HFRs identified in each Hox cluster. The number of HFRs that were also found unenriched in H3K4me3 and H3K27me3 arrays is indicated.

Example usage: perl hox_final.pl -in C2C12_G0_H3.txt -out hoxa_HFRs.bed -cluster hoxa

Quality control analysis

To confirm that the HFRs returned by the custom script were consistent, we analyzed probe binding data at these regions from two additional ChIP-chip custom arrays using modified histone H3 antibodies specific for H3K4me3 and H3K27me3. In most cases, more than 50% of the probes contributing to each of these histone-free regions were found to have NLR < 0 in both the H3K4me3 and H3K27me3 arrays as well. The number of HFRs from each cluster found unenriched in both these arrays is listed in Table 1. To further rule out the possibility of false detection of positively enriched regions as HFRs, we analyzed a test subset of probes from these tiling array datasets using our custom script. The test dataset was made of probes corresponding to genomic regions that were found positively enriched for either H3K4me3 or H3K27me3 in ChIP-qPCR experiments, and hence should not be negative for histone H3. As expected, no HFR was returned when this test dataset was submitted to the HFR script.

Real time qPCR validations

Real time primers were designed for sequences specific for ten intergenic histone-free regions from the four Hox clusters as well as ten control H3-positive regions from within the Hox gene bodies (Supplementary file 2: Table S1). Chromatin immunoprecipitation was performed with the histone H3 antibody and a non-specific IgG control antibody (Diagenode #kch-803-015). Real time quantitative PCR assays were performed to calculate the relative abundance of histone H3 at the target HFRs and at the control regions using Power SYBR Green qPCR Master mix (Applied Biosystems) on an ABI7900HT Fast Real-Time PCR System (2 min at 50 °C; 10 min at 95 °C; 40 cycles of 15 s at 94 °C, 30 s at 60 °C and 30 s at 68 °C; followed by dissociation curve analysis). Enrichment in the ChIP DNA was determined as percentage Input, where Input DNA represents an aliquot of the same crosslinked and sonicated chromatin used for ChIP and processed in parallel. The enrichment for H3 was then normalized to that observed for IgG at each locus (Fig. 2) and was found to be much lower at HFRs compared to control regions. The difference between the two groups was extremely statistically significant (P < 0.0001) using the paired t-test.
Fig. 2

HFRs show poor enrichment for histone H3. Presence of histone H3 at ten intergenic HFRs was assessed by ChIP with histone H3 antibody followed by real time quantitative PCR. Enrichment for histone H3 was found to be significantly lower at HFRs than at control regions using primers designed within Hox gene bodies.

Sequence analysis

The sequences corresponding to the genomic coordinates of the predicted HFRs in each Hox cluster (including 10 kb upstream and downstream genomic regions of each cluster) were extracted from the mouse reference assembly (MGSCv37-C57BL/6 J; NCBI build 37.2) and analyzed for common motifs using a MEME online tool [4]. The number of expected motifs was set as 5 and the minimum and maximum motif widths were set as 4 and 8 respectively, while allowing for any number of repetitions of the motif on a sequence. The GAGA binding factor (GAF) motif was identified as the top hit with high significance (p < 0.0005) in HFRs at all the clusters except HoxB. The binding of Th-POK (vertebrate homolog of Drosophila GAF, [5]) at these regions was subsequently tested by ChIP-qPCR as described in [1]. To check if the predicted HFRs mapped to sites of DNaseI hypersensitivity and could therefore be considered as reliable markers for sites of histone disruption, the coordinates of DNaseI HS sites in skeletal muscle (tissue of origin), mesoderm and embryonic stem cells from the mouse ENCODE project (DNaseI Hypersensitivity by Digital DNaseI from ENCODE/University of Washington) were obtained from UCSC browser. These were compared with the start and stop coordinates of each of the HFRs associated with the Hox clusters, including 10 kb upstream and downstream of each cluster (93 HFRs in total). DNaseI HS peaks called in any of the three cell types that either overlapped with histone free regions or lay within close proximity (500 bp) to an HFR were considered. The HFRs and DNaseI HS sites from the different cell types were mapped in the context of the Hox genes using R script. Table 2 lists all the HFRs that were found to be associated with a DNaseI HS peak in at least one cell type, supporting the hypothesis that these HFRs indeed represent histone-free regions likely to harbor sites for regulatory activity.
Table 2

HFRs associated with DNase hypersensitive sites.

S. noHFRSMMDESC
HoxA
1A_DOWN-1.1
2A_1-2.1
3A_1-2.2
4A_1-2.3
5A_2-3.1
6A_3-4.1
7A_3-4.2
8A_4-5.1
9A_4-5.2
10A_5.1
11A_6-7.1
12A_6-7.2
13A_7-9.1
14A_7-9.2
15A_7-9.3
16A_9-10.1
17A_10-11.1
18A_10-11.2
19A_11.1
20A_11-13.1
21A_11-13.2
22A_UP.1
23A_UP.2
24A_UP.3



HoxB
1B_13-9.1
2B_13-9.2
3B_9-8.1
4B_7-6.1
5B_5-4.1
6B_4-3.1
7B_3.1
8B_3.2
9B_2-1.1
10B_2-1.4



HoxC
1C_UP.26
2C_UP.27
3C_UP.29
4C_13.2
5C_12-11.1
6C_12-11.2
7C_12-11.3
8C_11-10.1
9C_11-10.2
10C_11-10.3
11C_11-10.4
12C_11-10.5
13C_10.1
14C_9.1
15C_8-6.2
16C_8-6.3
17C_8-6.4
18C_6-5.1
19C_5-4.1
20C_5-4.2
21C_5-4.3
22C_5-4.4
23C_4-DOWN.1
24C_DOWN.1



HoxD
1D_UP.13
2D_13.1
3D_11-10.1
4D_11-10.2
5D_10-9.1
6D_9-8.1
7D_8-4.1
8D_8-4.2
9D_8-4.3
10D_8-4.4
11D_4-3.1
12D_4-3.2
13D_3-1.1
14D_3-1.2
15D_3-1.5
16D_1-DOWN.1

The HFRs overlapping with DNaseI hypersensitive peaks in different tissues as obtained from ENCODE data are tabulated. HFRs that are consistently associated with DNaseI HS sites in all three tissues are highlighted in bold. SM = skeletal muscle, MD = mesoderm, ESC = embryonic stem cells.

The coordinates of the non-coding transcripts within the Hox clusters were obtained from the TROMER transcriptome database. In case of overlapping transcripts in the same orientation, only the longer one was mapped for clarity. The co-ordinates of the 93 HFRs located within the Hox clusters (including 10 kb upstream and downstream of the clusters) were similarly compared with those for all the non-coding transcripts, CpG islands, gene bodies, and 5′ and 3′ ends of the genic regions (co-ordinates obtained from the UCSC mm9 database) and the majority of the HFRs showed no overlap with these known features of the Hox clusters.

Conclusions

We describe here the detailed methods and Perl script used to analyze the dataset obtained from a custom designed histone H3 ChIP-chip tiling array. The probe data offers very high resolution due to the array design thus enabling the identification of novel histone-free regions at the Hox clusters. This dataset has helped delineate novel cis elements that are likely involved in organizing higher order chromatin and governing the tightly regulated expression domains of vertebrate homeotic genes. The following are the supplementary data related to this article.

Supplementary material.

Perl script.

Supplementary file 2

Table S1: List of primers for ChIP-qPCR assays for HFRs.
Organism/cell line/tissueMus musculus/C2C12 myoblasts
SexFemale
Sequencer or array typeAgilent ChIP-chip custom tiling array (AMADID-0245671)
Data formatRaw data: gunzipped txt and analyzed data: bed format
Experimental featuresSuspension culture to induce synchronized reversible G0 arrest; ChIP for histone H3
  5 in total

1.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.

Authors:  Ron Edgar; Michael Domrachev; Alex E Lash
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

2.  The human genome browser at UCSC.

Authors:  W James Kent; Charles W Sugnet; Terrence S Furey; Krishna M Roskin; Tom H Pringle; Alan M Zahler; David Haussler
Journal:  Genome Res       Date:  2002-06       Impact factor: 9.043

3.  Vertebrate homologue of Drosophila GAGA factor.

Authors:  Navneet Kaur Matharu; Tanweer Hussain; Rajan Sankaranarayanan; Rakesh K Mishra
Journal:  J Mol Biol       Date:  2010-05-21       Impact factor: 5.469

4.  MEME SUITE: tools for motif discovery and searching.

Authors:  Timothy L Bailey; Mikael Boden; Fabian A Buske; Martin Frith; Charles E Grant; Luca Clementi; Jingyuan Ren; Wilfred W Li; William S Noble
Journal:  Nucleic Acids Res       Date:  2009-05-20       Impact factor: 16.971

5.  Vertebrate GAGA factor associated insulator elements demarcate homeotic genes in the HOX clusters.

Authors:  Surabhi Srivastava; Deepika Puri; Hita Sony Garapati; Jyotsna Dhawan; Rakesh K Mishra
Journal:  Epigenetics Chromatin       Date:  2013-04-22       Impact factor: 4.954

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.