Literature DB >> 22891264

Open chromatin structures regulate the efficiencies of pre-RC formation and replication initiation in Epstein-Barr virus.

Peer Papior¹, José M Arteaga-Salas, Thomas Günther, Adam Grundhoff, Aloys Schepers.

Abstract

Whether or not metazoan replication initiates at random or specific but flexible sites is an unsolved question. The lack of sequence specificity in origin recognition complex (ORC) DNA binding complicates genome-scale chromatin immunoprecipitation (ChIP)-based studies. Epstein-Barr virus (EBV) persists as chromatinized minichromosomes that are replicated by the host replication machinery. We used EBV to investigate the link between zones of pre-replication complex (pre-RC) assembly, replication initiation, and micrococcal nuclease (MNase) sensitivity at different cell cycle stages in a genome-wide fashion. The dyad symmetry element (DS) of EBV's latent origin, a well-established and very efficient pre-RC assembly region, served as an internal control. We identified 64 pre-RC zones that correlate spatially with 57 short nascent strand (SNS) zones. MNase experiments revealed that pre-RC and SNS zones were linked to regions of increased MNase sensitivity, which is a marker of origin strength. Interestingly, although spatially correlated, pre-RC and SNS zones were characterized by different features. We propose that pre-RCs are formed at flexible but distinct sites, from which only a few are activated per single genome and cell cycle.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：

Year: 2012 PMID： 22891264 PMCID： PMC3514025 DOI： 10.1083/jcb.201109105

Source DB: PubMed Journal: J Cell Biol ISSN： 0021-9525 Impact factor: 10.539

Introduction

Eukaryotic cells initiate their genome duplication from hundreds to several tens of thousands of sites, called replication origins. The genomic organization of replication origins is very different across the eukaryotic kingdom (Aladjem et al., 2006). In yeast, origins are mainly defined by DNA sequence. Saccharomyces cerevisiae replication origins are located in ∼150-bp-long autonomous replicating sequences characterized by an 11-bp AT-rich consensus motif. Schizosaccharomyces pombe origins are 500–1,000-bp-long AT-rich sequences, which lack a consensus sequence but support autonomous replication (Aladjem, 2007). Both yeast species feature an excess of origin sites, and the local chromatin structure limits the number of active origins to ∼400 (Breier et al., 2004). In multicellular eukaryotes, origins are defined independently of sequence, and various approaches to identify essential features of origins have led to ambiguous results (Schepers and Papior, 2010). In humans, replication starts from an estimated 30,000 origins. The mode of origin recognition and activation is characterized by its flexibility and plasticity, allowing an adequate response to environmental constraints and diverse demands during differentiation (Aladjem, 2007). Despite differences in origin definition, the principles of origin recognition are highly conserved from yeast to human. The first step is always the binding of the origin recognition complex (ORC) that acts as an interactive platform for the subsequent assembly of pre-replication complexes (pre-RCs) during the G1 phase of the cell cycle. Pre-RC formation is characterized by the reiterative loading of the minichromosome maintenance complex (Mcm2-7) that requires the help of two auxiliary proteins, Cdc6 and Cdt1 (Sivaprasad et al., 2006). The DNA binding features of ORC reflect the plasticity of origin recognition. Although S. cerevisiae ORC (ScORC) recognizes origin-specific sequences, S. pombe ORC (SpORC) targets AT-rich DNA regions via an AT-hook extension of the SpOrc4 subunit (Aladjem et al., 2006; Masai et al., 2010). Drosophila melanogaster ORC (DmORC) has some bias for polyA tracts, whereas human ORC binds to DNA without any marked preference for distinct sequences (Vashee et al., 2003; Schaarschmidt et al., 2004; Balasov et al., 2007). ORC localizes to MNase-sensitive regions (MSRs), which are flanked by positioned nucleosomes (Berbenetz et al., 2010; Eaton et al., 2010; MacAlpine et al., 2010). In higher eukaryotic systems, additional features such as DNA topology, histone modifications, and chromatin structures might contribute to pre-RC binding and origin activation (Thomae et al., 2008; Méchali, 2010). For example, it has been postulated that pre-RCs assemble in zones of increased MNase sensitivity at the dihydrofolate reductase (DHFR) initiation region (Lubelsky et al., 2011). Genome-scale studies in human and mouse cells using short nascent strand (SNS) DNA as readout suggest that strong origins are often located in promoter regions, particularly transcription start sites (TSS), and map to CpG islands (Cadoret et al., 2008; Sequeira-Mendes et al., 2009; Cayrou et al., 2011). However, the high plasticity of ORC-DNA binding in human and other metazoan cells still hampers our understanding of origin formation and selection (Gilbert, 2010; Schepers and Papior, 2010). In this study, we used Epstein-Barr virus (EBV) as a model to study the relationship between sites of pre-RC formation, origin activation, and nucleosome dynamics at origins in the background of human cells. EBV infects human B cells and establishes a persistent latent infection. The viral genome is maintained autonomously in proliferating cells and replicates once per cell cycle during S phase in synchrony with the host’s chromosomal DNA (Adams, 1987; Yates and Guan, 1991). The latent origin, oriP, is the only cis-acting element required to sustain the autonomous state of the EBV genome (Yates et al., 1984). OriP is bound by the viral transactivator EBNA1. OriP was discovered due to its ability to support replication of plasmids, and it was believed that EBV’s latent DNA replication initiates only at oriP. 2D gel analyses suggested that DNA synthesis frequently initiates outside oriP (Little and Schildkraut, 1995). Single molecule analyses demonstrated that initiation of DNA replication occurs at many sites across the viral genome, although only one or very few initiation events per genome occur in any given S phase (Norio and Schildkraut, 2001, 2004). OriP consists of two EBNA1 binding arrays: the family of repeats (FR) and the dyad symmetry element (DS). FR tethers the EBV genome to human chromosomes, thus ensuring stable retention (Marechal et al., 1999; Wu et al., 2000, 2002; Sears et al., 2003, 2004). DS is the origin element. The replication function of DS is based on EBNA1’s ability to interact directly with ORC (Schepers et al., 2001). This interaction allows a highly efficient assembly of pre-RCs at or near DS (Chaudhuri et al., 2001; Dhar et al., 2001; Schepers et al., 2001; Ritzi et al., 2003). As mentioned in the previous paragraph, and even given the current high-throughput approaches, the complexity of mammalian genomes and the intrinsic flexibility in origin selection precluded these studies. Using EBV as a model system, we circumvented these problems by investigating the autonomous viral genome that, in many aspects, mimics a cellular chromosome. The advantage of this system is that the EBV genome is small enough to capture the entire molecule at high resolution on microarrays, yet large enough to allow formation of complex chromatin patterns and multiple replication origins. EBV has several advantages to study the replication initiation process in human cells. (a) Like all γ-herpesviruses, EBV persists as fully chromatinized genome that is exclusively replicated by the host replication machinery. This makes EBV an ideal reductionist model system to study replication across the entire genome. (b) DS can be used as an internal control site. DS has the unique advantage of being a well-characterized, highly specific and very efficient pre-RC site. (c) The high copy number of EBV genomes in combination with its small genome facilitates genome-wide experiments. We performed a comparative analysis of different pre-RC components, SNS mapping, and the pattern of mononucleosomes isolated at different stages of the cell cycle. Microarray analyses revealed highly similar DNA binding profiles for Orc2 and Mcm3, allowing the identification of 64 pre-RC zones in the EBV genome. We asked to what extent SNS and pre-RC zones coincide and whether these processes are characterized by MNase sensitivity patterns. Finally, we investigated a potential correlation of pre-RC assembly and replication initiation with nucleotide composition or proximity with TSSs. Our data demonstrates that pre-RC and SNS zones correlate spatially and are generally linked with regions of increased MNase sensitivity, though with distinct differences.

Results

To study the parameters determining origin selection and activation in human cells, a comprehensive survey was performed using the EBV genome of the Burkitt’s lymphoma cell line Raji (Fig. 1 A). Using a custom-made 6-bp resolution tiling array of the EBV genome, the relationship between zones of pre-RC formation, replication initiation, and nucleosome dynamics at origins were analyzed at high resolution. We chromatin-immunoprecipitated (ChIP) Orc2 and Mcm3 as members of the pre-RC from G1 cells and compared the array data with zones of actual initiation by measuring SNS DNA; we also compared them to mononucleosomal DNA isolated from cell cycle–fractionized chromatin, determining MNase-sensitive and -resistant regions (Fig. 1 B).

Figure 1.

Scheme of the EBV genome and experimental design for the analyses of pre-RC and SNS zones as well as mapping of MR profiles. (A) Scheme of the circular EBV genome. In addition to the latent origins oriP (blue box) and the 14-kbp “Raji origin” (blue line), the lytic origin (oriLyt) is shown. The latent EBV nuclear antigens 1, 2, 3A–C, EBNA-LP genes (turquoise), and LMP1 and -2 (purple) are depicted, including their transcripts and promoters. The EBER1 and -2 and the miRNAs regions (BART and BARF) are indicated (green lines). The Raji genome harbors two deletions (red Δ, nt 86,000–89,000 and 163,978–166,635). These regions do not produce array signals in comparison to the reference strain type I used for the design of the EBV microarray. (B) Chart of the experimental set up to map pre-RC zones (left), MNase profiles (central), and SNS DNA (right). (C) Cell cycle phases of logarithmically growing Raji cells were separated by centrifugal elutriation. The DNA content of the different fractions was determined by FACS analysis (top, I–VI). The FACS profiles of one out of three experiments are shown. The quality of coprecipitated DNA was determined by quantitative PCR. The histograms show the mean values of three independent Orc2 (bottom left) and Mcm3 (bottom right) immunoprecipitations. The enrichments of Orc2 (red bars) and Mcm3 enrichments (blue bars) at the DS region are shown. The black bars indicate the enrichments of Orc2 and Mcm3 to a reference site. Error bars indicate mean ± SEM.

Genome-wide localization of Orc2 and Mcm3

To identify pre-RC zones, we cell cycle–fractionized cells using centrifugal elutriation (Ritzi et al., 2003) and performed ChIP with Orc2- and Mcm3-specific antibodies (Fig. 1 C). Orc2 binding to DS is cell cycle independent, whereas Mcm3 binding is clearly cell cycle regulated (Ritzi et al., 2003). The reference near oriLyt shows reduced amounts of Orc2 and Mcm3. Three biological replicates of Orc2- and Mcm3-specific precipitations and IgG controls of G1 chromatin (fraction II in Fig. 1 C) were hybridized against input DNA to the tiling array and analyzed (see Fig. S1 A for experiments designed to control potential biases introduced by linear amplification of ChIP material). The mean values of three independent Orc2/Mcm3 (Cy5) and input (Cy3) log2 ratios were normalized against the IgG/input log2 ratios. A sliding window of 150 bp was used to smooth the signal, and we then identified ChIP-enriched sites using a hidden Markov model (HMM; see Materials and methods). As expected, both Orc2 and Mcm3 show the most prominent enrichment at DS (Fig. S1 B). However, in addition to DS, many reproducible albeit less pronounced signals were observed across the EBV genome. To determine the best possible resolution and to differentiate between background and true signals, we used several criteria. First, we considered the influence of the fragment length of the input DNA in the resolution of microarrays. Fig. S1 D simulates the resolution of an isolated binding site (top left) or of two neighboring binding sites (top right) with a uniform fragment population of 700 bp (see Materials and methods for the deduction of the formula for signal calculation). The simulated profile of a single signal has the shape of a triangle centered at the binding site with a width of twice the fragment length. Thus the fragment length has no influence on the resolution of a single signal per se, but may affect the separation of neighboring signals. When two binding sites are separated by less than the fragment length, their peaks will not be resolved, and appear as a trapezoid. The fragmentation process of a ChIP experiment, however, generates a population of fragments with varying lengths. Fig. S1 C shows the length distribution for one of our ChIP experiments. Fragments of ∼700 bp are the most abundant. Fig. S1 D also shows signal simulations assuming the fragment length distribution shown in Fig. S1 C. Although the presence of large fragments broadens the total width of the peaks, the contribution of smaller fragments increases the overall resolution (Fig. S1 D, bottom left). As a consequence, it is possible to resolve individual binding sites within a distance of less than the mean (or median) fragment length (Fig. S1 D, bottom right). How well these peaks are resolved will ultimately depend on baseline fluorescence and noise levels. In summary, these simulations suggest that the resolution could in fact be higher than the mean fragment length. Second, to minimize false positives within the obtained Orc2 and Mcm3 zones, we included for further analyses all peaks with a width of ≥400 bp (Fig. 2 A, see Materials and methods). Because peaks ≥400 bp are not representative for a specific site, we use the term “zone” for a region of adjacent probes with elevated signals. Note that this definition is different from a “replication initiation zone” describing a large region with delocalized initiation (Dijkwel et al., 1991). The Orc2 and Mcm3 profiles are highly similar, and Mcm3 log2 ratios at Orc2 enriched zones have a significantly higher mean than at Orc2 nonenriched zones (Fig. 2 B; P < 2.2 × 10−16, one-sided Student’s t test). A linear regression of Orc2 and Mcm3 log2 ratios at pre-RC zones confirmed a significant fit (Fig. 2 C; P < 2.2 × 10−16, regression F-test) and a high correlation (Pearson correlation = 0.92) between the enrichments. These results suggest that it is appropriate to combine Orc2 and Mcm3 log2 ratios to define pre-RC enrichments. However, because Mcm3 but not Orc2 is essential for initiation once pre-RCs are formed, to define pre-RC zones we included zones with probes enriched not only with Mcm3 and Orc2 but also with Mcm3 only. From the identified 64 pre-RC zones, 55 are enriched in at least 5% of their width with both Mcm3 and Orc2, and 9 are Mcm3-only zones (Fig. 3 A). Detailed information about the location and composition of the zones is given in Table S1.

Figure 2.

Orc2 and Mcm3 localize at highly similar locations and correlate with MSRs. (A) Orc2 and Mcm3 ChIP experiments were performed with fraction II of the elutriated Raji cells. Orc2 (red) and Mcm3 (blue) enriched zones (width ≥ 400 bp) are plotted as a function of the EBV genome. Solid lines indicate the log2 enrichments at the identified zones and the rectangles below indicate the width of each zone. EBV genes encoded in the upper strand (yellow boxes) and the lower strand (blue) are shown; repetitive elements are shown in gray. OriP and the reference site as well as the two Raji deletions (del.) are indicated. (B) A box plot analysis of Mcm3 log2 enrichment in Orc2 enriched and nonenriched zones indicates a significant difference between the mean signals. (C) A linear regression of Mcm3- and Orc2-enriched probes in pre-RC zones confirms a significant relationship between the enrichments.

Figure 3.

Relationship between pre-RC zones and MSRs. (A) 64 overlapping Orc2 and Mcm3 zones and Mcm3-only zones ≥ 400 bp were defined as pre-RC zones (black boxes). (B) MSRs (green), defined as regions ≥ 150 bp with negative MR versus genomic input ratios <1, overlap with pre-RC zones (black).

Relationship between pre-RC zones and MSRs

Increasing evidence suggests that defined chromatin structures contribute to the definition of origins and that increased MNase sensitivity is one conserved feature of eukaryotic origins (Berbenetz et al., 2010; Eaton et al., 2010; Gilbert, 2010; Lubelsky et al., 2011; Givens et al., 2012; Xu et al., 2012). To confirm whether the positions of origins correlate with increased MNase accessibility, we generated MNase profiles of the EBV genome. Because we speculated that the MNase profile at origins might change dynamically during the cell cycle, we first isolated mononucleosomal DNA from MNase digested G1 chromatin (Fig. S2 A). A common misconception is to interpret MNase sensitivity as equivalent to nucleosome depletion. Yet, MNase sensitivity can also be produced by other factors (e.g., nonhistone proteins). Also, regions of extended high MNase protection are not digested to mononucleosomes and appear similar to MSRs. In this study, we define an MSR as a region of at least 150 bp in which all probes have a negative input/MNase ratio, which is indicative of increased MNase sensitivity (Fig. 3 B; see Materials and methods). A box plot shows that the MNase resistance (MR) was significantly lower in pre-RC enriched zones than in nonenriched zones (Fig. S2 B; P < 2.2 × 10−16, one sided Student’s t test). 81.3% of the pre-RCs are located in MSRs (mean probability of MSR being located in a random pre-RC = 0.0173). To evaluate whether the locations of probes in pre-RC zones and MSRs are independent, we used a two-way contingency table. A χ2 test rejected the null hypothesis (P < 2.2 × 10−16; Fig. S2 C). These results confirm the relationship between pre-RC zones and MSRs. Furthermore, the anti-correlation of the ChIP and MNase profiles clearly suggests that the ChIP signals are not systematic random noise.

Pre-RC assembly and replication initiation zones correlate

The observation of 64 potential pre-RCs called into question how many of them can function as replication initiation sites. To identify active initiation sites, we isolated SNS DNA from asynchronous cells using alkaline gel electrophoresis (Kamath and Leffak, 2001). On average, we obtained <10 ng SNS DNA from 108 cells, which is in the range of the expected amount (Cadoret et al., 2008). Nonproliferating cells did not yield any SNS DNA (not depicted). We prepared two independent samples, which were quality controlled by quantitative PCR at the hypoxanthine-guanine phosphoribosyltransferase (HPRT) origin and at reference regions (Fig. S3 A; Cohen et al., 2002). SNS preparations were amplified and hybridized against genomic input DNA. As for ChIP DNA, we used Southern blot analysis with EBV-specific probes to monitor the length of viral DNA fragments (Fig. S1 C). To identify SNS-enriched zones we used the criteria described in Materials and methods. Most importantly, to avoid an overlap with Okazaki fragments, we omitted all potential SNS zones with a width of <400 bp. Thus we identified 57 distinct potential SNS-enriched zones (Fig. 4 A). It was immediately obvious that replication initiates at many regions of the EBV genome and that DS is not the most prominent initiation zone. The region between the W repeats and nt 65,000 is relatively lacking in initiation zones, which is in line with studies from the Schildkraut laboratory (Fig. S3 B; Norio and Schildkraut, 2001, 2004).

Figure 4.

Relationship between pre-RC and SNS zones. (A) Enriched SNS zones (width ≥ 400 bp). SNS zones overlapping with at least 5% of their width with a pre-RC zone are indicated by the brown boxes. The nine nonoverlapping SNS zones are depicted as open boxes. The red line indicates the mean log2 enrichment of the weakest SNS zone. The EBV map is the same as in Fig. 2. (B) A box plot of SNS log2 enrichment at pre-RC enriched and nonenriched zones confirms a significant difference between the mean signals. One paradigm of the replication initiation model is that initiation occurs at or near pre-RC sites. Our observation that SNS log2 ratios at pre-RC enriched zones have a significantly higher mean than at pre-RC nonenriched zones supports this hypothesis (Fig. 4 B; P < 2.2 × 10−16, one-sided Student’s t test). 46 out of 57 SNS zones (81%) overlap with at least 5% of their width with a pre-RC zone (Table 1). Several arguments might explain why 19% of the identified SNS zones do not overlap. First, our stringent criteria in defining pre-RC zones might exclude some true positive zones. For example, reducing the cut-off size for pre-RCs to 300 bp increases the number of potential pre-RC zones from 64 to 79, and the overlap between pre-RC and SNS zones raises the number from 81% to 89% (not depicted). Second, SNS zones not overlapping with a pre-RC zone are located in extended MSRs, and the majority have pre-RC signals within a short distance (Fig. S3 C). This suggests that SNS zones are spatially linked with pre-RC zones, although they are not located at identical sites. A list of all 57 SNS zones, as well as their mean and maximum peak intensities, is given in Table S5. Tables S2 and S6 contain detailed information about SNSs not overlapping with pre-RCs and vice versa.

Table 1.

The majority of SNS zones overlap with at least 5% of their width with a pre-RC zone, a relationship that is also found in topSNSs

Zone type	SNS	pre-RC
Total	57	64
Overlapping	46 (80.7%)	43 (67.2%)
Nonoverlapping	11 (19.3%)	21 (32.8%)
Top 30% enriched zones	17	19
Overlapping	14 (82.4%)	12 (63.2%)
Nonoverlapping	3 (17.6%)	7 (36.8%)

The majority of SNS zones overlap with at least 5% of their width with a pre-RC zone, a relationship that is also found in topSNSs These data suggests that many pre-RCs might also function as initiation sites, although not all potential origins are necessarily used in every EBV genome and cell cycle. We next examined a potential link between the mean efficiencies of pre-RC assembly and origin activation. To verify this, we compared SNS and pre-RC log2 enrichments at SNS zones with a linear regression (Fig. S3 D). The regression provides a significant fit (P < 2.2 × 10−16, regression F-test), but the overall correlation of 0.27 is low. Because strong origins need to be efficient in both pre-RC assembly and initiation, we examined a potential correlation between both activities. Table 1 shows that 82.4% of the 30% strongest SNS zones (topSNS, n = 17) overlap by at least 5% of their width with a pre-RC zone, the majority overlapping with one of the 30% strongest pre-RC zones (top-pre-RC, n = 19). When analyzing top-pre-RCs and topSNSs in more detail, a relationship between these became obvious. Using a two-way contingency table, we tested the null hypothesis that the locations of probes in topSNSs and top-pre-RCs are independent. A χ2 test rejected the null hypothesis (P = 5.6 × 10−16; Fig. S3 E). We conclude that a significant relationship between top-pre-RCs and topSNSs exists. However, this association does not extend to all pre-RC and SNS zones. At present it is unclear which parameters determine the relationship. A list of all top-pre-RCs and topSNSs, including their mean and maximum peak intensities, is given in Tables S3 and S7. Our previous data demonstrated that DS is flanked by positioned nucleosomes (Zhou et al., 2005). To analyze the relationship between pre-RC assembly, MNase sensitivity, and initiation efficiency, we aligned these features using heat maps (Fig. 5). Fig. 5 A shows oriP, a multifunctional region, in which transcriptional activity, pre-RC assembly, replication initiation activity, and MNase sensitivity are spatially and functionally linked. Both oriP elements, FR and DS, are constantly bound by the EBV-transactivator EBNA1 and represent MSRs flanked by MNase-resistant regions (MRRs). Interestingly, both SNS and pre-RC signals peak not at DS but in the neighboring regions, confirming that EBNA1 targets ORC to a broad area (Schepers et al., 2001; Ritzi et al., 2003). The pre-RC zone at oriP is flanked on one side by FR and on the other side by the C promoter (nt 7,690–11,800). This zone contains three SNS zones, which suggests that multiple ORC molecules might bind to this region. The region between nt 5,100 and 7,250 represents one extended SNS and includes the noncoding EBER transcripts (Minarovits et al., 1992). In Raji cells, only EBER1 is transcribed at a high level (Pratt et al., 2009). The promoter regions of both EBER genes are MNase sensitive and are characterized by the presence of +1 nucleosomes. Two pre-RC enrichments localized at the EBER promoters did not qualify as enriched zones because of our stringent scoring conditions. This example demonstrates that the criteria chosen to eliminate false positive signals and to efficiently reduce background noise come at the expense of sensitivity and might also eliminate true positive signals. Fig. 5 (B and C) shows two additional selected regions. The region between nt 57,000 and 67,000 displays three weak pre-RCs, which indicates that not every potential pre-RC zone is used as an initiation site (Fig. 5 B). The region between nt 76,000 and 86,000 has multiple pre-RC zones overlapping with SNS zones, which are preferentially located in MSRs; this suggests that replication initiation and increased MNase sensitivity are linked (Fig. 5 C).

Figure 5.

Heat map at three different EBV regions. (A–C) SNS heat map with pre-RC (black line) and G1-MNase profiles (green line). The SNS log2 enrichment efficiency at an enlargement of the oriP region (A) and two exemplary regions (B and C) is shown. The SNS values are presented as heat maps, with red and yellow indicating high and low initiation activity, respectively. Dashed black lines indicate all pre-RC log2 enrichments, whereas solid black lines represent only those pre-RC signals passing our filter for pre-RC zones. True SNS zones are marked by brown rectangles above the graph. The positions of the oriP elements FR, DS, and Rep* (red boxes), the C promoter, and the RNA Pol III transcribed EBER1 and 2 (arrows) are indicated. Latent (blue arrows) and silent lytic genes (white arrows) are depicted.

The MNase sensitivity at pre-RC zones is dynamic over the cell cycle

Different studies demonstrate that origins are located in MSR (Berbenetz et al., 2010; Eaton et al., 2010; Gilbert, 2010; Lubelsky et al., 2011). To explore a potential MNase sensitivity at origins, we aligned and plotted the mean mononucleosome log2 enrichments of G1 cells in a ±1,000 bp window surrounding the maximum peak of the 64 pre-RCs (Fig. 6 A, panel 1; see Materials and methods). The alignment of all pre-RCs indicates only a moderate MNase sensitivity during G1. The standard deviation of the mean profiles confirms this analysis (Fig. S4 A). As control, we also aligned the ±1,000-bp neighborhood of 250 randomly selected positions across the EBV genome (Fig. 6 A, panel 2). Next, we examined whether the extent of MNase sensitivity is linked to the efficiency of pre-RC formation. The alignments of the 30% least prominent pre-RCs (bot-pre-RC, n = 19; Table S4) and the top-pre-RCs indicate only small differences in MNase sensitivity at pre-RCs in G1 phase chromatin (Fig. 6 A, panels 3 and 4; and Fig. S4 A).

Figure 6.

Mean MR profiles at pre-RCs are dynamic over the cell cycle. (A) Mean profile of pre-RC (black) and G1 phase MR log2 enrichments (green) in a ±1,000-bp window centered at: (1) the maximum peak of all pre-RCs (top left), (2) 250 randomly chosen locations (top right), (3) the maximum peak of the top-pre-RCs (bottom left), and (4) the maximum peak of the bot-pre-RCs (bottom right). (B) Mean profile of pre-RC (black), G1 phase MR (green), S phase MR (red), and G2/M phase MR (blue) log2 enrichments. The log2 enrichments in a ±1,000-bp window are centered at the maximum peaks of pre-RCs as in A. Pre-RC formation is limited to the G1 phase of the cell cycle, and pre-RCs are disassembled after origin firing. Therefore, we determined whether the MNase sensitivity at pre-RCs changes over the cell cycle. Fig. 4 B shows mean pre-RC and MR profiles, now also including the S- and G2/M-MR (S phase, fraction IV; G2/M, fraction VI of Fig. 1 C). In contrast to G2/M and G1 cells, we observed a significant increase in MNase accessibility at pre-RC zones during S phase, whereas on average the MR at pre-RC flanking regions do not change over the cell cycle (Fig. 6 B, left). Top-pre-RCs display pronounced MNase sensitivity during S phase, whereas this link is not obvious in bot-pre-RCs (Fig. 6 B, center and right; see Fig. S4 B for standard deviations). It is possible that pre-RCs protect DNA against MNase digestion, an effect that is lost when pre-RCs and ORC are disassembled in human cells after origin activation. The increased MNase sensitivity is S phase specific, whereas the average profile of the G2/M fraction is similar to the G1 fraction. It is important to note that the increased MNase sensitivity does not necessarily mean that nucleosomes are evicted, but that structural changes might occur that expose DNA, thus increasing the accessibility.

Efficiency of replication initiation correlates with MNase sensitivity

Pre-RC formation and replication initiation are independent processes that occur in different cell cycle phases. Although most SNS and pre-RC zones overlap, they are linked to different cell cycle phases, which might result in different chromatin states (Tables 1, S1, and S5). Therefore, we next analyzed the mean MR profiles of SNS zones and their standard deviations (Fig. 7 A and Fig. S4 B). In G1 cells, SNSs are characterized by increased MNase sensitivity, whereas the topSNSs are characterized by a pronounced MNase sensitivity. A decreased sensitivity is observed within the 30% least prominent SNS zones (botSNSs, n = 17; Table S8). This finding is in line with a recent report by Lantermann et al. (2010), who suggested a correlation between origin strength and MSRs for S. pombe origins. In contrast to the situation at pre-RC zones, no cell cycle dependence was evident in the MNase sensitivity profiles at SNS zones. A pronounced MNase sensitivity was particularly evident in all phases of the cell cycle for the topSNSs (Fig. 7 B, panel 3). This relationship is missing in botSNSs (Fig. 7 B, panel 4). Randomly selected control sites show no regular pattern (Fig. 7 B, panel 2). We conclude from these analyses that both pre-RC and SNS zones show different features with respect to MNase sensitivity. Although pre-RCs are characterized by dynamic profiles, the efficiency of origin activation is clearly linked with the degree of sensitivity, reflecting an open chromatin state.

Figure 7.

Origin activity is linked to increased MNase sensitivity. (A) Mean profile of SNS (brown) and G1-MSR log2 enrichments (green) in a ±1,000-bp window centered at the maximum peak of all SNSs (left), the topSNSs (center), and the botSNSs (right). (B) Mean profile of SNS (brown), pre-RC (black), G1-MR (green), S-MR (red), and G2/M-MR (blue) log2 enrichments in a ±1,000-bp window centered at the maximum peak of the SNS zones as described in A.

Replication initiation at EBV promoter regions

Recent genome-wide studies in different systems show a link between TSS and replication origins (Cadoret et al., 2008; Sequeira-Mendes et al., 2009; Eaton et al., 2010; Karnani et al., 2010; Cayrou et al., 2011). In comparison to the human genome, the EBV genome is very gene dense and comprises ∼100 genes within 170 kbp. Most of these genes are efficiently silenced in latently infected cells. A recent study using the Raji cell line indicated RNA polymerase II binding only at the EBER regions, the DS/Cp domain, the BART miRNA region, and the LMP promoters (Holdorf et al., 2011). To study the relationship between replication SNS zones and TSSs, we generated mean enrichment profiles of SNSs and cell cycle MR profiles aligned at the TSSs (Fig. 8 A and Materials and methods). We omitted from the analysis those genes with a distance of <500 bp between their TSSs, which resulted in 72 TSSs used for analyses (Table S9). On average, promoters were found to exhibit MR just upstream of the TSSs. These indicate cell cycle independence, and were positioned +1 and +2 nucleosomes within the gene body, with the +1 nucleosome peaking at TSS + 20 bp, and the +2 nucleosome peaking at TSS + 220 bp.

Figure 8.

Nucleosome occupancy at TSSs, and local nucleotide composition at pre-RC and SNS zones. (A) Profiles of SNS (brown), G1-MR (green), S-MR (red), and G2/M-MR (blue) log2 enrichments in a ±1,000-bp window centered at 72 TSS. (B) Heat map of G1-MR log2 enrichments in a ±1,000-bp window centered at the 72 TSSs. The red color indicates high nucleosome resistance, whereas green indicates low. The IDs and names of the 72 genes are shown on the right axis. The dendrogram was obtained with a hierarchical cluster analysis based on the neighborhood [TSS − 250 bp, TSS + 250 bp]. Based on the dendrogram, we define four clusters of TSS: R1 (resistance, n = 22), S1 (sensitive, n = 9), S2 (n = 3), and R2 (n = 21). SNSs located in [TSS, TSS + 500 bp] are indicated in blue. Light blue highlights those TSSs with two SNSs. (C) Mean profile of the nucleotide base content in a ±250-bp window centered at the maximum peak of: all SNSs (left), topSNSs (center), and botSNSs (right). The nucleotide content is depicted in red for G or C and black for A or T, respectively. The gray dashed horizontal lines in each of the three panels represent the mean GC (top) or mean AT (bottom) content in the EBV genome. The red dashed line represents the mean AT content at the SNS zones. Fig. 8 A shows that the mean replication initiation activity is high in the region [TSS, TSS + 500 bp], peaking in an MSR in the gene body. The mean nucleosome phasing is similar to promoter regions with an elongating or stalled RNA Pol II (Schones et al., 2008). In total, 37 SNS zones are located in the region [TSS, TSS + 500 bp] of the 72 analyzed TSSs (see Table S9). 33 regions [TSS, TSS + 500 bp] have one SNS and two [TSS, TSS + 500 bp] have two SNSs. In comparison to the genome mean, these regions show an ∼2.5-fold higher density of SNSs. Most of the analyzed TSSs represent silent promoters, which are reactivated in the productive cycle. These lytic genes are expressed in a sequential order and are accordingly classified as early or late genes. We hypothesized that a correlation exists between the MNase profile of these classes and replication initiation. To investigate this, we performed a cluster analysis of the 72 promoters according to their MNase sensitivity in the 500-bp region [TSS − 250 bp, TSS + 250 bp] (Fig. 8 B; see Materials and methods, “Cluster analysis and heat map generation”). Generally, two major groups can be defined. The majority of late lytic genes (28 out of 43; subgroups R1 and R2) represent genes with high MR. In contrast, the latent genes, the miRNA regions, and genes preferentially expressed in the early lytic phase (28 TSSs: 5 latent and miRNA, 13 early lytic) are characterized by increased MNase sensitivity (subgroups S1 and S2). The cluster analysis revealed that 71.4% of the TSSs in the S groups contain SNSs, whereas only 38.6% of TSSs in the R groups have an SNS (Table S10). None of the five origins within R1 belong to the topSNSs, whereas five of the 10 S1-SNSs are topSNSs. These results suggest that TSSs with an open chromatin structure are more frequently associated with SNSs, especially with topSNSs, than they are associated with a more closed chromatin state. Active transcription is not a prerequisite for this association. Our finding of two different “gene expression classes” is in accordance with studies of epigenetic modifications in the Kaposis’s sarcoma-associated herpesvirus (Günther and Grundhoff, 2010; Toth et al., 2010). These studies revealed that early genes tend to be more enriched, with chromatin marks that usually correlate with active transcription, whereas late genes are more enriched with repressive histone modifications. We conclude that herpesvirus genes destined for rapid expression upon reactivation preserve an open chromatin state during latency. Our data strongly suggest that the prime determinant of pre-RC formation and initiation is not transcriptional activity as such, but rather an open and dynamic local chromatin structure.

Nucleotide preferences at pre-RC and SNS zones

Previous in vitro ORC binding and origin mapping experiments show that metazoan ORC does not display any sequence preference. Recent meta-analysis of replication origins in Drosophila melanogaster corroborated that the primary sequence together with active chromatin features contributes to ORC binding, although to a low degree (MacAlpine et al., 2010; Eaton et al., 2011). Cayrou et al. (2011) reported that D. melanogaster and mouse origins are characterized by GC-rich motifs. We investigated the nucleotide composition and the occurrence of dinucleotide motifs in a ±250-bp window surrounding the highest peaks of pre-RC and SNS zones. Table 2 shows that pre-RCs assemble without any nucleotide preference relative to the genome wide mean; we observed only very minor differences between top- and bot-pre-RC zones. We observed slight significant differences in the A/G/T content between top- and bot-pre-RCs, in particular minor advantages of CG and G stretches (Tables 3 and 4).

Table 2.

Base composition in SNS and pre-RC zones

Zone type	As	Ts	Gs	Cs	CG	GC	CC	GG	GC or CG	TA	AT	AA	TT	AT or TA
All pre-RC	0.206	0.232	0.284	0.277	0.050	0.073	0.083	0.086	0.124	0.034	0.043	0.044	0.053	0.077
Top-pre-RC	0.204	0.218	0.294	0.285	0.055	0.078	0.089	0.090	0.134	0.032	0.040	0.042	0.046	0.073
Bot-pre-RC	0.220	0.235	0.270	0.275	0.043	0.070	0.081	0.078	0.113	0.035	0.046	0.051	0.055	0.080
All SNS	0.221	0.235	0.283	0.262	0.043	0.069	0.080	0.088	0.112	0.039	0.048	0.051	0.057	0.087
TopSNS	0.223	0.243	0.276	0.257	0.042	0.066	0.082	0.081	0.108	0.041	0.051	0.050	0.066	0.092
BotSNS	0.218	0.224	0.296	0.262	0.046	0.073	0.076	0.094	0.119	0.032	0.045	0.051	0.050	0.077
Genome mean	0.203	0.214	0.301	0.282	0.054	0.079	0.090	0.101	0.133	0.032	0.041	0.043	0.048	0.074

Summary of nucleotide features within enriched pre-RC and SNS zones. The numbers indicate the percentages of single nucleotides, dinucleotide pairs (AT, TA, GC, and CG), and of single dinucleotide pairs. The top and bottom values give the percentages of the strongest and weakest enrichment zones (n =19 for pre-RC; n = 17 for SNS).

Table 3.

Deviations versus genome mean (p-value, one sided Student’s t-test)

Zone type	As	Ts	Gs	Cs	CG	GC	CC	GG	GC or CG	TA	AT	AA	TT	AT or TA
All pre-RC	0.23	3.5 × 10⁻⁹	3.6 × 10⁻⁶	0.07	0.01	1.1 × 10⁻⁴	1.2 × 10⁻³	2.2 × 10⁻⁹	1.9 × 10⁻⁴	0.13	0.06	0.22	2.7 × 10⁻³	0.05
Top-pre-RC	0.46	0.21	0.16	0.40	0.26	0.38	0.33	0.01	0.44	0.47	0.34	0.44	0.26	0.39
Bot-pre-RC	2.6 × 10⁻³	2.9 × 10⁻⁴	2.9 × 10⁻⁷	0.18	2.5 × 10⁻⁵	1.6 × 10⁻³	0.02	4.0 × 10⁻⁸	5.0 × 10⁻⁵	0.15	0.04	0.01	0.02	0.05
All SNS	1.8 × 10⁻⁷	4.5 × 10⁻⁸	3.4 × 10⁻⁶	8.2 × 10⁻⁸	2 × 10⁻¹²	1.2 × 10⁻⁹	1.4 × 10⁻⁵	2.3 × 10⁻⁷	5 × 10⁻¹⁴	1.7 × 10⁻⁶	3.8 × 10⁻⁶	2.4 × 10⁻⁵	1.4 × 10⁻⁵	5.8 × 10⁻⁸
TopSNS	6.0 × 10⁻⁴	3.5 × 10⁻⁵	4.0 × 10⁻⁴	4.4 × 10⁻⁴	3.0 × 10⁻⁷	4.2 × 10⁻⁶	0.03	1.8 × 10⁻⁵	2.8 × 10⁻⁸	1.8 × 10⁻⁴	1.1 × 10⁻⁴	0.01	1.8 × 10⁻⁵	1.1 × 10⁻⁵
BotSNS	0.02	0.07	0.27	1.7 × 10⁻³	0.01	0.04	5.1 × 10⁻⁴	0.08	0.01	0.38	0.09	0.02	0.28	0.24

Nucleotide feature deviations between pre-RC or SNS zones in relation to the genome mean. The p-values shown indicate the significance of the deviation between each nucleotide feature in Table 2 and the genome mean.

Table 4.

Deviations, top versus bottom zones (p-value, one sided Student’s t-test)

Zone type	As	Ts	Gs	Cs	CG	GC	CC	GG	GC or CG	TA	AT	AA	TT	AT or TA
pre-RC	0.02	0.02	1.9 × 10⁻³	0.2	2.7 × 10⁻⁴	0.03	0.13	0.02	1.1 × 10⁻³	0.21	0.05	0.03	0.02	0.07
SNS	0.24	0.02	0.02	0.26	0.12	0.04	0.23	0.02	0.04	1.8 × 10⁻³	0.04	0.47	1.4 × 10⁻³	3.3 × 10⁻³

Nucleotide feature deviations between top and bottom pre-RC or SNS zones. The p-values shown indicate the significance of the deviation between top and bottom pre-RC or SNS zones.

Base composition in SNS and pre-RC zones Summary of nucleotide features within enriched pre-RC and SNS zones. The numbers indicate the percentages of single nucleotides, dinucleotide pairs (AT, TA, GC, and CG), and of single dinucleotide pairs. The top and bottom values give the percentages of the strongest and weakest enrichment zones (n =19 for pre-RC; n = 17 for SNS). Deviations versus genome mean (p-value, one sided Student’s t-test) Nucleotide feature deviations between pre-RC or SNS zones in relation to the genome mean. The p-values shown indicate the significance of the deviation between each nucleotide feature in Table 2 and the genome mean. Deviations, top versus bottom zones (p-value, one sided Student’s t-test) Nucleotide feature deviations between top and bottom pre-RC or SNS zones. The p-values shown indicate the significance of the deviation between top and bottom pre-RC or SNS zones. Table 2 also indicates that origin activation is moderately affected by the nucleotide composition. We observed an elevated A/T content at SNS zones in relation to genome mean, which is more pronounced at topSNS than at botSNS. The EBV genome has an A/T content of 41.7%, whereas the SNS zones display a mean A/T content of 45.6%, with topSNSs having a mean of 46.6%. Fig. 8 C visualizes the preference for A/T-rich sequences at SNS zones by plotting the mean nucleotide content in a ±250-bp window centered at their maximum peak, which confirms the increased A/T frequency at topSNSs. The analysis of AT dinucleotide pairs indicates a slight overrepresentation of any A/T pair at topSNSs in relation to genome mean. Conversely, we observed a slight bias in disfavor of C/G pairs. In summary the initiation process is moderately favored by A/T-rich stretches, independent from specific primary sequence motifs, whereas no correlation between the efficiency of pre-RC assembly and the underlying sequence can be detected. It is important to note that this relationship does not have any predictive power to explain why origins are placed where they are.

Discussion

Significant progress has been made in understanding the features controlling DNA replication in the context of chromatin in mammals. However, mechanisms regulating the efficiencies of pre-RC formation and origin firing are still a conundrum. By analyzing pre-RC and SNS zones, as well as mononucleosome profiles from different cell cycle stages, we show that pre-RCs are characterized by an S phase–specific MNase sensitivity, and that the efficiency of origin activation correlates with increased MNase sensitivity. Given that latent EBV replication is akin to that of host cell DNA in nearly every aspect studied to date, there is every reason to believe that the findings of our study are extendable to mammalian chromatin. The replicon paradigm that guided the search for replication origins for many years does not reflect origin selection and activation in metazoan cells (Rhind, 2006; Hamlin et al., 2008). In contrast to S. cerevisiae, which nearly follows the replicon model, metazoan pre-RCs are established at flexible sites in each genome. In frog embryos, the plasticity is extreme and suggests a random origin pattern (Harland and Laskey, 1980; Hyrien and Méchali, 1993). The flexibility in pre-RC formation has implications on ChIP experiments and makes the identification of binding sites very difficult: signals are diluted, and reliable parameters to allow for a clear distinction between enriched binding sites and background signals are missing (Gilbert, 2010; Hamlin et al., 2010; Schepers and Papior, 2010). In addition to the advantages described previously, our study of the parameters supporting pre-RC formation in the latent EBV replication system has the unique advantage of using a well-characterized, highly specific, and efficient pre-RC site at DS that serves as internal positive control. We detect many Orc2- and Mcm3-enriched sites throughout the EBV genome, which exhibit a very high correlation between binding sites and efficiencies. To reduce background noise, we performed three independent experiments, which were normalized against IgG controls. The resulting Orc2 and Mcm3 profiles were highly similar, which allowed us to combine both profiles to one pre-RC profile. To eliminate false positive signals, we chose a cut-off width of 400 bp for the identified enriched zones, although the fragment distribution might have allowed a higher resolution. The resulting 64 pre-RC zones correlate with increased MNase sensitivity, providing further evidence that these signals are true positive pre-RC zones and not random noise caused by antibody or hybridization artifacts. Pre-RCs are distributed over the entire EBV genome. Some regions contain clusters of assembly sites, whereas other regions are relatively sparse in pre-RC zones. We conclude that pre-RC formation occurs at multiple places of the EBV genome, with DS being the dominant assembly site. Furthermore, not the full contingent but rather only a small subset of these sites are used per individual genome and cell cycle. Nucleosomes limit the accessibility of DNA for binding partners, and increasing evidence suggests that nucleosome organization might be one defining parameter of replication origins (Berbenetz et al., 2010; Eaton et al., 2010; MacAlpine et al., 2010; Lubelsky et al., 2011; Givens et al., 2012; Xu et al., 2012). Open chromatin structures are often found at transcriptionally active regions. Also, chromatin remodeling complexes mobilize nucleosomes to allow origin formation (Collins et al., 2002; MacAlpine et al., 2004; Zhou et al., 2005; Cadoret et al., 2008; Sugimoto et al., 2008; Sequeira-Mendes et al., 2009). Here, we performed the first comparative genome-wide analysis between pre-RC and SNS zones and MR profiles generated at different stages of the cell cycle. We found that pre-RCs are characterized by a dynamic MNase pattern, which exhibits an increased sensitivity during S phase (Fig. 6). In an analogy to the extended pre-RC–specific DNaseI footprint in S. cerevisiae, it is conceivable that pre-RCs also protect mammalian origin DNA in G1 (Diffley et al., 1994). The increased MNase sensitivity during S phase is in line with previous findings that human ORC dissociates after origin firing, which is likely to result in increased enzymatic accessibility (Gerhardt et al., 2006; Siddiqui and Stillman, 2007). In G2/M phase the MNase profile at pre-RCs is similar to the G1 profile. This observation might be explained by a rebinding of ORC. However, the reassembly of pre-RCs is not completed in the G2/M fraction. Alternatively, structural changes exposing origin DNA might explain the cell cycle–dependent MNase sensitivity of origin DNA. Comparing the most and least efficient pre-RCs, we found a more pronounced sensitivity at top-pre-RCs than at bot-pre-RCs. In S. cerevisiae, pre-RCs are characterized by positioned nucleosomes. As these origins have an orientation, the mean size of a MSR is dependent on the alignment to the T-rich strand (Field et al., 2008; Berbenetz et al., 2010; Eaton et al., 2010). However, a limitation of our system is the small number of origins detected in the EBV genome. This results in a relatively low sample size for any statistical analyses, and thus in high variance, limiting any conclusions regarding the mean flanking nucleosome positions and the existence of an orientation in these origins. Pre-RC assembly and origin activation are temporally separated but functionally linked events. To detect initiation sites, we isolated SNS DNA by an enzyme-free method and found that >80% of SNS and pre-RC zones overlap. When taking into account that the majority of the nonoverlapping SNS zones are located in the direct neighborhood of pre-RC zones, the spatial correlation increases to >90%. We do not observe a 100% overlap because: (a) Experimentally, we do not have a single-nucleotide resolution in our ChIP and SNS experiments; and (b) the definition of pre-RC and SNS zones for our analyses is most likely not perfect, and has some intrinsic fuzziness. Also, we might exclude true positive zones as well as include false positive signals. Lubelsky et al. (2011) have also observed the spatial separation of origin recognition and replication initiation, where pre-RCs and SNSs do not align perfectly. Origin recognition at pre-RCs and replication initiation at SNSs are reflected in different features. First, pre-RC zones are characterized by a cell cycle–dependent MNase profile, whereas SNS zones appear as cell cycle–independent MSRs. The efficiency of origin activation clearly correlates with the degree of MNase sensitivity. Second, our findings indicate that the initiation efficiency is moderately influenced by the underlying sequence. Our comparative analysis indicates that A/T-rich tracks are preferentially found at topSNSs. An increased A/T content thermodynamically destabilizes the DNA duplex, thus facilitating base unpairing, an event that is part of the initiation process, but not of pre-RC assembly. Furthermore, A/T-rich elements, particularly homopolymeric poly(dA:dT), are less favorable for nucleosome formation (Segal et al., 2006; Segal and Widom, 2009), which might explain the relationship between A/T content and SNS. Currently, no experimental data exist that describe how the EBV sequence influences nucleosome positioning. In contrast to our findings, Cayrou et al. (2011) found that SNSs correlate with GC richness and CpG islands, whereas we observe a bias toward AT-rich elements. This could either be explained by the different model organisms analyzed or by the different experimental methods used to isolate SNS DNA. A general feature of SNS DNA is the very low copy number, which makes all methods sensitive for contaminations or biases introduced during the experimental process. For example, λ-exonuclease might induce a bias toward GC-rich DNA. However, Karnani et al. (2010) compared this enzymatic method with the enzyme-independent immunoprecipitation of newly BrdU-labeled DNA without any apparent differences in terms of AT content. Further experiments are essential to clarify the strengths and limitations of the individual methods. Our observations suggest a two-step model to explain the plasticity of origin formation and selection in human cells. In the first step, a limited number of pre-RCs are assembled independent of sequence. At present it is unclear which mechanisms exist to limit this number; however, we propose that the efficiency is linked to the local chromatin structure and its ability to mobilize nucleosomes. It is very unlikely that each potential pre-RC is used in each cell cycle for complex formation because the copy number of initiation proteins is too low (Wong et al., 2011). The excess of pre-RCs in relation to SNSs and the relative ratios between the efficiencies of pre-RC assembly at DS and other sites corroborate this data. Assuming that a pre-RC is formed at the DS region in every cell cycle, the mean efficiency of a non-DS pre-RC in the EBV genome is on average 5.98 times weaker than at DS (peakmax at DS, 23.34; peakaverage, 20.76). This means that only 15–20% of potential pre-RC sites are used per genome and cell cycle for pre-RC formation. In a second step, a subset of pre-RCs is activated to initiate replication. SMARD data shows that only 1–3 origins are activated per EBV genome, which suggests that the origin activation efficiency is in the range of 10–20%. This model explains the discrepancy between the observed plasticity of initiation sites, the limited number of pre-RCs present in each cell, and the even lower number of initiation events. With this, the Jesuit model (“many are called, but few are chosen”) functions at two temporarily separated levels (DePamphilis, 1993, 1996). The genome-wide mapping of pre-RC proteins and its correlation with replication initiation sites and MSRs provides new insights into our understanding of how replication origins are organized in mammalian cells. Our study demonstrates that a ChIP analysis of pre-RC components is technically possible; however, it requires very careful controls (i.e., sufficient replicates, various pre-RC proteins, IgG-controls) and considerations in the selection of threshold levels for enriched zone width. The high copy number of the EBV genome might have facilitated our analyses. Strong origins are characterized by efficient pre-RC assembly and replication initiation processes. However, to be a weak origin, only one of these processes needs to be inefficient. DS is a perfect example of a strong pre-RC site which may function as an internal control site, but which at the same time represents only a weak initiation site. DNA accessibility and nucleosome mobility are likely to contribute to efficient pre-RC formation, whereas initiation efficiency is influenced by additional parameters such as the A/T content. Our study may help to unravel the conflict between the strict replicon model and an entirely stochastic origin pattern (Rhind, 2006).

Materials and methods

Cell culture

Raji and DG75 cells were grown in suspension culture with RPMI medium, supplemented with 10% FCS at 37°C in 5% CO2.

Centrifugal elutriation and flow cytometry

Centrifugal elutriation (J6-MC centrifuge; Beckman Coulter) was used to separate the different cell cycle phases. For ChIP experiments, 5 × 109 logarithmically growing Raji cells were washed with PBS and resuspended in 50 ml RPMI supplemented with 1% FCS, 1 mM EDTA, and 0.25 U/ml DNase I (Roche). Cells were injected into a JE-5.0 rotor (Beckman Coulter) with a large separation chamber at 1,500 rpm and a flow rate of 30 ml/min controlled with a Masterflex pump (Cole-Palmer). The rotor speed was kept constant and 400-ml fractions were collected at increasing flow rates (35–100 ml/min). Individual fractions were counted and processed for the ChIP assay as described in the next section. For flow cytometry, 106 cells were washed once with PBS, resuspended in 1 ml 80% ethanol/20% PBS, and incubated for 1 h on ice. Fixed cells were washed twice with PBS, and 900 µl PBS supplemented with 200 U RNase was added. After 15 min of incubation on ice, 100 µl of propidium iodide stain was added (5 µg/ml PI and 50 mM EDTA in PBS). Samples were kept on ice until the DNA content was determined using a FACS Calibur (BD).

Chromatin preparation and ChIP experiments

108 cells of the corresponding cell cycle fractions were centrifuged (1,200 rpm, 10 min) and washed twice with ice-cold PBS. The pellet was resuspended in 20 ml of PBS (room temperature). Cells were fixed for 10 min at room temperature by adding 20 ml of a freshly prepared 2% (vol/vol) formaldehyde solution. Adding 1.25 M glycine to a final concentration of 125 mM stopped the reaction. Cells were immediately transferred to ice and incubated for 5 min. After centrifugation, the cells were washed twice with ice-cold PBS and subsequently lysed in 10 ml ice-cold ChIP lysis buffer 1 (50 mM Hepes-KOH, pH 7.4, 140 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 10% [vol/vol] glycerol, 0.5% [vol/vol] NP-40, 0.25% Triton X-100, and a freshly added 1× protease inhibitor cocktail [Roche]). After incubation for 10 min at 4°C, nuclei were precipitated by centrifugation (1,500 rpm) and solved in 5 ml of ice-cold ChIP lysis buffer 2 (10 mM Tris-HCl, pH 8.0, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, and freshly added 1× protease inhibitor cocktail). After incubation and centrifugation (1,500 rpm, 10 min, 4°C), the chromatin was resuspended in 5 ml of ice-cold ChIP lysis buffer 3 (10 mM Tris-HCl, pH 8.0, 140 mM NaCl, 1 mM EDTA, 1 mM EGTA, 0.5% N-lauryl sarcosine, 0.1% sodium deoxycholate, and 1× protease inhibitor cocktail). Acid-washed glass beads (212–300 µm) were added, and the cross-linked chromatin was fragmented by sonication (8 × 30 s at 35% power output, with pulses set to 1 s on/1 s off) in an ice-water bath using a sonicator (250-D; Branson). Micrococcal nuclease (MNase) digestion was performed (10 U MNase/ml and 4 mM CaCl2) for 10 min at 37°C. Adding 40 mM EGTA stopped the reaction. The combination of DNA shearing by sonication and DNA digestion by MNase resulted in a mean fragment size of the bulk genomic DNA of 200–800 bp. Triton X-100 was added to a final concentration of 0.5% (vol/vol), and the lysate was centrifuged (13,200 rpm, 5 min, 4°C). The chromatin extract was quantified and diluted to 1 mg/ml with ChIP lysis buffer 3. Pre-clearing of the lysate was performed with 100 µl protein A or G Sepharose beads (pre-absorbed with PBS/0.5% [wt/vol] BSA) for 2 h at 4°C. Pre-cleared extracts were incubated with 10 µg affinity purified antibodies overnight at 4°C (rabbit Orc2 [Schepers et al., 2001], rabbit Mcm3 [Ritzi et al., 2003], and rabbit anti-IgG [Dianova]). 50 µl of blocked protein A or G Sepharose beads were added and incubated for 4 h. The antibody–protein–DNA complexes were collected by centrifugation (1,500 rpm, 2 min, 4°C) and washed twice with 10 ml RIPA (1 mM EDTA, 150 mM NaCl, 0.1% SDS, 0.5% DOC, and 1% NP-40), 10 ml LiCl (250 mM LiCl, 0.1% SDS, 0.5% DOC, 1% NP-40, and 50 mM Tris, pH 8.0), and 10 ml TE, pH 8.0, respectively. Sepharose beads were transferred to 1.5-ml reaction tubes, and the protein–DNA complexes were eluted twice for 10 min using 100 µl ChIP elution buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA, and 1% SDS) at 65°C under constant agitation. Beads were removed (1,500 rpm, 2 min, room temperature) and the supernatant was incubated for 2 h at 37°C adding 5 µg DNase-free RNase A. The cross-link was reversed by incubation for 12 h with 80 µg Proteinase K at 56°C. Input DNA (10% of the chromatin material used for ChIP) was prepared in parallel to the ChIP samples. RNase, Proteinase K treatment, and reversion of the cross-link were performed as described previously. Co-precipitated and input DNA were purified via NupleoSpin Extract II (Machery-Nagel) according to manufacturer’s instructions. The DNA was eluted in 22 µl of elution buffer.

Quantitative real-time PCR

Real-time PCR was performed with the LightCycler (Roche) according to the manufacturer’s instructions. The provided FastStart Reaction Mix was supplemented with MgCl2 to a final concentration of 2 mM. The amplification of PCR products was monitored on-line and usually stopped after 40 cycles. The following settings were used: 10 min at 95°C, cycles with 1 s at 95°C, 10 s at 62°C, and 20 s at 72°C. The sequences of the primers used are shown in Table S11.

Whole genome amplification

Co-precipitated DNA was amplified before microarray hybridization using WGA II (Sigma-Aldrich) according to the manufacturer’s instructions. 10 µl of the purified ChIP DNA and 100 ng of purified input DNA were used for amplification. The PCR was performed in a thermocycler (Mastercycler personal; Eppendorf) using 15 cycles. The amplified DNA was purified via NucleoSpin Extract II according to manufacturer’s instructions.

Array design and control experiments

EBV samples were hybridized on a custom-made 2 × 105,000 array (Agilent Technologies) covering both strands of the EBV genome (EBV strain type I; GenBank/EMBL/DDBJ accession no. NC_007605) in overlapping, melting temperature–optimized 60 mers. For normalization purposes, the array also contained 60 mers of the Adenovirus 5 (Ad5) genome. Probes were tiled every 12 bp along the upper and lower strand, with strand-specific probes being shifted 6 bp relative to each other, resulting in an overall resolution of 6 nucleotides. The Raji genome harbors two deletions located at nt 86,000–89,000 and 163,978–166,635, respectively. The EBV genome contains a series of W repeats as internal repeats. Six copies of the W repeats were plotted onto the array, but the repetitive regions typically vary from genome to genome, and thus could introduce undesired background noise to our analyses. Therefore, we discarded these repeats from all analyses. To minimize errors introduced by the antibodies and the unspecific hybridization of cellular DNA to the EBV genome, we performed two control experiments. First, we hybridized the IgG control to identify probes that are specifically recognized by IgG (Fig. S5 A). No overlap with the pre-RC–specific signals was observed. Second, we analyzed the Orc2 pattern of the EBV-negative cell line DG75 hybridized against the EBV tiling array (Fig. S5 B).

Microarray hybridization

Before hybridization, the concentrations and absorbance ratios of coimmunoprecipitated DNAs were recorded for all DNA samples using a spectrophotometer (ND-1000 UV-VIS; NanoDrop). For microarray hybridizations, only high-quality DNA samples with an A260/A280 ratio of 1.8–2.0 and a A260/A230 ratio >1.0 were used. 500 ng of DNA from each sample was subjected to restriction digestion with a combination of AluI and RsaI. The digested DNA samples were directly labeled with exo-Klenow polymerase and random primers by using Cyanine-5 dUTP for the experimental samples and Cyanine-3 dUTP for the reference samples (Genomic DNA Enzymatic Labeling kit; Agilent Technologies). After purification, the DNA concentrations and Cyanine-5 and Cyanine-3 dye concentrations (pmol/µl) were recorded for all labeled samples. After clean-up and quantification of labeled DNA, each Cyanine-5–labeled experimental sample was combined with a corresponding Cyanine-3–labeled reference sample. Human Cot-1 DNA was added to block the repetitive sequences in the genomic DNA. The combined samples were prehybridized and prepared for two-color based hybridization (Oligo aCGH Hybridization kit; Agilent Technologies). Each combination of experimental and reference DNA samples was hybridized at 65°C for 40 h on custom-made EBV-specific microarrays (2 × 105,000 format). Microarrays were washed with increasing stringency using Oligo aCGH wash buffers (Agilent Technologies) followed by drying with acetonitrile. Before scanning, each slide was washed in a drying and stabilization solution (Agilent Technologies) to stabilize the fluorescence for future scans. Fluorescent signal intensities for both dyes were detected on an Agilent DNA Microarray Scanner using Scan Control A8.4.1 Software (Agilent Technologies). Images were extracted using Feature Extraction 10.5.1.1 Software (Agilent Technologies). After a one-pass scan, an additional scan was performed by applying eXtended Dynamic Range (XDR) at 10% as well as 100% laser capacity. Primary array analyses and data normalization were performed on a GenePix Personal 4100A scanner using GenePix Pro 6.0 software (Axon Instruments).

SNS analysis

2 × 107 cells were washed with PBS and resuspended in PBS with 10% glycerol (Kamath and Leffak, 2001). Cells were lysed for 10 min in slots of a 1.2% alkaline agarose gel (50 mM NaOH and 1 mM EDTA). DNA was separated by electrophoresis overnight (low melting temperature agarose; Biozym). After neutralization with 1× TAE (40 mM Tris, pH 8.0, 20 mM acetic acid, and 1 mM EDTA) for 45 min, the lane containing DNA size markers was separated from the rest of the gel and visualized by ethidium-bromide (EtBr) for 15 min. The size of SNS DNA in the unstained gel was determined by comparing it with the EtBr-stained DNA size marker. Subsequently, SNS fragments of 800–1,500 nt were extracted from the gel using the QIAquick gel extraction kit. SNS abundance was measured by quantitative real-time PCR using primer pairs for the human HPRT locus (Cohen et al., 2002). The concentration of purified SNS-DNA was determined using a Quant-iT dsDNA high-sensitivity assay kit according to the manufacture’s instructions. After amplification, Southern bot analysis with an EBV-specific probe was performed to determine the length of EBV-specific nascent strand DNA. Fragments of 100–2,500 bp were detected, indicating that the size of the marker in the alkaline gel does not correspond with the actual length of the isolated SNS fragments.

Mononucleosome preparation

For each sample, 107 cells were harvested, washed with PBS, and resuspended in 4 ml hypotonic buffer A (10 mM Hepes-KOH, pH 7.9, 10 mM KCl, 1.5 mM MgCl2, 0.34 M sucrose, 10% glycerol, 1 mM DTT, and 1× protease inhibitor mix). Cells were lysed by adding 0.04% Triton X-100 and incubated for 10 min on ice. Samples were centrifuged (4 min, 1,300 g, 4°C) to separate soluble cytosolic and nucleosolic proteins from chromatin. Nuclei were washed in 5 ml of ice-cold buffer A supplemented with 200 mM NaCl. After centrifugation (5 min, 1,300 g, 4°C), nuclei were carefully resuspended in 1 ml MNase digestion buffer (10 mM Hepes-KOH, pH 7.6, 120 mM NaCl, 1.5 mM MgCl2, 3 mM CaCl2, 10% (vol/vol) glycerol, 1 mM DTT, and 1× protease inhibitor mix). MNase digestion was performed by adding 30 U MNase and incubating for 3 min at 37°C. The reaction was stopped by adding 40 µl EGTA (0.5 M) on ice. RNA was removed by incubation with 20 U RNase for 2 h at 37°C, then subsequently incubated with Proteinase K at 56°C over night (80 µg/ml). Mononucleosomes were isolated from a 1.2% agarose TAE gel and purified via NupleoSpin Extract II according to the manufacturer’s instructions.

Southern blotting

500 ng of sonicated and MNase-digested ChIP DNA or nascent strand DNA were separated on a 1.0% TAE gel and transferred to membrane (Immobilon Ny+; EMD Millipore). After prehybridization in Church buffer (0.5 M sodium phosphate, 7% SDS, and 1 mM EDTA, pH 7.2) for 3 h at 65°C, labeled and denatured EBV-specific probes (nt 9,170–9,347, nt 37,794–37,970, and nt 50,012–50,269) were added and hybridized for 16 h at 65°C. After extensive washings with 2× SSC, 0.1% SDS and 0.5× SSC, 0.1% SDS, the data were obtained by digital imaging.

Bioinformatical methods

Software.

All numerical and statistical analyses were done using R (http://www.r-project.org). Additionally, we used the packages limma (http://www.bioconductor.org/packages/2.8/bioc/html/limma.html), affy (http://www.bioconductor.org/packages/release/bioc/html/affy.html), gplots (http://cran.r-project.org/web/packages/gplots/) and tileHMM (http://cran.r-project.org/web/packages/tileHMM/). All functions were used with default parameters unless stated otherwise.

Orc2 and Mcm3 signal enrichment and IgG-normalization.

Primary array analysis of enriched (Cy5) and input (Cy3) DNA as well as data normalization was performed on a GenePix Personal 4100A scanner using GenePix Pro 6.0 software (Axon Instruments). Enriched (Cy5) and input (Cy3) signal intensities were converted into log2 enrichment ratios (log2(Cy5/Cy3)) individually for each biological replicate. Using these ratios, we then separately normalized the Orc2 and Mcm3 log2 ratios with IgG using the lmFit function of the limma package. Finally, we scaled the resulting normalized log2 enrichments (i.e., so that the set of log2 enrichments have a mean of 0 and a standard deviation of 1) separately for each concentration to allow for an appropriate comparison.

Orc2 and Mcm3 enriched zones calculation.

Enrichment zones for Mcm3 and Orc2 were then calculated with the IgG-normalized ratios using the tileHMM package (Humburg et al., 2008). The package identifies ChIP-enriched regions using a two-state HMM with t distributions. The initial parameters of the HMM were obtained by adjusting the default parameters for the probe length (probe region) and the mean size of the DNA fragments to the values 60 and 800, respectively, which are appropriate choices for our analyses. For the estimation of transition probabilities, we assigned the “enriched” state to those probes with log2 ratios > 0, and the “non-enriched” state to those probes with log2 ratios ≤ 0. Log2 ratios in repetitive and dilution regions in the EBV genome were excluded from the analysis. Before obtaining the enriched zones, we smoothed the data with a moving mean in overlapping windows with a size of 150 bp. Thus, after parameter optimization with the Viterbi and EM algorithms, tileHMM suggested 140 enriched zones for Mcm3 and 174 enriched zones for Orc2. We observed that several of these zones had short widths (minimum zone width = 114 bp), and thus were suspicious of including mostly background signals. To avoid including false positive–enriched zones in further analyses, we discarded all enriched zones with widths <400 bp. Thus, we obtained 64 enriched zones for Mcm3 (mean width = 650 bp) and 76 enriched zones for Orc2 (mean width = 570 bp). 55 of the 64 zones were enriched for both Mcm3 and Orc2.

Estimation of single pre-RC log2 ratios and enriched zones.

With the normalized Orc2 and Mcm3 log2 ratios, we then estimated a single log2 ratio for pre-RC using the lmFit function of the limma package. This single estimate was then scaled (as described previously) to allow for appropriate comparisons with other concentrations. To obtain pre-RC enriched zones, we observed the overlap of the previously identified Mcm3 and Orc2 enriched zones on a probe-by-probe basis, and defined them as regions with length ≥400 bp, composed of probes enriched with both Orc2 and Mcm3 or with Mcm3 only. This resulted in the identification of 64 pre-RC enriched zones.

SNS DNA signal enrichment.

Signal intensities for the SNS arrays were converted into log2 ratios for each biological replicate as described previously. Using the mean of both replicates, we obtained a single estimated SNS signal. This signal was then smoothed with a moving mean in overlapping windows of size 200 bp, and scaled similarly as for Orc2, Mcm3, and pre-RC.

SNS enriched zone calculation.

Using the SNS log2 ratios, we followed three criteria to identify SNS enrichment zones: (1) the probes must have a positive log2 ratio, (2) ratios in the bottom quintile of all positive ratios qualified as an estimate of background noise and were thus removed from further analyses, and (3) a minimum of 58 consecutive probes that fulfill criteria 1 and 2 must be present to be considered an SNS-enriched zone (i.e., minimum length of SNS enriched zone = 402 bp). The number of 58 consecutive probes used to estimate the minimal length of SNS-enriched zones was chosen to avoid an overlap with Okazaki fragments. This resulted in the identification of 57 SNS enriched zones.

SNS heat map generation.

The heat maps displayed in Fig. 3 C were generated using the image function in R.

G1 MSR estimation.

To estimate the sensitive regions of MNase-digested G1 chromatin, we first converted the signal intensities of the nucleosome G1 array into log2 ratios, and then scaled these ratios as described previously. We then smoothed the data with a moving mean in overlapping windows with a size of 100 bp. The MSRs were estimated according to three criteria: (1) the probes must have a negative log2 ratio, (2) ratios in the bottom decile of all negative ratios qualified as an estimate of background noise and were thus removed from further analyses, and (3) a minimum of16 adjacent probes that fulfill criteria 1 and 2 must be present to be considered an MSR (i.e., minimum length of a MSR = 150 bp). The number of 16 consecutive probes used to estimate the minimal length of a MSR was chosen based on the typical length of a nucleosome.

S and G2 MSR signal enrichment.

Signal intensities for the nucleosome arrays (S and G2/M phases) were converted to log2 enrichment ratios, scaled to allow for proper comparisons, and smoothed with a moving mean in overlapping windows of size 100 bp.

Mean pre-RC, SNS, and MR log2 enrichment profiles.

The pre-RC, SNS, and MR profiles (Fig. 4, A–C; and Fig. 5, A and B) were created by averaging the pre-RC, SNS, G1 phase MR, S phase MR, and G2/M phase MR log2 ratios in a ±1,000 bp neighborhood centered at the maximum peak of the pre-RC or SNS enriched zones (all, top 30%, or bottom 30%). The profiles were calculated using sliding windows 50 bp in size, sliding the window in steps of 10 bp. The mean log2 enrichments at each step were obtained with a 5% trimmed mean to avoid the effects of outliers. The random profiles were obtained as described previously, except that the neighborhood was centered at 250 randomly selected positions with a uniform distribution across the EBV genome (excluding repetitive and dilution regions).

Standard deviation for mean profiles.

Because of the heterogeneity of the data (in some cases caused by small sample sizes), we also obtained the standard deviation of the mean profiles described previously (Fig. S4, A and B). The dotted lines in each panel represent the mean ± 1 standard deviation for each step of the sliding window.

Mean profiles at TSSs.

The mean log2 enrichment profiles centered at the TSSs were obtained according to the description given previously. We first centered the ±1,000-bp neighborhood at the selected TSSs, and then obtained mean log2 enrichments of SNS, G1 phase MR, G2/M phase MR, and S phase MR with a 5% trimmed mean. The profiles were calculated using sliding windows 50 bp in size, sliding the window in steps of 10 bp.

Cluster analysis and heat map generation.

The hierarchical cluster analysis on the 72 TSSs was obtained using the Ward’s minimum variance method, and is based on the G1 phase MR in the region [TSS − 250 bp, TSS + 250 bp]. The heat map shown in Fig. 6 B was produced using the gplots package, and it represents the G1 phase MR in the ±1,000 bp neighborhood centered at each TSS. The red color indicates high nucleosome occupancy, whereas green indicates low. The IDs of the 72 TSSs are shown on the right axis of the heat map, and the dendrogram obtained with the hierarchical cluster analysis is shown on the left.

Nucleotide base composition at pre-RC and SNS zones.

The mean nucleotide composition profiles were produced by centering a ±250-bp neighborhood at the maximum peak of the SNS enrichment zones (all, top, or bottom), and then averaging the percentage of nucleotide composition of the probes comprising each zone. The profiles were calculated using sliding windows 50 bp in size, sliding the window in steps of 10 bp.

Note on p-values and box plots.

All box plots were plotted without outliers. In addition, several p-values in the figures and supplementary figures display the value “P < 2.2 × 10−16.” The reason for the frequent repetition of this value is because of the fact that the lowest possible p-value that the software obtains (for numerical reasons) is precisely 2.2 × 10−16.

Calculation of simulated microarray signals.

Resolution of fragments with a uniform length: To simulate the resolution of microarray signals, we considered a hypothetical genome/chromosome of length l that contains several discrete sites (b ∈ B = {b1, b2, … , bx}), which can be used to isolate subgenomic fragments. We assume that the fragmentation process results in the generation of subgenomic fragments that are uniformly distributed along the genome. Hence, within a fragment population of length l, the coverage of each nucleotide position is identical, and there are an identical number of fragments that begin or end at this position (this is only strictly true for nucleotides that are at least 1 nucleotide away from either end of the parental molecule). Assuming that the presence of a single selection site is sufficient to retain a given fragment, the selection process isolates all fragments that contain at least one site b ∈ B. We refer to this population as FB. For each probe p, the pool of fragments that contributes to its signal is represented by the subpopulation of FB that contains the target sequence for p. The maximum possible array signal is observed when probe position and a binding site coincide, and thus 100% of fragments that contain p are retained. Hence, a relative and normalized signal for each probe p is calculated by dividing the number of selected fragments that contain p by the total number of fragments that contain p (see also Fig. S1 E), such that:where F is the set of fragments retained during selection, F is the set of fragments that contain p, F∩F is the intersection of F and F, and thus the set of fragments containing p as well as at least one selection site b ∈ B, and ∣ … ∣ denotes the cardinality, i.e., total number of elements/fragments within a given set. Resolution of fragments with varying length: in ChIP experiments, the fragmentation process results in the generation of a population of fragments of varying length l. Under nonsaturating conditions, the signal of each probe is represented by the sum of the normalized contribution of each fragment pool. To calculate a normalized signal value, we introduce a normalization factor f(l), represented by the frequency distribution of sequence coverage by individual fragment length pools before selection. For example, if the population consists of fragment lengths l1, l2, and l3 that account for 80%, 19%, and 1% of sequence coverage, their frequency values are 0.8, 0.19, and 0.01, respectively. We are not considering potential competition between probes for longer subgenomic fragments because the Agilent protocol includes a step that generates shorter fragments of 50–200 bp immediately before hybridization. Furthermore, fragments are labeled by synthesis using Cy5- and Cy3-labeled nucleotides, and thus the amount of label is directly proportional to the fragment length. Therefore, if fragments of different lengths equally cover a given probe, competition by adjacent probes for longer fragments is compensated for by the higher fluorescence of those fragments. Based on these considerations:where l is the length of subgenomic fragment population, lmin and lmax are the lower and upper bounds, respectively, of l, f(1) is the coverage frequency for fragments of length l, is the selected set of fragments of length l, is the set of fragments of length l that contain p, is the intersection of and , and thus the set of fragments containing p as well as at least one selection site b ∈ B, and ∣ … ∣ denotes the cardinality, i.e., the total number of elements/fragments within a given set.

Online supplemental material

Fig. S1 A illustrates a small bias introduced by the whole genome amplification step. Fig. S1 B shows genome-wide profiles for Orc2, Mcm3, G1-MNase, and pre-RC. Fig. 1 C shows Southern blot experiments to determine the fragment length distribution of the ChIP-input DNA and the SNS DNA, and a quantitative analysis of the fragment length distribution of the ChIP DNA. Fig. S1 (D and E) shows figures simulating the influence of the fragment length on the resolution of microarray data. Fig. S2 shows a control agarose gel of an MNase digest (A), box plots of G1-MR at pre-RC and SNS-enriched zones (B), and a χ2 test demonstrating that pre-RC zones are not independent of MSRs. Fig. S3 A shows an experiment to control the quality of isolated SNS DNA at the published HPRT locus (Cohen et al., 2002). Fig. S3 B shows the replication initiation activity within the EBV genome in 5-kbp steps to facilitate the comparison with the SMARD results (Norio and Schildkraut, 2001, 2004). Fig. S3 C shows that the majority of SNS zones that are not overlapping with at least 5% of their width with a pre-RC zone have nearby pre-RC zones. Fig. S3 D shows a linear regression between the mean SNS and pre-RC log2 enrichments at SNS enriched zones. Only a minor correlation between both enrichments can be detected. In contrast to this finding, a χ2 test on a two-way contingency table revealed a significant relationship between top-pre-RCs and topSNSs (Fig. S3 E). Fig. S4 shows means and standard deviations of the mean profiles shown in Fig. 4 and Fig. 5. Fig. S5 shows box plots of different IgG antibodies hybridized on the EBV array (A), and a control experiment in the EBV-negative cell line DG75 (B). Tables S1–S8 contain lists of all pre-RC and SNS zones, their locations, sizes, and efficiencies. Table S9 lists all TSS in the EBV genome and their overlap with SNS zones. Table S10 shows the cluster information of SNS zones located in [TSS − 250 bp, TSS + 250 bp]. Table S11 lists all primer pairs used for quantitative PCR experiments. Online supplemental material is available at http://www.jcb.org/cgi/content/full/jcb.201109105/DC1.

64 in total

1. The microRNAs of Epstein-Barr Virus are expressed at dramatically differing levels among cell lines.

Authors: Zachary L Pratt; Malika Kuzembayeva; Srikumar Sengupta; Bill Sugden
Journal: Virology Date: 2009-02-12 Impact factor: 3.616

2. Genome-wide studies highlight indirect links between human replication origins and gene regulation.

Authors: Jean-Charles Cadoret; Françoise Meisch; Vahideh Hassan-Zadeh; Isabelle Luyten; Claire Guillet; Laurent Duret; Hadi Quesneville; Marie-Noëlle Prioleau
Journal: Proc Natl Acad Sci U S A Date: 2008-10-06 Impact factor: 11.205

3. Drosophila ORC localizes to open chromatin and marks sites of cohesin complex loading.

Authors: Heather K MacAlpine; Raluca Gordân; Sara K Powell; Alexander J Hartemink; David M MacAlpine
Journal: Genome Res Date: 2009-12-07 Impact factor: 9.043

4. Schizosaccharomyces pombe genome-wide nucleosome mapping reveals positioning mechanisms distinct from those of Saccharomyces cerevisiae.

Authors: Alexandra B Lantermann; Tobias Straub; Annelie Strålfors; Guo-Cheng Yuan; Karl Ekwall; Philipp Korber
Journal: Nat Struct Mol Biol Date: 2010-01-31 Impact factor: 15.369

Review 5. Poly(dA:dT) tracts: major determinants of nucleosome organization.

Authors: Eran Segal; Jonathan Widom
Journal: Curr Opin Struct Biol Date: 2009-02-07 Impact factor: 6.809

6. Dynamic regulation of nucleosome positioning in the human genome.

Authors: Dustin E Schones; Kairong Cui; Suresh Cuddapah; Tae-Young Roh; Artem Barski; Zhibin Wang; Gang Wei; Keji Zhao
Journal: Cell Date: 2008-03-07 Impact factor: 41.582

7. Transcription initiation activity sets replication origin efficiency in mammalian cells.

Authors: Joana Sequeira-Mendes; Ramón Díaz-Uriarte; Anwyn Apedaile; Derek Huntley; Neil Brockdorff; María Gómez
Journal: PLoS Genet Date: 2009-04-10 Impact factor: 5.917

8. Genomic study of replication initiation in human chromosomes reveals the influence of transcription regulation and chromatin structure on origin selection.

Authors: Neerja Karnani; Christopher M Taylor; Ankit Malhotra; Anindya Dutta
Journal: Mol Biol Cell Date: 2009-12-02 Impact factor: 4.138

9. Parameter estimation for robust HMM analysis of ChIP-chip data.

Authors: Peter Humburg; David Bulger; Glenn Stone
Journal: BMC Bioinformatics Date: 2008-08-18 Impact factor: 3.169

10. Distinct modes of regulation by chromatin encoded through nucleosome positioning signals.

Authors: Yair Field; Noam Kaplan; Yvonne Fondufe-Mittendorf; Irene K Moore; Eilon Sharon; Yaniv Lubling; Jonathan Widom; Eran Segal
Journal: PLoS Comput Biol Date: 2008-11-07 Impact factor: 4.475

9 in total

1. High-resolution analysis of DNA synthesis start sites and nucleosome architecture at efficient mammalian replication origins.

Authors: Rodrigo Lombraña; Ricardo Almeida; Isabel Revuelta; Sofia Madeira; Gonzalo Herranz; Néstor Saiz; Ugo Bastolla; María Gómez
Journal: EMBO J Date: 2013-08-30 Impact factor: 11.598

2. Bacterial artificial chromosomes establish replication timing and sub-nuclear compartment de novo as extra-chromosomal vectors.

Authors: Jiao Sima; Daniel A Bartlett; Molly R Gordon; David M Gilbert
Journal: Nucleic Acids Res Date: 2018-02-28 Impact factor: 16.971

3. Orc5 induces large-scale chromatin decondensation in a GCN5-dependent manner.

Authors: Sumanprava Giri; Arindam Chakraborty; Kizhakke M Sathyan; Kannanganattu V Prasanth; Supriya G Prasanth
Journal: J Cell Sci Date: 2015-12-07 Impact factor: 5.285

4. Cdt1-binding protein GRWD1 is a novel histone-binding protein that facilitates MCM loading through its influence on chromatin architecture.

Authors: Nozomi Sugimoto; Kazumitsu Maehara; Kazumasa Yoshida; Shuhei Yasukouchi; Satoko Osano; Shinya Watanabe; Masahiro Aizawa; Takashi Yugawa; Tohru Kiyono; Hitoshi Kurumizaka; Yasuyuki Ohkawa; Masatoshi Fujita
Journal: Nucleic Acids Res Date: 2015-05-18 Impact factor: 16.971

5. Fidelity of end joining in mammalian episomes and the impact of Metnase on joint processing.

Authors: Abhijit Rath; Robert Hromas; Arrigo De Benedetti
Journal: BMC Mol Biol Date: 2014-03-22 Impact factor: 2.946

6. The Replicative Consequences of Papillomavirus E2 Protein Binding to the Origin Replication Factor ORC2.

Authors: Marsha DeSmet; Sriramana Kanginakudru; Anne Rietz; Wai-Hong Wu; Richard Roden; Elliot J Androphy
Journal: PLoS Pathog Date: 2016-10-04 Impact factor: 6.823

7. Human ORC/MCM density is low in active genes and correlates with replication time but does not delimit initiation zones.

Authors: Nina Kirstein; Alexander Buschle; Xia Wu; Stefan Krebs; Helmut Blum; Elisabeth Kremmer; Ina M Vorberg; Wolfgang Hammerschmidt; Laurent Lacroix; Olivier Hyrien; Benjamin Audit; Aloys Schepers
Journal: Elife Date: 2021-03-08 Impact factor: 8.140

Review 8. The origin recognition complex in human diseases.

Authors: Zhen Shen
Journal: Biosci Rep Date: 2013-06-11 Impact factor: 3.840

9. Cis-acting DNA sequence at a replication origin promotes repeat expansion to fragile X full mutation.

Authors: Jeannine Gerhardt; Nikica Zaninovic; Qiansheng Zhan; Advaitha Madireddy; Sarah L Nolin; Nicole Ersalesi; Zi Yan; Zev Rosenwaks; Carl L Schildkraut
Journal: J Cell Biol Date: 2014-09-01 Impact factor: 10.539

9 in total