| Literature DB >> 25601401 |
Abstract
Replication of mammalian genomes starts at sites termed replication origins, which historically have been difficult to locate as a result of large genome sizes, limited power of genetic identification schemes, and rareness and fragility of initiation intermediates. However, origins are now mapped by the thousands using microarrays and sequencing techniques. Independent studies show modest concordance, suggesting that mammalian origins can form at any DNA sequence but are suppressed by read-through transcription or that they can overlap the 5' end or even the entire gene. These results require a critical reevaluation of whether origins form at specific DNA elements and/or epigenetic signals or require no such determinants.Entities:
Mesh:
Year: 2015 PMID: 25601401 PMCID: PMC4298691 DOI: 10.1083/jcb.201407004
Source DB: PubMed Journal: J Cell Biol ISSN: 0021-9525 Impact factor: 10.539
Figure 1.Replication mapping by DNA fiber techniques. (A and B) DNA fiber autoradiography (Cairns, 1963; Huberman and Riggs, 1968). Cells are labeled with [3H]thymidine, gently lysed on a glass slide, covered with photographic emulsion, and exposed for several months to reveal the stretches of radiolabeled spread DNA as silver grain tracks. (A) Intact E. coli chromosomal DNA labeled for several generations showed θ forms, suggesting replication by the fork mechanism and a single initiation event per chromosome. Brief sequential pulses of low and high activity produced grain tracks denser on both ends than in the middle, providing evidence for bidirectional replication (Prescott and Kuempel, 1972). (B) Pulse-labeled DNA from eukaryotic cells showed tandem tracks, indicating multiple bidirectional origins (Huberman and Riggs, 1968). (C) In DNA fiber fluorography, cells are typically pulsed with chlorodeoxyuridine (CldU) and then iododeoxyuridine (IdU; or vice versa), and the labeled tracks are detected with appropriate fluorescent antibodies, shortening imaging times to seconds. DNA can be spread by direct cell lysis on a glass slide (Jackson and Pombo, 1998), by attachment of purified DNA molecule ends to silanized coverslips before parallel stretching by a receding air/water meniscus (DNA combing; Michalet et al., 1997), or by capillary stretching between slide and coverslip (Norio and Schildkraut, 2001). CldU/IdU detection (green/blue) can be combined with FISH with specific DNA probes (red) to identify and orient target DNA molecules (Norio and Schildkraut, 2001; Anglana et al., 2003). Unlabeled DNA (dotted line) may be simultaneously detected in a fourth color with anti-DNA antibodies. In a variation called SMARD (Norio and Schildkraut, 2001), the labeled DNA is cut with a rare cutter restriction endonuclease, and a large (100–500 kb) target fragment is enriched by pulsed-field gel electrophoresis before stretching and detection. In this case, long CldU/IdU labeling times are needed to chase forks out of the labeled fragments before their electrophoretic separation. Tens to hundreds of single DNA molecules 100–1,000 kb in size identified by FISH are analyzed in a typical SMARD or DNA combing experiment.
Figure 2.Replication mapping by restriction fragment shape or strand composition analysis. (A) Neutral/alkaline 2D gel technique (Huberman et al., 1987). A restriction digest of total DNA is enriched for partially single-stranded, replication fork–containing fragments, by chromatography on BND (benzyl-naphtyl-DEAE)-cellulose. The enriched material is first separated in neutral agarose so that replication intermediates (RIs) of each fragment are resolved according to mass (horizontal arrows). Parental and nascent strands are then melted and resolved in an orthogonal direction in alkaline agarose (vertical arrows). After membrane transfer, center or end probes (noted left [L], middle [M], and right [R]) are used to reveal whether nascent strands grow from the center by internal initiation or from either end by entry of outside-initiated forks. The diagonal smear of nascent strands detected by each probe is indicated in the same color as the probe. (B) Neutral/neutral 2D gel technique (Brewer and Fangman, 1987). A restriction digest of total DNA is enriched in replication intermediates and separated in a first electrophoresis as in A. Branched fragments of similar mass but various shapes are then resolved in an orthogonal neutral electrophoresis using conditions that maximize contribution of shape to migration rate. Transfer and hybridization reveal whether the fragment’s replication intermediates contain two diverging forks (bubbles, internal initiation), one fork (simple Ys, passive replication), or two converging forks (double Ys, termination; not depicted). Panels illustrate the patterns obtained in case of centered, off-centered, or random initiation within the restriction fragment. (C) Bubble trap (Mesner et al., 2006). A restriction digest of total DNA enriched in replication intermediates by isolation on the nuclear matrix and chromatography on BND-cellulose is mixed with molten agarose, allowed to solidify, and electrophoresed out of the agarose plug. Bubbles become topologically trapped in the gel as a result of agarose fiber polymerization through their circular structure, whereas replication intermediates of other shapes can migrate out of the plug. Trapped bubbles are then cloned in a plasmid library and either hybridized to microarrays (Mesner et al., 2011) or sequenced (Mesner et al., 2013). Library purity is estimated to >80% by 2D gel analysis of the trapped material or by probing 2D gels of total replication intermediates with individual clones and scoring for a bubble arc.
Figure 3.Replication mapping by nascent strand abundance or polarity analysis. (A) Schematic drawing of nascent strands, Okazaki fragments, and leading strands synthesized at an origin. (B) Simplified flowcharts for isolating short nascent strands (SNSs), Okazaki fragments, or leading strands. (C) Principles of origin mapping by replicative strand analysis. In SNS abundance assays (B, 1–3; and C, 1), total DNA is denatured and SSS in the 0.5–3-kb range are isolated on sucrose gradients or agarose gels, taking care to exclude the smaller Okazaki fragments (<0.5 kb). SSS include SNS synthesized specifically at origins as well as inadvertently sheared or nicked strands, which sample the entire genome and typically form the vast majority of SSS molecules. SNS enrichment has been achieved (a) by lysing cells directly into the well of an alkaline agarose gel to minimize breakage before size fractionation (B, 1; in-gel lysis [IGL]-SNS), (b) by labeling neosynthetized DNA with BrdU and purifying the Br-DNA by immunoprecipitation (B, 2) or isopycnic centrifugation (B, 3; BrdU-SNS), and (c) by treating SSS with λ-exonuclease, a 5′-exonuclease that digests DNA but not RNA, to eliminate all DNA strands except nascent strands with an attached RNA primer (B, 4; λ-SNS). The latter strategy requires heat denaturation and neutral gradient purification of SSS to avoid RNA primer hydrolysis. Origins can be mapped by determining relative SNS abundance at closely spaced genomic positions (C, 1) by quantitative PCR (Vassilev et al., 1990), microarray hybridization (Lucas et al., 2007), or high-throughput sequencing (Besnard et al., 2012). Alternatively, SNS have been metabolically labeled with radioactive precursors and used to probe macroarrays representing highly amplified genomic loci (Dijkwel et al., 2002). (B, 4; and C, 2) Lagging-strand polarity assay (Hay and DePamphilis, 1982). Cells are pulse labeled with BrdUTP and radioactive precursors. Labeled Okazaki fragments are purified by size and immunoprecipitation and hybridized to immobilized, strand-specific probes spanning the locus of interest to determine the lagging-strand template (Burhans et al., 1990; Wang et al., 1998). Alternatively, Okazaki fragments accumulated after ligase inactivation are size purified and sequenced (Smith and Whitehouse, 2012). Template switches of opposite direction are observed at initiation and termination sites. The length of DNA over which the switch occurs indicates the size of the initiation or termination zone. (B, 5; and C, 3) Leading-strand polarity assay (Handeli et al., 1989). Cells are treated with emetine to prevent lagging-strand synthesis. Leading strands are density labeled with BrdU, isolated by isopycnic centrifugation, and hybridized with strand-specific probes spanning the locus of interest to determine the template of leading strand synthesis. The precise mechanism by which emetine, a protein synthesis inhibitor, specifically inhibits lagging-strand synthesis is still unclear (Burhans et al., 1991). (C, 4) Replication initiation point mapping (Bielinsky and Gerbi, 1998). (top) SNS 5′ ends can be mapped at nucleotide resolution by extension of a labeled downstream primer (red + blue arrows) followed by sequencing gel electrophoresis. (bottom) The leading-strand start sites are distinguished from the 5′ end of upstream joined Okazaki fragments by preventing joining in yeast ligase mutants (Bielinsky and Gerbi, 1999) or lagging-strand synthesis in mammalian cells treated with emetine (Abdurashidova et al., 2000). Ligation-mediated (Abdurashidova et al., 2000) or one-way (Romero and Lee, 2008) PCR amplification of SNS has been used to increase the sensitivity to the level required for the human genome.
Summary of origin features reported in mammalian genome-wide mapping studies
| Study | Origin purification | Detection | Genome span | Cell type | Origin number | Mean origin spacing | Mean origin size | Main origin features |
| SSS | Microarray | 1,425 | 11365 | 32 | <50 | <5 | 80% within transcription units. Correlated with chromatin acetylation. | |
| λ-SNS | Microarray | 30 (ENCODE) | HeLa | 283 | 63 | <5 | Clustered in GC-rich regions. Rare in GC-poor regions. Associated with CGI, TRE, and c-JUN and c-FOS BSs, with open chromatin due to CGIs, with DNase HSSs and with evolutionarily conserved regions. | |
| λ-SNS/BrdU-SNS | Microarray | 30 (ENCODE) | HeLa | 150 | 58/28 | 1.4/1.7 | AT-rich but within GC-rich regions, associated with conserved evolutionary elements. λ-SNS and λ-SNS + BrdU SNS intersects enriched in TSSs, but BrdU-SNS specific peaks depleted in TSSs. | |
| Bubble trap | Microarray | 30 (ENCODE) | Early S HeLa/HeLa/GM06990 | 111 (646)/128 (657)/177 (988) | 58/69/41 | 15.2/18.1/14.5 | Broad initiation zones covering 15–22% of the genome, within intergenic regions as well as within or overlapping active and inactive genes. 20% encompass 5′ end of or entire active genes and activating histone marks. Overlap by only ∼1/3 between cell types and affected by synchronization. | |
| SSS, λ-SNS/λ-SNS | Microarray sequencing | 34/3,000 (WG) | MCF-7, BT474, H520/MCF-7 | 8,281 | 4 | NR | >70% conserved in all cell lines. Enriched at active TSSs and in H3K4me3 and Pol-II. Associated with conserved evolutionary elements. | |
| λ-SNS | Sequencing | 3,000 (WG) | K562, MCF-7 | NR | NR | NR | Clustered near regions of moderate transcription. Rare in highly transcribed or nontranscribed regions. Excluded from TSSs but enriched ∼0.5 kb downstream. Strongly associated with meCpGs and DNase HSSs./Weakly associated with umCpGs, miRNA transcripts, CTCF, Pol-II and c-JUN BSs, H3K4 me, H3K29ac, and H3K27ac. | |
| λ-SNS | Sequencing | 3,000 (WG) | HeLa, IMR-90, hESC H9, iPSCs from IMR-90 | 250,000 | 11 | 0.5 | Often grouped in clusters (mean size of 11 kb). At saturation cover ∼10% of the genome. One half within genes, <18% with TSS and CGI. 65–84% pairwise overlap between cell lines, few and inefficient cell type–specific origins. Density correlated with percentage of GC, timing, and efficiency. 91% associated with G4. Strand asymmetric distribution of G, C, and G4. | |
| Bubble trap | Sequencing | 3,000 (WG) | GM06990 | 72,812 (123,297) | NR | 20 | Broad initiation zones covering 24% of the genome. 17,999 early, 25,735 mid-, and 29,020 late-replicating zone of mean size 27 kb, 18 kb, and 16 kb./Early zones more focused and efficient than late zones. Majority in nontranscribed DNA regardless of firing time. Early but not mid- and late zones associated with transcribed genes and activating marks DNase I HSSs (58%), H3K4me3, H3K27me3, H3K36me3, and CTCF BSs. At megabase scale, late zones anticorrelated with both activating and repressive marks. Densities were highest in both highly accessible and highly compact chromatin. | |
| ORC1-ChIP | Sequencing | 3,000 (WG) | HeLa | 13,600 | NR | NR | Mostly associated with TSSs of coding and noncoding RNAs. 39% of all expressed TSSs in HeLa cells. Most and least transcribed sites associated with coding and noncoding RNAs, respectively. No consensus sequence. | |
| λ-SNS | Sequencing | 3,000 (WG) | K562 | 59,185 | NR | 3.4 | Reanalyzed data of | |
| λ-SNS/BrdU-SNS | Sequencing | 3,000 (WG) | Primary basophilic erythroblasts | 100,000 | NR | NR | Association with G4 (37%), CGIs (7%), and TSSs (13%). DNase I HSSs associated with but not required for origin formation. | |
| λ-SNS | Microarray | 10.1 | mESC PGK12, MEFs, NIH-3T3 | 97 | 103 | NR | Most within transcription units. Half at CGI promoters. Efficiency conserved across cell types and correlated with embryonic TSSs. | |
| λ-SNS | Microarray | 60.4/118.3 | mESC GCR8, mTC P19, MEFs/Kc ( | 2,748/6,184 | 21/19 | NR | 44% conserved between cell types. Spacing fivefold smaller than IOD on combed DNA. Inferred firing efficiency 20%. Preferentially intragenic./Bimodal distribution of SNS around CGI. G-rich motifs and local nucleotide skew./ |
WG, whole genome; NR, not reported; iPSCs, induced pluripotent stem cells; mESC, mouse embryonic stem cell; hESC, human embryonic stem cell; MEFs, mouse embryonic fibroblasts; mTC, mouse teratocarcinoma; CGI, CpG islands; TRE, transcriptional regulatory elements; ChIP, chromatin immunoprecipitation; HSS, hypersensitive site; CTCF, CCCTC-binding factor; Pol II, RNA polymerase II; meCpG, methylated CpG dinucleotide; umCpG, unmethylated CpG; G4, G-quadruplex elements; BS, binding site; IOD, inter-origin distance; HP1, heterochromatin protein 1; For Mesner et al. (2011, 2013), the numbers in parentheses indicate the number of individual EcoRI fragments clustering into the indicated number of initiation zones.