Literature DB >> 24822028

Recent Advances in the Identification of Replication Origins Based on the Z-curve Method.

Abstract

Precise DNA replication is critical for the maintenance of genetic integrity in all organisms. In all three domains of life, DNA replication starts at a specialized locus, termed as the replication origin, oriC or ORI, and its identification is vital to understanding the complex replication process. In bacteria and eukaryotes, replication initiates from single and multiple origins, respectively, while archaea can adopt either of the two modes. The Z-curve method has been successfully used to identify replication origins in genomes of various species, including multiple oriCs in some archaea. Based on the Z-curve method and comparative genomics analysis, we have developed a web-based system, Ori-Finder, for finding oriCs in bacterial genomes with high accuracy. Predicted oriC regions in bacterial genomes are organized into an online database, DoriC. Recently, archaeal oriC regions identified by both in vivo and in silico methods have also been included in the database. Here, we summarize the recent advances of in silico prediction of oriCs in bacterial and archaeal genomes using the Z-curve based method.

Entities: Chemical Disease Gene Species

Keywords: Archaea; Bacteria; Comparative genomics; Origins of replication; Z-curve.

Year: 2014 PMID： 24822028 PMCID： PMC4009838 DOI： 10.2174/1389202915999140328162938

Source DB: PubMed Journal: Curr Genomics ISSN： 1389-2029 Impact factor: 2.236

INTRODUCTION

In 1963, Jacob, Brenner, and Cuzin proposed the replicon model, in which the replicon was defined as the fundamental unit of replication [1]. The initiator protein (bacterial DnaA or archaeal Orc1/Cdc6) binds a sequence (bacterial DnaA box or archaeal ORB element) within a replicon called a replicator, and then DNA synthesis initiates from a specific site, called origin of replication [1]. The events that occur at the replication origin (oriC or ORI) are central to the process of regulating DNA replication and the cell cycle. Therefore, it is important to precisely identify the replication origins within the analyzed genomes. This critical information allows us to better understand not only the structure and function of the replication origins, but also the mechanisms of DNA replication [2, 3]. The oriC regions can be identified by several experimental methods including construction of replicative oriC plasmids [4, 5], microarray-based [6] or high-throughput sequencing-based [7] marker frequency analysis, and two-dimensional gel electrophoresis analysis [8]. The experimental methods for identifying replication origins in vivo are reliable, but time-consuming and labor-intensive. The identification of replication origins based on in silico analysis has been the subject of intensive study in the last two decades. The pioneer work to identify oriCs in silico is the GC-skew analysis [9, 10], and the cumulative GC-skew was later proposed to provide better resolution [11]. An oligomer-skew method was also proposed to predict oriC regions in bacterial genomes [12]. The same method was later used to identify oriCs in more than 200 prokaryotic chromosomes [13]. Use of GC-skew analysis, together with the location of the dnaA gene and distribution of DnaA boxes led to more accurate prediction of oriC regions [14]. The Z-curve method was developed in 1994 as a way to display base composition distributions along DNA sequences [15]. The x and y components of the Z-curve are related to distributions of RY (purine/pyrimidine) and MK (amino/ keto), as well as GC and AT bases, and can be used to identify oriC regions in bacterial and archaeal genomes [16]. For instance, Z-curve analysis predicted single oriC in the archaeal genomes of Methanosarcina mazei Go1 [17] and Methanocaldococcus jannaschii DSM 2661 [18], two oriCs in Halobacterium species NRC-1 [19], and three oriCs in Sulfolobus solfataricus P2 genome [19], and these prediction were consistent with later in vivo experimental evidence, e.g., that obtained in studies of Halobacterium species NRC-1 [20, 21] and Sulfolobus solfataricus P2 genome [6, 8]. Based on the Z-curve method, a web-based system, Ori-Finder [22], has been developed to find oriCs in over 2,000 bacterial genomes including Sorangium cellulosum 'So ce 56', Microcystis aeruginosa NIES-843 [23] and Cyanothece 51142 [24]. The predicted oriC regions have been organized into DoriC [25], a database of oriC regions in bacterial genomes. Recently, the database has been updated to include the oriC regions in archaeal genomes [26]. With the advent of the post-genomic era, genomic data accumulation has been increasing exponentially [27]. However, locations of a large number of oriCs in sequenced bacterial and archaeal genomes still remain unknown. This has created challenges as well as opportunities for identifying these oriCs by in silico analysis. Clarification of the archaeal replication mechanism is particularly important, as it may provide insight into the replication mechanisms of eukarya.

METHODS

Z-curve and RY, MK, AT or GC Disparity Curves

The Z-curve is a three-dimensional curve that constitutes a unique representation of a DNA sequence, such that the Z-curve and the given DNA sequence can each be uniquely reconstructed from the other [15]. The three components of the Z-curve, xn, yn and zn, represent three independent distributions that completely describe the DNA sequence being studied. The components xn, yn and zn display the distributions of purine versus pyrimidine (R vs. Y), amino versus keto (M vs. K) and strong H-bond versus weak H-bond (S vs. W) bases, respectively, along the DNA sequence. The xn and yn components are termed RY and MK disparity curves, respectively. The AT and GC disparity curves are defined by (xn + yn)/2 and (xn – yn)/2, which show the excess of A over T and G over C respectively, along the genome. The RY and MK disparity curves, as well as the AT and GC disparity curves, can be used to predict replication origins [16]. For instance, Z-curves (that is, RY, MK, AT and GC disparity curves) show a single oriC in the genome of the bacterium of Cyanothece sp. PCC 7425 (Fig. ) and one, two, three oriCs in genomes of the archaea of Pyrococcus abyssi GE5, Halobacterium sp. NRC-1, and Sulfolobus acidocaldarius DSM 639, respectively (Fig. ).

Ori-Finder and DoriC

Ori-Finder is an online system for finding oriCs in bacterial genomes based on an integrated method involving the analysis of base composition asymmetry using the Z-curve method, distribution of DnaA boxes, and the occurrence of genes frequently adjacent to oriCs. Currently, Ori-Finder version 1.0 is designed only for the identification of oriCs in bacterial genomes, which is available at http:// tubic.tju.edu.cn/Ori-Finder/. Ori-Finder has been used to analyze roughly 50 newly sequenced bacterial genomes, such as Corynebacterium pseudotuberculosis FRC41 [28], Orientia tsutsugamushi Ikeda [29], Bacillus pseudofirmus OF4 [30], Klebsiella pneumoniae subsp. pneumoniae HS11286 [31], Streptococcus parasanguinis FW213 [32], Acinetobacter baumannii MDR-TJ [33], Streptococcus infantarius subsp. infantarius CJ18 [34] and Streptococcus equi ssp. zooepidemicus strain ATCC35246 [35]. The oriC regions predicted by Ori-Finder in bacterial genomes have been organized into an online database, DoriC, which has been publicly available at http:// tubic.tju.edu.cn/doric since 2007. Six years after we constructed DoriC, the database has made significant advances in the number of bacterial genomes available, increasing about four-fold. Additionally, oriC regions in archaeal genomes identified by in vivo experiments as well as in silico analyses have been added to the database. Consequently, the latest release of DoriC 6.5 contains oriCs for more than 2,000 bacterial genomes and 100 archaeal genomes. Each entry contains detailed information about the oriC, such as the sequence, repeat, DnaA box or ORB motif, and graphical representations of the oriC, such as the various disparity curves (RY, MK, AT and GC). Users can browse the database by species name, or accession numbers of GenBank or DoriC, can search for oriCs by the organism’s name, accession number, lineage, or a keyword, and can also explore the genomic context around the oriC regions via NCBI Map Viewer or UCSC Archaeal Genome Browser by clicking the corresponding links provided by DoriC. In addition, users can select the ‘BLAST’ option to compare a query sequence or even a whole genome against DoriC to find homologous oriCs. DoriC has been widely used as a source of data in comparative genomics analysis [36-42].

RESULTS AND DISCUSSION

Replication Origins in Cyanobacteria

Based on DoriC, the relationships between the conserved features associated with the oriC regions, such as adjacent genes and DnaA boxes, and the taxonomic levels of the corresponding bacteria can be summarized. For example, detailed analyses have shown that the consensus sequence of the DnaA boxes in oriC regions, and the distribution of genes around oriCs, are strongly conserved among the bacteria in the phylum cyanobacteria [24]. The position of the oriC, adjacent to dnaN gene which encodes the beta clamp processivity factor, has been found to be universal among the bacteria within the phylum cyanobacteria. The ‘species-specific’ DnaA box motif for the phylum cyanobacteria is ‘TTTTCCACA’ instead of ‘TTATCCACA’, the DnaA box motif of Escherichia coli [43]. These strongly conserved features indicate that the in silico identified oriCs are reliable, as they have been confirmed by comparative genomics approaches. As we expected, the experimentally confirmed replication origins of Anabaena sp. PCC 7120 [4] and S. elongatus PCC 7942 [44] in the phylum cyanobacteria are all adjacent to the dnaN gene. Recently, coverage of the cyanobacterial phylum has improved significantly using diversity-driven genome sequencing [45], and some exceptions to the proposed rules have been uncovered in the process. For example, a cluster of DnaA boxes with perfect matches to the motif ‘TTTTCCACA’ has been found adjacent to dnaA gene instead of dnaN gene in Anabaena sp. 90, Geitlerinema sp. PCC 7407 and Synechococcus sp. PCC 6312. For Dactylococcopsis salina PCC 8305, Halothece sp. PCC 7418, Leptolyngbya sp. PCC 7376 and Thermosynechococcus elongatus BP-1, a cluster of DnaA boxes with perfect matches to the motif ‘TTTTCCACA’ has been found adjacent to neither dnaA nor dnaN (Table ). Perhaps the ancestral position of the replication origins in the phylum cyanobacteria was within the dnaA-dnaN intergenic region, and the translocation of the dnaA or dnaN gene from the putative origin of replication to another place on the chromosome has led to some origins linked only to dnaN or dnaA gene. If the oriC region instead of dnaA or dnaN gene had translocated away from its ancestral position, origins would be linked to neither dnaA nor dnaN genes.

Replication Origins in Some Intracellular Bacteria

Some bacteria are intracellular parasites or symbionts. Recently, the genome of Blattabacterium cuenoti, primary endosymbiont of the omnivorous cockroach Blatta orientalis, has been completely sequenced [46]. In their report, Patiño-Navarrete et al. concluded that ‘Similar to previously sequenced Blattabacterium strains, the strain from Blatta orientalis does not possess any features determining replication origin.’ Based on the results of Ori-Finder and DoriC in the genomes of Blattabacterium strains, we have identified candidate oriC regions which are adjacent to the gidA gene encoding glucose-inhibited division protein A. They contain putative DnaA boxes and repeat elements. The location of oriCs adjacent to the gidA gene, is common among intracellular bacteria such as secondary endosymbiont of Heteropsylla cubana, secondary endosymbiont of Ctenarytaina eucalypti, Wigglesworthia glossinidia endosymbiont of Glossina morsitans morsitans (Yale colony), and Wigglesworthia glossinidia endosymbiont of Glossina brevipalpis. However, for Wolbachia endosymbionts (Wolbachia endosymbiont of Drosophila melanogaster, Wolbachia endosymbiont strain TRS of Brugia malayi, Wolbachia pipientis, Wolbachia sp. wRi, and Wolbachia endosymbiont of Onchocerca ochengi), we have identified candidate oriC regions which are adjacent to the hemE gene encoding uroporphyrinogen decarboxylase. The replication origin of Orientia tsutsugamushi, an obligate intracellular bacterium belonging to the family Rickettsiaceae, is also predicted to be adjacent to the hemE gene by Ori-Finder. For Mollicutes whose genomes underwent considerable reduction because of a parasitic style of life, the oriCs are adjacent to dnaA gene. Interestingly, for Chlamydiae, a phylum of bacteria whose members are obligate intracellular pathogens, oriCs are adjacent to the hemB gene encoding delta-aminolevulinic acid dehydratase instead of dnaA gene, although two dnaA genes are contained in their genomes according to annotations in GenBank.

Multiple Replication Origins in Pyrobaculum calidifontis JCM 11548

The number of oriCs in archaea has been found to correlate with the phylogeny. For example, all the archaea within the phylum Crenarchaeota examined to date contain multiple origins [7]. Recently, four chromosome replication origins in the archaeon Pyrobaculum calidifontis JCM 11548 have been mapped by using high-throughput sequencing-based marker frequency analysis [7]. However, only one origin (oriC1) among the four can be mapped in detail to a precise location, which is within an intergenic region between the gene Pcal_0001 and a cdc6 gene, from 309 nt to 378 nt (Fig. ). Within the oriC1, there are two palindromic sequences (blue) annotated as Orb-1 elements [7]. The location of oriC, flanked by tRNA genes, is universal among the archaea in the class Thermoprotei within the phylum Crenarchaeota. For example, we found that the origins were adjacent to tRNA genes in Sulfolobus solfataricus P2, Sulfolobus tokodaii str. 7, Sulfolobus acidocaldarius DSM 639, Sulfolobus islandicus Y.N.15.51, Sulfolobus solfataricus 98/2, Metallosphaera cuprina Ar-4, Acidianus hospitalis W1, and Thermofilum pendens Hrk 5. Based on this conserved feature, the other three putative origins of replication in Pyrobaculum calidifontis JCM 11548 have been identified at the sequence level (Fig. ). The putative oriC2 is within an intergenic region between the gene Pcal_0541 and Pcal_0542, from 514,406 nt to 514,741 nt (Fig. ). The putative oriC3 is within an intergenic region between the gene Pcal_1006 and Pcal_1007, from 950,832 nt to 951,332 nt (Fig. ). The putative oriC4 is within an intergenic region between the gene Pcal_1820 and Pcal_1821, from 1,687,883 nt to 1,688,541 nt (Fig. ). Among the predicted oriCs, the putative oriC2 shares a long sequence, ‘atcccgtccccgttcagggggcgtgggttcaaatcccacccccggcgtgt’, with the putative oriC3. These three putative oriC regions all contain a 13-mer consensus element, ‘GGGTT CAAATCCC’, which has also been found in the oriCs of closely-related species such as Sulfolobus solfataricus P2, Acidianus hospitalis W1, and Metallosphaera cuprina Ar-4. We also found that the putative oriC2 and oriC3 share a common sequence, ‘gccggggtggccgagcggcccaaggcg’, with the putative origin of Thermofilum pendens Hrk 5, and the putative oriC4 shares a sequence, ‘atcccgggttcaaatcccggccg’, with the origins of Sulfolobus solfataricus P2, Acidianus hospitalis W1, and Metallosphaera cuprina Ar-4. Some conserved genes associated with oriCs, such as copG gene encoding plasmid copy number control protein, were also found around the predicted oriCs. The replication origin was flanked by tRNA gene and copG gene, which could play a fundamental role in shaping the origin-containing loci [47]. Around the putative oriC2, there is a tRNA-Ser gene (514,425..514,522 nt) recognizing UCA codons and a gene Pcal_0536 (510,064..510,240 nt) encoding CopG/Arc/MetJ family transcriptional regulator. Around the putative oriC3, there is a tRNA-Ser gene (951,001.. 951,098) recognizing UCC codons and a gene Pcal_1012 (953,989..954,357 nt) encoding CopG family transcriptional regulator. Around the putative oriC4, there is a tRNA-Cys gene (1,687,978..1,688,071 nt) recognizing UGC codons. Therefore, these origins may also be introduced by an extrachromosomal element. In addition, we found an intergenic region, (1,957, 398..1,957,754 nt), which also contains a 13-mer consensus element, ‘GGGTTCAAATCCC’, and a tRNA gene. However, this region is in close proximity to the putative oriC1, so it is not believed to function as a replication origin. Furthermore, the locations of all the predicted replication origins are in accordance with those determined by using the high-throughput sequencing-based marker frequency analysis (Fig. ). Therefore, the predicted replication origins would be useful to further the experimental study of the replication origins in Pyrobaculum calidifontis JCM 11548.

Mc-pRIP-adjacent Replication Origins in Methanococcales

While formulating our hypothesis, we found that the locations of other putative replication initiator genes would be helpful in predicting oriC. For example, in the genome of M. jannaschii, an ORF (MJ_0774), annotated as a ‘hypothetical protein’, is in fact a distant homolog of the Cdc6 protein [18]. The name Mc-pRIP for the putative replication initiator protein in Methanococcales has been used for MJ0774 and related proteins to distinguish it from bona fide orthologous Cdc6 [26]. We also found the genes, which encode Mc-pRIP in the other thirteen genomes within the order Methanococcales (Methanococcus aeolicus Nankai-3, Methanocaldococcus fervens AG86, Methanococcus maripaludis C5, M. maripaludis C6, M. maripaludis C7, M. maripaludis S2, M. maripaludis X1, Methanococcus vannielii SB, Methanococcus voltae A3, Methanocaldococcus vulcanius M7, Methanocaldococcus sp. FS406-22, Methanothermococcus okinawensis IH1, Methanocaldococcus infernus ME), were annotated as ‘LysR family protein’, ‘regulatory protein ArsR’, ‘MarR family transcriptional regulator’, etc. No cdc6 gene was annotated in the above genomes. All of the Mc-pRIP genes have been assigned COG identification number COG1474 (Cdc6-related protein, AAA superfamily ATPase), and belong to the COG functional categories L (Replication, recombination and repair) and O (Posttranslational modification, protein turnover, chaperones). In addition, helix-turn-helix domains were found in Mc-pRIP genes, which are believed to be involved in the DNA binding. Conserved domain annotation on the Mc-pRIP protein sequence in M. jannaschii, using the CD-Search web-service [48], also confirms the above results (Fig. ). The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily, multi-domains of Arch_ATPase (pfam01637, Archaeal ATPase), CDC6 (COG1474, Cdc6-related protein, AAA superfamily ATPase), TIGR02928 (orc1/cdc6 family replication initiation protein), and putative DNA binding sites have been found on Mc-pRIP protein in M. jannaschii. Similar results have also been obtained for the other Mc-pRIP proteins. Consequently, based on the locations of Mc-pRIP genes, the oriCs in the aforementioned genomes were predicted reliably and contain almost all the features of known replication origins in archaeal genomes.

Table 1.

The statistics of adjacent genes for the bacteria in the phylum Cyanobacteria.

RefSeq	Organism	Lineage	Adjacent Genes
NC_009925	Acaryochloris marina MBIC11017	Acaryochloris	dnaA, dnaN
NC_008312	Trichodesmium erythraeum IMS101	Oscillatoriales, Trichodesmium	dnaA, dnaN
NC_019776	Cyanobacterium aponinum PCC 10605a	Chroococcales, Cyanobacterium	dnaN
NC_019778	Cyanobacterium stanieri PCC 7202a	Chroococcales, Cyanobacterium	dnaN
NC_013771	Cyanobacterium UCYN-Aa	Chroococcales	dnaN
NC_019675	Cyanobium gracile PCC 6307	Chroococcales, Cyanobium	dnaN
NC_010546	Cyanothece sp. ATCC 51142b	Chroococcales, Cyanothece	dnaN
NC_011729	Cyanothece sp. PCC 7424	Chroococcales, Cyanothece	dnaN
NC_011884	Cyanothece sp. PCC 7425	Chroococcales, Cyanothece	dnaA, dnaN
NC_014501	Cyanothece sp. PCC 7822	Chroococcales, Cyanothece	dnaN
NC_011726	Cyanothece sp. PCC 8801	Chroococcales, Cyanothece	dnaN
NC_013161	Cyanothece sp. PCC 8802	Chroococcales, Cyanothece	dnaN
NC_019780	Dactylococcopsis salina PCC 8305	Chroococcales, Dactylococcopsis	others
NC_019779	Halothece sp. PCC 7418	Chroococcales, Halothece cluster, Halothece	others
NC_010296	Microcystis aeruginosa NIES-843	Chroococcales, Microcystis	dnaN
NC_006576	Synechococcus elongatus PCC 6301	Chroococcales, Synechococcus	dnaN
NC_007604	Synechococcus elongatus PCC 7942	Chroococcales, Synechococcus	dnaN
NC_008319	Synechococcus sp. CC9311	Chroococcales, Synechococcus	dnaN
NC_007516	Synechococcus sp. CC9605	Chroococcales, Synechococcus	dnaN
NC_007513	Synechococcus sp. CC9902	Chroococcales, Synechococcus	dnaN
NC_007776	Synechococcus sp. JA-2-3B'a(2-13)	Chroococcales, Synechococcus	dnaN
NC_007775	Synechococcus sp. JA-3-3Ab	Chroococcales, Synechococcus	dnaN
NC_019680	Synechococcus sp. PCC 6312	Chroococcales, Synechococcus	dnaA
NC_010475	Synechococcus sp. PCC 7002	Chroococcales, Synechococcus	dnaN
NC_019702	Synechococcus sp. PCC 7502	Chroococcales, Synechococcus	dnaN
NC_009482	Synechococcus sp. RCC307	Chroococcales, Synechococcus	dnaN
NC_009481	Synechococcus sp. WH 7803	Chroococcales, Synechococcus	dnaN
NC_005070	Synechococcus sp. WH 8102	Chroococcales, Synechococcus	dnaN
NC_000911	Synechocystis sp. PCC 6803	Chroococcales, Synechocystis	dnaN
NC_017277	Synechocystis sp. PCC 6803	Chroococcales, Synechocystis	dnaN
NC_017038	Synechocystis sp. PCC 6803 substr. GT-I	Chroococcales, Synechocystis	dnaN
NC_017052	Synechocystis sp. PCC 6803 substr. PCC-N	Chroococcales, Synechocystis	dnaN
NC_017039	Synechocystis sp. PCC 6803 substr. PCC-P	Chroococcales, Synechocystis	dnaN
NC_004113	Thermosynechococcus elongatus BP-1	Chroococcales, Thermosynechococcus	others
NC_005125	Gloeobacter violaceus PCC 7421	Gloeobacteria, Gloeobacterales, Gloeobacter	dnaN
NC_019427	Anabaena sp. 90b	Nostocales, Nostocaceae, Anabaena	dnaA
NC_007413	Anabaena variabilis ATCC 29413	Nostocales, Nostocaceae, Anabaena	dnaA, dnaN
NC_014248	'Nostoc azollae' 0708	Nostocales, Nostocaceae, Anabaena	dnaA, dnaN
NC_010628	Nostoc punctiforme PCC 73102	Nostocales, Nostocaceae, Nostoc	dnaN
NC_019676	Nostoc sp. PCC 7107	Nostocales, Nostocaceae, Nostoc	dnaA, dnaN
NC_003272	Nostoc sp. PCC 7120	Nostocales, Nostocaceae, Nostoc	dnaA, dnaN
NC_019684	Nostoc sp. PCC 7524	Nostocales, Nostocaceae, Nostoc	dnaA, dnaN
NC_019751	Calothrix sp. PCC 6303	Nostocales, Rivulariaceae, Calothrix	dnaN c
NC_019682	Calothrix sp. PCC 7507	Nostocales, Rivulariaceae, Calothrix	dnaA, dnaN
NC_019678	Rivularia sp. PCC 7116	Nostocales, Rivulariaceae, Rivularia	dnaA, dnaN
NC_019753	Crinalium epipsammum PCC 9333	Oscillatoriales, Crinalium	dnaA, dnaN
NC_019703	Geitlerinema sp. PCC 7407	Oscillatoriales, Geitlerinema	dnaA
NC_019683	Leptolyngbya sp. PCC 7376	Oscillatoriales, Leptolyngbya	others
NC_019738	Microcoleus sp. PCC 7113	Oscillatoriales, Microcoleus	dnaA, dnaN
NC_019693	Oscillatoria acuminata PCC 6304	Oscillatoriales, Oscillatoria	dnaA, dnaN
NC_019729	Oscillatoria nigro-viridis PCC 7112	Oscillatoriales, Oscillatoria	dnaA, dnaN
NC_019701	Pseudanabaena sp. PCC 7367	Oscillatoriales, Pseudanabaena	dnaN
NC_019695	Chroococcidiopsis thermalis PCC 7203	Pleurocapsales, Chroococcidiopsis	dnaN
NC_019689	Pleurocapsa sp. PCC 7327	Pleurocapsales, Pleurocapsa	dnaA, dnaN
NC_008816	Prochlorococcus marinus str. AS9601	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_009976	Prochlorococcus marinus str. MIT 9211	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_009840	Prochlorococcus marinus str. MIT 9215	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_009091	Prochlorococcus marinus str. MIT 9301	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_008820	Prochlorococcus marinus str. MIT 9303	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_007577	Prochlorococcus marinus str. MIT 9312	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_005071	Prochlorococcus marinus str. MIT 9313	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_008817	Prochlorococcus marinus str. MIT 9515	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_008819	Prochlorococcus marinus str. NATL1A	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_007335	Prochlorococcus marinus str. NATL2A	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_005042	Prochlorococcus marinus subsp. marinus str. CCMP1375	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN
NC_005072	Prochlorococcus marinus subsp. pastoris str. CCMP1986	Prochlorales, Prochlorococcaceae, Prochlorococcus	dnaN

Note that no dnaA gene is annotated in these genomes.

Note that only the chromosome 1 (I) or chromosome circular was counted if the bacterium has multiple chromosomes.

Note that the oriC region is about 5 kb away from the dnaN gene.

47 in total

1. Light-dependent and asynchronous replication of cyanobacterial multi-copy chromosomes.

Authors: Satoru Watanabe; Ryudo Ohbayashi; Yuh Shiwa; Aska Noda; Yu Kanesaki; Taku Chibazakura; Hirofumi Yoshikawa
Journal: Mol Microbiol Date: 2012-02 Impact factor: 3.501

2. Where does bacterial replication start? Rules for predicting the oriC region.

Authors: Pawel Mackiewicz; Jolanta Zakrzewska-Czerwinska; Anna Zawilak; Miroslaw R Dudek; Stanislaw Cebrat
Journal: Nucleic Acids Res Date: 2004-07-16 Impact factor: 16.971

3. Nucleotide compositional asymmetry between the leading and lagging strands of eubacterial genomes.

Authors: Hongzhu Qu; Hao Wu; Tongwu Zhang; Zhang Zhang; Songnian Hu; Jun Yu
Journal: Res Microbiol Date: 2010-09-22 Impact factor: 3.992

4. Genome of alkaliphilic Bacillus pseudofirmus OF4 reveals adaptations that support the ability to grow in an external pH range from 7.5 to 11.4.

Authors: Benjamin Janto; Azad Ahmed; Masahiro Ito; Jun Liu; David B Hicks; Sarah Pagni; Oliver J Fackelmayer; Terry-Ann Smith; Joshua Earl; Liam D H Elbourne; Karl Hassan; Ian T Paulsen; Anne-Brit Kolstø; Nicolas J Tourasse; Garth D Ehrlich; Robert Boissy; D Mack Ivey; Gang Li; Yanfen Xue; Yanhe Ma; Fen Z Hu; Terry A Krulwich
Journal: Environ Microbiol Date: 2011-09-27 Impact factor: 5.491

5. Identification of the oriC region and its influence on heterocyst development in the filamentous cyanobacterium Anabaena sp. strain PCC 7120.

Authors: Yin Zhou; Wen-Li Chen; Li Wang; Cheng-Cai Zhang
Journal: Microbiology Date: 2011-04-14 Impact factor: 2.777

Review 6. Regulation of the replication cycle: conserved and diverse regulatory systems for DnaA and oriC.

Authors: Tsutomu Katayama; Shogo Ozaki; Kenji Keyamura; Kazuyuki Fujimitsu
Journal: Nat Rev Microbiol Date: 2010-03 Impact factor: 60.633

7. DNA thermodynamic stability and supercoil dynamics determine the gene expression program during the bacterial growth cycle.

Authors: Patrick Sobetzko; Monika Glinkowska; Andrew Travers; Georgi Muskhelishvili
Journal: Mol Biosyst Date: 2013-03-14

8. Single replication origin of the archaeon Methanosarcina mazei revealed by the Z curve method.

Authors: Ren Zhang; Chun-Ting Zhang
Journal: Biochem Biophys Res Commun Date: 2002-09-20 Impact factor: 3.575

9. An archaeal chromosomal autonomously replicating sequence element from an extreme halophile, Halobacterium sp. strain NRC-1.

Authors: Brian R Berquist; Shiladitya DasSarma
Journal: J Bacteriol Date: 2003-10 Impact factor: 3.490

10. Complete genome and transcriptomes of Streptococcus parasanguinis FW213: phylogenic relations and potential virulence mechanisms.

Authors: Jianing Geng; Cheng-Hsun Chiu; Petrus Tang; Yaping Chen; Hui-Ru Shieh; Songnian Hu; Yi-Ywan M Chen
Journal: PLoS One Date: 2012-04-18 Impact factor: 3.240

10 in total

1. Diversification of DnaA dependency for DNA replication in cyanobacterial evolution.

Authors: Ryudo Ohbayashi; Satoru Watanabe; Shigeki Ehira; Yu Kanesaki; Taku Chibazakura; Hirofumi Yoshikawa
Journal: ISME J Date: 2015-10-30 Impact factor: 10.302

2. Acidianus Tailed Spindle Virus: a New Archaeal Large Tailed Spindle Virus Discovered by Culture-Independent Methods.

Authors: Rebecca A Hochstein; Maximiliano J Amenabar; Jacob H Munson-McGee; Eric S Boyd; Mark J Young
Journal: J Virol Date: 2016-01-13 Impact factor: 5.103

Review 3. Choosing a suitable method for the identification of replication origins in microbial genomes.

Authors: Chengcheng Song; Shaocun Zhang; He Huang
Journal: Front Microbiol Date: 2015-09-30 Impact factor: 5.640

4. Bacteria may have multiple replication origins.

Authors: Feng Gao
Journal: Front Microbiol Date: 2015-04-20 Impact factor: 5.640

5. Identification of the Replication Origins from Cyanothece ATCC 51142 and Their Interactions with the DnaA Protein: From In Silico to In Vitro Studies.

Authors: He Huang; Cheng-Cheng Song; Zhi-Liang Yang; Yan Dong; Yao-Zhong Hu; Feng Gao
Journal: Front Microbiol Date: 2015-12-10 Impact factor: 5.640

10. Context based computational analysis and characterization of ARS consensus sequences (ACS) of Saccharomyces cerevisiae genome.

Authors: Vinod Kumar Singh; Annangarachari Krishnamachari
Journal: Genom Data Date: 2016-07-19