| Literature DB >> 31572328 |
Abstract
DNA replication initiates from multiple replication origins (ORIs) in eukaryotes. Discovery and characterization of replication origins are essential for a better understanding of the molecular mechanism of DNA replication. In this study, the features of autonomously replicating sequences (ARSs) in Saccharomyces cerevisiae have been comprehensively analyzed as follows. Firstly, we carried out the analysis of the ARSs available in S. cerevisiae S288C. By evaluating the sequence similarity of experimentally established ARSs, we found that 94.32% of ARSs are unique across the whole genome of S. cerevisiae S288C and those with high sequence similarity are prone to locate in subtelomeres. Subsequently, we built a non-redundant dataset with a total of 520 ARSs, which are based on ARSs annotation of S. cerevisiae S288C from SGD and then supplemented with those from OriDB and DeOri databases. We conducted a large-scale comparison of ORIs among the diverse budding yeast strains from a population genomics perspective. We found that 82.7% of ARSs are not only conserved in genomic sequence but also relatively conserved in chromosomal position. The non-conserved ARSs tend to distribute in the subtelomeric regions. We also conducted a pan-genome analysis of ARSs among the S. cerevisiae strains, and a total of 183 core ARSs existing in all yeast strains were determined. We extracted the genes adjacent to replication origins among the 104 yeast strains to examine whether there are differences in their gene functions. The result showed that the genes involved in the initiation of DNA replication, such as orc3, mcm2, mcm4, mcm6, and cdc45, are conservatively located adjacent to the replication origins. Furthermore, we found the genes adjacent to conserved ARSs are significantly enriched in DNA binding, enzyme activity, transportation, and energy, whereas for the genes adjacent to non-conserved ARSs are significantly enriched in response to environmental stress, metabolites biosynthetic process and biosynthesis of antibiotics. In general, we characterized the replication origins from the genome-wide and population genomics perspectives, which would provide new insights into the replication mechanism of S. cerevisiae and facilitate the design of algorithms to identify genome-wide replication origins in yeast.Entities:
Keywords: DNA replication; Saccharomyces cerevisiae; autonomously replicating sequence; genome-wide analysis; replication origin
Year: 2019 PMID: 31572328 PMCID: PMC6753640 DOI: 10.3389/fmicb.2019.02122
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
FIGURE 1Circos plot showing the ARSs distribution of S. cerevisiae S288C reference genome. Every circle is described in the outermost-innermost direction. (1) The outermost circle represents the S. cerevisiae S288C chromosomes in kb, and the subtelomeric regions are colored in lighter gray; (2) GC-skew (window=3 kb, step=3 kb); (3) AT content (window=3 kb, step=3 kb); (4) Conservation heatmap of 352 ARSs. The location of each bar in the heatmap denotes the position of the ARS in each chromosome of S. cerevisiae S288C, and the number of homologous ARSs in 104 yeast strains is represented by the color of green (conserved ARS) and red (non-conserved ARS); (5) Links showed the result of similar ARS sequences. The orange links represent the similar ARS sequences that are mapped to the identical chromosome, and the blue links represent the similar ARS sequences that are located on different chromosomes.
FIGURE 2Functional analysis of genes adjacent to replication origins. (A) GO enrichment analysis of 662 conserved genes adjacent to conserved ARSs. GO terms with the adjusted p-value ≤ 0.05 are shown. The statistical significance was assessed by Fisher’s exact test with False discovery rate (fdr) correction. BP for the biological process; MF for molecular function; CC for the cellular component. (B) Scatterplot for significantly enriched KEGG pathways of 662 conserved genes adjacent to conserved ARSs. KEGG pathways with the adjusted p-value ≤ 0.05 are shown. The statistical significance was assessed by Fisher’s exact test with fdr correction. The size and color of dots represent the gene number and the adjusted p-value, respectively. Gene ratio is the proportion of enriched genes among all conserved genes neighboring ORIs. (C) GO enrichment analysis of genes adjacent to non-conserved ARSs. (D) Scatterplot for significantly enriched KEGG pathways of genes adjacent to non-conserved ARSs.