| Literature DB >> 30987063 |
Dashiell J Massey1, Dongsung Kim2,3, Kayla E Brooks4, Marcus B Smolka5,6, Amnon Koren7.
Abstract
Centromeres serve a critical function in preserving genome integrity across sequential cell divisions, by mediating symmetric chromosome segregation. The repetitive, heterochromatic nature of centromeres is thought to be inhibitory to DNA replication, but has also led to their underrepresentation in human reference genome assemblies. Consequently, centromeres have been excluded from genomic replication timing analyses, leaving their time of replication unresolved. However, the most recent human reference genome, hg38, included models of centromere sequences. To establish the experimental requirements for achieving replication timing profiles for centromeres, we sequenced G₁- and S-phase cells from five human cell lines, and aligned the sequence reads to hg38. We were able to infer DNA replication timing profiles for the centromeres in each of the five cell lines, which showed that centromere replication occurs in mid-to-late S phase. Furthermore, we found that replication timing was more variable between cell lines in the centromere regions than expected, given the distribution of variation in replication timing genome-wide. These results suggest the potential of these, and future, sequence models to enable high-resolution studies of replication in centromeres and other heterochromatic regions.Entities:
Keywords: DNA replication timing; centromeres; heterochromatin; next-generation sequencing
Mesh:
Substances:
Year: 2019 PMID: 30987063 PMCID: PMC6523654 DOI: 10.3390/genes10040269
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Centromere replication timing can be consistently measured in human cell lines for most chromosomes. (A) Unsmoothed replication timing data for the breast cancer cell line HCC1954 across all centromeres (tan) and flanking regions. Each dot represents a single window, defined by 200 reads in the G1 phase sample. Chromosomes labeled in black contain at least 10 centromeric windows and were included in the analyses for Figure 3, Figure 4, Figure 5 (B) Replication timing inference is successful in the same subset of centromeres across cell lines. Bars represent the number of cell lines in which that chromosome’s centromere contained enough windows to be included in further analyses.
Figure 2Paired-end sequencing is critical for obtaining centromere replication timing. Single-end sequencing was generated by considering only the first read of each pair. Read (or read-pair) counts were averaged across the G1 and S phase fractions for each cell line. (A,B) Single-end alignment had a negligible effect on read depth genome-wide, but eliminated almost all of the reads in the centromeres. (C,D) The difference between single- and paired-end sequencing is largely driven by the difficulty in discriminating true sequence repeats from PCR and optical duplicates with single-end reads. All chromosomes/centromeres were considered for this analysis.
Figure 3Centromere replication timing is more variable between cell lines than chromosome-wide replication timing. Blue: whole chromosome (excluding centromere); gold: centromeres. Pearson correlation was calculated for each mappable centromere (see Figure 1) and for each chromosome for each pair of cell lines. Dots represent individual pairwise comparisons. The difference in the distribution of correlation coefficients between the centromeres and whole chromosomes is robust when controlling for the size of the centromeres (Figure S5).
Figure 4Average replication timing is more consistent within the centromeres of a given cell line than in the surrounding pericentromeres (or the whole genome). For each cell line, the replication timing profile for each mappable centromere (see Figure 1) is shown, overlaid with an averaged “aggregate” profile for that cell line’s centromeres. The shaded area indicates the minimum and maximum values, and the dashed line indicates the genome average. Each centromere was divided into 100 bins for the purpose of aggregation.
Figure 5Centromere replication timing is variable between cell lines, occurring between mid- and mid-late S phase. Each line represents the average replication timing of mappable centromeres (see Figure 1) in the indicated cell line.