| Literature DB >> 35853083 |
Riccardo Gamba1, Giulia Mazzucco2, Therese Wilhelm1, Leonid Velikovsky1, Catalina Salinas-Luypaert1, Florian Chardon1, Julien Picotto1, Mylène Bohec3, Sylvain Baulande3, Ylli Doksani2, Daniele Fachinetti1.
Abstract
Centromeres are key elements for chromosome segregation. Canonical centromeres are built over long-stretches of tandem repetitive arrays. Despite being quite abundant compared to other loci, centromere sequences overall still represent only 2 to 5% of the human genome, therefore studying their genetic and epigenetic features is a major challenge. Furthermore, sequencing of centromeric regions requires high coverage to fully analyze length and sequence variations, and this can be extremely costly. To bypass these issues, we have developed a technique, named CenRICH, to enrich for centromeric DNA from human cells based on selective restriction digestion and size fractionation. Combining restriction enzymes cutting at high frequency throughout the genome, except within most human centromeres, with size-selection of fragments >20 kb, resulted in over 25-fold enrichment in centromeric DNA. High-throughput sequencing revealed that up to 60% of the DNA in the enriched samples is made of centromeric repeats. We show that this method can be used in combination with long-read sequencing to investigate the DNA methylation status of certain centromeres and, with a specific enzyme combination, also of their surrounding regions (mainly HSATII). Finally, we show that CenRICH facilitates single-molecule analysis of replicating centromeric fibers by DNA combing. This approach has great potential for making sequencing of centromeric DNA more affordable and efficient and for single DNA molecule studies.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35853083 PMCID: PMC9295943 DOI: 10.1371/journal.pgen.1010306
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 6.020
Fig 2Centromeric DNA is enriched in the high-molecular weight fractions.
A. Schematic representation of the experimental design. B. Dot-blot detecting the abundance of centromeric DNA (measured by signal intensity with a CENP-B box DNA probe, left membrane) in different sucrose gradient fractions (F1 to F4; F5+F6 is a pool of fractions F5 and F6) and in unfractionated genomic DNA (gDNA). A probe for the Alu repeat was used as a control (right membrane). In both membranes increasing amounts of DNA were loaded (50, 100 and 200 ng). C. Quantification of the dot-blot showed in B; signal is reported as a ratio to gDNA. The average for the different amounts of DNA is reported. Error bars represent the standard error of the three DNA amounts. D. Left: agarose gel electrophoresis performed on a molecular weight marker (Gene Ruler 1 kb), separated in the sucrose gradient showing efficient size separation; “tot” represents the unfractionated marker and F1 to F6 represent the different fractions. Middle and right: agarose gel electrophoresis of the sucrose fractions of a genomic DNA sample digested with the SNE combination and corresponding Southern blot after hybridization with an alpha satellite probe. “gDNA” represents the digested unfractionated sample and F1 to F6 represent different fractions. Lambda DNA digested with HindIII was also used as size control. E. Bar graph showing the ratio between CenDNA (from the Southern blot) over total DNA (from the agarose gel electrophoresis) in the fractions F1-F6.
Fig 3The CenRICH method provides high enrichment of alpha satellite and HSAT II DNA.
A. Schematic representation of the experimental design. B. Quantification of Illumina reads mapping on centromeric regions (red) and on other families of repetitive DNA after CenRICH (digestion with SNE or SEB enzyme combinations) and in an undigested unfractionated sample (WGS). F1-2 represents a pool of fractions 1 and 2 (LMW), F4-6 represents a pool of fractions from 4 to 6 (HMW). Data from RPE-1 (SNE and SEB) and DLD-1 (SNE only) are shown. Read counts are reported as a percentage of total mapped reads. C. Enrichment in centromere-derived reads after Illumina sequencing across the different centromeres in fractions F1-2, F3 and F4-6 (for RPE-1 cells) and fraction F4-6 (for DLD-1) after CenRICH with SNE digestion. Enrichment is expressed as a ratio to the read counts in the corresponding WGS samples. D. Examples of enrichment profiles in different fractions (F1-2, F3, and F4-6) after SNE digestion and sucrose gradient fractionation of RPE-1 DNA. On the left panel, centromere of chromosome 9 does not show enrichment in any fraction. On the right panel, centromere 18 shows high enrichment in F4-6 and depletion in fractions F1-2. Enrichment is plotted as log2 ratio over WGS in 2-Kb wide genomic bins. Y-axis ranges between -8 and +8. Genomic coordinates on the T2T-CHM13v1.0 reference are reported on top. Boundaries of centromeric regions (Cen Region, black bars) and HORs (grey bars) are described in S1 and S3 Tables, respectively. E. Scatter plot and linear regression reporting the correlation in fold enrichment (ratio to WGS) between the F4-6 fractions of RPE-1 and DLD-1 cells (SNE digestion, Illumina sequencing). Each of the 23 dots represents a centromere. The dashed line represents 95% confidence intervals of the linear regression. R-square = 0.97, p-value <0.0001. F. Example of enrichment profile and identification of enrichment domains on centromere 15, for fractions F4-6 after CenRICH with SNE enzyme combination on RPE-1 (red) and DLD-1 (green) cells. Enrichment is plotted as log2 ratio compared to WGS along 2-Kb bins (y-axis range -4 to +6). Bars below the enrichment profile identify enrichment domains where fold-enrichment is > 5-fold. Purple and cyan profiles report CENP-A CUT&RUN-seq profiles as ratio to WGS, identifying the enrichment domain as corresponding to the active HOR (y-axis range from 0 to 15). Centromeric region (Cen region, black bar) and HOR boundaries (grey bar) are defined in S2 and S3 Tables. G. Estimation of the variation in centromere length in DLD-1 cells compared to RPE-1, calculated from WGS or from F4-6 after CenRICH with SNE enzyme combination. Y-axis reports the percentage variation in the number of reads mapping in centromeric regions (DLD-1 over RPE-1), which is used as a proxy for centromere length.