Noa Gilat1, Dena Fridman1, Hila Sharim1, Sapir Margalit1, Natalie R Gassman2, Yael Michaeli1, Yuval Ebenstein1. 1. School of Chemistry, Center for Nanoscience and Nanotechnology, Center for Light-Matter Interaction, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, Israel. 2. Department of Physiology and Cell Biology, University of South Alabama College of Medicine, Mobile, Alabama.
Abstract
Mapping DNA damage and its repair has immense potential in understanding environmental exposures, their genotoxicity, and their impact on human health. Monitoring changes in genomic stability also aids in the diagnosis of numerous DNA-related diseases, such as cancer, and assists in monitoring their progression and prognosis. Developments in recent years have enabled unprecedented sensitivity in quantifying the global DNA damage dose in cells via fluorescence-based analysis down to the single-molecule level. However, genome-wide maps of DNA damage distribution are challenging to produce. Here, we describe the localization of DNA damage and repair loci by repair-assisted damage detection sequencing (RADD-seq). Based on the enrichment of damage lesions coupled with a pull-down assay and followed by next-generation sequencing, this method is easy to perform and can produce compelling results with minimal coverage. RADD-seq enables the localization of both DNA damage and repair sites for a wide range of single-strand damage types. Using this technique, we created a genome-wide map of the oxidation DNA damage lesion 8-oxo-7,8-dihydroguanine before and after repair. Oxidation lesions were heterogeneously distributed along the human genome, with less damage occurring in tight chromatin regions. Furthermore, we showed repair is prioritized for highly expressed, essential genes and in open chromatin regions. RADD-seq sheds light on cellular repair mechanisms and is capable of identifying genomic hotspots prone to mutation.
Mapping DNA damage and its repair has immense potential in understanding environmental exposures, their genotoxicity, and their impact on human health. Monitoring changes in genomic stability also aids in the diagnosis of numerous DNA-related diseases, such as cancer, and assists in monitoring their progression and prognosis. Developments in recent years have enabled unprecedented sensitivity in quantifying the global DNA damage dose in cells via fluorescence-based analysis down to the single-molecule level. However, genome-wide maps of DNA damage distribution are challenging to produce. Here, we describe the localization of DNA damage and repair loci by repair-assisted damage detection sequencing (RADD-seq). Based on the enrichment of damage lesions coupled with a pull-down assay and followed by next-generation sequencing, this method is easy to perform and can produce compelling results with minimal coverage. RADD-seq enables the localization of both DNA damage and repair sites for a wide range of single-strand damage types. Using this technique, we created a genome-wide map of the oxidation DNA damage lesion 8-oxo-7,8-dihydroguanine before and after repair. Oxidation lesions were heterogeneously distributed along the human genome, with less damage occurring in tight chromatin regions. Furthermore, we showed repair is prioritized for highly expressed, essential genes and in open chromatin regions. RADD-seq sheds light on cellular repair mechanisms and is capable of identifying genomic hotspots prone to mutation.
We are constantly exposed to molecules that damage our DNA, and if not repaired, they lead to DNA defects and disease. Repair-assisted damage detection (RADD) is a method for connecting a chemical tag to damaged DNA. It can be light-emitting molecules for measuring the level of damage, or molecular handles for selectively capturing the damaged DNA for sequencing. We expose cells to an oxidizing agent and sequence damaged DNA before and after cells repair their DNA. We find where DNA damage and repair accumulate in the genome and try to link these observations to the packing density of DNA and the level of gene expression. Repair-assisted damage detection sequencing provides a simple and modular method for mapping DNA damage coupled with quantitative assessment of global damage levels.
Introduction
Cellular DNA is continuously exposed to various exogenous and endogenous damaging agents (1,2). Damage accumulation on the DNA backbone or bases has cytotoxic and genotoxic effects, leading to mutations, genomic instability and, consequently, cancer (3). Hence, DNA repair is vital to the integrity of the genome and the organism’s health and survival. Single-strand lesions represent the most common class of damage incurred by DNA molecules, and these lesions are repaired extensively by cellular mechanisms. Specifically, oxidation DNA damage, manifested as single-strand breaks (SSB) and 8-oxo-7,8-dihydroguanine modifications, is typically repaired by either the nucleotide excision repair or the base excision repair pathways. In both processes, the DNA lesions are removed and replaced in three core steps: 1) the damaged DNA base is removed by a glycosylase (base excision repair BER) or a small fragment of the damaged single-stranded DNA is removed by endonuclease (nucleotide excision repair, 2) incision of the DNA backbone is performed by an endonuclease or the bifunctional activity of the glycosylase, and 3) repair is performed by gap filling and ligation of the nicked backbone (3,4). Because of the importance of understanding and monitoring DNA damage and repair processes, various methods have been developed to quantify the overall damage level in a cell population and genomic DNA (5, 6, 7). In the last 5 years, advances in the single-molecule field have paved the way to the development of new techniques for DNA damage quantification (8, 9, 10, 11, 12, 13). Such assays allow the quantification of single-stranded damage adducts through their excision by repair enzymes and the incorporation of fluorescent nucleotides into the formed gaps. Imaging the fluorescently labeled DNA molecules in a single-molecule manner enabled ultimate sensitivity of down to 15 damage adducts per Mbp. The single-molecule assay for damage detection naturally evolved to the single-cell measurement of DNA damage (14,15). In this assay, damaged lesions are repaired within the cells themselves, and by measuring the fluorescent intensity of the cell, an accurate assessment of the overall damage in the cell is obtained. By using the same biochemistry, we recently introduced a method for the bulk quantification of multiple samples over a single multiwell glass slide (16). In this method, termed rapid quantification of DNA damage by repair-assisted damage detection (Rapid-RADD), an absolute quantification of the DNA damage in a sample is easily obtained. Although the above-mentioned methods allow for determining the extent of damage in a DNA sample in bulk, they fail to provide information regarding the local distribution of genomic DNA damage. Mapping damage along the genome reveals the propensity of different genomic regions to accumulate damage and identifies associations to other genomic features, such as transcription sites. Here, we introduce a method for mapping single-strand DNA lesions that is a direct extension of our single-cell and single-molecule repair-assisted damage detection (RADD) assays. This method, termed RADD sequencing (RADD-seq), uses repair enzymes to excise the DNA lesions leaving a single-strand gap. The damage sites are then repaired in vitro with biotinylated nucleotides, followed by DNA fragmentation, immunoprecipitation, and sequencing. Reads are mapped back to the reference genome, indicating the locations of damage sites. RADD-seq can identify the genomic regions prone to damage accumulation (“damage hotspots”) and also detects the loci favorable for repair and reveals their repair dynamics. There are several other short-read next-generation sequencing (NGS)-based methods designed to locate and assess damage levels along the genome (Table 1; 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27). Existing methods fall into two main categories; the first uses affinity capture of the damaged lesion itself for pull down and sequencing, and the other takes advantage of the free 3′-OH terminus created at the damage site during excision repair.
Table 1
A comparison of short-read genome-wide damage-mapping techniques
Method
Detection concept
Damage type
DNA source
Resolution
Reference
RADD-seq
excision repair
oxidation
human
∼150 bp
–
OxiDIP-seq
affinity capture
oxidation
mouse, human
∼150 bp
(17,18)
AP-seq
affinity capture
oxidation
human
∼250 bp
(19)
GLOE-seq
affinity capture
SSB
yeast, human
single-nucleotide
(20,21)
Cisplatin-seq
affinity capture
cisplatin
mouse, human
single-nucleotide
(22)
SSB-seq
excision repair
SSB
human
200–400 bp
(23)
CPD-seq
excision repair
UV
yeast
single-nucleotide
(20)
Click-code seq
excision repair
oxidation
yeast
single-nucleotide
(24)
XR-seq
excision repair
cisplatin, UV
human
single-nucleotide
(22,25,26)
SSiNGLe
terminal transferase
SSB
human
single-nucleotide
(27)
A comparison of short-read genome-wide damage-mapping techniquesOne of the shortcomings of all NGS-based approaches is that the absolute level of DNA damage lesions is lost because of the need for PCR amplification for sequencing library preparation. RADD-seq is based on an established and validated biochemical procedure developed for single-molecule and single-cell optical detection of DNA damage adducts. It uses an identical procedure for detecting damage level before and after cellular repair, thus allowing one to map the dynamics of the repair process in a localized manner. We note that RADD-seq is a platform technique, which is easily tailored to detect any type or combination of single-strand DNA lesions. When coupled with Rapid-RADD, RADD-seq enables the facile and accurate mapping of DNA damage loci and repair dynamics for various damage types alongside the absolute quantification of the damage amount in the sample.To demonstrate the capability of our method to map and monitor repair dynamics, the osteosarcoma cell line U2OS was exposed to the oxidizing agent potassium bromate, KBrO3. RADD-seq was then used to monitor DNA lesion induction and repair in a time-dependent manner. For this demonstration, we map the landscape of DNA oxidation damage using the human 8-oxoguanine glycosylase 1 (hOGG1) repair enzyme to specifically excise oxidation DNA lesions. However, RADD-seq is compatible with any repair enzyme of interest, allowing any desired damage type to be monitored. We found that KBrO3-induced oxidation DNA damage was heterogeneously distributed along the genome. Moreover, we show that DNA repair occurs more readily in genomic regions with higher levels of gene expression and open chromatin than low expression, closed chromatin genomic regions. Finally, we identified specific gene clusters in which repair tends to occur more extensively, presumably due to their role in cell viability.
Materials and methods
Cell culture
U2OS (human osteosarcoma) cells were cultured in Dulbecco’s Modified Eagle medium, supplemented with 10% fetal bovine serum (Gibco, Amarillo, TX), l-glutamine (2 mM), and 1% penicillin-streptomycin (10,000 U/mL; Gibco). Cells were incubated at 37°C with 5% CO2.
KBrO3 treatment
A final concentration of 50 mM KBrO3 was added to culture medium for 1 h. Cells were washed with PBS twice, and genomic DNA was immediately extracted (for damage samples). Alternatively, cell medium was replaced, and cells were allowed to repair for 1 h at 37°C with 5% CO2 before DNA extraction (for repaired samples). Cells were allowed to grow up to a week after KBrO3 treatment to validate their viability. Over 80% of the cells survived this treatment and continued to divide 1 week posttreatment.
DNA extraction
DNA was extracted from ∼106 cells using the GenElute Mammalian Genomic DNA Miniprep Kit (Sigma-Aldrich, St. Louis, MO) according to manufacturer’s instructions.
Labeling oxidation DNA damage
KBrO3-treated DNA samples were labeled for oxidation damage in three consecutive enzymatic reactions. The sequence of events during the repair process is as follows: 1) hOGG1 excises the damage lesion, leaving an abasic site; 2) Endonuclease IV cleaves the abasic site, forming a single base gap; 3) Bst DNA Polymerase performs a displacement synthesis; and 4) Taq DNA Polymerase performs exonucleolytic activity and elongates the chain.In the first step, each reaction tube contained 2.2 μg of DNA sample, 2 μL of 10× buffer 4 (New England Biolabs (NEB), Ipswich, MA), 2 μL of 1 mg/mL Bovine Serum Albumin Solution (NEB), 1 μL of 0.5 mg/mL hOGG1 (ProSpec-Tany TechnoGene, Rehovot, Israel), and ultrapure water to a final volume of 20 μL. The reaction mixture was incubated for 30 min at 37°C. In the second step, 1 μL of Endonuclease IV (10,000 U/mL; NEB) was added to the reaction, and it was incubated for an additional 30 min at 37°C. In the final step, the following were added into each reaction tube: 0.5 μL of 10× NEBuffer 4 (NEB), 0.5 μL of 1 mg/mL Bovine Serum Albumin Solution (NEB), 1.2 μL of 50 mM NAD+ (NEB), 0.3 μL of 1 mM deoxynucleotides mix (A,G,C; Sigma-Aldrich and biotinylated dUTP; Thermo Fisher Scientific; Waltham, MA), 0.5 μL of Bst DNA Polymerase, Large Fragment (8000 U/mL; NEB), 0.5 μL of Taq DNA Polymerase (5000 U/mL; NEB), 0.4 μL of Taq DNA Ligase (40,000 U/mL; NEB) and ultrapure water to a final volume of 30 μL. The reaction mixture was incubated for 30 min at 65°C.
DNA fragmentation
After damage labeling, 120 of μL hydrated DNA was transferred to a Covaris microTUBE (AFA Fiber Pre-Slit Snap Cap 6 × 16 mm; Woburn, MA) and sheared using a probe sonicator (Covaris S220 Focused-ultrasonicator instrument; operation system SonoLab 7.0; prechilled to 4°C and degassed for 30 min). Samples were sonicated for 10 min at a peak power of 175 W, duty factor 20%, and 200 cycles per burst. The target fragmentation size was 150 bp. The fragmented and labeled DNA samples were purified from excess nucleotides using QIAquick PCR Purification Kit columns (QIAGEN, Hilden Germany), according to manufacturer’s recommendations.
Immunoprecipitation
Fragmented DNA was immunoprecipitated using anti-biotin antibodies (1 mg/mL; Abcam, Cambridge, UK) and protein G beads (Invitrogen, Waltham, MA). 70 μL of protein G beads were washed twice with 70 μL of IP buffer (10 mM Tris (pH 7), 1 mM EDTA, 150 mM NaCl) and were resuspended in 70 μL of IP buffer. The 70 μL of protein G beads were divided into two tubes: 1) 50 μL beads were incubated with 5 μg of anti-biotin antibodies; 2) 20 μL beads were incubated with 500 μL of the fragmented DNA (to allow nonspecific DNA binding to the beads). Both tubes were incubated for 2 h at 4°C with rotation and vibration. After incubation and magnetic pull down for both tubes, the supernatant from tube 1 was discarded, and the supernatant from tube 2 was added to the beads in tube 1. The beads and DNA were incubated overnight at 4°C with rotation and vibration. After incubation, the beads were washed seven times with 700 μL of IP buffer for 5 min at 4°C with rotation and vibration. To elute DNA, the beads solution was incubated with 40 μL of elution buffer (10 mM TE buffer, 1 mg/mL proteinase K, and 0.5% SDS), for 3 h at 50°C with shaking at 900 rpm, followed by magnetic pull down. The pulled-down DNA was purified using QIAquick PCR Purification Kit columns (QIAGEN), according to manufacturer’s recommendations.
Proof of concept experiment with Nt.BspQI nicking enzyme
DNA from human keratinocyte cells was extracted using the GenElute Mammalian Genomic DNA Miniprep Kit (Sigma-Aldrich) according to manufacturer’s instructions. DNA was nicked using the Nt.BspQI enzyme (10,000 U/mL; NEB). 10 μg of DNA were mixed with 11.6 μL Nt.BspQI enzyme, 10 μL NEBuffer 3.1 (NEB), and ultrapure water to a final volume of 70 μL. The reaction mixture was incubated for 2 h at 50°C. After the nicking reaction, DNA samples were labeled with 10 μL of a cocktail of repair enzymes (PreCR Repair Mix; NEB), fragmented, and assayed according to RADD-seq procedure as described above, with biotinylated nucleotides.
NGS
Sequencing libraries were prepared using TruSeq DNA Nano Library Prep Kit (Illumina, San Diego, CA), according to manufacturer’s instructions, without additional fragmentation and without size selection. All sequencing libraries were sequenced on HiSeq 2500 platform by the Technion Genome Center. Two biological replicates of each data type, damage and repaired, were prepared for sequencing, as well as two replicates with no repair enzyme as controls for each reaction. In addition, two replicates of input DNA containing nondamaged DNA were prepared. For each data type, 62M 50 bp single reads were collected.
Sequence analysis
Sequencing reads were aligned to the hg38 human reference using Bowtie 2 (version 2.3.4.2) with default parameters. After alignment, reads with MAPQ less than 30 were filtered with SAMtools, and the remaining reads were deduplicated with MateCigar (version 2.18.2.1) to eliminate PCR duplication bias. Genomic coverage-damaged DNA was calculated using BEDTools genomecov (version 2.27.1). All software mentioned above were used through the UseGalaxy site (https://usegalaxy.org/). Each of the data sets, damage, repaired, and input data, were generated using an in-house sliding window script with a 200 bp window size and 1 bp length step size. Correlation of biological replicates was assessed using Pearson correlation in 1 kb bins (Fig. S5). The two biological replicates for damage and repair data sets were combined by averaging the number of reads in each region. The data in both sample types (“damage” and “repaired”) were normalized to the input DNA to eliminate any bias resulting from the pull-down assay and misalignment to repetitive areas in the genome. In the first step, we divided the input DNA into 200 bp sized windows, and the read count in each window falling in the range of (mean + 1 SD − mean + 2 SD) was divided by mean + 1 SD (values were ranging between 1 and 1.14). In the second step, values lower than mean + 1 SD was set to 1, and all regions higher than mean + 2 SD were discarded from the data because they represent areas with high background level. Next, the read count for damage and repaired data in each 200 bp window was divided by the normalized input DNA respective windows. The regions discarded from the input data were discarded from these data sets as well. Finally, reads mapping to the ENCODE-defined blacklist regions (https://sites.google.com/site/anshulkundaje/projects/blacklists) were discarded. This normalization method normalizes only regions that aligned to regions with read count of mean + 1 SD − mean + 2 SD in the input DNA. All the regions below the mean value fall into the background noise level, and all the areas above the mean + 2 SD are considered outliers and mainly represent repetitive areas in the genome. The relative percentage of repair was calculated as the normalized “damage” data divided by the normalized “repaired” data, in 200 bp sized windows. To allow for a valid comparison between the two data sets, we used the transcription start sites (TSS) of each sample as a reference point. The TSS provides a robust region for minimal damage and remains at a constant level throughout the entire data set. The TSS in both data sets showed the same damage level; therefore, the two samples were correlated in their TSS damage level.
Correlation of DNA damage and repair to gene expression and chromatin state
Gene expression was assessed using RNA-sequencing data of U2OS cells, obtained from National Center for Biotechnology Information (NCBI; Bethesda, MD) BioProject (accession number PRJNA668283). For the damage analysis, genes were classified into four groups: high expression (20% top expressed, ∼4000 genes), medium expression (∼4000 genes with average expression level), low expression (20% bottom expressed, ∼4000 genes), and no expression (FPKM = 0, ∼3000 genes). For the repaired analysis, genes were classified into six expression deciles (each with ∼2000 genes) and another group of no expression gene (FPKM = 0, ∼3000 genes).Open and closed chromatin regions were determined using ATAC-seq data of U2OS cells, obtained from NCBI BioProject (accession number PRJNA486188). Genes were divided into two groups based on their “openness” level; genes for which over 50% of the gene body was found in open chromatin areas were defined as open (∼16,000 genes), and the rest of the genes were defined as closed (∼4000 genes).
Gene ontology analysis
Gene ontology analysis was performed with the web-based tools Panther Gene Ontology Consortium web tool (http://pantherdb.org/) and STRING (https://string-db.org/). Sorting genes into different gene groups was done using online databases (housekeeping, https://www.tau.ac.il/∼elieis/HKG/; repair pathways, https://www.mdanderson.org/documents/Labs/Wood-Laboratory/human-dna-repair-genes.html; oncogenes and tumor suppressor, https://cancerres.aacrjournals.org/content/canres/suppl/2012/01/23/0008-5472.CAN-11-2266.DC1/T3_74K.pdf; and translation-related, smell perception, and nervous system, Panther Gene Ontology).
Optical validation experiment: Rapid-RADD
DNA samples extracted for the RADD-seq protocol, were used for the Rapid-RADD procedure as well, as described by Gilat et al. (16). In short, KBrO3-treated DNA samples were labeled for oxidation DNA damage as described above, but instead of adding biotinylated dUTP, a fluorescent ATTO-550-UTP (Jena Bioscience, Jena, Germany) in the same quantity was added in the nucleotide mixture. Next, labeled DNA samples were purified from excess fluorophores using Oligo Clean & Concentrator columns (Zymo Research, Irvine, CA), according to manufacturer’s recommendations, with two washing steps for optimal results. Upon preparing the multiwell slide according to Rapid-RADD protocol, 1 μL of labeled DNA samples was placed in each well and incubated according to protocol instructions. The slide was scanned, stained for total DNA quantification, scanned again, and results were analyzed to quantify the amount of damage in each sample.
Results
RADD-seq exploits the incorporation of biotinylated nucleotides into DNA damage sites to provide a quick and straightforward pull-down based method for genome-wide sequencing DNA damage loci (Fig. 1). Using extracted genomic DNA, base lesions are recognized and excised from the double strand by a specific repair enzyme, leaving single-strand gaps (Fig. 2, A and B). Next, a DNA polymerase fills the gaps using biotinylated nucleotides, incorporating these nucleotides into the original damage sites (Fig. 2
C). DNA is then sheared into ∼150 bp fragments by sonication and immunoprecipitated using anti-biotin antibodies, enriching the sample for damaged DNA fragments (Fig. 2
D). Illumina sequencing is conducted, and reads are mapped back to a reference genome. Genomic coverage is then examined to identify trends along the genome, such as damage hotspots (Fig. 2
E).
Figure 1
RADD applications. By harnessing the same biochemistry, RADD enables the accurate and fast detection of damage over single molecules and single cells, bulk measurement, and mapping of damage loci by RADD-seq.
Figure 2
RADD-seq workflow. (A) DNA damage lesion is recognized by a specific repair enzyme. (B) The damaged lesion is excised, creating a gap in the DNA chain. (C) DNA polymerase fills the created gap with biotinylated nucleotides. (D) DNA is fragmented to ∼150 bp fragments and pulled down by anti-biotin antibody-coated magnetic beads. (E) Enriched DNA is processed for sequencing and mapped to the reference genome. The damage level across the genome is quantified by assessing the number of mapped reads.
RADD applications. By harnessing the same biochemistry, RADD enables the accurate and fast detection of damage over single molecules and single cells, bulk measurement, and mapping of damage loci by RADD-seq.RADD-seq workflow. (A) DNA damage lesion is recognized by a specific repair enzyme. (B) The damaged lesion is excised, creating a gap in the DNA chain. (C) DNA polymerase fills the created gap with biotinylated nucleotides. (D) DNA is fragmented to ∼150 bp fragments and pulled down by anti-biotin antibody-coated magnetic beads. (E) Enriched DNA is processed for sequencing and mapped to the reference genome. The damage level across the genome is quantified by assessing the number of mapped reads.First, validation of RADD-seq was achieved by showing that damaged regions are specifically detected, enriched, and accurately sequenced using this method. The sequence motif-specific Nt.BspQI nicking enzyme was used to induce sequence-specific single-strand breaks in genomic DNA extracted from human keratinocyte cells. A map with the expected damage sites was generated from the human genome reference according to the nicking enzyme’s recognition sequence motif (GCTCTTCNˆ). Nt.BspQI damaged DNA was “repaired” in a nick translation process, using biotinylated nucleotides and subsequently pulled down and utilized for library construction and sequencing (Fig. 3
A). As shown in Fig. 3
B, the damage sites revealed by RADD-seq correlated well with the expected nicking sites, demonstrating the ability of the method to construct reliable DNA damage maps on a genome-wide scale. With an average peak coverage of 30×, the median of the mapping resolution of single-strand breaks was ∼20 bp (Fig. S1). The false positive rate of this proof of concept experiment, defined as the percentage of peaks that do not overlap with expected Nt.BspQI sites, was calculated and found to be ∼2%. The use of Nt.BspQI enzyme to induce “damage-like” lesions in the DNA chain provides a modular method suitable for all damage types.
Figure 3
RADD-seq identifies damage sites correctly. (A) Illustration of the proof of concept experiment. The nicking enzyme Nt.BspQI creates expected nicks along the genome. DNA polymerase is then used to fill the created nicks with biotinylated nucleotides. Finally, DNA is fragmented and pulled down, and the enriched fragments are sequenced. (B) Comparison between high-coverage regions in the Nt.BspQI sample (top) and the expected nicking site positions of Nt.BspQI (bottom).
RADD-seq identifies damage sites correctly. (A) Illustration of the proof of concept experiment. The nicking enzyme Nt.BspQI creates expected nicks along the genome. DNA polymerase is then used to fill the created nicks with biotinylated nucleotides. Finally, DNA is fragmented and pulled down, and the enriched fragments are sequenced. (B) Comparison between high-coverage regions in the Nt.BspQI sample (top) and the expected nicking site positions of Nt.BspQI (bottom).
Oxidation damage is heterogeneously distributed
After its validation, we utilized RADD-seq to reveal DNA damage and repair dynamics in a biological model system. Little is known about the distribution of oxidation DNA damage throughout the human genome. RADD-seq has the potential to link DNA damage to environmental exposures and to specific biological processes such as gene expression and chromosomal packing. To demonstrate this, oxidation DNA damage was induced by KBrO3, which generates 8-oxoG in the context of GG and GGG sequences (28). U2OS cells were exposed to 50 mM KBrO3 for 1 h. The time of exposure as well as the concentration of the KBrO3 were chosen based on optimized damage results obtained in Rapid-RADD experiments (16). The genomic DNA was extracted, and RADD-seq was performed using the oxidation damage repair enzyme hOGG1.The genome-wide damage distribution was obtained from the number of reads aligned to each genomic locus after normalizing to data from a DNA sample that was not subjected to KBrO3 (see Materials and methods). The genome-wide oxidation DNA damage distribution is illustrated in Fig. 4
A. Evidently, oxidation damage hotspots are distributed heterogeneously along the genome, indicating that damage induction is not random, in agreement with Poetsch et al. (19).
Figure 4
DNA damage distribution revealed by RADD-seq. (A) Chromosomal view of the oxidation DNA damage landscape, in which darker blue color represents areas with an increased damage level. (B) A zoomed-in view of the oxidation DNA damage distribution along chromosome 2 (blue) alongside the distribution of lamina-associated domains (LADs; red). (C) Oxidation DNA damage in genes with different expression levels, namely high (20% top expressed genes, blue), medium (20% genes with average expression level, green), low (20% bottom expressed genes, yellow), or no expression (red).
DNA damage distribution revealed by RADD-seq. (A) Chromosomal view of the oxidation DNA damage landscape, in which darker blue color represents areas with an increased damage level. (B) A zoomed-in view of the oxidation DNA damage distribution along chromosome 2 (blue) alongside the distribution of lamina-associated domains (LADs; red). (C) Oxidation DNA damage in genes with different expression levels, namely high (20% top expressed genes, blue), medium (20% genes with average expression level, green), low (20% bottom expressed genes, yellow), or no expression (red).Interestingly, when comparing the DNA damage distribution along the genome with the locations of lamina-associated domains (LADs) genome-wide, an anticorrelation of −66% is found. LADs are condensed areas in the chromatin, found in close contact with the nuclear lamina (3,29). Fig. 4
B shows the anticorrelation between the oxidation damage and the LADs distribution along chromosome 2. Hence, oxidation DNA damage is less abundant in these more compacted areas. Moreover, DNA damage accumulation occurred along gene bodies and exhibits a drastic decrease near TSS, Fig. 4
C), in line with the findings of Amente and co-workers, Poetsch et al., and Wu et al. (18,19,24). We hypothesize that the decrease in damage at the TSS is due to the tendency of these areas to attract multiple DNA binding proteins, which may shield the TSS sequence from damaging agents. When inspecting the damage level in genes with different expression levels, we found that higher the expression levels correlate with more damaged gene bodies (Fig. 4
C).
Gene expression and chromatin state effect repair dynamics
We further explored the application of RADD-seq for gaining insights into the DNA repair process. U2OS cells exposed to 50 mM KBrO3 for 1 h were compared with cells that were washed and allowed to repair for 1 h after damage induction. Genomic DNA was extracted from both groups and subjected to RADD-seq (Fig. 5
A).
Figure 5
DNA repair of oxidation damage revealed by RADD-seq. (A) Illustration of repair dynamics experiment. U2OS cells were treated with 50 mM KBrO3 for 1 h, followed by immediate DNA extraction (damage samples), or washed and left for 1 h to repair before DNA extraction (repaired samples). (B) Oxidation DNA damage level along all human genes in the damage sample (blue) and the repaired sample (green). (C) The average repair level in genes according to their expression levels. Each dot represents the averaged repair level of the specific gene decile (∼2000 genes per dot). Deciles 7–10 correspond to no expression.
DNA repair of oxidation damage revealed by RADD-seq. (A) Illustration of repair dynamics experiment. U2OS cells were treated with 50 mM KBrO3 for 1 h, followed by immediate DNA extraction (damage samples), or washed and left for 1 h to repair before DNA extraction (repaired samples). (B) Oxidation DNA damage level along all human genes in the damage sample (blue) and the repaired sample (green). (C) The average repair level in genes according to their expression levels. Each dot represents the averaged repair level of the specific gene decile (∼2000 genes per dot). Deciles 7–10 correspond to no expression.We first evaluated the global DNA damage levels in the two samples optically using Rapid-RADD (16). The optical experiment showed an ∼30% decrease in the global damage level in the repaired samples compared with the damaged samples, verifying that DNA is indeed repaired during the 1 h experimental repair period (Fig. 6). Briefly, both types of samples were treated with a cocktail of repair enzymes specific to oxidation DNA damage, followed by the addition of DNA polymerase that incorporates fluorescent nucleotides into the formed gaps. The labeled DNA was deposited onto a partitioned poly-l-lysine-coated glass slide that was scanned using a slide scanner to evaluate the damage level in each well.
Figure 6
Rapid-RADD before and after repair of oxidation damage. (A) U2OS cells were incubated with 50 mM KBrO3, then washed and allowed to undergo native DNA repair. The DNA was extracted immediately post-treatment and 1 h post-treatment, labeled using the hOGG1 repair cocktail, and further assayed according to the Rapid-RADD protocol (Gilat et al. (16)). Representative fluorescence images of wells for each time point (immediately post-treatment and 1 h post-treatment) are shown. (B) Quantification results derived from the slide images and showing significant repair of the damaged DNA. Data represent mean ± SD of two independent experiments. Each bar represents the averaged results of two independent biological repeats; in each such repeat, five wells were averaged.
Rapid-RADD before and after repair of oxidation damage. (A) U2OS cells were incubated with 50 mM KBrO3, then washed and allowed to undergo native DNA repair. The DNA was extracted immediately post-treatment and 1 h post-treatment, labeled using the hOGG1 repair cocktail, and further assayed according to the Rapid-RADD protocol (Gilat et al. (16)). Representative fluorescence images of wells for each time point (immediately post-treatment and 1 h post-treatment) are shown. (B) Quantification results derived from the slide images and showing significant repair of the damaged DNA. Data represent mean ± SD of two independent experiments. Each bar represents the averaged results of two independent biological repeats; in each such repeat, five wells were averaged.In contrast to Rapid-RADD, which quantifies DNA damage in native unamplified DNA, RADD-seq uses predefined number of reads for sequencing and therefore did not show a significant difference in the global number of unique reads mapped for each sample. Cells that were not allowed to go through repair (termed “damage samples” hereafter) differed by only 0.7% in the global number of mapped reads as compared with cells that were allowed to repair (termed “repaired samples” hereafter). Nevertheless, RADD-seq was able to reveal repair patterns in specific regions of the human genome. When comparing gene bodies, we measured an average of 5% decrease in damage level for repaired samples (Fig. 5
B). More interestingly, when grouping genes by expression level deciles, we found clear positive correlation between repair level and gene expression level, for which a higher expression level is associated with a higher level of repair (Fig. 5
C; Fig. S2).In light of the observed relation between LADs and damage accumulation, we were motivated to observe the repair occurring in these areas. In general, a significant decrease in the damage level was found inside LADs compared with regions up- and downstream of LADs for both damaged and repaired samples, (Fig. 7
A). Moreover, the damage level inside LADs in the damage sample is lower than its corresponding genome-wide average. Interestingly, whereas outside the LAD regions damage levels decrease after repair, the damage level inside LADs is higher in the repaired sample; presumably because of the small size of the damaging KBrO3 molecules, which may penetrate through condensed chromatin areas, such as LADs, and accumulate with time. Furthermore, damaged LADs may not be easily accessible for the cellular DNA repair machinery, which is composed of bulky proteins. This hypothesis was also suggested by Wu et al. (24). We note that the observed differences in repair in Fig. 7
A might partly reflect the continued activity of the KBrO3 molecules sequestered at these sites.
Figure 7
Repair in closed versus open chromatin genomic regions by RADD-seq analysis. (A) The averaged damage level in and outside of LADs in the damage sample (blue) and repaired sample (green). Red line represents the average genome-wide damage level in the damage sample. (B) The averaged repair level in genes with different expression levels divided to open chromatin genes (green) and closed chromatin genes (red). Each dot represents the averaged repair level of the specific gene decile. Deciles 7–10 correspond to no expression.
Repair in closed versus open chromatin genomic regions by RADD-seq analysis. (A) The averaged damage level in and outside of LADs in the damage sample (blue) and repaired sample (green). Red line represents the average genome-wide damage level in the damage sample. (B) The averaged repair level in genes with different expression levels divided to open chromatin genes (green) and closed chromatin genes (red). Each dot represents the averaged repair level of the specific gene decile. Deciles 7–10 correspond to no expression.Next, we examined whether the level of repair in gene bodies is affected not only by gene expression but also by the state of chromatin. To this end, we divided each of the above-mentioned expression groups into open chromatin or closed chromatin genes. Fig. 7
B shows the repair level in each group across expression deciles. The repair level in open chromatin genes is higher than in closed chromatin genes from the same decile. Furthermore, in the case of low-expression genes, the difference in repair between the two groups becomes more distinct. As seen for LADs, we note that the observed differences in repair between open to closed genes in Fig. 7
B may partly reflect the continued activity of the KBrO3 molecules sequestered at the more condensed chromatin sites in close genes.
Essential genes are repaired more extensively
We next evaluated the relation between DNA repair and functionally different gene groups (Fig. 8
A). Generally, genes essential for basic cellular functions tended to be more extensively repaired than other genes. Housekeeping and translation-related genes showed the highest repair levels. Fig. 8
B shows an example of the damage level before and after repair for a housekeeping gene, CDK1 (top), and a translation-related gene, EIF4A2 (bottom). These data clearly demonstrate the distinct reduction in damage level in these genes after repair. In contrast, genes related to the nervous system and smell perception, not essential for the function of U2OS cells, presented a reduced level of repair (Fig. 8
A).
Figure 8
Repair in gene groups. (A) Box plot comparison of the repair level in six different gene groups; housekeeping genes (mean repair, 12.5%; number of genes, 931), translation-related genes (mean repair, 12.5%; number of genes, 160), repair pathway genes (mean repair, 8.8%; number of genes, 57), tumor suppressor genes (mean repair, 6.8%; number of genes, 63), oncogenes (mean repair, 6.4%; number of genes, 82), nervous system genes (mean repair, −3.4%; number of genes, 1286), and smell perception genes (mean repair, −11.3%; number of genes, 384). (B) Representative damage and repaired signals of a housekeeping gene (CDK1, top) and a translation-related gene (EIF4A2, bottom) as obtained by RADD-seq analysis.
Repair in gene groups. (A) Box plot comparison of the repair level in six different gene groups; housekeeping genes (mean repair, 12.5%; number of genes, 931), translation-related genes (mean repair, 12.5%; number of genes, 160), repair pathway genes (mean repair, 8.8%; number of genes, 57), tumor suppressor genes (mean repair, 6.8%; number of genes, 63), oncogenes (mean repair, 6.4%; number of genes, 82), nervous system genes (mean repair, −3.4%; number of genes, 1286), and smell perception genes (mean repair, −11.3%; number of genes, 384). (B) Representative damage and repaired signals of a housekeeping gene (CDK1, top) and a translation-related gene (EIF4A2, bottom) as obtained by RADD-seq analysis.We note that most of the genes in the top repaired groups, housekeeping genes, and translation-related genes belong to the top gene expression decile. To evaluate the impact of gene expression level on DNA repair level, we compared these two most highly repaired gene groups with the rest of the genes in the top decile of gene expression level. We found that the repair level of genes belonging to these gene groups was significantly higher than the overall repair level of the rest of the genes with the same gene expression level, indicating that additional factors influence the level of repair (Fig. S3).Finally, we show that the highest and lowest repaired gene groups include genes with specific functionalities. The top 500 repaired genes consist of a large group of translation-related genes and a group of cell-cycle-related genes (both part of the housekeeping gene group; Fig. S4
A). The bottom 500 repaired genes (which mostly include genes with a higher postrepair damage level) consist of two large clusters of genes, one involving thiol-dependent ubiquitinyl hydrolase activity, and another, related to smell perception (Fig. S4
B). These findings are in line with the fact that the studied cells originate from bone tissue, which unlikely requires the activity of the bottom 500 repaired genes.
Discussion
Oxidation DNA damage poses a serious threat to genome integrity and has been linked to the development of cancer, neurodegeneration and cell senescence (30, 31, 32, 33). Yet, to date, little is known about the genome-wide distribution of oxidation DNA damage loci and the repair dynamics of this damage type. RADD-seq provides a rapid and inexpensive mapping of DNA damage and repair based on pull-down and sequencing technique. Oxidation DNA damage map was obtained with as little as 60 million reads per sample. Damage was found to accumulate in various regions along the human genome, and was more abundant in highly expressed genes and in regions with less condensed chromatin. We found that repair tends to occur primarily for highly expressed genes. Moreover, specific gene groups vital to the normal functioning of the cells, such as housekeeping genes, were repaired more considerably than other less crucial genes. This last notion indicates that not only physical properties, such as chromatin architecture, direct DNA repair enzymes to their repair targets, but also other factors, yet unknown. Both basic research and clinical practice rely upon understanding the type and extent of genomic DNA damage. The detection of damage hotspots can help determine disease predisposition, whereas the targeting of areas with high repair levels can aid in monitoring therapeutic response. From an analytical perspective, RADD-seq offers an adaptor-free amplification for a variety of DNA damage types. We note that with a much higher sequencing depth, and while accounting only for unique reads, the genome can be fully sampled. This would result in more quantitative results, despite the nonlinearity and sequence dependence of PCR amplification. However, one of the strengths of our method is that we could recapitulate the relative damage and repair levels via global optical measurements. When coupled with Rapid-RADD for global quantification, these two methods provide means for determining both quantities and locations of DNA damage, the combination of which is essential for determining dose-response relationships of DNA-damaging agents. Here, we demonstrate RADD-seq applicability toward oxidation DNA damage; however, by choosing different repair enzymes, this protocol can be easily adjusted for other types of DNA lesions.
Availability
All raw sequencing data generated in this study have been submitted to the NCBI SRA BioProject and Biosample (SRA; https://www.ncbi.nlm.nih.gov/sra/) under accession number PRJNA697255. All processed sequencing data generated in this study have been submitted to the NCBI GEO (https://www.ncbi.nlm.nih.gov/geo/) under accession number GSE166843.
Author contributions
Y.E. and Y.M. conceived and supervised the project. N.G., Y.M., and H.S. performed the experiments. All authors analyzed the data. All authors wrote and edited the manuscript.
Authors: Annie M Sriramachandran; Giuseppe Petrosino; María Méndez-Lago; Axel J Schäfer; Liliana S Batista-Nascimento; Nicola Zilio; Helle D Ulrich Journal: Mol Cell Date: 2020-04-21 Impact factor: 17.970
Authors: Francesca Gorini; Giovanni Scala; Giacomo Di Palo; Gaetano Ivan Dellino; Sergio Cocozza; Pier Giuseppe Pelicci; Luigi Lania; Barbara Majello; Stefano Amente Journal: Nucleic Acids Res Date: 2020-05-07 Impact factor: 16.971
Authors: Kimiko L Krieger; Jie H Gohlke; Kevin J Lee; Danthasinghe Waduge Badrajee Piyarathna; Patricia D Castro; Jeffrey A Jones; Michael M Ittmann; Natalie R Gassman; Arun Sreekumar Journal: Cancers (Basel) Date: 2022-02-17 Impact factor: 6.639