| Literature DB >> 36128236 |
Jun Xiong1,2,3, Ping Wang4,5, Wen-Xuan Shao1,2, Gaojie Li4,5, Jiang-Hui Ding1,2, Neng-Bin Xie1,2, Min Wang1,2, Qing-Yun Cheng1, Conghua Xie1,3, Yu-Qi Feng1,2, Weimin Ci4,5, Bi-Feng Yuan1,2,3,6.
Abstract
N 4-methylcytosine (4mC) is a natural DNA modification occurring in thermophiles and plays important roles in restriction-modification (R-M) systems in bacterial genomes. However, the precise location and sequence context of 4mC in the whole genome are limited. In this study, we developed an APOBEC3A-mediated deamination sequencing (4mC-AMD-seq) method for genome-wide mapping of 4mC at single-base resolution. In the 4mC-AMD-seq method, cytosine and 5-methylcytosine (5mC) are deaminated by APOBEC3A (A3A) protein to generate uracil and thymine, both of which are read as thymine in sequencing, while 4mC is resistant to deamination and therefore read as cytosine. Thus, the readouts of cytosines from sequencing could manifest the original 4mC sites in genomes. With the 4mC-AMD-seq method, we achieved the genome-wide mapping of 4mC in Deinococcus radiodurans (D. radiodurans). In addition, we confirmed that 4mC, but not 5mC, was the major modification in the D. radiodurans genome. We identified 1586 4mC sites in the genome of D. radiodurans, among which 564 sites were located in the CCGCGG motif. The average methylation levels in the CCGCGG motif and non-CCGCGG sequence were 70.0% and 22.8%, respectively. We envision that the 4mC-AMD-seq method will facilitate the investigation of 4mC functions, including the 4mC-involved R-M systems, in uncharacterized but potentially useful strains. This journal is © The Royal Society of Chemistry.Entities:
Year: 2022 PMID: 36128236 PMCID: PMC9430668 DOI: 10.1039/d2sc02446b
Source DB: PubMed Journal: Chem Sci ISSN: 2041-6520 Impact factor: 9.969
Fig. 1Illustration of the deamination of cytosine, 4mC and 5mC by A3A protein. (A) Cytosine and 5mC are deaminated to form uracil and thymine that pair with adenine. 4mC is resistant to deamination by A3A and thus still pairs with guanine. (B) Schematic illustration of the single-base resolution mapping of 4mC in DNA by the 4mC-AMD-seq method. A3A treatment leads to the deamination of C and 5mC to form U and T, respectively. Both U and T are read as T in sequencing. However, A3A will not deaminate 4mC. Thus, 4mC is read as C in sequencing.
Fig. 2Evaluation of the deaminase activity of A3A protein on cytosine, 4mC and 5mC by LC-MS/MS. Three synthesized 215-bp dsDNAs (DNA-C, DNA-4mC, and DNA-5mC) were used for the evaluation. (A) Extracted-ion chromatograms of the dC standard, dC from DNA-C without or with A3A treatment. (B) Extracted-ion chromatograms of the dU standard, dU from DNA-C without or with A3A treatment. (C) Extracted-ion chromatograms of 4mdC and 5mdC standards, 5mdC from DNA-5mC without or with A3A treatment. (D) Extracted-ion chromatograms of 4mdC and 5mdC standards, 4mdC from DNA-4mC without A3A or with A3A treatment.
Fig. 3Quantitative evaluation of the deamination activity of A3A toward cytosine, 4mC and 5mC with steady-state kinetics analysis. (A) Kinetic constants of A3A acting on cytosine, 4mC and 5mC. (B–D) Rate versus substrate concentration curves of the substrates of DNA-C, DNA-5mC, and DNA-4mC. Data were fit with the Michaelis–Menten equation.
Fig. 4Single-base resolution analysis of 4mC in DNA by Sanger sequencing. (A–C) 215-bp dsDNA substrates (DNA-C, DNA-5mC, and DNA-4mC) were employed for the development of the 4mC-AMD-seq method by Sanger sequencing. Both C and 5mC were deaminated upon A3A treatment and all the cytosines and 5mC sites were read as thymine. 4mC was resistant to deamination by A3A protein and was still read as cytosine. Arrows denote deamination events (C-to-T conversion). (D) Evaluation of the C-to-T conversion upon A3A treatment by colony sequencing. (E–H) Quantitative evaluation of the level of 4mC at individual sites by 4mC-AMD-seq with colony sequencing. Various ratios of DNA-C and DNA-4mC were mixed and used as the DNA template for 4mC-AMD-seq. The X-axis indicates the position of cytosines (53 cytosines per strand). Detailed sequences are shown in ESI Table S1.† (I) Linear regression analysis by plotting the measured ratios of C/(C + T) with colony sequencing to the theoretical percentages of 4mC/(4mC + C).
Fig. 5Determination of 4mC in D. radiodurans DNA by LC-MS/MS. (A) Extracted-ion chromatograms for the detection of 4mC from different samples. The mass transition (m/z 242.2 → 126.1) was chosen for monitoring 4mdC and 5mdC. “Enzyme only” represents the sample only containing the enzymes used for digestion and omitting D. radiodurans DNA. “Synthetic DNA control” represents the sample of synthesized 215-bp DNA-C. (B) D3-Methionine was added to the TGY medium and metabolically labelled the methyl group of 4mC in D. radiodurans DNA. (C) Extracted-ion chromatograms of the 4mdC (in black) and D3-4mdC (in red) from D. radiodurans DNA supplemented with 0, 50, and 500 μg mL−1 of D3-methionine. (D) Histogram of the percentages of D3-4mdC/(D3-4mdC + 4mdC) from D. radiodurans DNA supplemented with 50 and 500 μg mL−1 of D3-methionine. (E) LC-MS/MS quantification of the level of 4mC in D. radiodurans DNA at different growth stages. Error bars represent standard deviation from three independent replicates.
Fig. 6Single-base resolution mapping of 4mC in D. radiodurans DNA by 4mC-AMD-seq. (A) Non-conversion rate of unmodified, 5mC, and 4mC containing spike-in DNA. (B) Venn diagram showing the identified 4mC sites in three replicates. (C) Distribution of 4mC in the gene body and 100 bp upstream and downstream of genes. (D) Motif sequence profile and sequence conservation analysis. (E) The theoretical numbers of CCGCGG motifs in D. radiodurans DNA and the detected numbers of C(4mC)GCGG motifs in D. radiodurans DNA by 4mC-AMD-seq. (F) A representative map for 4mC sites on chromosome 1 (position: 558 179–558 228) in D. radiodurans DNA by 4mC-AMD-seq. The sequenced + and − strands were mapped to the reference genome of D. radiodurans, and the CCGCGG context on both of the + and − strands was highly modified with 4mC (86.1% and 88.0%, respectively). (G) Methylation level distributions of 4mC sites in the CCGCGG motif in D. radiodurans DNA.
Fig. 7Distribution and level of 4mC sites in D. radiodurans DNA. (A) Circos plot of the distribution and level of 4mC sites across chromosomes 1 and 2, and plasmids pMP1 and pSP1 of D. radiodurans. Outer two circles: distribution and level of all the identified 4mC sites in the + and − strands. Middle two circles: distribution and level of all the identified 4mC sites in the CCGCGG motif in the + and − strands. Inner circle: distribution of genes on the + (blue) and − (yellow) strands of D. radiodurans DNA. (B) Methylation levels of 4mC sites in the CCGCGG motif and non-CCGCGG motif in three replicates.