| Literature DB >> 32547585 |
Olaf Werner1, Ángela S Prudencio2, Elena de la Cruz-Martínez1, Marta Nieto-Lugilde1, Pedro Martínez-Gómez2, Rosa M Ros1.
Abstract
Reference-free reduced representation bisulfite sequencing uses enzymatic digestion for reducing genome complexity and allows detection of markers to study DNA methylation of a high number of individuals in natural populations of non-model organisms. Current methods like epiGBS enquire the use of a higher number of methylated DNA oligos with a significant cost (especially for small labs and first pilot studies). In this paper, we present a modification of this epiGBS protocol that requires the use of only one hemimethylated P2 (common) adapter, which is combined with unmethylated barcoded adapters. The unmethylated cytosines of one chain of the barcoded adapter are replaced by methylated cytosines using nick translation with methylated cytosines in dNTP solution. The basic version of our technique uses only one restriction enzyme, and as a result, genomic fragments are integrated into two orientations with respect to the adapter sequences. Comparing the sequences of two chain orientations makes it possible to reconstruct the original sequence before bisulfite treatment with the help of standard software and newly developed software written in C and described here. We provide a proof of concept via data obtained from almond (Prunus dulcis). Example data and a detailed description of the complete software pipeline starting from the raw reads up until the final differentially methylated cytosines are given in Supplementary Material making this technique accessible to non-expert computer users. The adapter design showed in this paper should allow the use of a two restriction enzyme approach with minor changes in software parameters.Entities:
Keywords: DNA methylation; Prunus dulcis; epi genotyping by sequencing; non-model organisms; population genetics; reduced representation bisulfite sequencing
Year: 2020 PMID: 32547585 PMCID: PMC7270828 DOI: 10.3389/fpls.2020.00694
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
FIGURE 1Simplified epiGBS scheme using our protocol with I as an example. (A) Adapter P1 is a barcoded standard GBS adapter; adapter P2 is hemimethylated (only the lower strand has 5-methyl cytosine incorporated; indicated by a thick line). Both adapters are unphosphorylated at the 5′-termini. As a result, after the ligation reaction two nicks remain. The genomic DNA fragment is incorporated into two different orientations with respect to the adapter sequences, which are the reverse complement (rc) of each other. (B) After nick translation, the top chain of adapter P1 keeps unmethylated cytosines (thin line). The adapter sequences of the bottom chains are completely methylated (thick lines). (C) The bisulfite treatment converts cytosine to uracil unless the cytosines are protected by methylation. The top chain of adapter P1 contains a high number of converted unmethylated cytosines. (D) During the PCR step, uracil is read as thymine by a specially engineered polymerase. (E) Illumina sequence reads correspond to the complement of the bottom chain. (F) The software codifies DNA bases as either purines (R) or pyrimidines (Y). The program takes one arbitrarily defined Watson purine/pyrimidine sequence and tries to find the corresponding Crick sequence with an identical reverse complement purine/pyrimidine sequence. (G) If the software finds a Watson/Crick sequence pair, it compares the original Watson sequence with the reverse complement of the original Crick sequence. A cytosine in one sequence and a thymine in the other sequence indicate that there was an unmethylated cytosine in the original sequence. Two cytosines indicate a methylated cytosine in the original sequence, a guanine, and an adenine indicate a guanine with an unmethylated cytosine in the opposite strand in the original sequence and two guanines indicate a guanine with a methylated cytosine in the opposite strand of the original sequence.
Sample identification, barcode, and adapter sequences for the top and bottom strand of the barcoded P1 adapters.
| Sample | Barcode | Adapter sequence top 5′ ->3′ | Adapter sequence bottom 5′ ->3′ |
| AlDA1 | CATCTGCCG | cacgacgctcttccgatctCATCTGCCGtgca | CGGCAGATGagatcggaagagcgtcgtg |
| AlDA2 | GGACAG | cacgacgctcttccgatctGGACAGtgca | CTGTCCagatcggaagagcgtcgtg |
| AlDB1 | ATCTGT | cacgacgctcttccgatctATCTGTtgca | ACAGATagatcggaagagcgtcgtg |
| AlDB2 | AAGACGCT | cacgacgctcttccgatctAAGACGCTtgca | AGCGTCTTagatcggaagagcgtcgtg |
| AlPA1 | GAATGCAATA | cacgacgctcttccgatctGAATGCAATAtgca | TATTGCATTCagatcggaagagcgtcgtg |
| AlPA2 | TAGCAG | cacgacgctcttccgatctTAGCAGtgca | CTGCTAagatcggaagagcgtcgtg |
| AlPB1 | ATCCG | cacgacgctcttccgatctATCCGtgca | CGGATagatcggaagagcgtcgtg |
| AlPB2 | CTTAG | cacgacgctcttccgatctCTTAGtgca | CTAAGagatcggaagagcgtcgtg |
Sequences of the P2 (common) adapter.
| Adapter | Sequence 5′->3′ |
| cre-epiGBS P2 top strand | |
| cre-epiGBS P2 bottom strand | t5gg5att55tg5tgaa55g5t5tt55gat5tDDDDDAA5T |
FIGURE 2Flowchart to illustrate the software pipeline. A selection of freely available software (Stacks, USEARCH, Bismark, methylKit) and own software (creepi, merge_sequences, seek_fragments) was used. The freely available software was chosen for its ease of installation and good documentation. The function is specified for each program.