| Literature DB >> 26932983 |
Kenta Shirasawa1, Hideki Hirakawa2, Sachiko Isobe2.
Abstract
Double-digest restriction site-associated DNA sequencing (ddRAD-Seq) enables high-throughput genome-wide genotyping with next-generation sequencing technology. Consequently, this method has become popular in plant genetics and breeding. Although computational in silico prediction of restriction sites from the genome sequence is recognized as an effective approach for choosing the restriction enzymes to be used, few reports have evaluated the in silico predictions in actual experimental data. In this study, we designed and demonstrated a workflow for in silico and empirical ddRAD-Seq analysis in tomato, as follows: (i)in silico prediction of optimum restriction enzymes from the reference genome, (ii) verification of the prediction by actual ddRAD-Seq data of four restriction enzyme combinations, (iii) establishment of a computational data processing pipeline for high-confidence single nucleotide polymorphism (SNP) calling, and (iv) validation of SNP accuracy by construction of genetic linkage maps. The quality of SNPs based on de novo assembly reference of the ddRAD-Seq reads was comparable with that of SNPs obtained using the published reference genome of tomato. Comparisons of SNP calls in diverse tomato lines revealed that SNP density in the genome influenced the detectability of SNPs by ddRAD-Seq. In silico prediction prior to actual analysis contributed to optimization of the experimental conditions for ddRAD-Seq, e.g. choices of enzymes and plant materials. Following optimization, this ddRAD-Seq pipeline could help accelerate genetics, genomics, and molecular breeding in both model and non-model plants, including crops.Entities:
Keywords: genetic linkage map; in silico prediction; restriction-associated DNA sequencing; single nucleotide polymorphism; tomato (Solanum lycopersicum)
Mesh:
Substances:
Year: 2016 PMID: 26932983 PMCID: PMC4833422 DOI: 10.1093/dnares/dsw004
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Sequences of oligonucleotides used in ddRAD-Seq
| Names | Sequence (5′ – 3′) |
|---|---|
| Restriction enzyme | |
| | TCTTTCCCTACACGACGCTCTTCCGATCTGCA |
| GATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT | |
| | CTGGAGTTCAGACGTGTGCTCTTCCGATCT |
| AATTAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC | |
| | TCTTTCCCTACACGACGCTCTTCCGATCT |
| AGCTAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT | |
| | CTGGAGTTCAGACGTGTGCTCTTCCGATC |
| TCGAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC | |
| | CTGGAGTTCAGACGTGTGCTCTTCCGATCT |
| CGAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC | |
| Indexed primers for PCRa | |
| Forward primer | AATGATACGGCGACCACCGAGATCTACACXXXXXXXXACACTCTTTCCCTACACGACGCTCTTCC |
| Reverse primer | CAAGCAGAAGACGGCATACGAGATXXXXXXXXGTGACTGGAGTTCAGACGTGTGCTCTTC |
aIndex bases are indicated by X, which sequences are listed in Supplementary Table S1.
Figure 1.Numbers of restriction sites and restriction fragments in the tomato genome (SL2.50). Bars indicate the numbers of restriction sites (A) and 300–900 bp restriction fragments (B) predicted from the SL2.50 tomato genome sequence by in silico analysis.
Figure 2.Number of SNPs detected from empirical ddRAD-Seq analysis. Line chart indicates numbers of SNPs between Micro-Tom and Regina with four combinations of restriction enzymes (A) and SNPs of six cultivars with respect to SL2.50 using the PstI/MspI combination (B).
Figure 3.Proportions of SNPs detected from empirical ddRAD-Seq analysis. SNPs from empirical ddRAD-Seq libraries are distributed in genic and intergenic regions (A) and repeat and non-repeat sequences (B). Proportions of SNPs between Micro-Tom and Regina (MT vs REG) detected from WGS data is shown as a control.
Number of mapped loci and length of genetic linkage maps
| Linkage group | Reference-based map | |||
|---|---|---|---|---|
| #Mapped loci | Map length (cM) | #Mapped loci | Map length (cM) | |
| 1 | 151 | 230.1 | 86 | 253.6 |
| 2 | 58 | 120.9 | 32 | 60.6 |
| 3 | 126 | 176.7 | 66 | 177.8 |
| 4 | 240 | 203.3 | 139 | 199.0 |
| 5 | 85 | 98.7 | 38 | 103.0 |
| 6 | 25 | 26.6 | 13 | 28.1 |
| 7 | 212 | 176.4 | 99 | 169.4 |
| 8 | 25 | 94.1 | 11 | 107.6 |
| 9 | 70 | 145.5 | 44 | 174.6 |
| 10 | 68a | 111.8a | 43a | 135.4a |
| 11 | 93 | 147.7 | 50a | 130.2a |
| 12 | 104 | 161.3 | 65 | 152.5 |
| Total | 1,257 | 1,693.2 | 686 | 1,691.8 |
aThese numbers reflect the total values of divided linkage groups.
Figure 4.Genetic linkage maps of RMF2, an F2 population derived from a cross between Micro-Tom and Regina. Bars on the left and right sides indicate linkage group maps based on SNP loci detected in the tomato reference genome (red lines) and a de novo assembly of ddRAD-Seq data (blue lines). Bars between the two maps indicate the physical map of the tomato genome. The density of SNPs detected using WGS data for the two cultivars is indicated by the darkness of green lines. Loci that are identical between the genetic and physical maps are connected by lines.
Figure 5.The ddRAD-Seq analytical workflow based on empirical and in silico optimization.