| Literature DB >> 22536421 |
Ignazio Verde1, Nahla Bassil, Simone Scalabrin, Barbara Gilmore, Cynthia T Lawley, Ksenija Gasic, Diego Micheletti, Umesh R Rosyara, Federica Cattonaro, Elisa Vendramin, Dorrie Main, Valeria Aramini, Andrea L Blas, Todd C Mockler, Douglas W Bryant, Larry Wilhelm, Michela Troggio, Bryon Sosinski, Maria José Aranzana, Pere Arús, Amy Iezzoni, Michele Morgante, Cameron Peace.
Abstract
Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs.The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22536421 PMCID: PMC3334984 DOI: 10.1371/journal.pone.0035668
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Accessions of peach, almond and peach x almond hybrid sequenced at the Istituto di Genomica Applicata (IGA, Udine, Italy) (pools 1–5), the Center for Genome Research and Biocomputing (CGRB, Oregon State University, Corvallis, OR, USA) (pools 6–11), and IRTA (Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB, Spain) (pool 12).
| Pool | Accession | Adaptors | Read length (bp) | Read count (million) | Coverage of peach genome |
| 1 | ‘Armking’ |
| 94 | 5.85 | 2.42 |
| 1 | ‘Big Top’ |
| 94 | 3.55 | 1.47 |
| 1 | ‘Fidelia’ |
| 94 | 6.47 | 2.68 |
| 1 | ‘Flordastar’ |
| 94 | 7.27 | 3.01 |
| 1 | ‘Silver Rome’ |
| 94 | 8.35 | 3.45 |
| 1 | ‘Weinberger’ |
| 94 | 9.72 | 4.02 |
| 2 | ‘Babygold 8’ |
| 93 | 5.60 | 2.29 |
| 2 | ‘Elberta’ |
| 93 | 5.63 | 2.30 |
| 2 | ‘Maruja’ |
| 93 | 8.52 | 3.49 |
| 2 | ‘Maycrest’ |
| 93 | 8.61 | 3.52 |
| 2 | ‘Oro A’ |
| 93 | 7.20 | 2.95 |
| 2 | ‘Stark Red Gold’ |
| 93 | 6.37 | 2.61 |
| 3 | ‘Circe’ |
| 93 | 9.23 | 3.78 |
| 3 | ‘Imera’ |
| 93 | 5.92 | 2.42 |
| 3 | ‘Percoca di Romagna 7’ |
| 93 | 4.27 | 1.75 |
| 3 | ‘Pillar’ |
| 93 | 1.40 | 0.57 |
| 3 | ‘S 2678’ |
| 93 | 10.15 | 4.15 |
| 3 | ‘Stark Saturn’ |
| 93 | 7.45 | 3.05 |
| 4 | ‘Kamarat’ |
| 93 | 9.63 | 3.94 |
| 4 | ‘Leonforte 1’ |
| 93 | 2.32 | 0.95 |
| 4 | ‘Sahua Hong Pantao’ |
| 93 | 19.20 | 7.86 |
| 4 | ‘Shen Zhou Mitao’ |
| 93 | 12.54 | 5.13 |
| 4 | ‘Tabacchiera’ |
| 93 | 0.56 | 0.23 |
| 4 | ‘Tudia’ |
| 93 | 7.43 | 3.04 |
| 5 | ‘GF677’ |
| 93 | 9.22 | 3.77 |
| 5 | ‘Kurakata Wase’ |
| 93 | 6.75 | 2.76 |
| 5 | ‘Quetta’ |
| 93 | 12.76 | 5.22 |
| 5 | ‘S6699’ |
| 93 | 4.90 | 2.01 |
| 6 | ‘Admiral Dewey’ |
| 80 | 2.42 | 0.85 |
| 6 | ‘Babcock’ |
| 80 | 3.19 | 1.12 |
| 6 | ‘Elberta’ |
| 80 | 0.64 | 0.23 |
| 6 | ‘Slappey’ |
| 80 | 2.02 | 0.71 |
| 7 | ‘Bolinha’ |
| 80 | 3.55 | 1.25 |
| 7 | ‘Carmen’ |
| 80 | 1.66 | 0.58 |
| 7 | ‘Chinese Cling’ |
| 80 | 2.50 | 0.88 |
| 7 | ‘Mayflower’ |
| 80 | 1.35 | 0.47 |
| 8 | ‘Diamante’ |
| 80 | 2.11 | 0.74 |
| 8 | ‘J.H. Hale’ |
| 80 | 3.18 | 1.12 |
| 8 | ‘Rio Oso Gem’ |
| 80 | 2.57 | 0.91 |
| 8 | ‘Yellow St. John’ |
| 80 | 1.35 | 0.48 |
| 9 | ‘Dixon’ |
| 80 | 1.25 | 0.44 |
| 9 | ‘Early Crawford’ |
| 80 | 3.89 | 1.37 |
| 9 | ‘Florida Prince’ |
| 80 | 1.85 | 0.65 |
| 9 | ‘Nonpareil’ |
| 80 | 2.52 | 0.89 |
| 10 | ‘Dr. Davis’ |
| 80 | 2.31 | 0.81 |
| 10 | ‘Nemaguard’ |
| 80 | 2.38 | 0.84 |
| 10 | ‘O'Henry’ |
| 80 | 4.28 | 1.51 |
| 10 | ‘Okinawa’ |
| 80 | 2.15 | 0.76 |
| 11 | ‘Georgia Belle’ |
| 80 | 14.42 | 5.08 |
| 11 | ‘Lovell’ |
| 80 | 6.55 | 2.30 |
| 11 | ‘Lovell’ |
| 80 | 0.03 | 0.01 |
| 11 | ‘Oldmixon Free’ |
| 80 | 3.26 | 1.15 |
| 12 | ‘Big Top’ |
| 330 | 0.20 | 0.29 |
| 12 | ‘Binaced’ |
| 355 | 0.16 | 0.26 |
| 12 | ‘Catherina’ |
| 288 | 0.17 | 0.22 |
| 12 | ‘Elegant Lady’ |
| 243 | 0.19 | 0.20 |
| 12 | ‘Nectaross’ |
| 275 | 0.19 | 0.23 |
| 12 | ‘O'Henry’ |
| 289 | 0.15 | 0.18 |
| 12 | ‘Sweet Cap’ |
| 251 | 0.16 | 0.18 |
| 12 | ‘Venus’ |
| 278 | 0.15 | 0.19 |
Peach x almond hybrid;
Almond accession.
Pools 1–11 were sequenced with the Illumina Genome Analyzer while pool 12 was sequenced with the Roche 454 platform. Adaptors were used for retrieving accession-specific sequences from pools.
Figure 1Workflow for SNP detection, validation, filtering, and final choice employed for development of the International Peach SNP Consortium (IPSC) peach 9 K SNP array v1.
Validation outcomes for 96 SNPs with the GoldenGate® assay.
| SNP parameter | Total | Proportion of SNPs | |||||
| Failed | Mono-morphic | Poly-morphic | MAF (validation panel) | ||||
| <5% | 5–10% | >10% | |||||
| Evenly spaced | 74 | 0.22 | 0.14 | 0.65 | 0.01 | 0.04 | 0.59 |
|
| 14 | 0.29 |
|
|
|
|
|
| Other trait loci | 8 |
|
|
| 0.00 | 0.00 |
|
| Accession-specific | 19 | 0.26 | 0.11 | 0.63 |
|
| 0.42 |
| Genomic location: | |||||||
| LG1 | 13 | 0.08 | 0.15 |
| 0.00 | 0.00 |
|
| LG2 | 7 |
| 0.00 |
| 0.00 | 0.00 |
|
| LG3 | 7 | 0.14 | 0.14 |
| 0.00 | 0.00 |
|
| LG4 | 9 | 0.33 | 0.11 | 0.56 | 0.00 | 0.11 | 0.44 |
| LG5 | 7 | 0.14 | 0.29 | 0.57 | 0.00 | 0.00 | 0.57 |
| LG6 | 12 | 0.00 | 0.17 |
| 0.00 | 0.00 |
|
| LG7 | 10 | 0.30 | 0.10 | 0.60 | 0.00 | 0.00 | 0.60 |
| LG8 | 9 | 0.22 | 0.11 | 0.67 |
|
|
|
| Genic location: | |||||||
| Exonic | 30 | 0.13 | 0.10 |
| 0.03 | 0.03 |
|
| Intronic | 19 | 0.21 | 0.05 |
| 0.00 | 0.05 |
|
| UTR | 16 | 0.25 | 0.25 | 0.50 | 0.00 | 0.00 | 0.50 |
| Intergenic | 31 |
|
|
| 0.03 | 0.10 | 0.19 |
| ADT score: | |||||||
| <0.2 | 2 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.2–0.4 | 4 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.4–0.5 | 2 | 0.00 |
| 0.50 | 0.00 | 0.00 | 0.50 |
| 0.5–0.6 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 |
| 0.6–0.7 | 8 |
| 0.13 | 0.50 | 0.00 |
|
|
| 0.7–0.8 | 10 | 0.20 | 0.30 | 0.50 |
| 0.00 | 0.40 |
| 0.8–0.9 | 18 | 0.17 | 0.22 | 0.61 | 0.06 | 0.00 |
|
| >0.90 | 54 | 0.22 | 0.17 | 0.61 | 0.00 | 0.07 |
|
| MAF (detection panel): | |||||||
| 1–10% | 8 | 0.25 | 0.25 | 0.50 | 0.00 |
| 0.38 |
| 11–20% | 20 | 0.25 | 0.15 | 0.60 |
|
| 0.35 |
| 21–30% | 9 | 0.00 |
| 0.44 | 0.00 | 0.00 | 0.44 |
| 31–40% | 25 | 0.24 | 0.29 | 0.56 | 0.00 | 0.04 |
|
| 41–50% | 34 | 0.20 | 0.09 | 0.62 | 0.00 | 0.00 |
|
| Total | 96 | 0.24 | 0.19 | 0.57 | 0.02 | 0.05 | 0.50 |
Observed minor allele frequency (MAF) of polymorphic SNPs among 23 accessions of the detection panel and 119 non-seedling accessions of the validation panel are indicated. Values in bold represent considerably better than average SNP performance (e.g., high polymorphism), while values in italics are worse than average (high failure and monomorphism).
Chromosome distribution and performance of SNPs on the IPSC peach 9 K SNP array v1.
| Chromosome | Stage 2 SNPs | SNPs on peach array | SNPs polymorphic on the peach array | Rate of Polymorphism (%) | Distance between SNPs | ||
| Average gap (kb) | Largest gap (kb) | No. of gaps >150 kb | |||||
| 1 | 5573 | 1114 | 919 | 82.5 | 42.0 (50.9) | 672.9 (1254.0) | 78 (78) |
| 2 | 7205 | 1396 | 1193 | 85.5 | 19.2 (22.3) | 521.6 (531.5) | 30 (31) |
| 3 | 4031 | 811 | 680 | 83.8 | 27.1 (32.3) | 393.1 (398.0) | 24 (26) |
| 4 | 8149 | 1619 | 1391 | 85.9 | 18.6 (21.6) | 500.6 (661.2) | 26 (25 |
| 5 | 2692 | 546 | 459 | 84.1 | 33.3 (39.3) | 915.8 (915.8) | 19 (23) |
| 6 | 4674 | 933 | 802 | 86.0 | 30.6 (35.6) | 515.2 (515.2) | 27 (30) |
| 7 | 3943 | 793 | 668 | 84.2 | 28.6 (33.3) | 484.8 (564.9) | 27 (29) |
| 8 | 4527 | 913 | 743 | 81.4 | 23.7 (28.8) | 491.4 (634.5) | 21 (22) |
| Total on chromosomes | 40794 | 8125 | 6855 | 84.4 | 26.7 (31.5) | 915.8 (1254.0) | 252 (264) |
|
|
|
|
|
| - | - | - |
| Total | 41800 | 8144 | 6869 | 84.3 | - | - | - |
Two >150 Kb gaps collapsed into one (∼600 Kb) after removal of monomorphic SNPs.
Chromosomes and distances refer to pseudomolecules of the whole genome Peach v1.0 assembly. Gaps in brackets refer to those obtained when only polymorphic SNPs are considered.
Figure 2Distribution of SNPs along the Peach v1.0 pseudomolecules.
All tracks are plotted in 100 kb windows; inner blue track represents the frequency of coding DNA sequence CDS; y axis ranges from 0 to 100%. Red, yellow and green tracks represent, respectively, absolute number of SNPs discovered within pool 1–5, 40,789 Stage 2 SNPs in exons, and 9,000 SNPs chosen for the array; values in the y axes are capped at 2000, 100, and 30, respectively.
Figure 3Frequency distribution of size of gaps between SNPs included on the IPSC peach 9 K SNP array v1. Gap sizes were based on SNP physical locations in the Peach v1.0 assembly.
Figure 4Distribution of minor allele frequencies (MAF) in two independent germplasm sets.
A. EU evaluation panel (n = 232); B. US evaluation panel (n = 115; cultivars and advanced selections only).
Figure 5Polymorphic SNPs detected in EU (n = 232) and US (n = 477) evaluation panels from genome scans with the IPSC peach 9 K SNP array v1.
Figure 6Distribution and physical spacing of polymorphic SNPs across the eight peach chromosomes and comparison of SNP minor allele frequencies between peach and non-peach samples, including almond, and peach and almond wild relatives, in US data set.
Data set comprises cultivars and advanced selections only. The coefficient of regression (r) between the MAF in peach and non peach set is 0.437.