| Literature DB >> 21658273 |
John P Hamilton1, Candice N Hansey, Brett R Whitty, Kevin Stoffel, Alicia N Massa, Allen Van Deynze, Walter S De Jong, David S Douches, C Robin Buell.
Abstract
BACKGROUND: Current breeding approaches in potato rely almost entirely on phenotypic evaluations; molecular markers, with the exception of a few linked to disease resistance traits, are not widely used. Large-scale sequence datasets generated primarily through Sanger Expressed Sequence Tag projects are available from a limited number of potato cultivars and access to next generation sequencing technologies permits rapid generation of sequence data for additional cultivars. When coupled with the advent of high throughput genotyping methods, an opportunity now exists for potato breeders to incorporate considerably more genotypic data into their decision-making.Entities:
Mesh:
Year: 2011 PMID: 21658273 PMCID: PMC3128068 DOI: 10.1186/1471-2164-12-302
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Genotypes and sequence datasets used in this study,
| Species | Cultivar | Market Class | Platform | Comments |
|---|---|---|---|---|
| Kennebec | Fresh market | Sanger ESTs | 1948 release | |
| Bintje | Fresh market | Sanger ESTs | 1905 release | |
| Shepody | French fry processing | Sanger ESTs | 1980 release | |
| PremierRusset | French fry processing | GA2 ESTs | 2008 release | |
| Snowden | Chip processing | GA2 ESTs | 1990 release | |
| Atlantic | Chip processing | GA2 ESTs | 1978 release | |
| DM | Diploid Andean Fresh Market | NA | Used in Genome Projecta |
aThe DM genome is available at http://potatogenome.net.
Potato sequence and assembly statistics.
| Sanger | GA2 | |||||
|---|---|---|---|---|---|---|
| Bintje | Kennebec | Shepody | Atlantic | Snowden | Premier Russet | |
| Total No. sequences | 15,866 | 83,549 | 86,341 | 36,291,638 | 38,981,546 | 39,556,178 |
| Total No. Gb sequences | 0.0079 | 0.0544 | 0.0543 | 2.2 | 2.4 | 2.4 |
| No. sequences passed quality filters | 14,588 | 78,386 | 83,611 | 30,185,186 | 31,949,096 | 33,288,120 |
| No. of Gb of sequences passed quality filters | 0.0077 | 0.0533 | 0.053 | 1.8 | 2.0 | 2.0 |
| Total No. contigs & singletons | 7,510 | 25,330 | 51,459 | NA | NA | NA |
| No.contigs | 2,332 | 10,318 | 10,716 | 45,214 | 58,754 | 54,917 |
| No.singletons | 5,178 | 15,012 | 40,743 | NA | NA | NA |
| Total No. Mb contigs & singletons | 4.27 | 19.89 | 36.33 | 29.45 | 28.55 | 28.93 |
| No. Mb contigs | 1.61 | 10.6 | 8.68 | 29.45 | 28.55 | 28.93 |
| No. Mb singletons | 2.66 | 9.29 | 27.65 | NA | NA | NA |
| N50 contig size (bp) | 711 | 1,097 | 847 | 1,192 | 775 | 826 |
| Max contig size (bp) | 2,255 | 4,081 | 2,517 | 11,317 | 7,012 | 6,675 |
| Min contig size (bp) | 278 | 272 | 847 | 150 | 150 | 150 |
Alignment of contigs to the A. thaliana proteome.
| Cultivar | No. contigs with alignmenta | No.non-redudundant alignmentb |
|---|---|---|
| Atlantic | 27,934 | 13,752 |
| Premier Russet | 32,369 | 14,563 |
| Snowden | 33,503 | 14,608 |
| Bintje | 2,111 | 1,793 |
| Kennebec | 9,320 | 6,193 |
| Shepody | 9,163 | 6,202 |
aContigs were search against the A. thaliana proteome using an E value cutoff of <10-5. Only the top alignment was retained.
bMultiple alignments to the same A. thaliana protein were condensed to provide a non-redundant estimation of representation of the A. thaliana proteome.
Figure 1Overlap of potato transcriptomes. Contigs from each of the cultivars were searched against the A. thaliana proteome and the non-redundant A. thaliana proteome matches are shown. A. GA2-generated transcript datasets; B. Sanger-generated transcript datasets; C. Nested Venn diagram with all six datasets. The small Venn diagram within C shows the overlap between contigs found only within the Sanger datasets.
Figure 2Workflow used for SNP discovery in potato transcriptomes and design of the BeadXpress SNP array. SNPs identified in RNA-Seq reads were called and filtered using the Maq SNP pipeline. Sanger ESTs were clustered by cultivar using TGICL [40] and SNPs called and filtered using custom Perl scripts. Filtered SNPs were linked to positions of the potato DM genomic sequence and filtered again to eliminate those close to an intron as well SNPs that were not biallelic. SNPs selected for the BeadXpress SNP array were selected randomly from the Atlantic, Premier Russet, and Snowden datasets.
Figure 3Graphical display of population substructure for 248 genotypes at a population size . Population substructure was determined using STRUCTURE [47] with 82 high quality SNP markers. Each genotype is represented by a vertical line. Color segments within the vertical line indicate the proportion of membership in each of the four population substructure groups. Population substructure groups are color-coded as population one (red), population two (green), population three (blue), and population four (yellow). Numbers in parenthesis indicate the number of genotypes with majority membership (greater than 50%) in each population group and the total number of genotypes for each market class.
Figure 4Unweighted Pair Group Method with Arithmetic Mean (UPGMA) tree of 244 genotypes categorized by market class based on 82 high quality SNP markers. The numbers above each branch are the branch length, which relates to the genetic distance between groups.
Total and cultivar-restricted SNPs in six potato cultivars.
| Cultivar | Total SNPs | Cultivar-Restricted SNPs |
|---|---|---|
| Atlantic | 42,928 | 19,442 |
| Snowden | 46,074 | 21,559 |
| PremierRusset | 45,772 | 18,764 |
| Bintje | 1,155 | 576 |
| Kennebec | 8,773 | 5,533 |
| Shepody | 2,823 | 1,532 |
Total and cultivar-restricted SNPs were determined by aligning reads to the DM reference genome and calling SNPs using SAMTools. Cultivar-restricted SNPs are SNPs that are found only in a single cultivar with the other cultivars lacking the SNP or having no sequence data at that genomic position.
Pairwise comparison of SNPs between potato accessions.
| Cultivar 1 | Cultivar 2 | Total SNPs | Cultivar-restricted SNPs |
|---|---|---|---|
| Atlantic | Premier Russet | 14,955 | 5,087 |
| Atlantic | Snowden | 17,531 | 7,570 |
| Atlantic | Bintje | 192 | 40 |
| Atlantic | Shepody | 506 | 128 |
| Atlantic | Kennebec | 1,459 | 388 |
| Premier Russet | Snowden | 18,537 | 8,365 |
| Premier Russet | Bintje | 212 | 42 |
| Premier Russet | Shepody | 535 | 106 |
| Premier Russet | Kennebec | 1,689 | 424 |
| Snowden | Bintje | 215 | 31 |
| Snowden | Shepody | 567 | 121 |
| Snowden | Kennebec | 1,665 | 349 |
| Bintje | Shepody | 136 | 31 |
| Bintje | Kennebec | 329 | 141 |
| Shepody | Kennebec | 566 | 276 |
Pairwise comparision of SNPs across the six potato cultivars. Cultivar-restricted SNPs are SNPs found exclusively in the two cultivars based on alignment to the reference genome.