| Literature DB >> 23947483 |
Venkatramana Pegadaraju1, Rick Nipper, Brent Hulke, Lili Qi, Quentin Schultz.
Abstract
BACKGROUND: Application of Single Nucleotide Polymorphism (SNP) marker technology as a tool in sunflower breeding programs offers enormous potential to improve sunflower genetics, and facilitate faster release of sunflower hybrids to the market place. Through a National Sunflower Association (NSA) funded initiative, we report on the process of SNP discovery through reductive genome sequencing and local assembly of six diverse sunflower inbred lines that represent oil as well as confection types.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23947483 PMCID: PMC3765701 DOI: 10.1186/1471-2164-14-556
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Paired–end RAD Sequencing Overview. A. Genomic DNA is digested with a restriction endonuclease. B. After ligation with a primary adapter, the fragments are sheared, then ligated with a secondary adapter. C. A composite mixture of variable length fragments is recovered from each restriction enzyme digestion site. These fragments are size selected, amplified and sequenced on a next-generation DNA sequencing platform using paired end chemistry. D. Development of the genomic assembly around each digestion site is then completed bioinformatically.
Figure 2Sequencing &Analysis Pipeline. The de novo assembly was developed using paired end sequence reads from RHA 464. Bowtie alignments of paired end data from the Helianthus population were used to identify putative SNPs using the SAMtools software suite. A panel of 16,464 variants was ultimately selected for Illumina Infinium Genotyping Design.
Figure 3RAD-Seq Assembly Results and Repeat Element Contribution. A. The length distribution of RAD-Seq contigs is plotted as a histogram. B. The contribution of known repetitive elements in the H. annuus RAD sequence assemblies is shown. Results were obtained through RepeatMasker analysis using the Repbase Arabidopsis database.
Paired-end RAD-Seq assembly statistics
| Number of RHA 464 contigs assembled | 42267 |
| Contigs removed due to plastid homology | 154 |
| Number of contigs retained | 42113 |
| Total assembly length (bp) | 15181868 |
| Minimum contig length (bp) | 200 |
| Maximum contig length (bp) | 920 |
| GC% | 36.1 |
| N50 Contig Length (bp) | 393 |
| N90 Contig Length (bp) | 254 |
SNP filtering and statistics
| 6 | |
| Total number of SNP variants identified: | 105662 |
| Total possible SNP genotypes in population: | 633972 |
| SNP genotypes with high confidence call: | 502837 (79.3%) |
| SNP genotypes with missing or low quality data: | 131135 (20.7%) |
| SNP loci with < 50% genotype data: | 11614 |
| SNPs passing initial filters: | 94048 |
| Number of fixed genotype calls: | 448289 (89.2%) |
| Number of heterozygous genotype calls: | 54548 (10.8%) |
| SNP Loci with insufficient flanking sequence for IIGT*: | 3445 |
| SNP Loci with nearby polymorphism (< 50 bp): | 74136 |
| SNP Loci meeting all defined IIGT assay design criteria: | 16467 |
* Illumina Infinium Genotyping Technology.
Figure 4SNP Discovery. A. The number and ratio of SNP transitions and transversions observed in the Helianthus population is graphed. B. The frequency of SNPs by position in each respective contig is plotted. C. The number of sequence variations observed across each RAD-Seq contig is shown.