| Literature DB >> 28913899 |
M Michelle Malmberg1,2, Luke W Pembleton1, Rebecca C Baillie1, Michelle C Drayton1, Shimna Sudheesh1, Sukhjiwan Kaur1, Hiroshi Shinozuka1, Preeti Verma1, German C Spangenberg1,2, Hans D Daetwyler1,2, John W Forster1,2, Noel O I Cogan1,2.
Abstract
The application of genomics in crops has the ability to significantly improve genetic gain for agriculture. Many marker-dense tools have been developed, but few have seen broad adoption in plant genomics due to issues of significant variations of genome size, levels of ploidy, single nucleotide polymorphism (SNP) frequency and reproductive habit. When combined with limited breeding activities, small research communities and scant sequence resources, the suitability of popular systems is often suboptimal and routinely fails to effectively balance cost-effectiveness and sample throughput. Genotyping-by-sequencing (GBS) encompasses a range of protocols including resequencing of the transcriptome. This study describes a skim GBS-transcriptomics (GBS-t) approach developed to be broadly applicable, cost-effective and high-throughput while still assaying a significant number of SNP loci. A range of crop species with differing levels of ploidy and degree of inbreeding/outbreeding were chosen, including perennial ryegrass, a diploid outbreeding forage grass; phalaris, a putative segmental allotetraploid outbreeding forage grass; lentil, a diploid inbreeding grain legume; and canola, an allotetraploid partially outbreeding oilseed. GBS-t was validated as a simple and largely automated, cost-effective method which generates sufficient SNPs (from 89 738 to 231 977) with acceptable levels of missing data and even genome coverage from c. 3 million sequence reads per sample. GBS-t is therefore a broadly applicable system suitable for many crops, offering advantages over other systems. The correct choice of subsequent sequence analysis software is important, and the bioinformatics process should be iterative and tailored to the specific challenges posed by ploidy variation and extent of heterozygosity.Entities:
Keywords: Brassica napus; Lens culinaris; Lolium perenne; Phalaris aquatic; RNA
Mesh:
Year: 2017 PMID: 28913899 PMCID: PMC5866951 DOI: 10.1111/pbi.12835
Source DB: PubMed Journal: Plant Biotechnol J ISSN: 1467-7644 Impact factor: 9.803
Figure 1Overview of the GBS‐t workflow.
Figure 2View of read alignments using Tablet for the gene BnaA08g04820D in the sample AG‐Comet BAM file. The SNP called at position 93 is putatively misaligned due to local similarity between the A and C subgenomes of canola, while the SNP at position 240 is putatively homozygous for an alternate base.
Figure 3Percentage missing data and number of reads generated per sample for perennial ryegrass (blue) and canola (green).
Figure 4NJ dendrograms for phalaris and lentil calculated using Nei's pairwise genetic distance. For phalaris, relationships between populations (a) and individuals (b) are displayed. The dendrogram showing individuals displays samples from the Advanced AT cultivar in blue, Landmaster in green, Holdfast GT in orange and PWA in purple. Relationships between lentil parental cultivars are also displayed, based on calculations using the same method (c).
Figure 5Canola SNP and gene density heatmaps. Red indicates high density, and blue indicates low density. Tracks displayed are as follows: (a) the canola karyotype; (b) density of SNPs discovered through GBS‐t; and (c) gene density based on the CDS reference genome.
Figure 6Frequency histogram of distribution of canola genes containing assayed SNPs. Black bars indicate the frequency of consecutive genes without SNPs. Grey bars indicate the frequency of consecutive genes with at least one SNP.
Figure 7Comparative alignment of loci on the perennial ryegrass genetic map with genome locations of matching DNA sequences in the genomes of rice and Brachypodium distachyon.