| Literature DB >> 25031181 |
Kyle M Gardner1, Patrick Brown2, Thomas F Cooke3, Scott Cann1, Fabrizio Costa4, Carlos Bustamante3, Riccardo Velasco4, Michela Troggio4, Sean Myles5.
Abstract
Next-generation DNA sequencing (NGS) produces vast amounts of DNA sequence data, but it is not specifically designed to generate data suitable for genetic mapping. Recently developed DNA library preparation methods for NGS have helped solve this problem, however, by combining the use of reduced representation libraries with DNA sample barcoding to generate genome-wide genotype data from a common set of genetic markers across a large number of samples. Here we use such a method, called genotyping-by-sequencing (GBS), to produce a data set for genetic mapping in an F1 population of apples (Malus × domestica) segregating for skin color. We show that GBS produces a relatively large, but extremely sparse, genotype matrix: over 270,000 SNPs were discovered but most SNPs have too much missing data across samples to be useful for genetic mapping. After filtering for genotype quality and missing data, only 6% of the 85 million DNA sequence reads contributed to useful genotype calls. Despite this limitation, using existing software and a set of simple heuristics, we generated a final genotype matrix containing 3967 SNPs from 89 DNA samples from a single lane of Illumina HiSeq and used it to create a saturated genetic linkage map and to identify a known QTL underlying apple skin color. We therefore demonstrate that GBS is a cost-effective method for generating genome-wide SNP data suitable for genetic mapping in a highly diverse and heterozygous agricultural species. We anticipate future improvements to the GBS analysis pipeline presented here that will enhance the utility of next-generation DNA sequence data for the purposes of genetic mapping across diverse species.Entities:
Keywords: Malus; QTL; SNP; apple; genotyping-by-sequencing; next-generation DNA sequencing
Mesh:
Substances:
Year: 2014 PMID: 25031181 PMCID: PMC4169160 DOI: 10.1534/g3.114.011023
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Results of alignment of GBS reads to the apple reference genome. For each sample, the number of reads mapped and number of reads unmapped to the reference genome are shown. The read counts for the parents of the F1 mapping population, Golden Delicious and Scarlet Spur, are indicated.
Figure 2SNP and genotype counts from GBS data. (A) Cumulative count of SNPs identified across varying missing data thresholds. More than 200,000 SNPs are called with a very liberal missing data threshold of 90%, but only 30,393 SNPs remain if only SNPs with <20% missing data are retained. (B) The number of genotypes called at increasing levels of sequencing depth, after retaining only SNPs with <20% missing data. (C) The number of SNPs retained at increasing minimum thresholds of sequence depth while retaining only SNPs with <20% missing data. Here, we chose a minimum depth of coverage of six reads. Thus, only SNPs with at least six supporting reads and <20% missing genotypes were retained, resulting in a set of 3967 SNPs.
Figure 3Manhattan plot of a single marker association analysis for apple skin color. Each of the 3967 SNPs is plotted according to its physical position from the "Golden Delicious" reference genome and the −log10 P value of the single marker association test. The horizontal dotted line represents the Bonferonni-corrected P value significance threshold. The vertical dotted line represents the location of the MYB transcription factor gene known to be responsible for skin color variation.
Figure 4Result of QTL analysis across the linkage group corresponding to chromosome 9 of the apple genome. The left panel indicates the genetic map positions in cM of each of the SNPs or groups of SNPs. Each SNP’s ID indicates its physical position according to the reference genome, i.e., the physical coordinate it was assigned through alignment and SNP calling (e.g., SNP 9_ 449878 mapped to position 449878 on chromosome 9 of the "Golden Delicious" v1.0 reference genome). Note that many SNPs genetically mapping to the linkage group corresponding to chromosome 9 are assigned to other chromosomes according to the reference genome (e.g., SNP 13_30028140). The right panel displays the LOD scores from a QTL analysis for skin color for markers that segregate in the Scarlet Spur genetic background. The horizontal dashed line represents the significance threshold determined by permutation. LOD scores across all linkage groups are shown in Figure S3.