| Literature DB >> 31512748 |
M Longeri1, A Chiodi2, M Brilli3,4, A Piazza4,5, L A Lyons6, G Sofronidis7, M C Cozzi1, C Bazzocchi1,4,8.
Abstract
Targeted GBS is a recent approach for obtaining an effective characterization for hundreds to thousands of markers. The high throughput of next-generation sequencing technologies, moreover, allows sample multiplexing. The aims of this study were to (i) define a panel of single nucleotide polymorphisms (SNPs) in the cat, (ii) use GBS for profiling 16 cats, and (iii) evaluate the performance with respect to the inference using standard approaches at different coverage thresholds, thereby providing useful information for designing similar experiments. Probes for sequencing 230 variants were designed based on the Felis_catus_8.0. 8.0 genome. The regions comprised anonymous and non-anonymous SNPs. Sixteen cat samples were analysed, some of which had already been genotyped in a large group of loci and one having been whole-genome sequenced in the 99_Lives Cat Genome Sequencing Project. The accuracy of the method was assessed by comparing the GBS results with the genotypes already available. Overall, GBS achieved good performance, with 92-96% correct assignments, depending on the coverage threshold used to define the set of trustable genotypes. Analyses confirmed that (i) the reliability of the inference of each genotype depends on the coverage at that locus and (ii) the fraction of target loci whose genotype can be inferred correctly is a function of the total coverage. GBS proves to be a valid alternative to other methods. Data suggested a depth of less than 11× is required for greater than 95% accuracy. However, sequencing depth must be adapted to the total size of the targets to ensure proper genotype inference.Entities:
Keywords: zzm321990Felis catuszzm321990; DNA profiling; genotyping-by-sequencing; single nucleotide polymorphisms
Year: 2019 PMID: 31512748 PMCID: PMC6899796 DOI: 10.1111/age.12838
Source DB: PubMed Journal: Anim Genet ISSN: 0268-9146 Impact factor: 3.169
Figure 1Reads processed by flash. For the loci, number of sequenced (fragmentSeq), combined (fragmentExtended), not combined (fragmentNotCombined) and mapped (fragmentMapped) fragments for each sample under analysis. Combined are fragments merged by the overlapping region, and the not combined fragments are those without an overlapping region.
Figure 2Plot of each genomic position by coverage (x‐axis) and −log10 of the likelihood of the second most likely genotype. The line represents the tendency of the distribution. The grey area represents the 95% confidence interval for the tendency calculated under the generalized addictive model.
Figure 3Histogram: number of SNPs with the likelihood of the second most probable genotype higher than 15 (the chosen likelihood threshold) and lying in a coverage interval. Scatter plot: proportion of SNPs over the total within each coverage interval that passed the likelihood threshold.
Figure 4Mean and standard deviation (y‐axis) of target loci with a coverage exceeding the threshold (x‐axis) per 16 samples.
Heatmap of the performance of the genotyping by sequencing compared with the genotyping results of the Reference for anonymous SNPs from SNP01 to SNP120 at different coverage levels in the samples genotyped with both methods
Figure 5Scatter plot of the performances at different coverage levels. GBS: performance of the 13 samples genotyped with both GBS and Mass Array (data in Table 1A); WGS: performance of Cat_16 genotyped with both GBS and WGS (data on the left in the figure).