| Literature DB >> 28328985 |
Haigang Qi1,2,3, Kai Song1,3,4, Chunyan Li1,2,3, Wei Wang1,2,3, Busu Li1,2,3, Li Li1,3,4, Guofan Zhang1,2,3.
Abstract
Single nucleotide polymorphisms (SNPs) are widely used in genetics and genomics research. The Pacific oyster (Crassostrea gigas) is an economically and ecologically important marine bivalve, and it possesses one of the highest levels of genomic DNA variation among animal species. Pacific oyster SNPs have been extensively investigated; however, the mechanisms by which these SNPs may be used in a high-throughput, transferable, and economical manner remain to be elucidated. Here, we constructed an oyster 190K SNP array using Affymetrix Axiom genotyping technology. We designed 190,420 SNPs on the chip; these SNPs were selected from 54 million SNPs identified through re-sequencing of 472 Pacific oysters collected in China, Japan, Korea, and Canada. Our genotyping results indicated that 133,984 (70.4%) SNPs were polymorphic and successfully converted on the chip. The SNPs were distributed evenly throughout the oyster genome, located in 3,595 scaffolds with a length of ~509.4 million; the average interval spacing was 4,210 bp. In addition, 111,158 SNPs were distributed in 21,050 coding genes, with an average of 5.3 SNPs per gene. In comparison with genotypes obtained through re-sequencing, ~69% of the converted SNPs had a concordance rate of >0.971; the mean concordance rate was 0.966. Evaluation based on genotypes of full-sib family individuals revealed that the average genotyping accuracy rate was 0.975. Carrying 133 K polymorphic SNPs, our oyster 190K SNP array is the first commercially available high-density SNP chip for mollusks, with the highest throughput. It represents a valuable tool for oyster genome-wide association studies, fine linkage mapping, and population genetics.Entities:
Mesh:
Year: 2017 PMID: 28328985 PMCID: PMC5362100 DOI: 10.1371/journal.pone.0174007
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Flow diagram for the SNP selection steps with major criteria.
Fig 2Distribution of the p-convert values for candidate and on-chip probes.
The counts of the SNPs clustering categories.
| SNP Category | Probe No. | Percent | SNP No. | Percent |
|---|---|---|---|---|
| PolyHighResolution | 101,722 | 52.8 | 101,193 | 53.1 |
| NoMinorHom | 33,033 | 17.1 | 32,791 | 17.2 |
| MonoHighResolution | 4,177 | 2.2 | 4,145 | 2.2 |
| OTV | 3,472 | 1.8 | 3,409 | 1.8 |
| CallRateBelowThreshold | 17,336 | 9.0 | 17,015 | 8.9 |
| Other | 33,049 | 17.1 | 31,867 | 16.7 |
| Total | 192,789 | 100.0 | 190,420 | 100.0 |
The counts of the SNPs types.
| SNP type | On-chip | Percent | Converted | Percent | Conversion Rate |
|---|---|---|---|---|---|
| A/G | 68,655 | 36.1 | 49,108 | 36.7 | 0.72 |
| C/T | 67,800 | 35.6 | 48,442 | 36.2 | 0.71 |
| G/T | 22,345 | 11.7 | 15,047 | 11.2 | 0.67 |
| A/C | 22,241 | 11.7 | 14,863 | 11.1 | 0.67 |
| A/T | 6,753 | 3.5 | 4,437 | 3.3 | 0.66 |
| C/G | 2,626 | 1.4 | 2,087 | 1.6 | 0.79 |
| Total | 190,420 | 100 | 133,984 | 100 | 0.70 |
Summary of the SNPs according to their positions or functions.
| SNP region/function | On-chip | Converted |
|---|---|---|
| Coding region | ||
| | 46,285 | 39,579 |
| | 24,215 | 19,604 |
| | 245 | 186 |
| | 57 | 41 |
| | 45 | 30 |
| | 45 | 40 |
| | 6 | 5 |
| Intron region | ||
| | 43,304 | 27,790 |
| | 83 | 61 |
| | 63 | 45 |
| | 2,745 | 2,259 |
| 2Kbp-up/down-stream of a gene | ||
| | 17,630 | 11,880 |
| | 15,133 | 9,638 |
| Intergenic region | 40,564 | 22,826 |
Fig 3Distribution of the interval spacing of the SNPs on the array.
Fig 4Distribution of the concordance rate of the SNPs on the array.
Estimation of genotyping accuracy by family data.
| Parents genotypes | All no. | Error no. | Error no. Percentage | Call rate | Concordance rate | Error rate |
|---|---|---|---|---|---|---|
| AA × AA | 73,246 | 2,040 | 2.8 | 0.996 | 0.971 | 0.002 |
| AA × AB | 20,716 | 2,199 | 10.6 | 0.995 | 0.964 | 0.025 |
| AA × BB | 8,842 | 3,342 | 37.8 | 0.994 | 0.943 | 0.235 |
| AB × AA | 20,814 | 2,276 | 10.9 | 0.995 | 0.962 | 0.026 |
| AB × AB | 10,271 | 0 | 0.0 | 0.995 | 0.962 | 0.000 |
| 133,889 | 9,857 | 7.4 | 0.995 | 0.966 | 0.025 |
Fig 5Principal component analysis of all samples.
The first principal component (PC1) was assigned to X axis, and the second principal component (PC2) was assigned to Y axis. “China-North”, “China-South”, “Japan”, “Korea”, and “Canada” represented the Pacific oysters collected in northern China, southern China, Japan, Korea, and Canada, respectively. “Family” represented the parents and 24 off-springs of a full-sib family. The parents of the full-sib family were also collected in northern China.