| Literature DB >> 19654876 |
Antonio M Ramos1, Richard P M A Crooijmans, Nabeel A Affara, Andreia J Amaral, Alan L Archibald, Jonathan E Beever, Christian Bendixen, Carol Churcher, Richard Clark, Patrick Dehais, Mark S Hansen, Jakob Hedegaard, Zhi-Liang Hu, Hindrik H Kerstens, Andy S Law, Hendrik-Jan Megens, Denis Milan, Danny J Nonneman, Gary A Rohrer, Max F Rothschild, Tim P L Smith, Robert D Schnabel, Curt P Van Tassell, Jeremy F Taylor, Ralph T Wiedmann, Lawrence B Schook, Martien A M Groenen.
Abstract
BACKGROUND: The dissection of complex traits of economic importance to the pig industry requires the availability of a significant number of genetic markers, such as single nucleotide polymorphisms (SNPs). This study was conducted to discover several hundreds of thousands of porcine SNPs using next generation sequencing technologies and use these SNPs, as well as others from different public sources, to design a high-density SNP genotyping assay. METHODOLOGY/PRINCIPALEntities:
Mesh:
Year: 2009 PMID: 19654876 PMCID: PMC2716536 DOI: 10.1371/journal.pone.0006524
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Number of Illumina Genome Analyzer reads generated, filters applied to the dataset and final number of reads used for SNP discovery from the four RRLs.
|
|
|
|
| Total | |
|
| 87,962,916 | 145,926,417 | 67,057,081 | 69,507,210 | 370,453,624 |
|
|
| ||||
|
| 3,276,584 | 48,401,718 | 6,854,265 | 16,555,378 | 75,087,945 |
|
| 260,648 | 656,411 | 278,435 | 202,585 | 1,398,079 |
|
| 1,004,585 | 1,085,621 | 2,138,675 | 402,036 | 4,630,917 |
|
| 10,167,678 | 14,193,925 | 9,559,415 | 7,622,490 | 41,543,508 |
|
| 73,253,421 | 81,588,742 | 48,226,291 | 44,724,721 | 247,793,175 |
|
| 83.3 | 55.9 | 71.9 | 64.4 | 66.9 |
Summary of the SNP discovered from the four analyzed RRLs.
|
|
|
|
| Total | |
|
| 2,625,323 | 2,854,329 | 2,377,571 | 1,180,640 | 9,037,863 |
|
| 106,456 | 124,578 | 56,817 | 27,279 | 315,130 |
|
| 11,149 | 39,096 | 5,620 | 1,891 | 57,756 |
|
| 117,605 | 163,674 | 62,437 | 29,170 | 372,886 |
SNP detected with only two minor alleles among the sequence reads.
Figure 1SNP distribution on each of the GA read positions.
The distribution represents all de novo identified SNPs, from the RRLs generated. The number of transitions and transversions identified is also illustrated.
Figure 2Distribution of the low minor allele count SNPs on each of the GA read positions.
For these SNPs two reads were identified for the minor allele. The number of transitions and transversions identified is also illustrated.
Summary of SNPs included on the Beadchip, assay conversion, Infinium SNP type, wave number and estimated distances between SNPs using builds 7 and 8 of the pig genome (total numbers and breakdown by chromosome).
| SSC | SNPs on Beadchip | Assay | Infinium type | Wave number | Distance between SNPs (kb) build 7 | Distance between SNPs (kb) build 8 | |||||||||
| Working | Not working | I | II | 1 | 2 | 3 | ≥4 | Average | Largest | Gaps≥250 kb | Average | Largest | Gaps≥250 kb | ||
| 1 | 6,584 | 6,477 | 107 | 138 | 6,446 | 88 | 4,705 | 512 | 1,279 | 37.7 | 461.7 | 14 | 39.0 | 545.3 | 19 |
| 2 | 2,441 | 2,381 | 60 | 33 | 2,408 | 69 | 2,038 | 150 | 184 | 32.4 | 222.3 | 0 | 36.9 | 409.3 | 13 |
| 3 | 1,907 | 1,856 | 51 | 15 | 1,892 | 54 | 1,577 | 128 | 148 | 33.7 | 224.1 | 0 | 38.4 | 445.8 | 11 |
| 4 | 3,690 | 3,631 | 59 | 60 | 3,630 | 107 | 2,798 | 259 | 526 | 34.7 | 355.4 | 6 | 34.8 | 355.3 | 11 |
| 5 | 2,186 | 2,137 | 49 | 16 | 2,170 | 49 | 1,705 | 156 | 276 | 34.6 | 272.8 | 2 | 37.1 | 367.5 | 10 |
| 6 | 1,411 | 1,375 | 36 | 5 | 1,406 | 32 | 1,214 | 67 | 98 | 32.2 | 162.1 | 0 | 37.1 | 375.0 | 6 |
| 7 | 3,669 | 3,567 | 102 | 70 | 3,599 | 70 | 2,764 | 235 | 600 | 36.3 | 438.4 | 12 | 36.4 | 598.2 | 14 |
| 8 | 1,924 | 1,888 | 36 | 13 | 1,911 | 32 | 1,597 | 149 | 146 | 34.8 | 331.3 | 3 | 40.1 | 618.8 | 16 |
| 9 | 2,390 | 2,348 | 42 | 26 | 2,364 | 42 | 2,042 | 159 | 147 | 33.4 | 281.4 | 2 | 37.7 | 519.9 | 13 |
| 10 | 1,270 | 1,221 | 49 | 6 | 1,264 | 18 | 1,167 | 39 | 46 | 32.7 | 169.9 | 0 | 35.2 | 377.9 | 3 |
| 11 | 1,827 | 1,786 | 41 | 10 | 1,817 | 16 | 1,533 | 122 | 156 | 34.9 | 281.6 | 1 | 36.1 | 360.7 | 4 |
| 12 | 932 | 913 | 19 | 3 | 929 | 34 | 834 | 26 | 38 | 31.3 | 169.0 | 0 | 33.7 | 294.6 | 2 |
| 13 | 3,297 | 3,229 | 68 | 24 | 3,273 | 69 | 2,649 | 255 | 324 | 36.2 | 449.6 | 3 | 38.1 | 449.5 | 5 |
| 14 | 4,153 | 4,062 | 91 | 73 | 4,080 | 95 | 3,185 | 238 | 635 | 35.8 | 371.8 | 36 | 35.8 | 371.8 | 0 |
| 15 | 2,426 | 2,370 | 56 | 53 | 2,373 | 34 | 1,853 | 184 | 355 | 39.6 | 449.6 | 8 | 42.8 | 579.6 | 20 |
| 16 | 1,458 | 1,429 | 29 | 14 | 1,444 | 16 | 1,290 | 71 | 81 | 35.0 | 198.2 | 0 | 37.2 | 327.0 | 1 |
| 17 | 1,659 | 1,622 | 37 | 11 | 1,648 | 26 | 1,370 | 104 | 159 | 33.6 | 259.0 | 1 | 34.2 | 295.3 | 2 |
| 18 | 1,037 | 1,020 | 17 | 5 | 1,032 | 37 | 802 | 98 | 100 | 34.1 | 161.2 | 0 | 37.6 | 306.6 | 3 |
| X | 1,228 | 1,168 | 60 | 65 | 1,163 | 9 | 740 | 115 | 364 | 59.2 | 447.8 | 27 | 67.5 | 955.5 | 54 |
| Y | 21 | 19 | 2 | 0 | 21 | 0 | 0 | 0 | 21 | - | - | - | - | - | - |
| Unmapped | 14,529 | 14,050 | 479 | 118 | 14,411 | 132 | 14,100 | 135 | 162 | - | - | - | - | - | - |
| Unmapped (predicted position) | 4,193 | 4,072 | 121 | 74 | 4,119 | 259 | 3,045 | 448 | 441 | - | - | - | - | - | - |
|
| 64,232 | 62,621 | 1,611 | 832 | 63,400 | 1,288 | 53,008 | 3,650 | 6,286 | - | - | 115 | - | - | 207 |
Figure 3Distances between the SNPs included on the 60K+porcine Beadchip.
The distances (x axis) were calculated using builds 7 (blue) and 8 (red) of the pig genome sequence assembly.
Description, by SNP source, of the number of working SNPs, SNPs by minor allele frequency and monomorphic SNPs.
| SNP source | SNPs | Working SNPs | Non working SNPs | MAF>0.05 | MAF>0.01 | MAF = 0 |
| ALGA | 20,144 | 19,593 | 551 | 18,236 | 18,436 | 1,133 |
| ASGA | 15,310 | 14,994 | 316 | 14,530 | 14,687 | 299 |
| BGIS | 117 | 115 | 2 | 34 | 43 | 71 |
| CADI | 21 | 20 | 1 | 12 | 13 | 7 |
| CAHM | 32 | 26 | 6 | 6 | 7 | 3 |
| CAIL | 17 | 15 | 2 | 6 | 7 | 8 |
| CAMB | 9 | 8 | 1 | 0 | 0 | 7 |
| CAPE | 6 | 5 | 1 | 1 | 1 | 4 |
| CASI | 550 | 542 | 8 | 444 | 493 | 37 |
| DBKK | 20 | 19 | 1 | 10 | 14 | 5 |
| DBMA | 21 | 21 | 0 | 20 | 20 | 1 |
| DBNP | 45 | 45 | 0 | 24 | 34 | 11 |
| DBUN | 39 | 37 | 2 | 15 | 21 | 13 |
| DBWU | 96 | 94 | 2 | 90 | 93 | 1 |
| DIAS | 1,202 | 1,171 | 31 | 1,144 | 1,155 | 15 |
| DRGA | 3,422 | 3,347 | 75 | 2,912 | 3,151 | 177 |
| H3GA | 6,300 | 6,135 | 165 | 5,740 | 5,809 | 321 |
| INRA | 2,528 | 2,493 | 35 | 1,802 | 2,174 | 244 |
| ISU | 37 | 37 | 0 | 34 | 36 | 1 |
| M1GA | 1,828 | 1,779 | 49 | 1,719 | 1,748 | 29 |
| MARC | 12,121 | 11,760 | 361 | 9,976 | 10,438 | 1,235 |
| SIRI | 324 | 323 | 1 | 313 | 316 | 4 |
| UMB | 35 | 35 | 0 | 35 | 35 | 0 |
| WUR | 8 | 7 | 1 | 6 | 6 | 1 |
|
| 64,232 | 62,621 | 1,611 | 57,109 | 58,737 | 3,627 |
The description for these acronyms is summarized in Table S1.
Figure 4Correlation between sequence-derived and genotype-derived allele frequencies.
The scatter plot was determined using the frequencies of the PorcineSNP60 SNPs derived from the RRLs generated.
Figure 5Relationship between sequence depth and the correlation between sequence-derived and genotype-derived allele frequencies.
This relationship was determined for the PorcineSNP60 SNPs derived from the RRLs generated.
Values for the correlation between sequence-derived and genotype-derived allele frequencies for different read depths and SNP sources.
| Read Depth | SNP source | |||||||||
| All SNPs | Duroc | Landrace | Large White | Pietrain | Wild Boar |
|
|
|
| |
|
| 0.53 | 0.61 | 0.49 | 0.49 | 0.52 | 0.49 | 0.59 | 0.48 | 0.59 | 0.61 |
|
| 0.68 | 0.73 | 0.69 | 0.64 | 0.66 | 0.69 | 0.67 | 0.66 | 0.71 | 0.76 |
|
| 0.74 | 0.77 | 0.75 | 0.70 | 0.72 | 0.76 | 0.73 | 0.72 | 0.75 | 0.81 |
|
| 0.79 | 0.80 | 0.82 | 0.74 | 0.77 | 0.78 | 0.79 | 0.76 | 0.79 | 0.86 |
|
| 0.79 | 0.79 | 0.86 | 0.71 | 0.78 | 0.66 | 0.79 | 0.76 | 0.78 | 0.91 |