| Literature DB >> 24465974 |
Gwenola Tosser-Klopp1, Philippe Bardou2, Olivier Bouchez3, Cédric Cabau2, Richard Crooijmans4, Yang Dong5, Cécile Donnadieu-Tonon3, André Eggen6, Henri C M Heuven7, Saadiah Jamli8, Abdullah Johari Jiken8, Christophe Klopp9, Cynthia T Lawley6, John McEwan10, Patrice Martin11, Carole R Moreno12, Philippe Mulsant1, Ibouniyamine Nabihoudine2, Eric Pailhoux13, Isabelle Palhière12, Rachel Rupp12, Julien Sarry1, Brian L Sayre14, Aurélie Tircazes12, Wen Wang15, Wenguang Zhang16.
Abstract
The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50-60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF) suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed): Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc) and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes), sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years.Entities:
Mesh:
Year: 2014 PMID: 24465974 PMCID: PMC3899236 DOI: 10.1371/journal.pone.0086227
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Goat Genome scaffolds assembly.
The goat genome scaffolds were sorted by decreasing size (x-axis) and the cumulative proportion of the assembled genome was plotted on the y-axis for all the scaffolds. The vertical line shows that >10kb scaffolds represent 97.2% of the assembled goat genome.
SNP identified in the five breeds or breed pool and in ESTs.
| Alpine | Boer | Creole | Katjang/Savanna | Saanen | ESTs | |
|
|
| 701 503 | 832 959 | 774 144 | 3 154 048 | 3 133 |
|
|
| 184 853 | 1 321 410 | 594 634 | 1 092 | |
|
|
| 200 083 | 794 951 | 644 | ||
|
|
| 662 060 | 1 163 | |||
|
|
| 2 714 | ||||
|
|
|
The number on the diagonal is the number of SNPs found in a breed (Alpine, Boer, Creole, Saanen), breed pool (Katjang/Savanna) or in ESTs. Off diagonals are the number of SNPs shared between the two respective breeds.
Figure 2SNP spacing on the goat scaffolds.
Spacing between the selected SNPs was calculated and the percentage of gaps (total number of gaps is 59,030 on goat scaffolds and 62,693 on UMD3.1 cattle assembly) is shown (y-axis) in each 5kb class ranging from 5 to 150kb (x-axis).
Figure 3SNPs by category in final design.
The number of selected SNPs is indicated for each of the following categories. 1: SNP detected in an EST. 2: two alleles detected in the five considered breeds. 3: two alleles detected in Alpine and Saanen and Creole and (Boer or Savanna). 4: two alleles detected in two of the three milk and mixed breeds (Alpine, Saanen, Creole) and in Boer and Savanna. 5: two alleles detected in Alpine and Saanen and Creole. 6: two alleles detected in three out of the five breeds. 10: two alleles detected in each of the two milk breeds (Saanen and Alpine). 11: two alleles detected in one milk breed (Saanen or Alpine) and one meat breed (Creole or Boer or Katjang/Savanna). 12: two alleles detected in at least two meat breeds (Creole and Boer or Katjang/Savanna). 13: two alleles detected in one milk breed (Saanen or Alpine).
Figure 4Distribution of estimated MAFs of the selected SNPs.
The MAF for all the 60,000 selected SNPs was estimated based on the read counts for the two alleles.
Average call rate and >5%MAF SNPs for the cluster file breeds.
| Breed | Samples | SNPs MAF>/ = 0.05 | Av call rate |
|
| 53 | 51339 | 0.9990 |
|
| 26 | 47195 | 0.9986 |
|
| 30 | 48494 | 0.9989 |
|
| 38 | 50216 | 0.9988 |
|
| 13 | 45648 | 0.9983 |
|
| 13 | 33873 | 0.9987 |
|
| 57 | 51689 | 0.9989 |
|
| 20 | 46629 | 0.9990 |
|
| 27 | 50908 | 0.9987 |
|
| 1 | 17335 | 0.9995 |
|
| 281 | 0.9988 |
For each breed used for the chip validation and cluster file definition, the number of samples, the number of >5%MAF SNPs and the average call rate are indicated.