Literature DB >> 22470530

Design of a bovine low-density SNP array optimized for imputation.

Didier Boichard1, Hoyoung Chung, Romain Dassonneville, Xavier David, André Eggen, Sébastien Fritz, Kimberly J Gietzen, Ben J Hayes, Cynthia T Lawley, Tad S Sonstegard, Curtis P Van Tassell, Paul M VanRaden, Karine A Viaud-Martinez, George R Wiggans.   

Abstract

The Illumina BovineLD BeadChip was designed to support imputation to higher density genotypes in dairy and beef breeds by including single-nucleotide polymorphisms (SNPs) that had a high minor allele frequency as well as uniform spacing across the genome except at the ends of the chromosome where densities were increased. The chip also includes SNPs on the Y chromosome and mitochondrial DNA loci that are useful for determining subspecies classification and certain paternal and maternal breed lineages. The total number of SNPs was 6,909. Accuracy of imputation to Illumina BovineSNP50 genotypes using the BovineLD chip was over 97% for most dairy and beef populations. The BovineLD imputations were about 3 percentage points more accurate than those from the Illumina GoldenGate Bovine3K BeadChip across multiple populations. The improvement was greatest when neither parent was genotyped. The minor allele frequencies were similar across taurine beef and dairy breeds as was the proportion of SNPs that were polymorphic. The new BovineLD chip should facilitate low-cost genomic selection in taurine beef and dairy cattle.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22470530      PMCID: PMC3314603          DOI: 10.1371/journal.pone.0034130

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Genetic improvement of several key agricultural species is accelerating with the adoption of genomic selection [1], [2], [3]. With this method, animals or plants can be selected for breeding on the basis of their genetic merit predicted by markers spanning the entire genome. Particularly in dairy cattle, this method has been shown to be more efficient than conventional progeny testing of bulls (up to double the rate of genetic gain) as well as substantially less expensive [4]. Moreover, genomic selection opens new opportunities for sustainable management of populations by more efficiently selecting for traits that have low heritability, e.g. fitness traits, or traits that are difficult to measure. This method is also useful for managing the accumulation of inbreeding within breeds with a small effective population size. In dairy cattle, genomic selection has been deployed at a rapid pace, and most countries with major dairy breeding programs now rely heavily on this new technology [5]. A major challenge in implementing genomic selection in most species is the cost of genotyping. The expected value of the information gained by genotyping must exceed the cost of obtaining the genotypes. During the early stages of genomic selection in the dairy industry, the cost of high-density genotyping could be justified. The primary application was to evaluate bulls that were potential candidates for production of commercial semen. Using SNP information for those evaluations resulted in more accurate selection of bulls to acquire and extensively market. Once increased accuracies of genome-enhanced breeding values had been demonstrated, breeders and buyers quickly adopted this technology to improve accuracy of selection [6]. This example of a genomic-selection application has extreme value compared with other animal food production paradigms. In contrast, profit from genomic selection is likely to be much lower for beef bulls and dairy females [5], [7]. An appealing approach in situations with much lower returns from genotyping is to use a more economical, reduced-density SNP chip with markers optimized for imputation. Imputation is the process of predicting unknown genotypes for animals from observed genotypes and often uses information from a reference population with dense genotypes to predict missing genotypes for animals with lower density genotypes. It is also applied to merge genotypes of similar densities but different SNPs. Most imputation algorithms use information from relatives and population linkage disequilibrium. A number of software programs for imputation have been developed based originally on human genetics [8], [9] and more recently on animal genetics [10], [11], [12], [13]. The limited effective population sizes and population structures in livestock allow the possibility of imputation of high-density genotypes from quite low-density genotypes [11], [14], [15], [16]. In 2010, a low-density bovine SNP chip, the Illumina GoldenGate Bovine3K Genotyping Beadchip (http://www.illumina.com/documents/products/datasheets/datasheet_bovine3K.pdf), was developed and made commercially available. That product offered a significant advance toward low-cost genomic selection in cattle; however, imputation accuracy was highly dependent on the relationship of the individual genotyped with the Bovine3K chip to the reference population genotyped at a higher density [17]. In addition, some samples failed to provide genotypes of adequate quality for use in genomic predictions. The SNP call rate performance of the Bovine3K chip was slightly reduced compared with the BovineSNP50 chip [18] because GoldenGate chemistry relies on two hybridization events for proper SNP detection as opposed to a single event for Infinium chemistry. In this study, the Illumina Infinium BovineLD Genotyping Beadchip (http://www.illumina.com/documents/products/datasheets/datasheet_bovineLD.pdf) was developed to provide high imputation accuracy for higher density SNP genotypes in taurine dairy and beef populations. The main objective was to provide a tool that would enable genomic estimated breeding values to be calculated from accurately imputed genotype data from an Infinium-based SNP array with very low rates of failed samples. The main features of the new BovineLD chip are presented along with its imputation performance in a range of breeds and reference populations.

Materials and Methods

SNP selection

To provide highly accurate imputation to BovineSNP50 genotypes in global taurine breeds, SNPs were selected from validated assays from existing higher density chips and similar SNP detection technology, i.e. the Illumina BovineSNP50 and BovineHD (http://www.illumina.com/documents/products/datasheets/datasheet_bovineHD.pdf) SNP arrays, with priority given to BovineSNP50 content. From the known and validated SNPs, selection priority was 1) high minor allele frequencies (MAFs) in targeted breeds, 2) uniform spacing at a minimum of 2 SNPs per Mbp, with increased SNP density within 500 kbp of chromosomal ends, 3) inclusion of SNPs for determination of sex, parentage, Y haplotypes, and subspecies and maternal lineages, 4) SNP quality and fidelity criteria for robust reproducibility (>98% call rate and <0.01% Mendelian inconsistency), and 5) a target overlap of 2,000 SNPs with the Bovine3K chip to ensure backward compatibility. The anticipated SNP spacing (2 SNPs per Mbp) obviated the need to check for highly correlated SNPs. The SNPs were selected to be highly informative with a high MAF over a large range of breeds from around the world (Table 1). The reference MAF estimates were from breeds in 10 countries from North America, Europe, and Oceania. Content selection was optimized using taurine allele frequencies. To achieve regular spacing, the UMD3 bovine genome assembly (http://www.cbcb.umd.edu/research/bos_taurus_assembly.shtml) was used to define 500-kbp segments over the 29 autosomes. A lack of flanking information at the end of each chromosome had resulted in lower imputation efficiency in preliminary tests. To correct that problem, the SNP density was doubled in the first and last segments of each chromosome. Reflecting the diverse membership of the Bovine LD Consortium, initial SNP selection was made by one member and updated by the others. The initial SNP selection was based on two independent criteria. First, SNPs with the highest mean MAF in each 500-kbp segment were selected over a broad range of European breeds including European Holstein, Montbéliarde, Normande, Jersey, Brown Swiss, Norwegian Red, Swedish Red and White, Finnish Ayrshire, Charolais, Limousine, Blonde d'Aquitaine, and Maine Anjou, with Holstein receiving double weight; the top two SNPs were selected in the segment at each end of the chromosome. Second, SNPs with the highest mean minimum MAF for six major European dairy breeds (European Holstein, Montbéliarde, Normande, Jersey, Brown Swiss, and Norwegian Red) were selected for each 500-kbp segment, with again 2 SNPs selected at each end of the chromosome. Selecting those SNPs with the highest mean of the two selection criteria within each 500-kbp segment (with doubling at the chromosome ends) resulted in 8,000 SNPs. Those 8,000 SNPs were subjected to a similar selection process using MAFs from North America and Oceania along with the European populations. For Holstein and Jersey breeds, the MAF used was the mean across the 3 populations; for Brown Swiss, only North America and Europe were included. The mean MAF was computed from Holstein, Jersey, Brown Swiss, Angus, and Brahman. The minimum MAF was from Jersey, Brown Swiss, and Angus. Again, the SNPs with the highest mean of the two selection criteria were selected with doubling at the chromosome ends.
Table 1

Number of DNA samples, minor allele frequencies (MAFs), and estimated frequency of loci that were polymorphic by breed and region.

MAF
BreedRegionDNA samples (n)MeanMedianLoci that are polymorphic (%)
AngusUnited States6,4000.330.3598.3
Australia2820.310.3397.4
AyrshireNorth America4340.310.3396.7
BeefmasterUnited States230.320.3597.9
Blonde d'AcquitaineEurope1600.340.3798.5
BrahmanAustralia800.210.1889.7
Brown SwissNorth America, Europe2,0390.310.3496.2
CharolaisEurope600.350.3799.0
FleckviehEurope8000.370.3999.5
FriesianNew Zealand170.350.3898.8
GelbviehNorth America140.350.3898.9
GuernseyGlobal610.290.3093.2
HerefordUnited States240.310.3396.1
HolsteinAustralia2,2570.360.3898.7
North America72,8240.350.3798.5
Europe16,0000.360.3898.9
JerseyAustralia5450.300.3295.6
North America5,9580.290.3194.0
LimousinEurope900.350.3798.4
MontbeliardEurope1,5000.340.3698.7
N'DamaAfrica230.300.2876.3
NormandeEurope1,2000.340.3698.4
Norwegian RedNorway170.330.3597.9
Red AngusAngus550.320.3498.1
Red DanishEurope300.350.3899.0
Santa GertrudisUnited States210.320.3397.2
Next, some of the selected SNPs were replaced by Bovine3K SNPs that were in nearby locations to ensure backward compatibility. In addition, SNPs used for breed determination and parentage testing that had not already been selected were included, and some SNPs were added to fill gaps generated by map inconsistencies. For the X chromosome, Bovine3K SNPs with high MAFs were selected and supplemented with BovineSNP50 SNPs, with consideration given to spacing, MAF, and fidelity. Because large gaps remained after that initial selection, additional X- chromosome SNPs were chosen from the BovineHD assay. For the Y chromosome and mitochondrial DNA (mtDNA), 9 Y-specific and 13 mtDNA SNP markers were identified from the BovineHD chip based on assay fidelity and performance across 27 breeds, MAF across those breeds, and ability of a SNP to discern subspecies and geographic locations of breed origins.

Imputation

Imputation efficiency was assessed in 10 populations (North American, French, and Australian Holsteins; North American and Australian Jerseys; North American Brown Swiss; Australian Angus; French Montbéliarde; French Normande; and French Blonde d'Aquitaine). Beagle software (http://faculty.washington.edu/browning/beagle/beagle.html) [9] was used for the Australian and French populations and findhap.f90 (http://aipl.arsusda.gov/software/findhap/) [13] for the North American populations. These imputation programs have similar performance in large dairy cattle data sets [19]. Using existing genotypes from the BovineSNP50 chip, imputation efficiency was determined by comparing imputed and obseved genotypes. Part of the population was retained as a “reference,” while target individuals for imputation had their genotypes reduced in silico to either BovineLD or Bovine3K genotypes. Results were assessed as the proportion of genotypes that were correct in the target population. For example, if the imputed genotype was a heterozygote and the BovineSNP50 genotype was a homozygote, that genotype was counted as incorrectly imputed. The count of correct genotypes included both observed and imputed genotypes to measure the overall success of a lower density genotype in approximating a BovineSNP50 genotype.

Content validation

The SNP assays for 6,914 loci were validated using data from 290 samples that represented 26 global dairy and beef breeds (Table 2) and included Bovine Hapmap samples [20]. The 290 samples (234 males, 56 females) included 286 unrelated samples, 2 trios, and 2 replicates. All markers were assessed for clustering of the genotypes using Illumina GenomeStudio genotyping software (version 2010.3; http://www.illumina.com/documents/products/datasheets/datasheet_genomestudio_software.pdf. A total of 6,909 clearly identifiable and scorable clusters were retained for robust utility of the panel. The cluster positions were defined with priority given first to data from dairy breeds and second to beef breeds. The purpose of the resulting cluster position file is to apply known robust cluster positions to future genotyping data for high throughput genotype calling. For phylogenetic analysis based on Y and mtDNA SNPs, individual sequences for each breed were clustered to construct consensus sequences using SNPs from 9 Y-chromosome loci and 13 mtDNA loci with the DNASTAR SeqMan program (version 6.1; http://www.dnastar.com/t-sub-products-lasergene-seqmanpro.aspx). There were 236 chromosome X SNP on the final Bovine LD chip. Flanking sequences and base calls for the 6,909 SNP are given in Table S1.
Table 2

Numbers of samples, call rates, and BovineSNP50 concordance for validation of BovineLD single-nucleotide polymorphisms (SNPs) by breed.

Call rateConcordance rate
BreedSamples (n)Call rate (%)Samples (n)Concordancea with BovineSNP50 SNPs (%)
Angus1099.981099.997
Ayrshire1099.970NAb
Beefmaster1099.851099.974
Blonde d'Aquitaine1099.971099.996
Brahman1099.51099.972
Brown Swiss101001099.999
Charolais1099.99999.995
Fleckvieh2099.980NA
Friesian1799.930NA
Gelbvieh599.970NA
Guernsey1099.8610100
Hereford1099.861099.997
Holstein1899.961899.999
Jersey (United States)1999.9619100
Jersey (Denmark)1099.910NA
Limousin1099.9710100
Montbeliard10100999.995
N'Dama1099.8510100
Normande1099.981099.997
Norwegian Red1199.8811100
Red Angus1099.9910100
Red Dairy (Angler)1099.990NA
Red Danish (Denmark)1099.920NA
Red Danish (Finland)1099.930NA
Red Danish (Sweden)1099.840NA
Santa Gertrudis1099.831099.988
All breeds29099.9318699.995

Concordance was included for animals with BovineSNP50 genotypes; “no calls” (null genotypes) on either BovineSNP50 or BovineLD were excluded from comparison.

NA = not applicable.

Concordance was included for animals with BovineSNP50 genotypes; “no calls” (null genotypes) on either BovineSNP50 or BovineLD were excluded from comparison. NA = not applicable.

Results

SNP call rates and accuracy

The BovineLD chip, consisting of 6,909 final loci, was validated for 290 individuals from 26 major dairy and beef breeds (Table 2). The mean call rate was 99.94% among dairy breeds, 99.90% among beef breeds, and 99.93% among all samples. For taurine breeds, discordant calls compared to BovineSNP50represented <0.01% of all genotyping calls (Table 2). Mendelian consistency was examined using two Holstein trios, which showed a single error on BTB-01149046 out of 13,797 total possible comparisons. Reproducibility was 100% across two Holstein replicated samples. Based on the nearly perfect concordance between the BovineLD and the BovineSNP50 genotypes reported in Table 2 and the similar concordance between BovineSNP50 and BovineHD genotypes, Mendelian consistency and reproducibility were also examined for the overlapping 6,844 SNPs from BovineHD genotypes. Those data included 8 parent-progeny, 24 parent-parent-progeny, and 10 replicate comparisons that represented 11 taurine, 2 indicine, and 1 hybrid breeds (Table 3). Mendelian consistency was 99.95%, and reproducibility was 99.99%.
Table 3

Mendelian consistency and reproducibility comparisons for a set of 6,844 SNPs in common for the BovineHD and BovineLD BeadChips.

Correctly genotyped SNPs
StatisticComparisonBreedComparisons (n)SNPs genotyped (n)Incorrectly genotyped SNPs (n)(n)(%)
Mendelian consistencyParent-progeny pairAngus213,636313,63399.98
Holstein320,508020,508100
Jersey16,83306,833100
N'Dama16,72006,720100
Red Angus16,80716,80699.99
Parent-parent-progeny trioAngus320,473220,47199.99
Beefmaster16,803106,79399.85
Brahman320,2794220,23799.79
Brown Swiss213,597013,597100
Charalois320,325720,31899.97
Hereford213,607313,60499.98
Holstein427,283227,28199.99
Jersey320,438520,43399.98
Santa Gertrudis320,4104320,36799.79
Overall32217,719118217,60199.95
ReproducibilityReplicatesHereford16,79216,79199.99
Holstein427,320127,319100
Jersey46,824268,2299.97
Limousin16,824268,2299.97
Overall1047,760647,75499.99
The concordance rate for 2,088 SNPs in common between BovineLD and Bovine3K assays was 98.78% for 281 females genotyped with both chips. The most likely cause of the differential performance between the BovineLD and Bovine3K chips is the chemistry difference between the Infinium and GoldenGate assays.

Performance for MAF, mean spacing, and paternal and maternal lineages

Data for calculating mean MAF (Table 1) were primarily BovineLD markers extracted from BovineSNP50 data. However, if BovineSNP50 data were not available, BovineLD markers from the validation data were used. That method allowed MAFs to be calculated more accurately. Mean MAF for the 6,909 SNPs was ≥0.29 for all taurine breeds (Table 1). For Brahman (a Bos primigenius indicus breed), mean MAF was lower (0.18). Overall, >89% of the SNPs were polymorphic in Brahman, which suggested that the BovineLD chip may be useful for imputation in this breed. For the 6,909 SNPs selected for the BovineLD chip, median spacing was 0.348 Mbp, with only 82 (1.1%) of intervals greater than 1 Mbp (Fig. 1). These gaps originate either from the X chromosome, or from regions not covered by the BovineSNP50. The strategy of increasing SNP density at chromosome ends substantially improved imputation accuracy for those regions compared with the Bovine3K array (Fig. 2).
Figure 1

BovineLD single-nucleotide polymorphism (SNP) gap distribution.

Figure 2

Imputation accuracy for Bovine3K and BovineLD genotypes.

Imputation was performed for A) Bovine3K and B) BovineLD genotypes using Beagle software (http://faculty.washington.edu/browning/beagle/beagle.html); imputation accuracy is reported by single-nucleotide polymorphism (SNP).

Imputation accuracy for Bovine3K and BovineLD genotypes.

Imputation was performed for A) Bovine3K and B) BovineLD genotypes using Beagle software (http://faculty.washington.edu/browning/beagle/beagle.html); imputation accuracy is reported by single-nucleotide polymorphism (SNP). The sex-specific and lineage identification SNPs also appeared to perform well. The nine Y-chromosome SNPs had a 100% call rate across 230 males of different breeds and no genotype calls for the 55 females. We investigated the frequency of the haplotypes of the alleles from these 9 SNP both within and across breeds. Four unique haplotypes were observed, which differed dramatically in frequency across breeds, Table 4. One haplotype, CGCCGCAAC (haplotype 1) was observed only in cattle with indicine lineage (eg Brahmans, Beef Master, Santa Getrudis). The second haplotype (TCTCCTCAC) was associated with central European lineage, haplotype 3 (TCTCCTCAT) was 1 base different from haplotype 2 and probably appeared to be associated with breeds that came to the island of Jersey from France or Spain, and haplotype 4 (TCTTGTCGC) was associated with northern European lineage, including islands. Only a few breeds had more than one haplotype, e.g. Santa Gertrudis and Beefmaster, both of which are taurineindicine hybrids. Common haplotypes across breeds appeared to reflect a common origin. Phylogenetic analysis separated the 26 breeds into four distinctive clades, which agrees with a previous report on the dual origins of dairy cattle breeds in Europe [21]. For mtDNA SNPs (Table 5), seven unique mitochondrial haplotypes were found, however 259 of the animals sampled had the same mitochondrial haplotype. Haplotype 7 (AAGAGCAAAAAAG) was at highest frequency in indicine cattle. Most taurine×indicine cattle were derived from taurine cows. Therefore, the lack of haplotype 7 for taurine breeds in most regions is not unexpected. While more research is required, these preliminary results suggest the BovineLD markers could be useful in determining lineage origin between taurine and indicine breeds or identifying potential admixture within a population of locally adapted animals.
Table 4

Animal counts for Y-chromosome haplotypesa by breed.

Y-chromosome haplotype counts (n)
Breed1b 2c 3d 4e
Angus0009
Ayrshire0009
Beefmaster2005
Blonde d'Aquitaine0910
Brahman7003
Brown Swiss01000
Charolais01100
Fleckvieh01802
Friesian00512
Gelbvieh0401
Hereford0009
Holstein00015
Jersey00145
Limousin01000
Montbeliard01000
N'Dama0200
Normande00010
Norwegian Red0007
Red Angus00010
Red Dairy (Angler)00010
Red Danish00015
Santa Gertrudis8000
All breeds177420122

Haplotypes defined by SNP BovineHD310000-0048, -0099, -0103, -0210, -0515, -0517, -1188, -1404, and -1406.

CGCCGCAAC.

TCTCCTCAC.

TCTCCTCAT.

TCTTGTCGC.

Table 5

Animal counts for mtDNA-chromosome haplotypesa by breed.

mtDNA-chromosome haplotype counts (n)
Breed1b 2c 3d 4e 5f 6g 7h Could not be determined
Angus100000000
Ayrshire90001000
Beefmaster70001001
Blonde d'Aquitaine100000000
Brahman80000030
Brown Swiss90010000
Charolais100000000
Fleckvieh200000000
Friesian160000010
Gelbvieh30000002
Guernsey100000000
Hereford90000010
Holstein160000020
Jersey210000151
Limousin100000000
Montbeliard100000000
N'Dama100000000
Normande90001000
Norwegian Red60001004
Red Angus91000000
Red Dairy (Angler)100000000
Red Danish280111000
Santa Gertrudis90001000
All breeds25911261128

Haplotypes defined by SNP BovineHD320000-0141, -0145, -0180, -0226, -0252, -0312, -0332, -0342, -0354, -0358, -0368, -0384, and -0406.

CCGCAACCGCCCG.

CCGCAAACGCCCG.

CCGCAACAGCCCG.

CCGCAACCACCCG.

CCGCAACCGCCCA.

CAACAACCGCCCG.

AAGAGCAAAAAAG.

Haplotypes defined by SNP BovineHD310000-0048, -0099, -0103, -0210, -0515, -0517, -1188, -1404, and -1406. CGCCGCAAC. TCTCCTCAC. TCTCCTCAT. TCTTGTCGC. Haplotypes defined by SNP BovineHD320000-0141, -0145, -0180, -0226, -0252, -0312, -0332, -0342, -0354, -0358, -0368, -0384, and -0406. CCGCAACCGCCCG. CCGCAAACGCCCG. CCGCAACAGCCCG. CCGCAACCACCCG. CCGCAACCGCCCA. CAACAACCGCCCG. AAGAGCAAAAAAG.

Accuracy of imputation

Imputation accuracy was assessed in Australian, French, and North American cattle populations. In all cases, the accuracy of imputation to BovineSNP50 genotypes was ≥95% (Table 6). Most imputation results were >97%, particularly for dairy breeds. The results were lower for some breeds, likely because of the limited reference population size used. For example, the considerably larger size of the North American reference set of Holsteins compared with the Australian set could explain why the North American imputation accuracy was 1.1 percentage points higher than for Australia. The effect of a smaller reference set of genotypes on imputation accuracy was further demonstrated by imputation from BovineLD genotypes for Australian Angus, which had the smallest reference population in the data set. For French populations, imputation efficiency also varied, with the highest accuracy for Holsteins and the lowest for Blondes d'Aquitaine (Table 6); imputation accuracy for Normandes and Montbéliardes was slightly lower than for Holsteins. Again, much of the variation is likely explained by reference population size.
Table 6

Accuracy of imputation from BovineLD genotypes to BovineSNP50 genotypes for Australian, French, and North American breeds.

Imputation accuracy
Country/regiona BreedReferenceTargetGenotypes correctly imputed (%)b Known genotypes without error (%)c
AustraliaAngus2008292.393.1
Holstein1,83136097.597.8
Jersey4548694.995.7
FranceBlonde d'Aquitaine75323795.295.8
Holstein3,50596698.598.7
Montbéliarde1,17022298.198.4
Normande1,17624898.498.6
North AmericaBrown Swiss1,99416897.497.9
Holstein63,28819,50698.898.9
Jersey8,6871,14098.098.3

Beagle software (http://faculty.washington.edu/browning/beagle/beagle.html) was used for Australian and French imputations and findhap.f90 (http://aipl.arsusda.gov/software/findhap/) for North American imputations.

The 6,909 SNPs on the BovineLD chip were excluded from the calculation of imputation accuracy.

All SNPs included, i.e. the 6,909 SNPs on the BovineLD chip.

Beagle software (http://faculty.washington.edu/browning/beagle/beagle.html) was used for Australian and French imputations and findhap.f90 (http://aipl.arsusda.gov/software/findhap/) for North American imputations. The 6,909 SNPs on the BovineLD chip were excluded from the calculation of imputation accuracy. All SNPs included, i.e. the 6,909 SNPs on the BovineLD chip. For Australian and North American Holsteins, accuracy of imputation to BovineSNP50 genotypes was better for BovineLD genotypes than for Bovine3K genotypes. For Australian Holsteins, imputation accuracies were up to almost 6 percentage points higher with the BovineLD chip than with the Bovine3K chip using the same data (Table 7). Mean imputation accuracy was 92.8% for Australian Holstein Bovine3K genotypes compared with 97.6% for BovineLD genotypes. For North American Holsteins, accuracies of imputation to BovineSNP50 genotypes from Bovine3K genotypes ranged from 93.0 to 96.7% (depending on number of parents genotyped) for 2,456 animals genotyped with both Bovine3K and BovineSNP50 chips [17]. Corresponding values for BovineLD genotypes (Table 8) are 96.6 to 99.3%.
Table 7

Accuracy of imputationa from BovineLD or Bovine3K genotypes to BovineSNP50 genotypes for Australian Holsteins with and without a sire in the reference populationb.

Sire statusGenotyping chipAnimals imputed (n)Imputation accuracy (%)
Included in reference populationBovineLD24098.3
Bovine3K24094.2
Not included in reference populationBovineLD12097.0
Bovine3K12091.3

Imputation was done using Beagle software (http://faculty.washington.edu/browning/beagle/beagle.html).

Reference population included 1,831 animals.

Table 8

Accuracy of imputationa from BovineLD genotypes to BovineSNP50 genotypes for North American Brown Swiss, Holsteins, and Jerseys with and without parents in the reference populationb.

JerseyHolsteinBrown Swiss
Sire genotypeDam genotypeAnimals with imputed genotypes (n)Genotypes imputed correctly (%)Animals with imputed genotypes (n)Genotypes imputed correctly (%)Animals with imputed genotypes (n)Genotypes imputed correctly (%)
BovineSNP50BovineSNP5034599.19,31999.31399.0
BovineSNP50None59398.19,38398.714597.9
NoneBovineSNP50698.113598.5197.2
BovineSNP50Bovine3K15898.315898.8NAc NA
Bovine3KNone396.9NANANANA
NoneBovine3K196.6897.8NANA
NoneNone3492.738996.6995.1
All comparisons1,14098.319,50698.916897.9

Imputation was done using findhap.f90 (http://aipl.arsusda.gov/software/findhap/), which includes both population- and pedigree-based haplotypes.

Reference population included 63,288 animals.

NA = not applicable.

Imputation was done using Beagle software (http://faculty.washington.edu/browning/beagle/beagle.html). Reference population included 1,831 animals. Imputation was done using findhap.f90 (http://aipl.arsusda.gov/software/findhap/), which includes both population- and pedigree-based haplotypes. Reference population included 63,288 animals. NA = not applicable. The greatest improvement in imputation for BovineLD genotypes compared with Bovine3K genotypes was for individuals with no genotyped parents. For Australian Holsteins, difference in mean imputation accuracy with and without a sire in the reference population was 2.9 percentage points for Bovine3K genotypes but only 1.3 percentage points for BovineLD genotypes. The improvement was smaller for North American Holsteins: a difference of 2.7 percentage points between both parents genotyped and no genotyped parents for Bovine LD genotypes (Table 6) compared with 3.7% for Bovine3K genotypes [17]. Compared with North American Holsteins, BovineLD imputation accuracy for animals without a parent in the reference population was slightly poorer for North American Jersey and Brown Swiss populations (Table 8). However, the more than doubling of markers and the different SNP selection criteria [22] compared with the Bovine3K chip allowed high imputation accuracies across a wider range of dairy breeds as well as some beef breeds.

Discussion

The Illumina BovineLD BeadChip includes 6,909 SNPs selected to provide optimized imputation to BovineSNP50 genotypes in dairy breeds. The SNPs have MAFs of >0.3 in most breeds, and nearly uniform spacing across the genome except at the ends of the chromosome where densities were increased. The chip also includes SNPs on the Y chromosome and mtDNA loci that are useful for gender checking, determining subspecies classification and identifying certain paternal and maternal breed lineages. Accuracy of imputation to BovineSNP50 genotypes using the BovineLD chip was >99% when both parents were genotyped in the North American BovineSNP50 reference population. That high accuracy suggests that the design criteria for the BovineLD chip would be useful to consider in other species for which an “imputation chip” could dramatically lower the cost of implementing genomic selection. BovineLD imputation was about 3 percentage points more accurate across multiple populations compared with Bovine3K imputation. The improvement was greatest when neither parent had been genotyped. The gain in imputation accuracy is attributed primarily to the increased overall density of the BovineLD chip compared with the Bovine3K chip and also to the even further increased density at the ends of chromosomes. The high MAFs also contribute to the improved imputation accuracy. The MAFs were similar across taurine beef and dairy breed as was the proportion of SNPs that were polymorphic. Although it would be expected that accuracies of imputation would be highest for those breeds which were included in the design of the chip, which was dominated by dairy breeds, the similar SNP characteristics (particularly the high MAF across many beef and dairy taurine breeds) suggest that the BovineLD chip will perform well in imputation of taurine beef cattle. Our results suggest that the imputation accuracy will also be quite dependent on the size of the population genotyped with a higher density SNP assay. Overall, the new BovineLD BeadChip should facilitate low cost genomic selection in Bos primigenius taurus beef and dairy cattle. Genomic locations, flanking sequences and base calls for the 6,909 SNP on the bovineLD array. (CSV) Click here for additional data file.
  19 in total

1.  Prediction of total genetic value using genome-wide dense marker maps.

Authors:  T H Meuwissen; B J Hayes; M E Goddard
Journal:  Genetics       Date:  2001-04       Impact factor: 4.562

2.  Short communication: Imputation performances of 3 low-density marker panels in beef and dairy cattle.

Authors:  R Dassonneville; S Fritz; V Ducrocq; D Boichard
Journal:  J Dairy Sci       Date:  2012-07       Impact factor: 4.034

3.  Deterministic models of breeding scheme designs that incorporate genomic selection.

Authors:  J E Pryce; M E Goddard; H W Raadsma; B J Hayes
Journal:  J Dairy Sci       Date:  2010-11       Impact factor: 4.034

4.  Strategy for applying genome-wide selection in dairy cattle.

Authors:  L R Schaeffer
Journal:  J Anim Breed Genet       Date:  2006-08       Impact factor: 2.380

5.  A hidden markov model combining linkage and linkage disequilibrium information for haplotype reconstruction and quantitative trait locus fine mapping.

Authors:  Tom Druet; Michel Georges
Journal:  Genetics       Date:  2009-12-14       Impact factor: 4.562

6.  Imputation of genotypes from different single nucleotide polymorphism panels in dairy cattle.

Authors:  T Druet; C Schrooten; A P W de Roos
Journal:  J Dairy Sci       Date:  2010-11       Impact factor: 4.034

7.  Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms.

Authors:  K A Weigel; C P Van Tassell; J R O'Connell; P M VanRaden; G R Wiggans
Journal:  J Dairy Sci       Date:  2010-05       Impact factor: 4.034

8.  Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds.

Authors:  Richard A Gibbs; Jeremy F Taylor; Curtis P Van Tassell; William Barendse; Kellye A Eversole; Clare A Gill; Ronnie D Green; Debora L Hamernik; Steven M Kappes; Sigbjørn Lien; Lakshmi K Matukumalli; John C McEwan; Lynne V Nazareth; Robert D Schnabel; George M Weinstock; David A Wheeler; Paolo Ajmone-Marsan; Paul J Boettcher; Alexandre R Caetano; Jose Fernando Garcia; Olivier Hanotte; Paola Mariani; Loren C Skow; Tad S Sonstegard; John L Williams; Boubacar Diallo; Lemecha Hailemariam; Mario L Martinez; Chris A Morris; Luiz O C Silva; Richard J Spelman; Woudyalew Mulatu; Keyan Zhao; Colette A Abbey; Morris Agaba; Flábio R Araujo; Rowan J Bunch; James Burton; Chiara Gorni; Hanotte Olivier; Blair E Harrison; Bill Luff; Marco A Machado; Joel Mwakaya; Graham Plastow; Warren Sim; Timothy Smith; Merle B Thomas; Alessio Valentini; Paul Williams; James Womack; John A Woolliams; Yue Liu; Xiang Qin; Kim C Worley; Chuan Gao; Huaiyang Jiang; Stephen S Moore; Yanru Ren; Xing-Zhi Song; Carlos D Bustamante; Ryan D Hernandez; Donna M Muzny; Shobha Patil; Anthony San Lucas; Qing Fu; Matthew P Kent; Richard Vega; Aruna Matukumalli; Sean McWilliam; Gert Sclep; Katarzyna Bryc; Jungwoo Choi; Hong Gao; John J Grefenstette; Brenda Murdoch; Alessandra Stella; Rafael Villa-Angulo; Mark Wright; Jan Aerts; Oliver Jann; Riccardo Negrini; Mike E Goddard; Ben J Hayes; Daniel G Bradley; Marcos Barbosa da Silva; Lilian P L Lau; George E Liu; David J Lynn; Francesca Panzitta; Ken G Dodds
Journal:  Science       Date:  2009-04-24       Impact factor: 47.728

9.  Dual origins of dairy cattle farming--evidence from a comprehensive survey of European Y-chromosomal variation.

Authors:  Ceiridwen J Edwards; Catarina Ginja; Juha Kantanen; Lucía Pérez-Pardal; Anne Tresset; Frauke Stock; Luis T Gama; M Cecilia T Penedo; Daniel G Bradley; Johannes A Lenstra; Isaäc J Nijman
Journal:  PLoS One       Date:  2011-01-06       Impact factor: 3.240

10.  Development and characterization of a high density SNP genotyping assay for cattle.

Authors:  Lakshmi K Matukumalli; Cynthia T Lawley; Robert D Schnabel; Jeremy F Taylor; Mark F Allan; Michael P Heaton; Jeff O'Connell; Stephen S Moore; Timothy P L Smith; Tad S Sonstegard; Curtis P Van Tassell
Journal:  PLoS One       Date:  2009-04-24       Impact factor: 3.240

View more
  52 in total

1.  High imputation accuracy from informative low-to-medium density single nucleotide polymorphism genotypes is achievable in sheep1.

Authors:  Aine C O'Brien; Michelle M Judge; Sean Fair; Donagh P Berry
Journal:  J Anim Sci       Date:  2019-04-03       Impact factor: 3.159

2.  Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins.

Authors:  Jun He; Jiaqi Xu; Xiao-Lin Wu; Stewart Bauck; Jungjae Lee; Gota Morota; Stephen D Kachman; Matthew L Spangler
Journal:  Genetica       Date:  2017-12-14       Impact factor: 1.082

3.  Using the difference in actual and expected calf liveweight relative to its dam liveweight as a statistic for interherd and intraherd benchmarking and genetic evaluations1.

Authors:  Noirin McHugh; Ross D Evans; Donagh P Berry
Journal:  J Anim Sci       Date:  2019-12-17       Impact factor: 3.159

4.  Prediction ability for growth and maternal traits using SNP arrays based on different marker densities in Nellore cattle using the ssGBLUP.

Authors:  Juan Diego Rodriguez Neira; Elisa Peripolli; Maria Paula Marinho de Negreiros; Rafael Espigolan; Rodrigo López-Correa; Ignacio Aguilar; Raysildo B Lobo; Fernando Baldi
Journal:  J Appl Genet       Date:  2022-02-08       Impact factor: 3.240

5.  Imputation of microsatellite alleles from dense SNP genotypes for parental verification.

Authors:  Matthew McClure; Tad Sonstegard; George Wiggans; Curtis P Van Tassell
Journal:  Front Genet       Date:  2012-08-14       Impact factor: 4.599

6.  Comparison of different imputation methods from low- to high-density panels using Chinese Holstein cattle.

Authors:  Z Weng; Z Zhang; Q Zhang; W Fu; S He; X Ding
Journal:  Animal       Date:  2012-12-11       Impact factor: 3.240

7.  Genotyping-by-sequencing (GBS): a novel, efficient and cost-effective genotyping method for cattle using next-generation sequencing.

Authors:  Marcos De Donato; Sunday O Peters; Sharon E Mitchell; Tanveer Hussain; Ikhide G Imumorin
Journal:  PLoS One       Date:  2013-05-17       Impact factor: 3.240

8.  Fine mapping for Weaver syndrome in Brown Swiss cattle and the identification of 41 concordant mutations across NRCAM, PNPLA8 and CTTNBP2.

Authors:  Matthew McClure; Euisoo Kim; Derek Bickhart; Daniel Null; Tabatha Cooper; John Cole; George Wiggans; Paolo Ajmone-Marsan; Licia Colli; Enrico Santus; George E Liu; Steve Schroeder; Lakshmi Matukumalli; Curt Van Tassell; Tad Sonstegard
Journal:  PLoS One       Date:  2013-03-20       Impact factor: 3.240

9.  Genetic diversity, population structure and relationships in indigenous cattle populations of Ethiopia and Korean Hanwoo breeds using SNP markers.

Authors:  Zewdu Edea; Hailu Dadi; Sang-Wook Kim; Tadelle Dessie; Taeheon Lee; Heebal Kim; Jong-Joo Kim; Kwan-Suk Kim
Journal:  Front Genet       Date:  2013-03-21       Impact factor: 4.599

10.  Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle.

Authors:  Mehar S Khatkar; Gerhard Moser; Ben J Hayes; Herman W Raadsma
Journal:  BMC Genomics       Date:  2012-10-08       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.