| Literature DB >> 33167493 |
Warren M Snelling1, Jesse L Hoff2, Jeremiah H Li2, Larry A Kuehn1, Brittney N Keel1, Amanda K Lindholm-Perry1, Joseph K Pickrell2.
Abstract
Decreasing costs are making low coverage sequencing with imputation to a comprehensive reference panel an attractive alternative to obtain functional variant genotypes that can increase the accuracy of genomic prediction. To assess the potential of low-pass sequencing, genomic sequence of 77 steers sequenced to >10X coverage was downsampled to 1X and imputed to a reference of 946 cattle representing multiple Bos taurus and Bos indicus-influenced breeds. Genotypes for nearly 60 million variants detected in the reference were imputed from the downsampled sequence. The imputed genotypes strongly agreed with the SNP array genotypes (r¯=0.99) and the genotypes called from the transcript sequence (r¯=0.97). Effects of BovineSNP50 and GGP-F250 variants on birth weight, postweaning gain, and marbling were solved without the steers' phenotypes and genotypes, then applied to their genotypes, to predict the molecular breeding values (MBV). The steers' MBV were similar when using imputed and array genotypes. Replacing array variants with functional sequence variants might allow more robust MBV. Imputation from low coverage sequence offers a viable, low-cost approach to obtain functional variant genotypes that could improve genomic prediction.Entities:
Keywords: beef cattle; genomic prediction; imputation; sequence
Mesh:
Year: 2020 PMID: 33167493 PMCID: PMC7716200 DOI: 10.3390/genes11111312
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Animals genotyped in the Germplasm Evaluation Project.
| SNP Array | N |
|---|---|
| BovineSNP50 a | 9930 |
| BovineHD b | 1547 |
| GGP c -F250 | 2339 |
| GGP-50K | 3068 |
| GGP d | 5083 |
a BovineSNP50 (Illumina, Inc.) versions 1 and 2; ~54,000 SNP. b BovineHD (Illumina, Inc.); ~780,000 SNP. c GeneSeek Genomic Profiler (GGP) F250 (Neogen, Inc.); ~220,000 putative functional SNP. d GGP versions 1 to 4; ~20,000 to 75,000 SNP.
Figure 1Principal component (PC) analysis of the haplotype reference panel. (a) Overlap among projects sequenced with different platforms; (b) depicts PC1 separating the Bos indicus from Bos taurus breeds, and PC2 separating Holstein from Angus, with other Bos taurus breeds intermediate between Holstein and Angus. The first two PC explained 11% of genomic relationships among the reference, 7% by PC1, and 4% by PC2.
Functional classification of variants detected in the cattle haplotype reference panel.
| Reference b | SNP Array c | |||
|---|---|---|---|---|
| Classification a | Variants | Genes | Variants | Genes |
| Protein-changing | 332,714 | 21,066 | 29,519 | 10,673 |
| High impact | 14,773 | 9084 | 545 | 509 |
| Non-synonymous SNP | 318,269 | 20,978 | 29,011 | 10,576 |
| Potentially regulatory | 327,357 | 18,110 | 13,072 | 8076 |
| Untranslated region (UTR) | 318,495 | 15,288 | 12,447 | 7557 |
| Non-coding RNA | 8940 | 2822 | 627 | 519 |
| Intergenic | 38,694,029 | 396,306 | ||
| Intronic | 19,533,912 | 272,510 | ||
| Total | 59,198,026 | 21,334 | 715,402 | 10,683 |
a Variants classified with snpEff v4.3 using ensembl ARS-UCD1.2.96 annotation. b Variants detected in the cattle haplotype reference panel and imputed from the low-pass sequence. c Autosomal and pseudo-autosomal variants detected in the reference panel and with usable SNP array genotypes in Germplasm Evaluation Project cattle.
Figure 2Relationship between imputation accuracy, expressed as a correlation (r) between genotypes imputed from sequence and called from SNP arrays, and call confidence—a function of imputed genotype probabilities. Accuracy and call confidence were lowest for the known crossbred (XB) steers, which were sequenced with DNA extracted from blood, another low-confidence, low-accuracy steer was suspected to be a twin. The purebred (PB) Bos taurus steer with lowest accuracy had the lowest call confidence of any Bos taurus and was a known twin. Bos indicus-influenced steers (>0.1 Brahman) tended to have lower call confidence and accuracy than Bos taurus steers.
Figure 3Relationship between imputation accuracy, expressed as a correlation (r) between genotypes imputed from sequence and called from SNP arrays (a) or transcript sequence (b), and minor allele frequency (MAF). Mean correlation between imputed and called genotypes within 0.01 MAF increments is shown by blue lines, and the green lines show mean concordance within the 0.01 MAF increments.
Sire-breed differences among correlations between genotypes imputed from downsampled sequence and called from transcript sequence.
| Correlation (r) Scale | −log(1−r) Scale | |||||
|---|---|---|---|---|---|---|
| Sire Breed | Effect a | SE | Effect a | SE | ||
| Red Angus | 4.60 × 10−4 | 4.92 × 10−3 | 9.26 × 10−1 | −0.07 | 0.21 | 7.52 × 10−1 |
| Brahman | −2.79 × 10−2 | 3.81 × 10−3 | 1.59 × 10−9 | 1.86 | 0.16 | 8.17 × 10−16 |
| Beefmaster | −2.08 × 10−2 | 3.36 × 10−3 | 1.01 × 10−7 | 1.60 | 0.14 | 1.78 × 10−15 |
| Brangus | −1.05 × 10−2 | 3.11 × 10−3 | 1.37 × 10−3 | 1.15 | 0.13 | 1.25 × 10−11 |
| Charolais | −2.02 × 10−3 | 3.36 × 10−3 | 5.50 × 10−1 | 0.36 | 0.14 | 1.55 × 10−2 |
| ChiAngus | −2.48 × 10−3 | 4.92 × 10−3 | 6.16 × 10−1 | 0.40 | 0.21 | 6.38 × 10−2 |
| South Devon | −2.54 × 10−3 | 6.60 × 10−3 | 7.02 × 10−1 | 0.44 | 0.28 | 1.22 × 10−1 |
| Gelbvieh | −1.63 × 10−3 | 3.81 × 10−3 | 6.71 × 10−1 | 0.29 | 0.16 | 7.83 × 10−2 |
| Hereford | −6.70 × 10−4 | 3.55 × 10−3 | 8.51 × 10−1 | 0.13 | 0.15 | 4.10 × 10−1 |
| Limousin | −1.80 × 10−3 | 6.60 × 10−3 | 7.86 × 10−1 | 0.34 | 0.28 | 2.34 × 10−1 |
| Maine-Anjou | −3.20 × 10−4 | 4.92 × 10−3 | 9.48 × 10−1 | 0.09 | 0.21 | 6.72 × 10−1 |
| Salers | −2.75 × 10−3 | 3.81 × 10−3 | 4.74 × 10−1 | 0.47 | 0.16 | 5.95 × 10−3 |
| Braunveih | −3.89 × 10−3 | 4.92 × 10−3 | 4.33 × 10−1 | 0.61 | 0.21 | 5.66 × 10−3 |
| Simmental | −2.57 × 10−4 | 4.21 × 10−3 | 9.52 × 10−1 | 0.07 | 0.18 | 6.79 × 10−1 |
| Shorthorn | −1.25 × 10−3 | 3.55 × 10−3 | 7.62 × 10−1 | 0.24 | 0.15 | 1.22 × 10−1 |
| Santa Gertrudis | −2.21 × 10−2 | 3.36 × 10−3 | 2.34 × 10−8 | 1.66 | 0.14 | 5.55 × 10−15 |
a Difference from Angus.
Restricted maximum likelihood heritability (h2) estimates for birth weight, postweaning gain, and marbling score using pedigree and different genomic relationship matrices.
| Birth Weight | Postweaning Gain | Marbling Score | ||||
|---|---|---|---|---|---|---|
| Relationship a | h2 (SE) | N | h2 |
| h2 |
|
| Pedigree a | 0.595 (0.008) | 78,625 | 0.526 (0.010) | 68,846 | 0.538 (0.018) | 33,850 |
| Gall b | 0.573 (0.011) | 16,512 | 0.474 (0.013) | 16,144 | 0.508 (0.017) | 10,898 |
| GF250 c | 0.545 (0.011) | 16,440 | 0.442 (0.012) | 16,068 | 0.471 (0.016) | 10,822 |
| GF250s d | 0.380 (0.023) | 16,440 | 0.270 (0.019) | 16,068 | 0.342 (0.021) | 10,822 |
| GF250r e | 0.066 (0.007) | 16,440 | 0.062 (0.007) | 16,068 | 0.105 (0.009) | 10,822 |
| G50K f | 0.519 (0.011) | 16,440 | 0.437 (0.012) | 16,068 | 0.466 (0.016) | 10,822 |
a Pedigree BLUP including downsampled steers. b 748,804 autosomal and pseudo-autosomal variants from GGP-F250 and BovineHD arrays, filtered for >0.95 call rate and pedigree imputation accuracy. Genomic BLUP (GBLUP) included downsampled steers. c 116,472 filtered variants from GGP-F250. GBLUP excluded downsampled steers. d GGP-F250 subsets selected for trait-specific effects: 551 birth weight; 585 postweaning gain; and 698 marbling score. GBLUP excluded downsampled steers. e Randomly selected GGP-F250 subsets, same size as trait-specific subsets. GBLUP excluded downsampled steers. f 51,496 BovineHD variants common with BovineSNP50 array. GBLUP excluded downsampled steers.
Correlations (SE) between molecular breeding values (∑ (marker effect estimates × genotypes)) and predicted breeding values.
| Birth Weight | Postweaning Gain | Marbling Score | ||||
|---|---|---|---|---|---|---|
| Predictions Using Imputed SNP Array Genotypes | ||||||
|
|
|
|
|
|
| |
| GF250 c | 0.738 (0.061) | 0.904 (0.037) | 0.779 (0.055) | 0.881 (0.041) | 0.770 (0.057) | 0.926 (0.032) |
| GF250s d | 0.555 (0.079) | 0.681 (0.067) | 0.653 (0.069) | 0.714 (0.063) | 0.655 (0.069) | 0.750 (0.059) |
| GF250r e | 0.379 (0.090) | 0.481 (0.083) | 0.344 (0.093) | 0.385 (0.090) | 0.629 (0.070) | 0.741 (0.058) |
| G50K f | 0.710 (0.063) | 0.888 (0.039) | 0.785 (0.055) | 0.886 (0.040) | 0.794 (0.053) | 0.950 (0.026) |
| Predictions Using Genotypes Imputed from Low-Coverage Sequence | ||||||
| GF250 c | 0.680 (0.067) | 0.866 (0.044) | 0.779 (0.055) | 0.887 (0.040) | 0.769 (0.057) | 0.936 (0.030) |
| Gf250s d | 0.531 (0.081) | 0.634 (0.073) | 0.635 (0.071) | 0.722 (0.063) | 0.641 (0.071) | 0.738 (0.062) |
| GF250r e | 0.286 (0.100) | 0.390 (0.094) | 0.332 (0.096) | 0.395 (0.094) | 0.649 (0.071) | 0.776 (0.057) |
| G50K f | 0.676 (0.067) | 0.866 (0.044) | 0.805 (0.052) | 0.903 (0.037) | 0.760 (0.058) | 0.941 (0.029) |
a Breeding values predicted by BLUP with pedigree relationships, including downsampled steers. b Breeding values predicted by BLUP with genomic relationships computed using genotypes of 748,804 autosomal and pseudo-autosomal variants from the GGP-F250 and BovineHD arrays, including downsampled steers. c Molecular breeding values of the downsampled steers predicted with effects of 116,472 GGP-F250 variants solved from GBLUP, excluding downsampled steer records. d GGP-F250 subsets selected for trait-specific effects: 551 birth weight; 585 postweaning gain; and 698 marbling score. GBLUP excluded downsampled steers. e Randomly selected GGP-F250 subsets, same size as trait-specific subsets. GBLUP excluded downsampled steers. f 51,496 BovineHD variants common with BovineSNP50 array. GBLUP excluded downsampled steers.