| Literature DB >> 29284405 |
Mirjam Frischknecht1,2, Hubert Pausch3,4,5, Beat Bapst6, Heidi Signer-Hasler7, Christine Flury7, Dorian Garrick8, Christian Stricker9, Ruedi Fries3, Birgit Gredler-Grandl6.
Abstract
BACKGROUND: Within the last few years a large amount of genomic information has become available in cattle. Densities of genomic information vary from a few thousand variants up to whole genome sequence information. In order to combine genomic information from different sources and infer genotypes for a common set of variants, genotype imputation is required.Entities:
Keywords: Accuracy; Brown Swiss; Dairy cattle; Genome-wide association study; Imputation; Milk traits; QTL discovery; Whole genome sequencing
Mesh:
Year: 2017 PMID: 29284405 PMCID: PMC5747239 DOI: 10.1186/s12864-017-4390-2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1PCA plot showing the population structure of 1577 sequenced animals of the 1000 Bull Genomes Project Run 5. Different colours and symbols separate the animals by breed. Symbols in black colour represent indicate individuals selected as reference population in scenario S1 (REF in legend); HOL: Holstein; JER: Jersey; AAN: Angus; BSW: Brown Swiss; GUE: Guernsey; MAR: Marchigiana; REN: Norwegian Red; SIM: Simmental; MON: Montbeliarde; NOR: Normandes; UNK: unknown; CHA: Charolais; DBC: Dairy-Beef Crosses; AYF: Finnish Ayrshire; HER: Hereford; STA: Stabilizer; RES: Swedish Red; GVH: Gelbvieh; RED: Danish Red; BCO:Beef Composites; BBR: Beef Booster; LIM: Limousin; PIE: Piedmontese; SAL: Salers; BCR: Beef Crosses; ERI: Eringer; GLW: Galloway; ROM: Romagnola; SCO: Scottish Highland; TGR: Tyrolean Grey; HIN: Hinterwalder; ANG: Angler; VOR: Vorderwalder; BELB: Belgian Blue
Overview of imputation scenarios and number of reference and validation animals within scenario
| Scenario | Reference | Validation |
|---|---|---|
| S1 | 49 animals out of Run 5 | 1528 animals out of Run 5 |
| S2 | 20 BSW | 103 BSW |
| S3 | 50 BSW | 73 BSW |
| S4 | 20 random BSW | 103 BSW |
| S5 | 103 BSW | 20 random BSW from S4 |
| S6 | BSW + HOL + SIM (855 animals) | 20 random BSW from S4 |
| S7 | All Run 5 animals (1557 animals) | 20 random BSW from S4 |
Mean (and standard deviation) genotype concordance, genotype correlation and allele dosage correlation for validation animals in S1
| Genotype concordance rate | Genotype correlation | Dosage correlation | |
|---|---|---|---|
| BEAGLE BSWa | 0.964 (0.004) | 0.829 (0.019) | 0.945 (0.011) |
| FImpute BSW | 0.953 (0.005) | 0.766 (0.021) | – |
| IMPUTE2 BSW | 0.971 (0.004) | 0.872 (0.021) | 0.956 (0.009) |
| Minimac BSW | 0.972 (0.004) | 0.872 (0.021) | 0.959 (0.008) |
| BEAGLE Allb | 0.967 (0.007) | 0.840 (0.035) | 0.965(0.010) |
| FImpute All | 0.956 (0.008) | 0.775 (0.038) | – |
| IMPUTE2 All | 0.974 (0.007) | 0.879 (0.035) | 0.972 (0.009) |
| Minimac All | 0.975 (0.007) | 0.879 (0.035) | 0.974 (0.008) |
aImputation accuracy evaluated for BSW validation animals only
bImputation accuracy evaluated for all validation animals
Mean (and standard deviation or range for cross validation scenarios) genotype concordance rate (Gen Conc), genotype correlation (Gen Corr), and allele dosage correlation (Dos Corr) between called sequence variants and imputed variants for animals in the validation set in scenarios S2 to S7
| Gen Conc | S2 | S3 | S4 | S5 | S6 | S7 |
|---|---|---|---|---|---|---|
| Beagle | 0.947 (0.024) | 0.967 (0.013) | 0.945 | 0.966 | 0.979 | 0.984 |
| FImpute | 0.938 (0.024) | 0.957 (0.012) | 0.935 | 0.960 | 0.974 | 0.982 |
| IMPUTE2 | 0.962 (0.02) | 0.973 (0.008) | 0.959 | 0.9712 | 0.981 | 0.985 |
| Minimac | 0.96 (0.02) | 0.972 (0.008) | 0.957 | 0.968 | 0.980 | 0.9852 |
| Geno Corr | ||||||
| Beagle | 0.863 (0.053) | 0.914 (0.031) | 0.857 | 0.914 | 0.932 | 0.932 |
| FImpute | 0.827 (0.052) | 0.879 (0.029) | 0.819 | 0.894 | 0.916 | 0.923 |
| IMPUTE2 | 0.905 (0.042) | 0.930 (0.02) | 0.897 | 0.9261 | 0.939 | 0.938 |
| Minimac | 0.899 (0.041) | 0.926 (0.022) | 0.8912 | 0.9162 | 0.9358 | 0.935 |
| Dos Corr | ||||||
| Beagle | 0.951 (0.028) | 0.973 (0.013) | 0.956 | 0.977 | 0.980 | 0.982 |
| FImpute | – | – | – | – | – | – |
| IMPUTE2 | 0.964 (0.023) | 0.977 (0.009) | 0.967 | 0.979 | 0.980 | 0.981 |
| Minimac | 0.964 (0.023) | 0.978 (0.009) | 0.967 | 0.9786 | 0.982 | 0.983 |
Fig. 2Genotype correlation by MAF class. a Genotype correlation by program. Mean genotype correlation (and range) obtained by imputation for each program (Beagle, FImpute, Impute2, Minimac) (b) Mean genotype correlation (and range) per MAF class with Minimac for S5-S7. The symbols are placed at the maximum MAF of the corresponding MAF class
Fig. 3Identification of QTL for milk fat percentage at different lactation stages: Manhattan plot representing the association of 13,036,370 imputed sequence variants with fat content in early (a) and late (b) lactation. Red color represents variants with p < 3.84 × 10−9
Fig. 4Detailed view of two QTL for fat content: Detailed overview of two QTL on BTA5 (a) and BTA20 (b) that were associated with FClate and FCearly, respectively. Grey and orange diamonds represent sequence and array-derived variants, respectively. Red diamonds represent candidate causal trait variants that were identified in breeds other than Brown Swiss