| Literature DB >> 24495554 |
Chris Schrooten1, Romain Dassonneville, Vincent Ducrocq, Rasmus F Brøndum, Mogens S Lund, Jun Chen, Zengting Liu, Oscar González-Recio, Juan Pena, Tom Druet.
Abstract
BACKGROUND: Imputation of genotypes from low-density to higher density chips is a cost-effective method to obtain high-density genotypes for many animals, based on genotypes of only a relatively small subset of animals (reference population) on the high-density chip. Several factors influence the accuracy of imputation and our objective was to investigate the effects of the size of the reference population used for imputation and of the imputation method used and its parameters. Imputation of genotypes was carried out from 50,000 (moderate-density) to 777,000 (high-density) SNPs (single nucleotide polymorphisms).Entities:
Mesh:
Year: 2014 PMID: 24495554 PMCID: PMC3929158 DOI: 10.1186/1297-9686-46-10
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Number and type of animals in each of the three analyzed datasets
| | | | ||||||
|---|---|---|---|---|---|---|---|---|
| | | | | | ||||
| Dataset 1 | 488 | 0 | 0 | 60 | Yes | Yes | Yes | Yes |
| Dataset 2 | 898 | 331 | 0 | 60 | No | No | Yes | No |
| Dataset 3 | 488 | 0 | 2200 | 60 | Yes | No | Yes | No |
Details on chromosomes in the analyses
| 1 | 2571 | 37791 | 0.0680 | 158.32 | 16.24 |
| 6 | 1989 | 29805 | 0.0667 | 119.45 | 16.65 |
| 11 | 1696 | 27398 | 0.0619 | 107.28 | 15.81 |
| 14 | 1376 | 17269 | 0.0797 | 84.63 | 16.26 |
| 20 | 1164 | 18381 | 0.0633 | 71.99 | 16.17 |
| 29 | 793 | 12690 | 0.0625 | 51.50 | 15.40 |
For each chromosome: number of markers that were not masked in validation animals (n_50k), total number of markers (n_HD), fraction unmasked (fr_unm), length of chromosome in centiMorgan, and average number of unmasked markers per cM (n50k/cM).
Average allelic imputation error rate (%) for each of six chromosomes and averaged across chromosomes
| 1 | 1.82 | 0.89 | 0.72 | 0.76 | 0.47 | 1.80 | 0.70 |
| 6 | 1.73 | 0.81 | 0.61 | 0.65 | 0.38 | 1.71 | 0.58 |
| 11 | 2.05 | 0.89 | 0.66 | 0.71 | 0.40 | 1.98 | 0.61 |
| 14 | 1.82 | 0.76 | 0.56 | 0.61 | 0.36 | 1.76 | 0.54 |
| 20 | 1.94 | 0.86 | 0.68 | 0.71 | 0.35 | 1.88 | 0.64 |
| 29 | 2.41 | 1.05 | 0.82 | 0.84 | 0.52 | 2.38 | 0.78 |
| Average | 1.91 | 0.87 | 0.67 | 0.71 | 0.41 | 1.87 | 0.64 |
Results are presented for four methods and three datasets: method A is a combination of Beagle 2.1.3 and DAGPHASE with scale and shift parameters equal to 2.0 and 0.1; method B is the same as method A, but with scale and shift parameters equal to 1.0 and 0.0; method C is Beagle version 3.3.0; method D is DAGPHASE using DAG from method C.
Figure 1Number of validation animals per imputation error rate class in dataset 1 (C_1) and dataset 2 (C_2), using method C (Beagle version 3.3.0).
Figure 2Average allelic imputation error rate (%) for each of 20 classes of allele frequency. Results are presented for four combinations of method (A, B or C) and dataset (1 or 2); method A: combination of Beagle 2.1.3 and DAGPHASE with scale and shift parameters equal to 2.0 and 0.1; method B: same as method A, but with scale and shift parameters equal to 1.0 and 0.0; method C: Beagle version 3.3.0; dataset 1: 548 HD-genotyped animals; dataset 2: 1289 HD-genotyped animals.
Figure 3Average locus imputation error rate (%) for each of 10 classes of SNPs according to their minor allele frequency. Results are presented for the combination of method C (Beagle version 3.3.0) and dataset 2 (1289 HD-genotyped animals).
Figure 4Average allelic imputation error rate (%) for each of seven traceability classes. Results are presented for four combinations of method (A, B or C) and dataset (1 or 2). Method A: combination of Beagle 2.1.3 and DAGPHASE with scale and shift parameters equal to 2.0 and 0.1; method B: same as method A, but with scale and shift parameters equal to 1.0 and 0.0; method C: Beagle version 3.3.0; dataset 1: 548 HD-genotyped animals; dataset 2: 1289 HD-genotyped animals.
Average, minimum and maximum computation time (h) per chromosome for each of four methods applied to dataset 1
| | | | | |
|---|---|---|---|---|
| Average | 1.7 | 6.9 | 0.7 | 0.3 |
| Minimum | 0.8 | 3.0 | 0.4 | 0.2 |
| Maximum | 2.9 | 11.9 | 1.1 | 0.5 |
Method A is a combination of Beagle 2.1.3 and DAGPHASE with scale and shift parameters equal to 2.0 and 0.1; method B is the same as method A, but with scale and shift parameters equal to 1.0 and 0.0; method C is Beagle version 3.3.0; method D is DAGPHASE using DAG from method C.