| Literature DB >> 23228675 |
Z Weng1, Z Zhang, Q Zhang, W Fu, S He, X Ding.
Abstract
Imputation of high-density genotypes from low- or medium-density platforms is a promising way to enhance the efficiency of whole-genome selection programs at low cost. In this study, we compared the efficiency of three widely used imputation algorithms (fastPHASE, BEAGLE and findhap) using Chinese Holstein cattle with Illumina BovineSNP50 genotypes. A total of 2108 cattle were randomly divided into a reference population and a test population to evaluate the influence of the reference population size. Three bovine chromosomes, BTA1, 16 and 28, were used to represent large, medium and small chromosome size, respectively. We simulated different scenarios by randomly masking 20%, 40%, 80% and 95% single-nucleotide polymorphisms (SNPs) on each chromosome in the test population to mimic different SNP density panels. Illumina Bovine3K and Illumina BovineLD (6909 SNPs) information was also used. We found that the three methods showed comparable accuracy when the proportion of masked SNPs was low. However, the difference became larger when more SNPs were masked. BEAGLE performed the best and was most robust with imputation accuracies >90% in almost all situations. fastPHASE was affected by the proportion of masked SNPs, especially when the masked SNP rate was high. findhap ran the fastest, whereas its accuracies were lower than those of BEAGLE but higher than those of fastPHASE. In addition, enlarging the reference population improved the imputation accuracy for BEAGLE and findhap, but did not affect fastPHASE. Considering imputation accuracy and computational requirements, BEAGLE has been found to be more reliable for imputing genotypes from low- to high-density genotyping platforms.Entities:
Mesh:
Year: 2012 PMID: 23228675 PMCID: PMC3608330 DOI: 10.1017/S1751731112002224
Source DB: PubMed Journal: Animal ISSN: 1751-7311 Impact factor: 3.240
Genomic information of three bovine chromosomes in Chinese Holstein
| Chromosome | No. SNP | Length (Mb) | Call rate | Average interval (kb) | LD ( |
|---|---|---|---|---|---|
| 1 | 2877 | 161.06 | 0.98 | 55.98 | 0.27 |
| 16 | 1367 | 77.82 | 0.98 | 56.92 | 0.28 |
| 28 | 823 | 46.00 | 0.98 | 55.89 | 0.21 |
SNP = single-nucleotide polymorphism; LD = linkage disequilibrium.
Average imputation accuracy from BEAGLE, fastPHASE and findhap at different masked SNPs rate based on 10 replicates
| Chromosome | Proportion of masked SNP genotypes in the test population | BEAGLE 3.3 (20 iterations) | fastPHASE 1.4 (20 haplotype clusters) | fastPHASE 1.4 (30 haplotype clusters) | findhap 2.0 (20 iterations) |
|---|---|---|---|---|---|
| BTA1 | 0.20 | 0.986 ± 0.001 | 0.970 ± 0.001 | 0.981 ± 0.001 | 0.951 ± 0.001 |
| 0.40 | 0.984 ± 0.001 | 0.967 ± 0.002 | 0.979 ± 0.001 | 0.955 ± 0.001 | |
| 0.80 | 0.972 ± 0.001 | 0.913 ± 0.003 | 0.931 ± 0.003 | 0.952 ± 0.002 | |
| 0.95 | 0.918 ± 0.004 | 0.724 ± 0.005 | 0.721 ± 0.004 | 0.869 ± 0.004 | |
| BTA16 | 0.20 | 0.984 ± 0.001 | 0.967 ± 0.002 | 0.979 ± 0.002 | 0.942 ± 0.003 |
| 0.40 | 0.983 ± 0.001 | 0.963 ± 0.001 | 0.975 ± 0.002 | 0.949 ± 0.002 | |
| 0.80 | 0.968 ± 0.001 | 0.898 ± 0.005 | 0.919 ± 0.005 | 0.940 ± 0.006 | |
| 0.95 | 0.901 ± 0.007 | 0.709 ± 0.008 | 0.715 ± 0.007 | 0.842 ± 0.009 | |
| BTA28 | 0.20 | 0.981 ± 0.001 | 0.957 ± 0.002 | 0.971 ± 0.002 | 0.943 ± 0.003 |
| 0.40 | 0.978 ± 0.001 | 0.952 ± 0.002 | 0.966 ± 0.003 | 0.950 ± 0.003 | |
| 0.80 | 0.960 ± 0.001 | 0.889 ± 0.005 | 0.915 ± 0.005 | 0.933 ± 0.008 | |
| 0.95 | 0.913 ± 0.009 | 0.699 ± 0.006 | 0.700 ± 0.010 | 0.889 ± 0.029 |
SNP = single-nucleotide polymorphism.
Computing time of BEAGLE, fastPHASE and findhap at different masked SNPs rate based on 10 replicates
| Chromosome | Proportion of masked SNP genotypes in test population | BEAGLE 3.3 (20 iterations) (h) | fastPHASE 1.4 (20 haplotype clusters) (h) | fastPHASE 1.4 (30 haplotype clusters) (h) | findhap 2.0 (20 iterations) (s) |
|---|---|---|---|---|---|
| BTA1 | 0.20 | 0.893 ± 0.049 | 55.892 ± 0.364 | 115.196 ± 0.808 | 23.333 ± 2.739 |
| 0.40 | 1.244 ± 0.055 | 39.123 ± 0.620 | 81.920 ± 0.745 | 23.111 ± 2.088 | |
| 0.80 | 4.681 ± 0.242 | 37.140 ± 0.204 | 78.893 ± 0.550 | 23.222 ± 1.787 | |
| 0.95 | 18.531 ± 0.635 | 37.583 ± 0.306 | 79.090 ± 0.538 | 73.889 ± 3.586 | |
| BTA16 | 0.20 | 0.434 ± 0.012 | 26.327 ± 0.721 | 54.788 ± 0.472 | 34.000 ± 0.707 |
| 0.40 | 0.662 ± 0.012 | 19.832 ± 0.267 | 41.899 ± 0.752 | 34.444 ± 0.527 | |
| 0.80 | 2.990 ± 0.067 | 18.001 ± 0.818 | 37.449 ± 0.356 | 34.333 ± 0.866 | |
| 0.95 | 10.024 ± 0.489 | 17.908 ± 0.176 | 37.738 ± 0.281 | 45.778 ± 1.716 | |
| BTA28 | 0.20 | 0.207 ± 0.013 | 15.686 ± 0.148 | 33.420 ± 0.173 | 25.889 ± 2.369 |
| 0.40 | 0.333 ± 0.013 | 12.702 ± 0.300 | 26.570 ± 0.312 | 25.444 ± 1.130 | |
| 0.80 | 1.632 ± 0.125 | 10.474 ± 0.035 | 22.640 ± 0.109 | 25.444 ± 0.726 | |
| 0.95 | 5.830 ± 0.303 | 10.939 ± 0.224 | 22.550 ± 0.097 | 33.778 ± 1.093 |
SNP = single-nucleotide polymorphism.
Imputation accuracy from BEAGLE, fastPHASE and findhap at Illumina 3k (Bovine3K) and 7k (BovineLD) chip
| SNP chip | Chromosome | BEAGLE 3.3 (20 iterations) | fastPHASE 1.4 (20 haplotype clusters) | findhap 2.0 (20 iterations) |
|---|---|---|---|---|
| 3k | BTA1 | 0.943 | 0.775 | 0.910 |
| BTA16 | 0.930 | 0.745 | 0.883 | |
| BTA28 | 0.914 | 0.731 | 0.876 | |
| 7k | BTA1 | 0.968 | 0.906 | 0.946 |
| BTA16 | 0.967 | 0.891 | 0.938 | |
| BTA28 | 0.961 | 0.879 | 0.932 |
SNP = single-nucleotide polymorphism.
Average imputation accuracy of BEAGLE, fastPHASE and findhap with different reference population size in scenario of randomly masking 95% genotypes on chromosome 16
| Composition of the reference population | Composition of the test population | BEAGLE 3.3 (20 iterations) | fastPHASE 1.4 (20 haplotype clusters) | findhap 2.0 (20 iterations) |
|---|---|---|---|---|
| 87 bulls | 2021 cows | 0.687 ± 0.006 | 0.702 ± 0.005 | 0.775 ± 0.004 |
| 87 bulls and 101 cows | 1920 cows | 0.831 ± 0.005 | 0.704 ± 0.010 | 0.812 ± 0.008 |
| 87 bulls and 404 cows | 1617 cows | 0.878 ± 0.007 | 0.720 ± 0.033 | 0.829 ± 0.007 |
| 87 bulls and 1011 cows | 1010 cows | 0.901 ± 0.007 | 0.709 ± 0.008 | 0.842 ± 0.009 |