| Literature DB >> 23621897 |
Eduardo C G Pimentel1, Monika Wensch-Dorendorf, Sven König, Hermann H Swalve.
Abstract
BACKGROUND: The most common application of imputation is to infer genotypes of a high-density panel of markers on animals that are genotyped for a low-density panel. However, the increase in accuracy of genomic predictions resulting from an increase in the number of markers tends to reach a plateau beyond a certain density. Another application of imputation is to increase the size of the training set with un-genotyped animals. This strategy can be particularly successful when a set of closely related individuals are genotyped.Entities:
Mesh:
Year: 2013 PMID: 23621897 PMCID: PMC3652763 DOI: 10.1186/1297-9686-45-12
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Figure 1Assumed family members with available genotypic information (black) used for imputing an un-genotyped dam (red).
Mean linkage disequilibrium (r) within different inter-marker distances in the simulated populations used for the comparison of imputation methods
| LowLD_NoSel | 0.15 | 0.13 | 0.12 | 0.10 | 0.08 | 0.05 |
| LowLD_Sel | 0.28 | 0.26 | 0.25 | 0.23 | 0.21 | 0.18 |
| HighLD_NoSel | 0.35 | 0.29 | 0.25 | 0.21 | 0.17 | 0.11 |
| HighLD_Sel | 0.48 | 0.43 | 0.37 | 0.34 | 0.30 | 0.24 |
Values are means across 10 replicates.
Percentage of correctly imputed genotypes of the Dams for two imputation methods
| Single_Step | 0.70 ± 0.003 | 0.77 ± 0.045 | 0.81 ± 0.005 | 0.85 ± 0.019 |
| Two_Step | 0.60 ± 0.004 | 0.71 ± 0.056 | 0.75 ± 0.005 | 0.80 ± 0.021 |
Values are the means ± standard deviations across 10 replicates.
Figure 2Average proportion of correctly imputed genotypes within each class of Dams, from both imputation methods for each simulated population structure. Each data point is the mean percentage within Dam class in a given replicate. Classes correspond to number of genotypes unambiguously inferred: blue square (lower than 100), green circle (between 100 and 300) and red triangle (greater than 300).
Correlation between true and imputed genotypes from different imputation methods and programs
| Single_Step | 0.76 ± 0.003 | 0.83 ± 0.038 | 0.88 ± 0.004 | 0.90 ± 0.013 |
| Single_Step* | 0.81 ± 0.003 | 0.86 ± 0.028 | 0.90 ± 0.003 | 0.93 ± 0.009 |
| Two_Step | 0.57 ± 0.008 | 0.74 ± 0.066 | 0.80 ± 0.006 | 0.85 ± 0.021 |
| findhap.f90 | 0.52 ± 0.006 | 0.69 ± 0.065 | 0.74 ± 0.006 | 0.82 ± 0.030 |
| AlphaImpute | 0.83 ± 0.003 | 0.87 ± 0.024 | 0.86 ± 0.004 | 0.89 ± 0.010 |
Values are the means ± standard deviations across 10 replicates.
Mean linkage disequilibrium (r) within different inter-marker distances in the simulated populations used for the comparison of the accuracy of genomic predictions with and without imputation
| LowLD_NoSel | 0.14 | 0.12 | 0.10 | 0.08 | 0.06 | 0.03 |
| LowLD_Sel | 0.21 | 0.19 | 0.18 | 0.16 | 0.14 | 0.12 |
| HighLD_NoSel | 0.36 | 0.30 | 0.26 | 0.22 | 0.17 | 0.11 |
| HighLD_Sel | 0.43 | 0.37 | 0.32 | 0.29 | 0.24 | 0.18 |
Values are the means across 10 replicates.
Figure 3Accuracies of genomic prediction for different values of trait h, number of female progeny in the last generation and population structure. Red surfaces represent accuracies obtained with TS (90% of the progeny in the training set) and green surfaces represent accuracies obtained with TSA (90% of the progeny + the imputed Dams).