| Literature DB >> 26282557 |
Marzieh Heidaritabar1, Mario P L Calus2, Addie Vereijken3, Martien A M Groenen4, John W M Bastiaansen5.
Abstract
BACKGROUND: Genotype imputation has become a standard practice in modern genetic research to increase genome coverage and improve the accuracy of genomic selection (GS) and genome-wide association studies (GWAS). We assessed accuracies of imputing 60K genotype data from lower density single nucleotide polymorphism (SNP) panels using a small set of the most common sires in a population of 2140 white layer chickens. Several factors affecting imputation accuracy were investigated, including the size of the reference population, the level of the relationship between the reference and validation populations, and minor allele frequency (MAF) of the SNP being imputed.Entities:
Mesh:
Year: 2015 PMID: 26282557 PMCID: PMC4539854 DOI: 10.1186/s12863-015-0253-5
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Animal-specific imputation accuracy (rcorrected) on GGA1 for 3K to 60K scenario
| Validation population | Ref22 | Ref62 |
|---|---|---|
| G01 | 0.50 | 0.63 |
| G12 | 0.46 | 0.58 |
| G23 | 0.50 | 0.60 |
1 First generation of genomic selection experiment
2 Offspring of G0
3 Offspring of G1
Animal-specific imputation accuracy (rcorrected) and the standard errors on GGA1 for different MAF classes in G0, G1 and G2 validation populations (48K to 60K scenario)
| Validation population | ||
|---|---|---|
| G01 | ||
| MAF2 class | Ref22 | Ref62 |
| 0.008-0.1 | 0.68 (0.005)a | 0.80 (0.006) |
| 0.1-0.2 | 0.82 (0.004) | 0.89 (0.004) |
| 0.2-0.3 | 0.86 (0.003) | 0.91 (0.003) |
| 0.3-0.4 | 0.88 (0.003) | 0.93 (0.003) |
| 0.4-0.5 | 0.86 (0.003) | 0.91 (0.003) |
| G13 | ||
| MAF class | Ref22 | Ref62 |
| 0.008-0.1 | 0.60 (0.005) | 0.73 (0.005) |
| 0.1-0.2 | 0.80 (0.004) | 0.86 (0.003) |
| 0.2-0.3 | 0.84 (0.002) | 0.89 (0.002) |
| 0.3-0.4 | 0.86 (0.002) | 0.91 (0.002) |
| 0.4-0.5 | 0.81 (0.003) | 0.87 (0.002) |
| G24 | ||
| MAF class | Ref22 | Ref62 |
| 0.008-0.1 | 0.72 (0.007) | 0.78 (0.007) |
| 0.1-0.2 | 0.85 (0.005) | 0.88 (0.005) |
| 0.2-0.3 | 0.87 (0.005) | 0.87 (0.006) |
| 0.3-0.4 | 0.89 (0.004) | 0.92 (0.005) |
| 0.4-0.5 | 0.85 (0.005) | 0.90 (0.005) |
1 First generation of genomic selection experiment
2 Minor allele frequency
3 Offspring of G0
4 Offspring of G1
a The values in parentheses are standard errors
Fig. 1Imputation accuracies in G0, G1 and G2 for 48K to 60K scenario. Imputation accuracies (rcorrected) for different MAF classes and different reference sizes for G0, G1 and G2 validation populations. The x-axis represents different classes of MAF and y-axis shows the imputation accuracies. The black dots are the mean imputation accuracies across individuals in each MAF class
Animal-specific imputation accuracy (rcorrected) with SNPs masked across the different MAF classes when G0 validation population was used for imputation
| MAF1 class | Ref22 | Ref62 |
|---|---|---|
| 0.008–0.1 | 0.80 (193)a | 0.87 (186) |
| 0.1–0.2 | 0.91 (178) | 0.94 (177) |
| 0.2–0.3 | 0.92 (181) | 0.95 (180) |
| 0.3–0.4 | 0.93 (186) | 0.96 (189) |
| 0.4–0.5 | 0.93 (184) | 0.96 (194) |
1 Minor allele frequency
a The numbers in the parentheses are the number of masked SNPs
Animal-specific imputation accuracy (rcorrected) with 22 randomly selected animals (Ref22rand) in the reference population
| MAF1 class | Ref22rand a |
|---|---|
| 0.008–0.1 | 0.61 (0.006)b |
| 0.1–0.2 | 0.82 (0.004) |
| 0.2–0.3 | 0.86 (0.003) |
| 0.3–0.4 | 0.88 (0.003) |
| 0.4–0.5 | 0.83 (0.003) |
1 Minor allele frequency
a Values are the average across 10 random subsets of animals
b The values in parentheses are standard errors
Animal-specific imputation accuracy (rcorrected) of G0 for three groups depending on their direct ancestors in the reference population Ref62
| MAF1 class | GR_S2 ( | GR_MGS4 ( | GR_SMGS5 ( |
|---|---|---|---|
| 0.008–0.1 | 0.80 | 0.79 | 0.80 |
| 0.1–0.2 | 0.89 | 0.90 | 0.89 |
| 0.2–0.3 | 0.90 | 0.92 | 0.91 |
| 0.3–0.4 | 0.93 | 0.93 | 0.92 |
| 0.4–0.5 | 0.91 | 0.91 | 0.89 |
| 3K to 60K scenario | 0.62 | 0.62 | 0.64 |
1 Minor allele frequency
2 Animals who had just their sire (S) in the reference population
3 N is the number of animals
4 Animals who had just their maternal grand sire (MGS) in the reference population
5 Animals who had both their sire and maternal grandsire (SMGS) in the reference population
Average allelic R2 measure from Beagle and true imputation reliability on GGA1 for different MAF classes and different reference sizes (48K to 60K scenario)
| Ref22 | Ref62 | |||
|---|---|---|---|---|
| MAF1 class | Reliabilitya | Allelic R2 | Reliability | Allelic R2 |
| 0.008–0.1 | 0.59 | 0.64 | 0.68 | 0.75 |
| 0.1–0.2 | 0.73 | 0.77 | 0.79 | 0.85 |
| 0.2–0.3 | 0.78 | 0.80 | 0.83 | 0.88 |
| 0.3–0.4 | 0.81 | 0.82 | 0.85 | 0.90 |
| 0.4–0.5 | 0.79 | 0.81 | 0.83 | 0.87 |
1 Minor allele frequency
aReliability is the square of imputation accuracy per SNP across individuals (SNP-specific imputation accuracy), i.e. the imputation accuracy per SNP was squared and were then summed across individuals. Note that the values in this table are average across the three generations (G0, G1 and G2)
Correlation between allelic R2 measure from Beagle and true imputation reliability on GGA1 for different MAF classes and different reference sizes in G0, G1 and G2 (48K to 60K scenario)
| Ref22 | Ref62 | |||||
|---|---|---|---|---|---|---|
| MAF1class | G02 | G13 | G24 | G0 | G1 | G2 |
| 0.008–0.1 | 0.70 | 0.60 | 0.45 | 0.67 | 0.71 | 0.51 |
| 0.1–0.2 | 0.67 | 0.73 | 0.52 | 0.72 | 0.72 | 0.63 |
| 0.2–0.3 | 0.75 | 0.72 | 0.64 | 0.74 | 0.73 | 0.71 |
| 0.3–0.4 | 0.64 | 0.69 | 0.60 | 0.79 | 0.76 | 0.68 |
| 0.4–0.5 | 0.74 | 0.72 | 0.71 | 0.85 | 0.81 | 0.82 |
1 Minor allele frequency
2 First generation of genomic selection experiment
3 Offspring of G0
4 Offspring of G1
Fig. 2Correlation between true imputation reliability and allelic R2 measure from Beagle. True imputation reliability is plotted against the allelic R2 when 96 % of SNPs were masked (3K to 60K scenario) in G0, G1 and G2. The red line is the regression line
Fig. 3Concordance of LD in G0 and G2. LD within each generation was measured as r (correlation) [51] between neighbouring SNPs