| Literature DB >> 24495673 |
Laura J Corbin, Andreas Kranis, Sarah C Blott, June E Swinburne, Mark Vaudin, Stephen C Bishop, John A Woolliams1.
Abstract
BACKGROUND: Despite the dramatic reduction in the cost of high-density genotyping that has occurred over the last decade, it remains one of the limiting factors for obtaining the large datasets required for genomic studies of disease in the horse. In this study, we investigated the potential for low-density genotyping and subsequent imputation to address this problem.Entities:
Mesh:
Year: 2014 PMID: 24495673 PMCID: PMC3930001 DOI: 10.1186/1297-9686-46-9
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Figure 1Data flow for analysis.
The mean proportion of correctly imputed genotypes, as calculated in the within-population analysis of the UK dataset
| 384 | 0.66 (0.52,0.93) | 0.67 (0.55,0.94) | 0.69 (0.55,0.92) |
| 768 | 0.76 (0.59,0.94) | 0.77 (0.62,0.95) | 0.78 (0.59,0.96) |
| 1K | 0.79 (0.61,0.94) | 0.84 (0.66,0.97) | 0.83 (0.64,0.98) |
| 2K | 0.90 (0.70,0.99) | 0.91 (0.71,0.99) | 0.89 (0.68,0.99) |
| 3K | 0.94 (0.70,0.99) | 0.95 (0.73,1.00) | 0.92 (0.67,0.99) |
| 6K | 0.97 (0.79,1.00) | 0.98 (0.78,1.00) | 0.95 (0.75,1.00) |
| 384 | 0.66 (0.30,1.00) | 0.67 (0.30,1.00) | 0.69 (0.36,1.00) |
| 768 | 0.76 (0.37,1.00) | 0.77 (0.44,1.00) | 0.78 (0.50,1.00) |
| 1K | 0.79 (0.44,1.00) | 0.84 (0.50,1.00) | 0.83 (0.53,1.00) |
| 2K | 0.90 (0.56,1.00) | 0.91 (0.63,1.00) | 0.89 (0.53,1.00) |
| 3K | 0.94 (0.68,1.00) | 0.95 (0.72,1.00) | 0.92 (0.66,1.00) |
| 6K | 0.97 (0.79,1.00) | 0.98 (0.83,1.00) | 0.95 (0.72,1.00) |
Mean proportion of correctly imputed genotypes per individual or per SNP for ECA1, with minimum and maximum values in brackets (tables for all chromosomes are in Additional file 4: Table S2); 1total number of SNPs that would be on a genome-wide LDP of equivalent density.
Figure 2The mean proportion of correctly imputed genotypes and its variance across SNPs for ECA1. As calculated in the within-population analysis of the UK dataset and plotted against the total number of SNPs on a genome-wide LDP of equivalent density (figures for all chromosomes are in Additional file 5: Figure S2).
Properties of low density panel SNPs, as calculated in the within-population analysis of the UK dataset
| 384 | bpEQ | 0.22 (0.13) | 6.40 (0.09) |
| | bpMAF | 0.25 (0.10) | 6.40 (0.66) |
| | lduMAF | 0.44 (0.06) | 6.40 (2.41) |
| 768 | bpEQ | 0.25 (0.15) | 3.14 (0.09) |
| | bpMAF | 0.31 (0.13) | 3.14 (0.57) |
| | lduMAF | 0.45 (0.04) | 3.14 (1.65) |
| 1K | bpEQ | 0.22 (0.14) | 2.41 (0.07) |
| | bpMAF | 0.39 (0.08) | 2.38 (0.61) |
| | lduMAF | 0.45 (0.04) | 2.38 (1.42) |
| 2K | bpEQ | 0.23 (0.14) | 1.19 (0.06) |
| | bpMAF | 0.28 (0.11) | 1.19 (0.30) |
| | lduMAF | 0.46 (0.04) | 1.19 (1.19) |
| 3K | bpEQ | 0.23 (0.14) | 0.79 (0.06) |
| | bpMAF | 0.30 (0.12) | 0.79 (0.23) |
| | lduMAF | 0.46 (0.03) | 0.79 (0.92) |
| 6K | bpEQ | 0.23 (0.14) | 0.39 (0.07) |
| | bpMAF | 0.29 (0.11) | 0.39 (0.17) |
| lduMAF | 0.43 (0.05) | 0.39 (0.63) |
Properties of low density panel SNPs for ECA1 selected using three methods (tables for all chromosomes are in Additional file 6: Table S3); 1total number of SNPs that would be on a genome-wide LDP of equivalent density.
Figure 3The proportion of correctly imputed genotypes plotted against the MAF of the SNPs being imputed (calculated in the reference population) for ECA1 (bpEQ). As calculated in the within-population analysis of the UK dataset. a) 384 panel; b) 1K panel; c) 6K panel.
Figure 4The proportion of correctly imputed genotypes by SNP and the mean linkage disequilibrium plotted against SNP position for the 1K panel. The figure presents Lowess curves, as calculated in R [45-48]; green = bpEQ; blue = bpMAF; red = lduMAF; black = mean linkage disequilibrium () in sliding windows of 1 Mb (with 0.5 Mb overlap); yellow = hypothesised position of the centromere. a) ECA1; b) ECA10; c) ECA20; d) ECA26.
The mean correlation between true and predicted genotypes, as calculated in the within-population analysis of the UK dataset
| 384 | 0.46 (0.14,0.89) | 0.49 (0.20,0.91) | 0.53 (0.22,0.89) |
| 768 | 0.64 (0.36,0.93) | 0.66 (0.38,0.93) | 0.69 (0.37,0.94) |
| 1K | 0.70 (0.41,0.93) | 0.78 (0.51,0.96) | 0.75 (0.47,0.98) |
| 2K | 0.86 (0.53,0.99) | 0.88 (0.62,0.98) | 0.85 (0.52,0.99) |
| 3K | 0.92 (0.59,0.99) | 0.94 (0.60,1.00) | 0.88 (0.48,0.99) |
| 6K | 0.97 (0.73,1.00) | 0.97 (0.71,1.00) | 0.93 (0.61,1.00) |
| 384 | 0.30 (-0.17,1.00) | 0.32 (-0.14,1.00) | 0.36 (-0.08,1.00) |
| 768 | 0.52 (-0.08,1.00) | 0.53 (-0.06,1.00) | 0.55 (-0.05,1.00) |
| 1K | 0.60 (-0.04,1.00) | 0.67 (-0.05,1.00) | 0.64 (-0.05,1.00) |
| 2K | 0.81 (-0.04,1.00) | 0.83 (-0.02,1.00) | 0.79 (-0.02,1.00) |
| 3K | 0.89 (-0.01,1.00) | 0.90 (-0.03,1.00) | 0.83 (-0.02,1.00) |
| 6K | 0.95 (0.25,1.00) | 0.96 (0.49,1.00) | 0.90 (-0.01,1.00) |
Mean correlation between true and predicted genotypes per individual or per SNP for ECA1, with minimum and maximum values in brackets (tables for all chromosomes are in Additional file 7: Table S4); 1total number of SNPs that would be on a genome-wide LDP of equivalent density.
Figure 5The correlation between true and imputed genotypes by SNP. a) Plotted against the proportion of correctly imputed genotypes; b) Plotted against the proportion of correctly imputed genotypes, scaled by the proportion expected from random imputation. Black = SNPs with MAF ≥ 0.40; blue = SNPs with 0.30 ≤ MAF < 0.40; green = SNPs with 0.20 ≤ MAF < 0.30; yellow = SNPs with 0.10 ≤ MAF < 0.20; red = SNPs with MAF < 0.10; data for ECA1 and 1K panel.
The mean proportion of correctly imputed genotypes for ECA1 and ECA26, as calculated in the between-population analysis of the US dataset with the 2K panel
| ECA1 | Randomd | bpEQ | 0.55 | 0.55 | 0.56 |
| bpMAF | 0.55 | 0.55 | 0.56 | ||
| lduMAF | 0.56 | 0.56 | 0.57 | ||
| Beaglee | bpEQ | 0.90 | 0.90 | 0.92 | |
| bpMAF | 0.91 | 0.91 | 0.93 | ||
| lduMAF | 0.89 | 0.89 | 0.92 | ||
| ECA26 | Randomf | bpEQ | 0.51 | 0.50 | 0.51 |
| bpMAF | 0.52 | 0.51 | 0.51 | ||
| lduMAF | 0.52 | 0.51 | 0.51 | ||
| Beagleg | bpEQ | 0.82 | 0.81 | 0.86 | |
| bpMAF | 0.85 | 0.84 | 0.89 | ||
| lduMAF | 0.88 | 0.85 | 0.90 |
areference population B and test population C; breference population B and test population D + E; creference population C and test population D; dSE across SNPs and samples was equal to 0.003 and 9×10-4 to 2×10-3, respectively; eSE across SNPs and samples was equal to 0.001 and 2×10-3 to 5×10-3, respectively; fSE across SNPs and samples was equal to 0.006 and 1×10-3 to 3×10-3, respectively; gSE across SNPs and samples was equal to 2×10-3 to 4×10-3 and 4×10-3 to 1×10-2, respectively.