| Literature DB >> 19228433 |
Zhaoxia Yu1, Chad Garner, Argyrios Ziogas, Hoda Anton-Culver, Daniel J Schaid.
Abstract
BACKGROUND: Genome-wide association studies with single nucleotide polymorphisms (SNPs) show great promise to identify genetic determinants of complex human traits. In current analyses, genotype calling and imputation of missing genotypes are usually considered as two separated tasks. The genotypes of SNPs are first determined one at a time from allele signal intensities. Then the missing genotypes, i.e., no-calls caused by not perfectly separated signal clouds, are imputed based on the linkage disequilibrium (LD) between multiple SNPs. Although many statistical methods have been developed to improve either genotype calling or imputation of missing genotypes, treating the two steps independently can lead to loss of genetic information.Entities:
Mesh:
Year: 2009 PMID: 19228433 PMCID: PMC2753842 DOI: 10.1186/1471-2105-10-63
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The clustering results based on a one-marker-at-a-time method. Values on the X-axis and Y-axis are normalized signal intensities of two alternative alleles (A and C). Estimated genotypes "AA", "AC", and "CC" are indicated by symbols "0", "1", and "2", respectively. Question marks represent missing values (i.e. no calls).
Call rate and genotyping error for two SNPs in high quality
| call rate (%) | error rate (%) | |||||
| 0.802 | 0.500 | 0.304 | 0.802 | 0.500 | 0.304 | |
| M1 | 99.75 | 99.75 | 99.75 | 0.05 | 0.05 | 0.05 |
| M2 | 99.87 | 99.75 | 99.75 | 0.06 | 0.05 | 0.05 |
| M3 | 99.87 | 99.82 | 99.80 | 0.04 | 0.05 | 0.05 |
M1: the one-SNP-at-a-time method; M2: imputation based on estimated genotypes from M1 and LD structure; M3: our method that jointly analyzes the signal of two SNPs.
Comparison of M2 and our method M3 on the estimation of r2
| 0.802 | 0.500 | 0.304 | ||
| high quality | M2 | .798 (4e-5) | .497 (3e-5) | .302 (2e-5) |
| M3 | .799 (3e-5) | .498 (2e-5) | .303 (1e-5) | |
| low quality | M2 | .757 (.002) | .469 (.001) | .283 (6e-4) |
| M3 | .770 (.001) | .478 (6e-4) | .291 (3e-4) | |
| mixed quality | M2 | .775 (8e-4) | .482 (4e-4) | .293 (2e-4) |
| M3 | .784 (4e-4) | .488 (2e-4) | .297 (1e-4) | |
The first row shows true values of the square of Pearson's correlation. Numbers in parentheses are mean square errors.
Call rate and genotyping error for two SNPs in low quality
| call rate (%) | error rate (%) | |||||
| 0.802 | 0.500 | 0.304 | 0.802 | 0.500 | 0.304 | |
| M1 | 97.96 | 97.97 | 97.95 | 0.53 | 0.54 | 0.54 |
| M2 | 98.90 | 97.97 | 97.95 | 0.64 | 0.54 | 0.54 |
| M3 | 98.75 | 98.35 | 98.16 | 0.32 | 0.44 | 0.50 |
Call rate and genotyping error for two SNPs in mixed quality
| call rate (%) | error rate (%) | ||||||
| 0.802 | 0.500 | 0.304 | 0.802 | 0.500 | 0.304 | ||
| M1 | SNP1 | 99.75 | 99.75 | 99.74 | 0.05 | 0.05 | 0.05 |
| SNP2 | 97.96 | 97.98 | 97.97 | 0.54 | 0.54 | 0.53 | |
| M2 | SNP1 | 99.87 | 99.75 | 99.74 | 0.06 | 0.05 | 0.05 |
| SNP2 | 98.89 | 98.02 | 97.97 | 0.63 | 0.56 | 0.53 | |
| M3 | SNP1 | 99.87 | 99.82 | 99.79 | 0.04 | 0.05 | 0.05 |
| SNP2 | 98.81 | 98.37 | 98.19 | 0.30 | 0.43 | 0.49 | |
SNP1 was in high quality and SNP2 was in low quality.