| Literature DB >> 25519374 |
Ally Rogers1, Andrew Beck2, Nathan L Tintle1.
Abstract
Genotype errors are well known to increase type I errors and/or decrease power in related tests of genotype-phenotype association, depending on whether the genotype error mechanism is associated with the phenotype. These relationships hold for both single and multimarker tests of genotype-phenotype association. To assess the potential for genotype errors in Genetic Analysis Workshop 18 (GAW18) data, where no gold standard genotype calls are available, we explored concordance rates between sequencing, imputation, and microarray genotype calls. Our analysis shows that missing data rates for sequenced individuals are high and that there is a modest amount of called genotype discordance between the 2 platforms, with discordance most common for lower minor allele frequency (MAF) single-nucleotide polymorphisms (SNPs). Some evidence for discordance rates that were different between phenotypes was observed, and we identified a number of cases where different technologies identified different bases at the variant site. Type I errors and power loss is possible as a result of missing genotypes and errors in called genotypes in downstream analysis of GAW18 data.Entities:
Year: 2014 PMID: 25519374 PMCID: PMC4143748 DOI: 10.1186/1753-6561-8-S1-S22
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Cross-classification of results summed over all SNPs and individuals
| Sequence genotype1 | Microarray genotype | Total | |||
|---|---|---|---|---|---|
| AA | AB | BB | Missing (XX) | ||
| AA | 117,284,236 | 58,271 | 1,309 | 2,554 | 117,346,370 |
| AB | 101,015 | 65,584,521 | 29,302 | 8,970 | 65,723,808 |
| BB | 6,844 | 339,856 | 41,656,995 | 24,361 | 42,028,056 |
| Missing (XX) | 3,009,304 | 1,506,621 | 977,234 | 5,911 | 5,499,070 |
| Total | 120,401,399 | 67,489,269 | 42,664,840 | 41,796 | 230,597,304 |
Conditional concordance rates (conditional on microarray genotype; SE in parentheses)
| Sequence genotype1 | Microarray genotype | ||||
|---|---|---|---|---|---|
| AA | AB | BB | Missing (XX) | ||
| AA | NGS | 0.998 (0.03) | 0.003 (0.03) | 0.004 (0.06) | 0.52 (0.36) |
| Imp | 0.998 (0.02) | 0.02 (0.11) | 0.0006 (0.02) | 0.44 (0.46) | |
| AB | NGS | 0.0009 (0.02) | 0.996 (0.04) | 0.02 (0.08) | 0.30 (0.24) |
| Imp | 0.002 (0.02) | 0.980 (0.12) | 0.006 (0.06) | 0.38 (0.43) | |
| BB | NGS | 0.0007 (0.02) | 0.0006 (0.02) | 0.98 (0.10) | 0.19 (0.28) |
| Imp | 8 × 10−5 (0.008) | 0.003 (0.05) | 0.993 (0.07) | 0.18 (0.35) | |
| Missing (XX) | NGS | 8 × 10−5 (0.005) | 0.0006 (0.02) | 0.0007 (0.01) | 0.001 (0.02) |
| Imp | 8 × 10−5 (0.005) | 0.0003 (0.009) | 0.0004 (0.01) | 0.007 (0.07) | |
Conditional concordance rates of hypertensive vs. nonhypertensive individuals (conditional on microarray; SE in parentheses)
| Sequence genotype | Microarray genotype | ||||
|---|---|---|---|---|---|
| AA | AB | BB | Missing (XX) | ||
| AA | Hypertensive | 0.998 (0.03) | 0.009 (0.07) | 0.005 (0.07) | 0.51 (0.43) |
| Nonhypertensive | 0.999 (0.02) | 0.010 (0.07) | 0.0004 (0.02) | 0.51 (0.37) | |
| AB | Hypertensive | 0.001 (0.02) | 0.990 (0.08) | 0.013 (0.07) | 0.30 (0.36) |
| Nonhypertensive | 0.001 (0.02) | 0.987 (0.08) | 0.011 (0.07) | 0.30 (0.26) | |
| BB | Hypertensive | 0.0009 (0.03) | 0.002 (0.03) | 0.982 (0.10) | 0.19 (0.33) |
| Nonhypertensive | 4 × 10−5 (0.005) | 0.002 (0.04) | 0.989 (0.07) | 018 (0.28) | |
| Missing (XX) | Hypertensive | 7 × 10−5 (0.005) | 0.0004 (0.009) | 0.0004 (0.01) | 0.001 (0.02) |
| Nonhypertensive | 7 × 10−5 (0.004) | 0.0005 (0.01) | 0.0005 (0.01) | 0.002 (0.02) | |