| Literature DB >> 21513519 |
A J Agopian1, Laura E Mitchell.
Abstract
BACKGROUND: Several platforms for the analysis of genome-wide association data are available. However, these platforms focus on the evaluation of the genotype inherited by affected (i.e. case) individuals, whereas for some conditions (e.g. birth defects) the genotype of the mothers of affected individuals may also contribute to risk. For such conditions, it is critical to evaluate associations with both the maternal and the inherited (i.e. case) genotype. When genotype data are available for case-parent triads, a likelihood-based approach using log-linear modeling can be used to assess both the maternal and inherited genotypes. However, available software packages for log-linear analyses are not well suited to the analysis of typical genome-wide association data (e.g. including missing data).Entities:
Mesh:
Year: 2011 PMID: 21513519 PMCID: PMC3110146 DOI: 10.1186/1471-2105-12-117
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Comparison of PLINK (A) and LEM (B) data format and example data for three hypothetical case-parent triads
| (A) | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1001 | 1001-A | 1001-C | 1001-B | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 2 |
| 1001 | 1001-B | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 |
| 1001 | 1001-C | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 0 | 0 |
| 1002 | 1002-A | 1002-C | 1002-B | 2 | 1 | 0 | 0 | 2 | 2 | 1 | 2 | 1 | 1 | 1 | 2 |
| 1002 | 1002-B | 0 | 0 | 2 | 0 | 2 | 2 | 1 | 2 | 0 | 0 | 1 | 1 | 1 | 1 |
| 1002 | 1002-C | 0 | 0 | 1 | 0 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 2 | 2 | 2 |
| 1003 | 1003-A | 0 | 1003-B | 2 | 1 | 1 | 2 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 |
| 1003 | 1003-B | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 2 | 1 | 1 | 0 | 0 | 1 | 1 |
| 1 | 1 | 1 | |||||||||||||
| 3 | 1 | 0 | |||||||||||||
| 1 | 0 | 2 | |||||||||||||
a 1 = male, 2 = female
b 1 = unaffected, 2 = affected
c 0 = allele missing, 1 = allele 1, 2 = allele 2
d 0 = genotype missing, 1 = no high-risk alleles present, 2 = one high-risk alleles present, 3 = two high-risk alleles present
Figure 1Summary of MI-GWAS platform structure, displaying steps preformed on subsets of 1,000 SNPs at a time.
Comparison of chi-square values from the PLINK TDT and MI-GWAS LRT for inherited genetic effects for a randomly selected set of SNPs on chromosome one and most significant autosomal SNPs.
| SNP | PLINK TDT chi-square value | MI-GWAS log-linear modeling chi-square value | |
|---|---|---|---|
| Chromosome 1 SNPs | |||
| SNP 1 | 0.02 | 0.02 | 0.00 |
| SNP 2 | 2.22 | 2.22 | 0.00 |
| SNP 3 | 0.91 | 0.91 | 0.00 |
| SNP 4 | 0.09 | 0.09 | 0.00 |
| SNP 5 | 0.24 | 0.24 | 0.00 |
| SNP 6 | 1.58 | 1.58 | 0.00 |
| SNP 7 | 2.97 | 2.98 | 0.00 |
| SNP 8 | 1.22 | 1.22 | 0.00 |
| Most significant autosomal SNPs | |||
| SNP 9 | 22.34 | 24.06 | 0.08 |
| SNP 10 | 22.29 | 22.57 | 0.01 |
| SNP 11 | 21.48 | 21.64 | 0.01 |
| SNP 12 | 20.78 | 21.02 | 0.01 |
| SNP 13 | 19.56 | 19.77 | 0.01 |
a Absolute difference between PLINK chi-square value and MI-GWAS chi-square value divided by PLINK chi-square value
Running times for the analysis of the same 60,000 SNPs using MI-GWAS on four computers with differing specifications
| Machine specifications | Running Time |
|---|---|
| Intel Core 2 Quad CPU Q9550, 2.83 GHz, 3.21 Gb of RAM | 11 hours 35 minutes |
| Pentium 4 CPU, 3.00 GHz, 2.00 Gb of RAM | 20 hours 52 minutes |
| Pentium D CPU, 3.20 GHz, 1.99 Gb of RAM | 21 hours 24 minutes |
| Intel Xeon CPU E5540, 2.53 GHz, 6.00 Gb of RAM | 22 hours 48 minutes |