| Literature DB >> 23014095 |
Ryan J Urbanowicz1, Jeff Kiralis, Jonathan M Fisher, Jason H Moore.
Abstract
BACKGROUND: Algorithms designed to detect complex genetic disease associations are initially evaluated using simulated datasets. Typical evaluations vary constraints that influence the correct detection of underlying models (i.e. number of loci, heritability, and minor allele frequency). Such studies neglect to account for model architecture (i.e. the unique specification and arrangement of penetrance values comprising the genetic model), which alone can influence the detectability of a model. In order to design a simulation study which efficiently takes architecture into account, a reliable metric is needed for model selection.Entities:
Year: 2012 PMID: 23014095 PMCID: PMC3549792 DOI: 10.1186/1756-0381-5-15
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
A 2-locus purely epistatic penetrance function
| | | ||||
|---|---|---|---|---|---|
| | AA(.36) | .266 | .764 | .664 | .614 |
| SNP 1 | Aa (.48) | .928 | .398 | .733 | .614 |
| | aa(.16) | .456 | .927 | .147 | .614 |
| | Marginal | .614 | .614 | .614 | K = .614 |
| Penetrance | |||||
Figure 12-Locus Model Detection: Each bar represents the model detection frequency (averaged between a MAF of 0.2 and 0.4) for the respective algorithm within 100 simulated datasets. Highest and lowest refers to the respective EDM of a given model within the model population generated by GAMETES. Each sub-plot corresponds to a specific combination of heritability and sample size. A similar figure for 3-locus models is included in the Additional file 1.
MDR Analysis (K = 0.3) Spearman Rank Correlations: A summary of Spearman rank correlation coefficients () and respective p-values relating detection to the other variables given in the table
| | ||||
|---|---|---|---|---|
| Heritability | 0.7757 | ** | 0.8270 | ** |
| Sample Size | 0.3508 | ** | 0.3253 | ** |
| mAF | 0.1257 | - | 0.075 | - |
| EDM | 0.8621 | ** | 0.8564 | ** |
| COR | 0.8491 | ** | 0.8603 | ** |
| PTV | 0.1544 | - | 0.2707 | * |
| EDM vs. COR | 0.9722 | ** | 0.9652 | ** |
Each calculation is performed over all 9,600 datasets generated for either 2 or 3-locus models as previously described. Also given is the correlation between EDM and COR over all datasets with K = 0.3. (− Not Sig., * P < 0.05, ** P << 0.001).
SURF Analysis (K = 0.3) Spearman Rank Correlations: A summary of Spearman rank correlation coefficients () and respective p-values relating detection to the other variables given in the table
| | ||||
|---|---|---|---|---|
| Heritability | 0.7690 | ** | -0.0300 | - |
| Sample Size | 0.3723 | ** | 0.0241 | - |
| mAF | 0.0864 | - | -0.0515 | - |
| EDM | 0.8798 | ** | -0.0515 | - |
| COR | 0.8602 | ** | -0.0635 | - |
| PTV | 0.1323 | - | -0.1403 | - |
Each calculation is performed over all 9,600 datasets generated for either 2 or 3-locus models as previously described. The correlation between EDM and COR for these datasets are the same as those given in Table 2. (− Not Sig., * P < 0.05, ** P << 0.001).
UCS Analysis (K = 0.3) Spearman Rank Correlations: A summary of Spearman rank correlation coefficients () and respective p-values relating detection to the other variables given in the table
| | ||||
|---|---|---|---|---|
| Heritability | 0.7984 | ** | 0.6926 | ** |
| Sample Size | 0.1567 | - | 0.0841 | - |
| mAF | -0.0413 | - | -0.1410 | - |
| EDM | 0.8990 | ** | 0.7512 | ** |
| COR | 0.8673 | ** | 0.7106 | ** |
| PTV | 0.1323 | - | -0.1403 | - |
Each calculation is performed over all 9,600 datasets generated for either 2 or 3-locus models as previously described. The correlation between EDM and COR for these datasets are the same as those given in Table 2. (− Not Sig., * P < 0.05, ** P << 0.001).
MDR Analysis (K = 0.1) Spearman Rank Correlations: A summary of Spearman rank correlation coefficients () and respective p-values relating detection to the other variables given in the table
| | ||||
|---|---|---|---|---|
| Heritability | 0.7663 | ** | 0.7722 | ** |
| Sample Size | 0.3081 | * | 0.4305 | ** |
| mAF | 0.1257 | - | 0.075 | - |
| EDM | 0.8237 | ** | 0.7786 | ** |
| COR | 0.8075 | ** | 0.8241 | ** |
| PTV | 0.1999 | - | -0.072 | - |
| EDM vs. COR | 0.9401 | ** | 0.9176 | ** |
ion is performed over all 8,800 datasets successfully generated for 2-locus models or all 6,400 datasets successfully generated for 3-locus models as previously described. Also given are the correlations between EDM and COR over these respective datasets with K = 0.1. (− Not Sig., * P < 0.05, ** P << 0.001).
Figure 2Plots illustrating the follow-up MDR detection analysis over 10 quantiles. For all datasets in this analysis, K = 0.3, number of SNPs is 20, and sample size is 800. MAF and heritability vary as before. (Left Panel) The solid regression line gives the best fit for all findings with an observed detection frequency below the significant detection threshold of 0.8 (the dotted line). Similar figures for COR and PTV are given in the Additional file 1. (Right Panel) A different perspective on the data in the left panel. This plot illustrates the capacity of model architecture to impact model detection independent of any genetic or dataset constraints. The x-axis gives the ten models selected to cover the range of EDMs observed. Each line highlights the ability to find these ten models for a respective combination of heritability and MAF.