| Literature DB >> 28361694 |
Sangseob Leem1, Taesung Park2.
Abstract
BACKGROUND: Detection of gene-gene interaction (GGI) is a key challenge towards solving the problem of missing heritability in genetics. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGIs. MDR reduces the dimensionality of multi-factor by means of binary classification into high-risk (H) or low-risk (L) groups. Unfortunately, this simple binary classification does not reflect the uncertainty of H/L classification. Thus, we proposed Fuzzy MDR to overcome limitations of binary classification by introducing the degree of membership of two fuzzy sets H/L. While Fuzzy MDR demonstrated higher power than that of MDR, its performance is highly dependent on the several tuning parameters. In real applications, it is not easy to choose appropriate tuning parameter values. RESULT: In this work, we propose an empirical fuzzy MDR (EF-MDR) which does not require specifying tuning parameters values. Here, we propose an empirical approach to estimating the membership degree that can be directly estimated from the data. In EF-MDR, the membership degree is estimated by the maximum likelihood estimator of the proportion of cases(controls) in each genotype combination. We also show that the balanced accuracy measure derived from this new membership function is a linear function of the standard chi-square statistics. This relationship allows us to perform the standard significance test using p-values in the MDR framework without permutation. Through two simulation studies, the power of the proposed EF-MDR is shown to be higher than those of MDR and Fuzzy MDR. We illustrate the proposed EF-MDR by analyzing Crohn's disease (CD) and bipolar disorder (BD) in the Wellcome Trust Case Control Consortium (WTCCC) dataset.Entities:
Keywords: Fuzzy MDR; Fuzzy set theory; Gene-gene interaction; Multifactor dimensionality reduction
Mesh:
Year: 2017 PMID: 28361694 PMCID: PMC5374597 DOI: 10.1186/s12864-017-3496-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Comparison between the original MDR and the Fuzzy MDR
Type I error rate of EF-MDR
| Threshold | Number of samples | |||
|---|---|---|---|---|
| 200 | 400 | 800 | 1600 | |
| 0.010 | 0.004 | 0.006 | 0.009 | 0.008 |
| 0.050 | 0.032 | 0.039 | 0.044 | 0.050 |
| 0.100 | 0.072 | 0.090 | 0.093 | 0.102 |
Fig. 2Power comparison of experiments without marginal effects
Fig. 3Power comparison of experiments with marginal effects
Basic characteristics of each SNP for Crohn’s disease (CD)
| Index | rs number | MAF | Chromosome (gene) |
| Index | rs number | MAF | Chromosome (gene) |
|
|---|---|---|---|---|---|---|---|---|---|
| 1 | rs11805303 | 0.347 | 1 (IL23R) | 4.41E-13 (2) | 16 | rs1456893 | 0.304 | 7 | 4.02E-05 (19) |
| 2 | rs12035082 | 0.410 | 1 | 2.70E-07 (8) | 17 | rs4263839 | 0.313 | 9 (NFSF15) | 1.64E-05 (17) |
| 3 | rs10801047 | 0.079 | 1 | 1.09E-05 (15) | 18 | rs17582416 | 0.363 | 10 (OC105376492) | 1.11E-03 (23) |
| 4 | rs11584383 | 0.297 | 1 (MROH3P) | 4.62E-05 (20) | 19 | rs10995271 | 0.413 | 10 | 1.54E-05 (16) |
| 5 | rs3828309 | 0.453 | 2 (ATG16L1) | 1.29E-13 (1) | 20 | rs10883365 | 0.498 | 10 (INC01475) | 1.60E-06 (11) |
| 6 | rs9858542 | 0.299 | 3 (BSN) | 3.20E-07 (9) | 21 | rs7927894 | 0.408 | 11 | 1.28E-02 (28) |
| 7 | rs17234657 | 0.146 | 5 | 1.71E-12 (3) | 22 | rs11175593 | 0.017 | 12 (OC105369735) | 4.22E-02 (30) |
| 8 | rs9292777 | 0.367 | 5 | 1.04E-11 (4) | 23 | rs3764147 | 0.222 | 13 (LACC1) | 3.34E-06 (13) |
| 9 | rs10077785 | 0.220 | 5 (C5orf56) | 6.39E-05 (22) | 24 | rs17221417 | 0.310 | 16 (NOD2) | 2.81E-10 (5) |
| 10 | rs13361189 | 0.084 | 5 | 7.04E-08 (6) | 25 | rs2872507 | 0.491 | 17 | 1.24E-03 (24) |
| 11 | rs4958847 | 0.130 | 5 (IRGM) | 1.81E-06 (12) | 26 | rs744166 | 0.422 | 17 (STAT3) | 6.27E-05 (21) |
| 12 | rs11747270 | 0.099 | 5 (IRGM) | 3.13E-05 (18) | 27 | rs2542151 | 0.181 | 18 | 1.74E-07 (7) |
| 13 | rs6887695 | 0.329 | 5 | 4.69E-03 (27) | 28 | rs1736135 | 0.412 | 21 (LOC101927745) | 3.39E-02 (29) |
| 14 | rs6908425 | 0.214 | 6 (CDKAL1) | 1.02E-06 (10) | 29 | rs2836754 | 0.374 | 21 (LOC400867) | 5.67E-06 (14) |
| 15 | rs7746082 | 0.293 | 6 | 4.20E-03 (26) | 30 | rs762421 | 0.408 | 21 (LOC105377139) | 2.35E-03 (25) |
Results of Crohn’s disease (CD) data analysis
| order | SNP combination | EF-MDR | MDR | |||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
| BA | SEN | SPE | ||
| 1 | 5 | 0.5060 | 1.292E-13 | 0.4002 | 0.6121 | 0.5494 | 0.3563 | 0.7425 |
| 2 | 1, 8 | 0.5121 | 6.211E-22 | 0.4069 | 0.6171 | 0.5664 | 0.5625 | 0.5702 |
| 3 | 1, 5, 8 | 0.5184 | 4.715E-25 | 0.4141 | 0.6224 | 0.5807 | 0.5203 | 0.6411 |
| 4 | 1, 5, 8, 23 | 0.5290 | 2.251E-24 | 0.4263 | 0.6319 | 0.5987 | 0.5557 | 0.6417 |
| 5 | 5, 8, 18, 24, 29 | 0.5518 | 2.480E-18 | 0.4585 | 0.6452 | 0.6219 | 0.5625 | 0.6814 |
Fig. 4Representation of the interaction between SNP1 and SNP8 for CD
Fig. 5Representation of the interaction among SNP1, SNP5 and SNP8 for CD
Results of the bipolar disorder (BD) data analysis
| order | SNP combination | EF-MDR | MDR | |||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
| BA | SEN | SPE | ||
| 1 | 16 | 0.5033 | 1.33e-07 | 0.3929 | 0.6140 | 0.5216 | 0.9540 | 0.0892 |
| 2 | 6, 16 | 0.5072 | 6.16e-12 | 0.3978 | 0.6171 | 0.5345 | 0.6467 | 0.4224 |
| 3 | 6, 15, 16 | 0.5118 | 3.21e-13 | 0.4031 | 0.6205 | 0.5568 | 0.6146 | 0.4990 |
| 4 | 5, 15, 17, 19 | 0.5203 | 4.87e-13 | 0.4133 | 0.6270 | 0.5850 | 0.6761 | 0.4939 |
| 5 | 5, 10, 15, 17, 19 | 0.5376 | 1.06e-13 | 0.4347 | 0.6406 | 0.6101 | 0.6376 | 0.5827 |
Fig. 6Representation of the interaction between SNP6 and SNP16 for BD
Execution times of MDR and EF-MDR in seconds
| order | MDR | EF-MDR |
|---|---|---|
| 1 | 2.99 | 0.29 |
| 2 | 61.20 | 3.99 |
| 3 | 1.01E + 03 | 39.79 |
| 4 | 1.62E + 04 | 3.09E + 02 |
| 5 | 2.78 E + 05 | 2.03E + 03 |