| Literature DB >> 35211187 |
Yingjie Guo1,2, Zhian Yuan3, Zhen Liang4, Yang Wang1, Yanpeng Wang5, Lei Xu1.
Abstract
Interactions between genetic variants (epistasis) are ubiquitous in the model system and can significantly affect evolutionary adaptation, genetic mapping, and precision medical efforts. In this paper, we proposed a method for epistasis detection, called EpiMIC (epistasis detection through a maximal information coefficient (MIC)). MIC is a promising bivariate dependence measure explicitly designed for rapidly exploring various function types equally and for interpreting and comparing them on the same scale. Most epistasis detection approaches make assumptions about the form of the association between genetic variants, resulting in limited statistical performance. Based on the notion that if two SNPs do not interact, their joint distribution in all samples and in only cases should not be substantially different. We developed a statistic that utilizes the difference of MIC as a signal of epistasis and combined it with a permutation resampling strategy to estimate the empirical distribution of our statistic. Results of simulation and real-world data set showed that EpiMIC outperformed previous approaches for identifying epistasis at varying degrees of heredity.Entities:
Mesh:
Year: 2022 PMID: 35211187 PMCID: PMC8863443 DOI: 10.1155/2022/7843990
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Algorithm 1EpiMIC.
Figure 1Illustration of the EpiMIC framework for pairwise epistasis detection.
The detailed information of the six disease models with marginal effects, which included prevalence, MAF, and penetrance for each combination of genotypes.
| Models | Prevalence | MAF | Genotypes | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| AABB | AABb | AAbb | AaBB | AaBb | Aabb | aaBB | aaBb | aabb | |||
| Model 1 | 0.050 | 0.1 | 0.061 | 0.017 | 0.017 | 0.017 | 0.136 | 0.136 | 0.017 | 0.136 | 0.136 |
| Model 2 | 0.050 | 0.1 | 0.060 | 0.021 | 0.021 | 0.021 | 0.116 | 0.116 | 0.021 | 0.116 | 0.116 |
| Model 3 | 0.046 | 0.1 | 0.030 | 0.080 | 0.090 | 0.090 | 0.010 | 0.010 | 0.070 | 0.040 | 0.000 |
| Model 4 | 0.026 | 0.1 | 0.030 | 0.010 | 0.020 | 0.010 | 0.090 | 0.050 | 0.020 | 0.050 | 0.070 |
| Model 5 | 0.017 | 0.1 | 0.020 | 0.005 | 0.020 | 0.007 | 0.070 | 0.001 | 0.003 | 0.080 | 0.090 |
| Model 6 | 0.052 | 0.2 | 0.044 | 0.066 | 0.073 | 0.069 | 0.021 | 0.007 | 0.042 | 0.073 | 0.054 |
(a) No effect disease model
| Methods | Sample size | ||||
|---|---|---|---|---|---|
| 1,000 | 2,000 | 3,000 | 4,000 | 5,000 | |
| BEAM | 0 | 0 | 0 | 0 | 0 |
| BOOST | 0 | 0 | 0 | 0 | 0 |
| MDR | 0.03 | 0.02 | 0 | 0.01 | 0.04 |
| Epi-GTBN | 0.01 | 0 | 0.02 | 0.01 | 0.05 |
| EpiMIC | 0.02 | 0.03 | 0.05 | 0.02 | 0.04 |
The statistical power of simulation studies for BEAM, BOOST, MDR, Epi-GTBN, and EpiMIC with h ∈ {0.005, 0.01, 0.025, 0.05, 0.1, 0.2} and MAF ∈ {0.2, 0.4}. There are five models for each heritability-MAF combinations. The best-performing approach for each model is shown with a bold font. The results of some heritability-MAF combinations are not listed in the table because all methods under these parameter combinations are 1. These parameter combinations include MAF = 0.2 with h ∈ {0.025, 0.05, 0.1, 0.2} and MAF = 0.4 with h ∈ {0.01, 0.025, 0.2}.
| MAF | Heritability | Method | Models | ||||
|---|---|---|---|---|---|---|---|
| M1 | M2 | M3 | M4 | M5 | |||
| 0.2 | 0.005 | BEAM | 0.53 | 0.95 | 0.95 | 0.98 | 0.97 |
| BOOST | 0.96 |
|
|
|
| ||
| MDR | 0.14 | 0.84 |
|
|
| ||
| Epi-GTBN | 0.94 |
|
|
|
| ||
| EpiMIC |
|
|
|
|
| ||
| 0.01 | BEAM | 1 | 1 | 1 | 1 | 1 | |
| BOOST | 1 | 1 | 1 | 1 | 1 | ||
| MDR | 0.34 | 0.99 | 1 | 1 | 1 | ||
| Epi-GTBN | 1 | 1 | 1 | 1 | 1 | ||
| EpiMIC | 1 | 1 | 1 | 1 | 1 | ||
|
| |||||||
| 0.4 | 0.005 | BEAM | 0.87 | 0.93 | 0.9 | 0.93 | 0.93 |
| BOOST |
|
|
|
|
| ||
| MDR | 0.99 |
|
|
|
| ||
| Epi-GTBN |
|
|
|
|
| ||
| EpiMIC |
|
|
|
|
| ||
| 0.05 | BEAM | 0.76 | 1 | 1 | 0.98 | 1 | |
| BOOST | 1 | 1 | 1 | 1 | 1 | ||
| MDR | 1 | 1 | 1 | 1 | 1 | ||
| Epi-GTBN | 1 | 1 | 1 | 1 | 1 | ||
| EpiMIC | 1 | 1 | 1 | 1 | 1 | ||
Average power for the methods BEAM, BOOST, MDR, Epi-GTBN, and EpiMIC to detect epistasis under 12 heritability-MAF combinations.
| MAF | Heritability | Methods | ||||
|---|---|---|---|---|---|---|
| BEAM | BOOST | MDR | Epi-GTBN | EpiMIC | ||
| 0.2 | 0.005 | 0.876 | 0.992 | 0.796 | 0.988 | 0.998 |
| 0.01 | 1 | 1 | 0.866 | 1 | 1 | |
| 0.025 | 1 | 1 | 1 | 1 | 1 | |
| 0.05 | 1 | 1 | 1 | 1 | 1 | |
| 0.1 | 1 | 1 | 1 | 1 | 1 | |
| 0.2 | 1 | 1 | 1 | 1 | 1 | |
|
| ||||||
| 0.4 | 0.005 | 0.912 | 1 | 0.998 | 1 | 1 |
| 0.01 | 1 | 1 | 1 | 1 | 1 | |
| 0.025 | 1 | 1 | 1 | 1 | 1 | |
| 0.05 | 0.948 | 1 | 1 | 1 | 1 | |
| 0.1 | 1 | 1 | 1 | 1 | 1 | |
| 0.2 | 1 | 1 | 1 | 1 | 1 | |
Figure 2The statistical power of simulation studies for BEAM (blue), BOOST (orange), MDR (green), Epi-GTBN (red), and EpiMIC (purple) under disease model with heritability = 0.005, MAF = 0.2, population prevalence = 0.2, and sample sizes that ranged from 1,000 to 5,000.
Figure 3Variant network of rheumatoid arthritis results from the EpiMIC model with identified SNP pairs. The nodes were SNPs, and the edges represented the epistasis relationship. Node size and color reflected the number of epistasis that the node involved in. Edge thickness indicated the maximal information coefficient of SNPs in case samples. The node labels with highlights were the top 15 SNPs ranked by node degree.
Detailed information of the top 15 nodes ranked by the node's degree of SNP epistasis network generated using EpiMIC with RA data. The column “corresponding gene” indicates the gene where SNP was located, and the column “gene interaction” shows the genes where the interacting SNPs were located.
| rsID | Corresponding gene | Degree | Gene interaction |
|---|---|---|---|
| rs10805069 | IL15 | 12 | GM-CSF, Tie2, TLR4, MMP3, FLT1 |
| rs4796119 | CCL2 | 11 | M-CSF, CD28, CTLA4, Ang1 |
| rs684 | LFA1 | 10 | M-CSF, Ang1, CTSL, RANK, LFA1 |
| rs10490573 | CD28 | 9 | CD80, IL15, Ang1, Tie2, CCL2, CCL5, LFA1 |
| rs10505107 | Ang1 | 9 | TGF |
| rs11938228 | CXCL1 | 9 | IL1, IL6, Ang1, APRIL |
| rs3850890 | CD80 | 9 | TLR2, MMP3, IFN |
| rs1283659 | Ang1 | 9 | CD28, CXCL1, FLT1, CCL2 |
| rs534129 | Tie2 | 9 | IL15, Ang1,TLR4, MMP1, MMP3 |
| rs10519613 | IL15 | 8 | IL1, IL6, Ang1, APRIL |
| rs7855140 | TLR4 | 8 | IL15, IL17, Tie2, MMP1, IL18 |
| rs544354 | IL18 | 8 | IL15, Tie2 |
| rs6808536 | CD80 | 8 | M-CSF, Tie2, MMP3, FLT1 |
| rs17069845 | RANK | 8 | TLR2, Tie2 |
| rs2367291 | IL15 | 8 | IL8, Tie2, FLT1, RANK, LFA1 |
Detailed information of the top 10 epistasis ranked by the MIC in case samples for each pair of SNPs and genes where SNPs were located. The column “Ref” references the literature that showed the regulatory relationship between two genes.
| rsID of SNP1 | rsID of SNP2 | Corresponding gene 1 | Corresponding gene 2 | Ref |
|---|---|---|---|---|
| rs4675363 | rs1427676 | CD28 | CTLA4 | [ |
| rs7537752 | rs6574222 | M-CSF | FOS | [ |
| rs4422395 | rs7037246 | TLR2 | TLR4 | |
| rs13285984 | rs1634507 | Tie2 | CCL4 | |
| rs12089727 | rs6808536 | MCSF | CD80 | |
| rs2564594 | rs1800795 | TLR2 | IL6 | [ |
| rs246841 | rs266089 | GM-CSF | CXCL12 | [ |
| rs550982 | rs1569328 | Tie2 | AP1 | |
| rs951759 | rs266089 | Ang1 | CXCL12 | |
| rs2256849 | rs1474552 | FLT1 | ITGB2 |
(a) No effect model
| AA | Aa | aa | |
|---|---|---|---|
| BB |
|
|
|
| Bb |
|
|
|
| bb |
|
|
|
(b) One marginal recessive model
| AA | Aa | aa | |
|---|---|---|---|
| BB |
|
|
|
| Bb |
|
|
|
| bb |
| | |
(b) Marginal disease model
| Methods | Sample size | ||||
|---|---|---|---|---|---|
| 1,000 | 2,000 | 3,000 | 4,000 | 5,000 | |
| BEAM | 0 | 0 | 0 | 0 | 0 |
| BOOST | 0 | 0 | 0 | 0 | 0 |
| MDR | 0.04 | 0.06 | 0.06 | 0.08 | 0.06 |
| Epi-GTBN | 0.06 | 0.07 | 0.06 | 0.08 | 0.11 |
| EpiMIC | 0.03 | 0.03 | 0.06 | 0.05 | 0.03 |