| Literature DB >> 30717303 |
Boxin Guan1, Yuhai Zhao2.
Abstract
The epistatic interactions of single nucleotide polymorphisms (SNPs) are considered to be an important factor in determining the susceptibility of individuals to complex diseases. Although many methods have been proposed to detect such interactions, the development of detection algorithm is still ongoing due to the computational burden in large-scale association studies. In this paper, to deal with the intensive computing problem of detecting epistatic interactions in large-scale datasets, a self-adjusting ant colony optimization based on information entropy (IEACO) is proposed. The algorithm can automatically self-adjust the path selection strategy according to the real-time information entropy. The performance of IEACO is compared with that of ant colony optimization (ACO), AntEpiSeeker, AntMiner, and epiACO on a set of simulated datasets and a real genome-wide dataset. The results of extensive experiments show that the proposed method is superior to the other methods.Entities:
Keywords: ant colony optimization; epistatic interactions; information entropy; self-adjusting algorithm; single nucleotide polymorphisms
Mesh:
Year: 2019 PMID: 30717303 PMCID: PMC6409693 DOI: 10.3390/genes10020114
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Pseudocode of ant colony optimization based on information entropy (IEACO) to solve epistatic interactions.
Figure 2(a) The process of searching for interactions by using ant colony optimization (ACO); (b) the process of searching for interactions using IEACO.
Details of eight commonly used two-locus epistasis models.
| Model 1 | Prevalence = 0.100, MAF(a) = 0.30, MAF(b) = 0.20 | Model 2 | Prevalence = 0.100, MAF(a) = 0.20, MAF(b) = 0.20 | ||||
| AA | Aa | aa | AA | Aa | aa | ||
| BB | 0.087 | 0.087 | 0.087 | BB | 0.092 | 0.092 | 0.092 |
| Bb | 0.087 | 0.146 | 0.190 | Bb | 0.092 | 0.145 | 0.181 |
| bb | 0.087 | 0.190 | 0.247 | bb | 0.092 | 0.181 | 0.227 |
| Model 3 | Prevalence = 0.100, MAF(a) = 0.05, MAF(b) = 0.05 | Model 4 | Prevalence = 0.100, MAF(a) = 0.50, MAF(b) = 0.50 | ||||
| AA | Aa | aa | AA | Aa | aa | ||
| BB | 0.096 | 0.096 | 0.096 | BB | 0.052 | 0.052 | 0.052 |
| Bb | 0.096 | 0.533 | 0.533 | Bb | 0.052 | 0.137 | 0.137 |
| bb | 0.096 | 0.533 | 0.533 | bb | 0.052 | 0.137 | 0.137 |
| Model 5 | Prevalence = 0.064, MAF(a) = 0.20, MAF(b) = 0.20 | Model 6 | Prevalence = 0.171, MAF(a) = 0.40, MAF(b) = 0.40 | ||||
| AA | Aa | aa | AA | Aa | aa | ||
| BB | 0.486 | 0.960 | 0.538 | BB | 0.068 | 0.299 | 0.017 |
| Bb | 0.947 | 0.004 | 0.811 | Bb | 0.289 | 0.044 | 0.285 |
| bb | 0.640 | 0.606 | 0.909 | bb | 0.048 | 0.262 | 0.174 |
| Model 7 | Prevalence = 0.038, MAF(a) = 0.50, MAF(b) = 0.50 | Model 8 | Prevalence = 0.010, MAF(a) = 0.50, MAF(b) = 0.50 | ||||
| AA | Aa | aa | AA | Aa | aa | ||
| BB | 0.000 | 0.000 | 0.100 | BB | 0.000 | 0.020 | 0.000 |
| Bb | 0.000 | 0.050 | 0.000 | Bb | 0.020 | 0.000 | 0.020 |
| bb | 0.100 | 0.000 | 0.000 | bb | 0.000 | 0.020 | 0.000 |
Figure 3Power comparison on 500-SNP datasets.
Figure 4Power comparison on 5000-SNP datasets.
Mean power of IEACO with its respective standard deviation.
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 | |
|---|---|---|---|---|---|---|---|---|
| 500-SNP | 100 ± 0.00 | 92 ± 1.85 | 100 ± 0.00 | 100 ± 0.00 | 97 ± 1.06 | 100 ± 0.00 | 93 ± 2.59 | 94 ± 2.08 |
| 5000-SNP | 70 ± 4.03 | 68 ± 4.20 | 63 ± 2.89 | 61± 4.56 | 16 ± 3.03 | 17 ± 3.14 | 20 ± 1.99 | 17 ± 2.05 |
Results of recall and precision on 500-SNP datasets.
| ACO | AntEpiSeeker | AntMiner | epiACO | IEACO | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| R | P | R | P | R | P | R | P | R | P | |
| Model 1 | 0.74 | 0.74 | 0.89 | 0.76 |
| 0.78 |
| 0.82 |
|
|
| Model 2 | 0.77 | 0.83 | 0.81 | 0.84 |
| 0.92 | 0.83 | 0.89 | 0.92 |
|
| Model 3 | 0.88 | 0.78 | 0.95 | 0.88 |
| 0.89 | 0.97 | 0.79 |
|
|
| Model 4 | 0.90 | 0.89 | 0.87 | 0.91 | 0.96 | 0.82 |
| 0.84 |
|
|
| Model 5 | 0.64 | 0.72 | 0.72 | 0.74 | 0.81 |
| 0.96 | 0.80 |
| 0.85 |
| Model 6 | 0.78 | 0.82 |
| 0.86 | 0.90 | 0.93 | 0.86 | 0.91 |
|
|
| Model 7 | 0.69 | 0.65 | 0.96 | 0.78 |
| 0.81 | 0.95 | 0.77 | 0.93 |
|
| Model 8 | 0.84 | 0.91 | 0.88 | 0.85 | 0.82 | 0.84 |
|
| 0.94 | 0.89 |
Results of recall and precision on 5000-SNP datasets.
| ACO | AntEpiSeeker | AntMiner | epiACO | IEACO | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| R | P | R | P | R | P | R | P | R | P | |
| Model 1 | 0.34 | 0.54 | 0.48 | 0.61 | 0.57 | 0.70 | 0.62 | 0.72 |
|
|
| Model 2 | 0.41 | 0.52 | 0.50 | 0.64 | 0.64 | 0.66 | 0.53 | 0.71 |
|
|
| Model 3 | 0.45 | 0.57 | 0.49 | 0.68 | 0.57 | 0.63 | 0.57 | 0.65 |
|
|
| Model 4 | 0.36 | 0.48 | 0.51 | 0.55 | 0.46 | 0.59 | 0.59 | 0.64 |
|
|
| Model 5 | 0.07 | 0.19 | 0.02 | 0.13 | 0.04 | 0.16 | 0.07 | 0.17 |
|
|
| Model 6 | 0.00 | 0.00 | 0.05 | 0.17 | 0.11 | 0.38 | 0.11 | 0.27 |
|
|
| Model 7 | 0.05 | 0.45 | 0.06 | 0.40 | 0.15 | 0.48 | 0.13 | 0.55 |
|
|
| Model 8 | 0.12 | 0.63 | 0.06 | 0.56 | 0.12 | 0.44 | 0.14 | 0.50 |
|
|
Adjusted p-values on 500-SNP datasets.
| Hypothesis | Unadjusted |
|
|
|
|---|---|---|---|---|
| IEACO vs ACO | 3.743122471085636E-4 | 0.00149724898843425 | 0.00149724898843425 | 0.00149724898843425 |
| IEACO vs AntEpiSeeker | 0.0577795711235972 | 0.17333871337079185 | 0.17333871337079185 | 0.17333871337079185 |
| IEACO vs AntMiner | 0.5270892568655381 | 1.0541785137310762 | 0.5270892568655381 | 0.5270892568655381 |
| IEACO vs epiACO | 0.5270892568655381 | 1.0541785137310762 | 0.5270892568655381 | 0.5270892568655381 |
Adjusted p-values on 5000-SNP datasets.
| Hypothesis | Unadjusted |
|
|
|
|---|---|---|---|---|
| IEACO vs ACO | 9.546919845278683E-6 | 3.818767938111473E-5 | 3.818767938111473E-5 | 3.818767938111473E-5 |
| IEACO vs AntEpiSeeker | 1.478023103344183E-4 | 4.434069310032551E-4 | 4.434069310032551E-4 | 4.434069310032551E-4 |
| IEACO vs AntMiner | 0.01770606580736659 | 0.03541213161473319 | 0.03541213161473319 | 0.03541213161473319 |
| IEACO vs epiACO | 0.03983261924474151 | 0.03983261924474151 | 0.03983261924474151 | 0.03983261924474151 |
Results of running time on 500-SNP datasets.
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 | |
|---|---|---|---|---|---|---|---|---|
| ACO | 12.3 ± 0.4 | 14.1 ± 0.3 | 19.8 ± 4 | 10.3 ± 0.5 | 18.0 ± 0.4 | 19.1 ± 0.5 | 17.7 ± 0.5 | 11.7 ± 0.3 |
| AntEpiSeeker | 28.2 ± 0.5 | 27.3 ± 0.5 | 28.0 ± 0.6 | 30.3 ± 0.7 | 29.4 ± 0.6 | 36.8 ± 0.8 | 38.2 ± 0.8 | 34.5 ± 0.7 |
| AntMiner | 108.9 ± 4.1 | 123.4 ± 3.3 | 98.9 ± 3.4 | 100.6 ± 3.8 | 112.3 ± 4.4 | 109.9 ± 3.1 | 131.2 ± 4.6 | 132.0 ± 3.9 |
| epiACO | 25.6 ± 0.8 | 24.9 ± 0.5 | 22.2 ± 0.5 | 23.1 ± 0.6 | 20.4 ± 0.7 | 29.0 ± 0.7 | 28.8 ± 0.7 | 23.2 ± 0.6 |
| IEACO | 17.7 ± 0.6 | 18.1 ± 0.4 | 16.5 ± 0.4 | 16.8 ± 0.5 | 21.2 ± 0.5 | 22.5 ± 0.5 | 23.1 ± 0.6 | 17.0 ± 0.6 |
Results of running time on 5000-SNP datasets.
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 | |
|---|---|---|---|---|---|---|---|---|
| ACO | 84.7 ± 4.5 | 82.3 ± 5.1 | 78.9 ± 3.4 | 80.5 ± 4.2 | 83.4 ± 3.9 | 85.1 ± 4.5 | 79.3 ± 3.6 | 80.1 ± 4.1 |
| AntEpiSeeker | 187.8 ± 8.9 | 182.3 ± 9.1 | 178.9 ± 7.8 | 179.9 ± 8.1 | 186.7 ± 8.4 | 179.0 ± 6.9 | 170.3 ± 6.3 | 177.4 ± 7.2 |
| AntMiner | 824.1 ± 67.4 | 865.2 ± 76.8 | 778.4 ± 57.3 | 894.3 ± 62.6 | 842.2± 63.1 | 811.5 ± 60.7 | 870.1 ± 79.2 | 888.3 ± 80.3 |
| epiACO | 126.5 ± 74 | 119.7 ± 8.1 | 116.7 ± 7.0 | 115.0 ± 6.8 | 133.6 ± 5.7 | 130.2 ± 6.2 | 123.1 ± 5.4 | 120.4 ± 6.0 |
| IEACO | 93.8 ± 5.2 | 91.4 ± 4.2 | 90.3 ± 3.7 | 94.1 ± 4.1 | 97.9 ± 5.1 | 96.5 ± 5.0 | 89.4 ± 4.4 | 92.3 ± 3.9 |
Figure 5Performance comparison of multiple epistasis detection.
Experimental results of age-related macular degeneration (AMD) data identified by IEACO.
| SNP 1 | Chromosome | Gene | SNP 2 | Chromosome | Gene |
|---|---|---|---|---|---|
| rs380390 | 1 | CHF | rs1363688 | 5 | N/A |
| rs380390 | 1 | CHF | rs2402053 | 7 | N/A |
| rs380390 | 1 | CHF | rs2224762 | 9 | KDM4C |
| rs1329428 | 1 | CHF | rs2113379 | 2 | ADAM23 |
| rs1329428 | 1 | CHF | rs3922799 | 2 | N/A |
| rs1329428 | 1 | CHF | rs1822657 | 21 | NCAM2 |
| rs1394608 | 5 | SGCD | rs1740752 | 10 | N/A |
| rs994542 | 6 | N/A | rs9298846 | 9 | N/A |
| rs1740752 | 10 | N/A | rs1368863 | 11 | N/A |