| Literature DB >> 35627256 |
Yijun Gu1, Yan Sun1, Junliang Shang1, Feng Li1, Boxin Guan1, Jin-Xing Liu1.
Abstract
In genome-wide association studies, epistasis detection is of great significance for the occurrence and diagnosis of complex human diseases, but it also faces challenges such as high dimensionality and a small data sample size. In order to cope with these challenges, several swarm intelligence methods have been introduced to identify epistasis in recent years. However, the existing methods still have some limitations, such as high-consumption and premature convergence. In this study, we proposed a multi-objective artificial bee colony (ABC) algorithm based on the scale-free network (SFMOABC). The SFMOABC incorporates the scale-free network into the ABC algorithm to guide the update and selection of solutions. In addition, the SFMOABC uses mutual information and the K2-Score of the Bayesian network as objective functions, and the opposition-based learning strategy is used to improve the search ability. Experiments were performed on both simulation datasets and a real dataset of age-related macular degeneration (AMD). The results of the simulation experiments showed that the SFMOABC has better detection power and efficiency than seven other epistasis detection methods. In the real AMD data experiment, most of the single nucleotide polymorphism combinations detected by the SFMOABC have been shown to be associated with AMD disease. Therefore, SFMOABC is a promising method for epistasis detection.Entities:
Keywords: artificial bee colony; complex disease; epistasis detection; scale-free network; single nucleotide polymorphism
Mesh:
Year: 2022 PMID: 35627256 PMCID: PMC9140669 DOI: 10.3390/genes13050871
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Figure 1Comparison diagram of the scale-free network and the random network: (a) visualization of the scale-free network; (b) the degree distribution curve of nodes in the scale-free network; (c) visualization of the random network; (d) the degree distribution curve of nodes in the random network.
Figure 2Framework of SFMOABC. The numbers in the figure represent the sequence numbers corresponding to the SNP combinations sorted in descending order according to the fitness value.
Details of the epistatic models.
| Model | AABB | AABb | AAbb | AaBB | AaBb | Aabb | aaBB | aaBb | aabb |
|---|---|---|---|---|---|---|---|---|---|
| Model 1 | 0.087 | 0.087 | 0.087 | 0.087 | 0.146 | 0.190 | 0.087 | 0.190 | 0.247 |
| Model 2 | 0.078 | 0.078 | 0.078 | 0.078 | 0.105 | 0.122 | 0.078 | 0.122 | 0.142 |
| Model 3 | 0.009 | 0.009 | 0.009 | 0.013 | 0.006 | 0.006 | 0.013 | 0.006 | 0.006 |
| Model 4 | 0.092 | 0.092 | 0.092 | 0.092 | 0.319 | 0.319 | 0.092 | 0.319 | 0.319 |
| Model 5 | 0.084 | 0.084 | 0.084 | 0.084 | 0.210 | 0.210 | 0.084 | 0.210 | 0.210 |
| Model 6 | 0.052 | 0.052 | 0.052 | 0.052 | 0.137 | 0.137 | 0.052 | 0.137 | 0.137 |
| Model 7 | 0.072 | 0.164 | 0.164 | 0.164 | 0.072 | 0.072 | 0.164 | 0.072 | 0.072 |
| Model 8 | 0.067 | 0.155 | 0.155 | 0.155 | 0.067 | 0.067 | 0.155 | 0.067 | 0.067 |
| Model 9 | 0.486 | 0.960 | 0.538 | 0.947 | 0.004 | 0.811 | 0.640 | 0.606 | 0.909 |
| Model 10 | 0.103 | 0.063 | 0.124 | 0.098 | 0.086 | 0.069 | 0.021 | 0.147 | 0.059 |
| Model 11 | 0.000 | 0.000 | 0.000 | 0.000 | 0.050 | 0.000 | 0.100 | 0.000 | 0.000 |
| Model 12 | 0.000 | 0.020 | 0.000 | 0.020 | 0.000 | 0.020 | 0.000 | 0.020 | 0.000 |
Figure 3Power of methods based on the swarm intelligence algorithm: (a) Power on the small-scale datasets; (b) Power on the large-scale datasets.
Figure 4Power of the single-objective ABC algorithms: (a) Power on the small-scale datasets; (b) Power on the large-scale datasets.
Figure 5F-measure of methods based on the swarm intelligence algorithm: (a) F-measure on the small-scale datasets; (b) F-measure on the large-scale datasets.
Figure 6F-measure of single-objective ABC algorithms: (a) F-measure on the small-scale datasets; (b) F-measure on the large-scale datasets.
Figure 7Running time: (a) Running time on the small-scale datasets; (b) Running time on the large-scale datasets.
Top 15 Captured Epistatic Interactions Associated with AMD.
| SNP1 | SNP2 | Fitness Value | |||||
|---|---|---|---|---|---|---|---|
| Name | Gene | Chr | Name | Gene | Chr | ||
| rs380390 |
| 1 | rs1363688 |
| 5 | 39.16 | 1.5453 × 10−9 |
| rs380390 |
| 1 | rs2402053 |
| 7 | 37.72 | 1.4679 × 10−8 |
| rs380390 |
| 1 | rs1374431 |
| 2 | 37.19 | 2.6240 × 10−8 |
| rs1329428 |
| 1 | rs9328536 |
| 9 | 36.81 | 3.0901 × 10−8 |
| rs380390 |
| 1 | rs2380684 |
| 2 | 36.00 | 3.9086 × 10−8 |
| rs380390 |
| 1 | rs3009336 |
| 1 | 34.59 | 5.3535 × 10−8 |
| rs380390 |
| 1 | rs555174 |
| 21 | 34.14 | 5.7995 × 10−8 |
| rs380390 |
| 1 | rs2794520 |
| 1 | 33.56 | 6.7417 × 10−8 |
| rs380390 |
| 1 | rs10508731 |
| 10 | 33.56 | 7.2188 × 10−8 |
| rs380390 |
| 1 | rs1740752 |
| 10 | 33.50 | 1.0917 × 10−7 |
| rs1329428 |
| 1 | rs10489076 |
| 4 | 33.43 | 1.9228 × 10−7 |
| rs1329428 |
| 1 | rs3913094 |
| 12 | 33.16 | 4.2460 × 10−7 |
| rs380390 |
| 1 | rs724972 |
| 3 | 33.02 | 4.4484 × 10−7 |
| rs1329428 |
| 1 | rs724972 |
| 3 | 33.02 | 1.3223 × 10−6 |
| rs1329428 |
| 1 | rs2466215 |
| 8 | 32.35 | 1.7865 × 10−6 |
Figure 8Epistasis network of AMD.