| Literature DB >> 35350819 |
Zhaomin Yao1,2, Gancheng Zhu3, Jingwei Too4, Meiyu Duan3, Zhiguo Wang1,2.
Abstract
OMIC datasets have high dimensions, and the connection among OMIC features is very complicated. It is difficult to establish linkages among these features and certain biological traits of significance. The proposed ensemble swarm intelligence-based approaches can identify key biomarkers and reduce feature dimension efficiently. It is an end-to-end method that only relies on the rules of the algorithm itself, without presets such as the number of filtering features. Additionally, this method achieves good classification accuracy without excessive consumption of computing resources.Entities:
Keywords: feature selection (FS); intersection and union combination; methylation data; swarm intelligence (SI); transcriptome data
Year: 2022 PMID: 35350819 PMCID: PMC8957794 DOI: 10.3389/fgene.2021.793629
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Overview of the proposed methodology.
FIGURE 2Feature subsets combination. M1, M2, and M3 represent the feature subsets extracted by three different methods, respectively. The green part and yellow part represent the combination results obtained by intersection and union.
FIGURE 3The convergence speed of top three swarm intelligence algorithms on T1D.
FIGURE 4Performance of three swarm intelligence algorithms on methylation datasets.
Descriptive statistics of the results on methylation datasets.
| Methods | Sample number | Average (%) | Standard deviation | Min (%) | Max (%) |
|---|---|---|---|---|---|
| SMA | 10 | 80.44 | 11.62 | 65.91 | 98.72 |
| PFA | 10 | 80.30 | 15.98 | 60.13 | 100.00 |
| HGSO | 10 | 81.55 | 14.17 | 56.73 | 98.90 |
Wilcoxon signed ranks test.
| Comparison |
|
|
|
|---|---|---|---|
| PFA versus SMA | 4 | 6 | 0.721 |
| HGSO versus SMA | 5 | 5 | 0.959 |
| HGSO versus PFA | 5 | 5 | 0.878 |
FIGURE 5Performance of feature intersection and union combination on methylation datasets.
Feature selection rates of all used feature subsets on methylation datasets.
| Data | Solo | Intersection | Union | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| SMA (%) | PFA (%) | HGSO (%) | SMA and PFA | SMA and HGSO | PFA and HGSO (%) | SMA and PFA and HGSO | SMA and PFA (%) | SMA and HGSO (%) | PFA and HGSO (%) | SMA and PFA and HGSO (%) | |
| GSE103186 | 0.0338 | 49.7048 | 0.6381 | 0.0154% | 0.0002% | 0.3184 | 0.0002% | 49.7232 | 0.6716 | 50.0245 | 50.0428 |
| GSE139032 | 0.0218 | 49.8948 | 0.0181 | 0.0145% | — | 0.0109 | — | 49.9021 | 0.0399 | 49.9021 | 49.9093 |
| GSE139404 | 0.0009 | 49.7509 | 0.0328 | 0.0004% | — | 0.0149 | — | 49.7513 | 0.0336 | 49.7688 | 49.7692 |
| GSE144910 | 0.0004 | 49.9814 | 0.0046 | 0.0001% | — | 0.0018 | — | 49.9816 | 0.0049 | 49.9841 | 49.9844 |
| GSE164269 | 0.0044 | 49.9655 | 0.7131 | 0.0022% | — | 0.3630 | — | 49.9677 | 0.7175 | 50.3156 | 50.3178 |
| GSE166787 | 0.0017 | 49.6841 | 0.0111 | 0.0009% | — | 0.0059 | — | 49.6849 | 0.0129 | 49.6893 | 49.6902 |
| GSE173330 | 0.0160 | 48.7964 | 0.3728 | 0.0107% | — | 0.1651 | — | 48.8017 | 0.3888 | 49.0041 | 49.0094 |
| GSE174613 | 0.0008 | 49.4005 | 0.0066 | — | — | 0.0049 | — | 49.4014 | 0.0074 | 49.4022 | 49.4030 |
| GSE74845 | 0.0023 | 49.9412 | 0.1849 | 0.0011% | — | 0.0933 | — | 49.9425 | 0.1873 | 50.0328 | 50.0341 |
| GSE80970 | 0.1564 | 49.9624 | 0.6070 | 0.0871% | 0.0007% | 0.3080 | 0.0005% | 50.0317 | 0.7626 | 50.2614 | 50.3304 |
| Average | 0.0238 | 49.7082 | 0.2589 | 0.0147% | 0.0005% | 0.1286 | 0.0004% | 49.7188 | 0.2827 | 49.8385 | 49.8491 |
FIGURE 6Performance of feature intersection and union combination on methylation datasets.