| Literature DB >> 35756402 |
Zejun Li1,2, Li Ang1, Wei Shi2, Ning Xin3, Min Chen1,2, Hua Tang3.
Abstract
Single-nucleotide polymorphism (SNP) involves the replacement of a single nucleotide in a deoxyribonucleic acid (DNA) sequence and is often linked to the development of specific diseases. Although current genotyping methods can tag SNP loci within biological samples to provide accurate genetic information for a disease associated, they have limited prediction accuracy. Furthermore, they are complex to perform and may result in the prediction of an excessive number of tag SNP loci, which may not always be associated with the disease. Therefore in this manuscript, we aimed to evaluate the impact of a newly optimized fuzzy clustering and binary particle swarm optimization algorithm (FCBPSO) on the accuracy and running time of informative SNP selection. Fuzzy clustering and FCBPSO were first applied to identify the equivalence relation and the candidate tag SNP set to reduce the redundancy between loci. The FCBPSO algorithm was then optimized and used to obtain the final tag SNP set. The prediction performance and running time of the newly developed model were compared with other traditional methods, including NMC, SPSO, and MCMR. The prediction accuracy of the FCBPSO algorithm was always higher than that of the other algorithms especially as the number of tag SNPs increased. However, when the number of tag SNPs was low, the prediction accuracy of FCBPSO was slightly lower than that of MCMR (add prediction accuracy values for each algorithm). However, the running time of the FCBPSO algorithm was always lower than that of MCMR. FCBPSO not only reduced the size and dimension of the optimization problem but also simplified the training of the prediction model. This improved the prediction accuracy of the model and reduced the running time when compared with other traditional methods.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35756402 PMCID: PMC9225903 DOI: 10.1155/2022/3837579
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.809
Figure 1FCBPSO algorithm flow chart.
The size of experimental datasets.
| Name | Number of SNPs | Number of samples |
|---|---|---|
| ENm013 | 360 | 120 |
| ENr112 | 411 | 120 |
| ENr113 | 514 | 120 |
Figure 2Comparison of prediction accuracy. (a) The precision accuracy results of the NMC, SPSO, MCMR, and FBPSCO algorithms for the ENm013 dataset. (b) The precision accuracy results of the NMC, SPSO, MCMR, and FBPSCO algorithms for the ENm112 dataset. (c) The precision accuracy results of the SPSO, MCMR, and FBPSCO algorithms for the ENm113 dataset.
Running time comparison of the MCMR and FCBPSO algorithms for the ENm013, Enr112, and ENr113 datasets.
| No. | ENm013 | ENr112 | ENr113 | |||
|---|---|---|---|---|---|---|
| MCMR | FCBPSO | MCMR | FCBPSO | MCMR | FCBPSO | |
| 2 | 11100 | 22.61 | 3100 | 20.81 | 19500 | 39.59 |
| 4 | 11000 | 17.51 | 3000 | 30.46 | 18000 | 53.34 |
| 6 | 10300 | 16.69 | 2700 | 30.75 | 17000 | 59.06 |
| 7 | 9600 | 18.64 | 1500 | 34.33 | 14800 | 57.22 |
| 10 | 8600 | 17.56 | 800 | 35.97 | 12000 | 58.16 |