| Literature DB >> 32399163 |
Jing Yu1, Hang Li2, Desheng Liu3.
Abstract
Medical data have the characteristics of particularity and complexity. Big data clustering plays a significant role in the area of medicine. The traditional clustering algorithms are easily falling into local extreme value. It will generate clustering deviation, and the clustering effect is poor. Therefore, we propose a new medical big data clustering algorithm based on the modified immune evolutionary method under cloud computing environment to overcome the above disadvantages in this paper. Firstly, we analyze the big data structure model under cloud computing environment. Secondly, we give the detailed modified immune evolutionary method to cluster medical data including encoding, constructing fitness function, and selecting genetic operators. Finally, the experiments show that this new approach can improve the accuracy of data classification, reduce the error rate, and improve the performance of data mining and feature extraction for medical data clustering.Entities:
Mesh:
Year: 2020 PMID: 32399163 PMCID: PMC7201819 DOI: 10.1155/2020/1051394
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1Proposed clustering algorithm flow diagram.
Figure 2Data clustering flow based on MIEA.
Attribute description.
| Name of attributes | Description | The serial number of characteristics |
|---|---|---|
| Lumps thickness | 1–10 | 1 |
| Cell size uniformity | 1–10 | 2 |
| Cell morphology uniformity | 1–10 | 3 |
| Marginal adhesion | 1–10 | 4 |
| Single epithelial cell size | 1–10 | 5 |
| Bare nucleus | 1–10 | 6 |
| Bland chromatin | 1–10 | 7 |
| Normal nucleoli | 1–10 | 8 |
| Mitosis | 1–10 | 9 |
Figure 3Weight change in 20 times.
Performance comparison.
| Method | HGM | WPC | ACCH | Proposed |
|---|---|---|---|---|
| Accuracy (%) | 65 | 72 | 76 |
|
| Optimal value | 12.54 | 10.31 | 8.75 |
|
Attributes of experimental datasets.
| Number | Dataset | Sample number | Dimensionality | Cluster number |
|---|---|---|---|---|
| 1 | Iris | 10000050 | 3 | 4 |
| 2 | CMC | 10000197 | 3 | 9 |
| 3 | Wine | 10000040 | 3 | 13 |
| 4 | Vowel | 10000822 | 6 | 3 |
F comparison with different methods.
| Dataset number | HGM | WPC | ACCH | Proposed |
|---|---|---|---|---|
| 1 | 0.678 | 0.796 | 0.853 |
|
| 2 | 0.312 | 0.336 | 0.398 |
|
| 3 | 0.493 | 0.528 | 0.735 |
|
| 4 | 0.597 | 0.654 | 0.678 |
|
Figure 4Big data two-dimensional feature distribution in cloud computing.
Figure 5Feature extraction result with the proposed method.
Figure 6Comparison results.
Figure 7HGM method.
Figure 8WPC method.
Figure 9ACCH method.
Figure 10Proposed method.