| Literature DB >> 35047155 |
Abolfazl Mehbodniya1, Ihtiram Raza Khan2, Sudeshna Chakraborty3, M Karthik4, Kamakshi Mehta5, Liaqat Ali6, Stephen Jeswinde Nuagah7.
Abstract
BACKGROUND: Even in today's environment, when there is a plethora of information accessible, it may be difficult to make appropriate choices for one's well-being. Data mining, machine learning, and computational statistics are among the most popular arenas of training today, and they are all aimed at secondary empowered person in making good decisions that will maximize the outcome of whatever working area they are involved with. Because the degree of rise in the number of patient roles is directly related to the rate of people growth and lifestyle variations, the healthcare sector has a significant need for data processing services. When it comes to cancer, the prognosis is an expression that relates to the possibility of the patient surviving in general, but it may also be used to describe the severity of the sickness as it will present itself in the patient's future timeline. Methodology. The proposed technique consists of three stages: input data acquisition, preprocessing, and classification. Data acquisition consists of input raw data which is followed by preprocessing to eliminate the missed data and the classification is carried out using ensemble classifier to analyze the stages of cancer. This study explored the combined influence of the prominent labels in conjunction with one another utilizing the multilabel classifier approach, which is successful. Finally, an ensemble classifier model has been constructed and experimentally validated to increase the accuracy of the classifier model, which has been previously shown. The entire performance of the recommended and tested models demonstrates a steady development of 2% to 6% over the baseline presentation on the baseline performance.Entities:
Mesh:
Year: 2022 PMID: 35047155 PMCID: PMC8763559 DOI: 10.1155/2022/6462657
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1Data mining in healthcare applications.
Figure 2Overall stage of preprocessing.
Raw dataset attributes.
| Input attributes | Indicators |
|---|---|
| Input age | 2 |
| Reason for death | 9 |
| Counts and date | 1 |
| External disease | 26 |
| Input disease | 27 |
| Geographic locations | 2 |
| Primary and secondary data | 5 |
| Other category | 3 |
| Input rides from the section | 12 |
| Total age and disease count | 9 |
| Sex, race, and health criteria | 10 |
| Input characteristic number | 18 |
| Input morphology characters (ICD-O-1-2) | 6 |
| Input sequence number | 23 |
| Input stage HSCC | 4 |
| Historic therapy | 7 |
| Input stage | 7 |
| Input therapy | 12 |
| Grand total | 183 |
Figure 3Risk analysis using input factor.
Performance comparison of various datasets.
| Classifier | Breast | Colorectal | Respiratory | Mixed | Sample size |
|---|---|---|---|---|---|
| Naive Bayes | 93 · 12 | 94 · 351 | 92 · 936 | 99.276 | |
| DT | 92 · 16 | 99 · 252 | 97 · 038 | 99 · 80 5 | |
| DT | 99 · 80 | 99 · 405 | 97 · 27 6 | 92 · 39 7 | |
| K-NN | 93 · 14 | 99 · 059 | 95 · 20 7 | 99 · 50 4 | 4000 |
| K-NN | 92.125 | 99 · 47 7 | 99.187 | 92 · 13 6 | |
| Naive Bayes | 93 · 265 | 94 · 409 | 92 · 57 9 | 93 · 27 7 | 8000 |
Figure 4Survival rate concerning performance analysis.