| Literature DB >> 35711517 |
Tahir Alyas1, Muhammad Hamid2, Khalid Alissa3, Tauqeer Faiz4, Nadia Tabassum5, Aqeel Ahmad6.
Abstract
There are many thyroid diseases affecting people all over the world. Many diseases affect the thyroid gland, like hypothyroidism, hyperthyroidism, and thyroid cancer. Thyroid inefficiency can cause severe symptoms in patients. Effective classification and machine learning play a significant role in the timely detection of thyroid diseases. This timely classification will indeed affect the timely treatment of the patients. Automatic and precise thyroid nodule detection in ultrasound pictures is critical for reducing effort and radiologists' mistake rate. Medical images have evolved into one of the most valuable and consistent data sources for machine learning generation. In this paper, various machine learning algorithms like decision tree, random forest algorithm, KNN, and artificial neural networks on the dataset create a comparative analysis to better predict the disease based on parameters established from the dataset. Also, the dataset has been manipulated for accurate prediction for the classification. The classification was performed on both the sampled and unsampled datasets for better comparison of the dataset. After dataset manipulation, we obtained the highest accuracy for the random forest algorithm, equal to 94.8% accuracy and 91% specificity.Entities:
Mesh:
Year: 2022 PMID: 35711517 PMCID: PMC9197629 DOI: 10.1155/2022/9809932
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.246
Figure 1Proposed framework.
Figure 2System diagram.
Machine specification for simulations.
| Specifications | Value |
|---|---|
| CPU | 1.5–2.7 GHZ |
| GPU | 920 m NVidia |
| RAM | 12 GB |
| Generation | 4th |
| Internet | 8 Mbps upload, 8 Mbps download |
Dataset description.
| Dataset characteristics | Multivariate, domain theory | Number of instances | 7200 | Area | Life |
|---|---|---|---|---|---|
| Attribute characteristics | Categorical, real | Number of attributes | 25 | Date donated | 1987-01-01 |
| Associated tasks | Classification | Missing values? | N/A | Some web hits | 254314 |
Figure 3Dataset distribution.
Algorithm 1Thyroid disease classification algorithm.
Figure 4Downsampled dataset.
Figure 5Features contributing to classification.
Dataset attributes.
| Attribute | Value |
|---|---|
| Age | Integer |
| On thyroxine | Male (M), female(F) |
| Query on thyroxine | False (f), true (t) |
| On antithyroid | False (f), true (t) |
| Sick | False (f), true (t) |
| Pregnant | False (f), true (t) |
| Thyroid surgery | False (f), true (t) |
| T131 treatment | False (f), true (t) |
| Query hypothyroid | False (f), true (t) |
| Query hyperthyroid | False (f), true (t) |
| Lithium | False (f), true (t) |
| Goiter | False (f), true (t) |
| Tumor | False (f), true (t) |
| Hypopituitary | False (f), true (t) |
| Psych | False (f), true (t) |
| TSH measured | False (f), true (t) |
| TSH | Real |
| T3 measured | False (f), true (t) |
| T3 | Real |
| TT4 measured | False (f), true (t) |
| TT4 | Real |
| T4U measured | False (f), true (t) |
| T4U | Real |
| FTI measured | False (f), true (t) |
| FTI | Real |
| TBG measured | False (f), true (t) |
| TBG | Real |
| Referral source | SVHC, other, SVI, STMW, SVHD |
| Class | Negative (1), positive (0) |
Comparison of all classifiers.
| Classifier | Sensitivity | Specificity |
|---|---|---|
| (1) KNN | 59% | 91% |
| (2) ANN | 94% | 81% |
| (3) Naïve Bayes | 93% | 78% |
| (4) Random forest | 94.8% | 91% |
Figure 6Classifiers' performance.
KNN result at different K values.
|
| Train | Sensitivity | Specificity |
|---|---|---|---|
| 2 | 93.7% | 99.1% | 0.11 |
| 10 | 92.6% | 99.5% | 0.05 |
| 20 | 92% | 99.7% | 0.08 |
| 25 | 91.8% | 99.6% | 0.009 |
KNN result with a sampled dataset.
|
| Sensitivity | Specificity |
|---|---|---|
| 2 | 67% | 80% |
| 10 | 70% | 89% |
| 20 | 65% | 91% |
| 25 | 59% | 91% |
ANN result with 1000 epochs.
| Epochs | Sensitivity | Specificity |
|---|---|---|
| 1000 | 77.4% | 99% |
Random forest result.
| Classifier | Sensitivity | Specificity |
|---|---|---|
| Random forest | 84.4% | 99.4% |
Random forest result.
| Classifier | Sensitivity | Specificity |
|---|---|---|
| Random forest | 94.8% | 91.2% |
Naive Bayes result.
| Classifier | Sensitivity | Specificity |
|---|---|---|
| Naïve Bayes | 98% | 97% |
Naïve Bayes result.
| Classifier | Sensitivity | Specificity |
|---|---|---|
| Naïve Bayes | 93% | 78% |