| Literature DB >> 35885149 |
Mario Fordellone1, Paolo Chiodini1.
Abstract
(1) Background: in recent years, a lot of the research of statistical methods focused on the classification problem in presence of imprecise data. A particular case of imprecise data is the interval-valued data. Following this research line, in this work a new hierarchical classification technique for multivariate interval-valued data is suggested for diagnosis of the breast cancer; (2)Entities:
Keywords: cancer classification; cancer detection; hierarchical classification; imprecise data; interval-valued data; unsupervised classification
Year: 2022 PMID: 35885149 PMCID: PMC9316630 DOI: 10.3390/e24070926
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.738
Figure 1Artificial data generated by three bi-variate Normal distributions. To the left we have dataset in ordinary form; to the right we have interval-valued dataset.
Figure 2Variables distribution with respect to the observed diagnosis groups of data.
Figure 3Dendrogram obtained by HC applied on interval-valued data.
Figure 4Dendrogram obtained by HC applied on worst data.
Comparison of the performances obtained by interval-valued approach and conventional approach.
| Interval-Valued Approach | Conventional Approach | |||||
|---|---|---|---|---|---|---|
| Estimate | Lower 95% | Upper 95% | Estimate | Lower 95% | Upper 95% | |
| Sensitivity | 0.613 | 0.544 | 0.679 | 0.080 | 0.047 | 0.125 |
| Specificity | 0.905 | 0.869 | 0.933 | 1.000 | 0.990 | 1.000 |
| Pos.Pred.Val. | 0.793 | 0.723 | 0.852 | 1.000 | 0.805 | 1.000 |
| Neg.Pred.Val. | 0.798 | 0.755 | 0.836 | 0.647 | 0.605 | 0.687 |
| LR+ | 6.439 | 4.596 | 9.020 | 58.826 | 3.556 | 973.204 |
| LR− | 0.427 | 0.360 | 0.508 | 0.920 | 0.883 | 0.957 |
| Accuracy | 0.796 | 0.761 | 0.829 | 0.657 | 0.617 | 0.696 |
Comparison of the performances obtained by interval-valued approach and conventional approach on a randomized sub-sample of 150 subjects.
| Interval-Valued Approach | Conventional Approach | |||||
|---|---|---|---|---|---|---|
| Estimate | Lower 95% | Upper 95% | Estimate | Lower 95% | Upper 95% | |
| Sensitivity | 1.000 | 0.962 | 1.000 | 0.036 | 0.004 | 0.125 |
| Specificity | 0.709 | 0.571 | 0.824 | 1.000 | 0.962 | 1.000 |
| Pos.Pred.Val. | 0.856 | 0.776 | 0.915 | 1.000 | 0.158 | 1.000 |
| Neg.Pred.Val. | 1.000 | 0.910 | 1.000 | 0.642 | 0.559 | 0.719 |
| LR+ | 3.438 | 2.275 | 5.193 | 8.571 | 0.419 | 175.363 |
| LR− | 0.007 | 0.000 | 0.118 | 0.964 | 0.915 | 1.014 |
| Accuracy | 0.893 | 0.833 | 0.938 | 0.647 | 0.565 | 0.723 |
Comparison of the performances obtained by interval-valued approach and conventional approach on a randomized sub-sample of 300 subjects.
| Interval-Valued Approach | Conventional Approach | |||||
|---|---|---|---|---|---|---|
| Estimate | Lower 95% | Upper 95% | Estimate | Lower 95% | Upper 95% | |
| Sensitivity | 0.935 | 0.890 | 0.966 | 0.009 | 0.000 | 0.048 |
| Specificity | 0.623 | 0.527 | 0.712 | 1.000 | 0.980 | 1.000 |
| Pos.Pred.Val. | 0.802 | 0.743 | 0.853 | 1.000 | 0.025 | 1.000 |
| Neg.Pred.Val. | 0.855 | 0.761 | 0.923 | 0.622 | 0.564 | 0.677 |
| LR+ | 2.480 | 1.953 | 3.149 | 4.878 | 0.200 | 118.743 |
| LR− | 0.104 | 0.059 | 0.182 | 0.991 | 0.974 | 1.008 |
| Accuracy | 0.817 | 0.768 | 0.859 | 0.623 | 0.566 | 0.678 |