| Literature DB >> 28744318 |
Karina Gutiérrez-Fragoso1, Héctor Gabriel Acosta-Mesa2, Nicandro Cruz-Ramírez2, Rodolfo Hernández-Jiménez3.
Abstract
Efforts have been being made to improve the diagnostic performance of colposcopy, trying to help better diagnose cervical cancer, particularly in developing countries. However, improvements in a number of areas are still necessary, such as the time it takes to process the full digital image of the cervix, the performance of the computing systems used to identify different kinds of tissues, and biopsy sampling. In this paper, we explore three different, well-known automatic classification methods (k-Nearest Neighbors, Naïve Bayes, and C4.5), in addition to different data models that take full advantage of this information and improve the diagnostic performance of colposcopy based on acetowhite temporal patterns. Based on the ROC and PRC area scores, the k-Nearest Neighbors and discrete PLA representation performed better than other methods. The values of sensitivity, specificity, and accuracy reached using this method were 60% (95% CI 50-70), 79% (95% CI 71-86), and 70% (95% CI 60-80), respectively. The acetowhitening phenomenon is not exclusive to high-grade lesions, and we have found acetowhite temporal patterns of epithelial changes that are not precancerous lesions but that are similar to positive ones. These findings need to be considered when developing more robust computing systems in the future.Entities:
Mesh:
Year: 2017 PMID: 28744318 PMCID: PMC5514345 DOI: 10.1155/2017/5989105
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1Acetowhite temporal patterns of different regions in a colposcopic sequence.
Figure 2Example of standardized data and data adjusted to a polynomial model.
Figure 3Discretized representation Piecewise Linear Approximation (PLA).
Figure 4Discretized representation Piecewise Slope Approximation (PSA).
Confusion matrix.
| Predicted | |||
|---|---|---|---|
| Class = 1 | Class = 0 | ||
| Actual | Class = 1 |
|
|
| Class = 0 |
|
| |
Evaluation measures from the confusion matrix.
| Measure | Formula |
|---|---|
| Accuracy | (TP + TN)/(TP + TN + FN + FP) |
| Sensitivity | TP/(TP + FN) |
| Specificity | TN/(TN + FP) |
Performance of automatic classification methods using acetowhite temporal patterns on a dataset of 200 cases.
| Method | Model | Classified instances (%) | TP rate | FP rate | Precision | Recall |
| MCC | ROC area | PRC area | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Correctly | Incorrectly | ||||||||||
| IBk | Standardized | 70 | 30 | 0.700 | 0.309 | 0.700 | 0.700 | 0.699 | 0.395 | 0.721 | 0.683 |
| Adjusted | 69 | 31 | 0.690 | 0.319 | 0.689 | 0.690 | 0.689 | 0.375 | 0.717 | 0.679 | |
| Parameters | 64 | 36 | 0.635 | 0.375 | 0.634 | 0.635 | 0.633 | 0.263 | 0.632 | 0.599 | |
| PLA | 70 | 30 | 0.700 | 0.313 | 0.701 | 0.700 | 0.697 | 0.395 | 0.732 | 0.713 | |
| PSA | 62 | 38 | 0.620 | 0.437 | 0.778 | 0.620 | 0.539 | 0.327 | 0.610 | 0.608 | |
|
| |||||||||||
| Naïve Bayes | Standardized | 69 | 31 | 0.690 | 0.329 | 0.695 | 0.690 | 0.684 | 0.377 | 0.713 | 0.681 |
| Adjusted | 69 | 31 | 0.685 | 0.333 | 0.689 | 0.685 | 0.679 | 0.366 | 0.708 | 0.673 | |
| Parameters | 53 | 47 | 0.525 | 0.459 | 0.537 | 0.525 | 0.520 | 0.067 | 0.540 | 0.543 | |
| PLA | 65 | 35 | 0.645 | 0.385 | 0.658 | 0.645 | 0.627 | 0.289 | 0.697 | 0.669 | |
| PSA | 61 | 39 | 0.605 | 0.409 | 0.603 | 0.605 | 0.601 | 0.200 | 0.624 | 0.617 | |
|
| |||||||||||
| C4.5 | Standardized | 68 | 32 | 0.675 | 0.330 | 0.674 | 0.675 | 0.675 | 0.346 | 0.627 | 0.592 |
| Adjusted | 64 | 36 | 0.640 | 0.361 | 0.641 | 0.640 | 0.640 | 0.279 | 0.618 | 0.582 | |
| Parameters | 55 | 45 | 0.545 | 0.471 | 0.541 | 0.545 | 0.539 | 0.076 | 0.526 | 0.520 | |
| PLA | 65 | 35 | 0.645 | 0.361 | 0.644 | 0.645 | 0.645 | 0.285 | 0.652 | 0.611 | |
| PSA | 64 | 36 | 0.635 | 0.379 | 0.634 | 0.635 | 0.631 | 0.262 | 0.643 | 0.611 | |
Confusion matrix for the KNN method and discrete PLA representation.
| Classified as → | a | b | |
|---|---|---|---|
| Positive | a | 56 | 37 |
| Negative | b | 23 | 84 |
Classes and subclasses.
| Class | Subclass | Cases | Biopsy | No biopsy |
|---|---|---|---|---|
| Negative | (1) Atrophy | 15 | 0 | 15 |
| (2) Inflammation | 24 | 6 | 18 | |
| (3) Ectopy | 20 | 0 | 20 | |
| (4) Normal | 48 | 1 | 47 | |
|
| ||||
| Positive | (5) Low-grade squamous intraepithelial lesion (LSIL) | 37 | 37 | — |
| (6) High-grade squamous intraepithelial lesion (HSIL) | 56 | 56 | — | |
|
| ||||
|
| 200 | 100 | 100 | |
Confusion matrix for multivalue classification of classes.
| Classifies as → | a | b | c | d | e | f |
|---|---|---|---|---|---|---|
| a = atrophy | 0 | 0 | 0 | 14 | 0 | 1 |
| b = inflammation | 0 | 0 | 0 | 13 | 1 | 10 |
| c = ectopy | 0 | 0 | 0 | 12 | 0 | 8 |
| d = normal | 0 | 0 | 0 | 44 | 0 | 4 |
| e = LSIL | 0 | 0 | 0 | 22 | 0 | 15 |
| f = HSIL | 0 | 0 | 0 | 16 | 0 | 40 |
Performance of the KNN method and discrete PLA representation using multivalue classification of classes on a dataset of 200 cases.
| Class | Classified instances (%) | TP rate | FP rate | Precision | Recall |
| MCC | ROC area | PRC area | |
|---|---|---|---|---|---|---|---|---|---|---|
| Correctly | Incorrectly | |||||||||
| Atrophy | 0 | 100 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.626 | 0.108 |
| Inflammation | 0 | 100 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.582 | 0.148 |
| Ectopy | 0 | 100 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.736 | 0.225 |
| Normal | 92 | 8 | 0.917 | 0.507 | 0.364 | 0.917 | 0.521 | 0.358 | 0.784 | 0.490 |
| LSIL | 0 | 100 | 0.000 | 0.006 | 0.000 | 0.000 | 0.000 | −0.034 | 0.494 | 0.178 |
| HSIL | 71 | 29 | 0.714 | 0.264 | 0.513 | 0.714 | 0.597 | 0.415 | 0.738 | 0.448 |
| Weighted Average | 42 | 58 | 0.420 | 0.197 | 0.231 | 0.420 | 0.292 | 0.196 | 0.677 | 0.324 |
Figure 5Acetowhite temporal patterns of different kind of epithelia based on mean of data using discrete PLA representation.