| Literature DB >> 35992807 |
Jing-Hang Ma1, Shang-Feng You1, Ji-Sen Xue1, Xiao-Lin Li1, Yi-Yao Chen1, Yan Hu1, Zhen Feng1.
Abstract
Background: computer-aided diagnosis of medical images is becoming more significant in intelligent medicine. Colposcopy-guided biopsy with pathological diagnosis is the gold standard in diagnosing CIN and invasive cervical cancer. However, it struggles with its low sensitivity in differentiating cancer/HSIL from LSIL/normal, particularly in areas with a lack of skilled colposcopists and access to adequate medical resources.Entities:
Keywords: Cervical dysplasia; colposcopy; computer-aided diagnosis; feature extraction - classification ensemble; multi-modal machine learning
Year: 2022 PMID: 35992807 PMCID: PMC9389460 DOI: 10.3389/fonc.2022.905623
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
The distribution of HPV tests and TCT tests in the 986 patients.
|
| N% |
| N% |
| HPV 16/18 positive | 27.7% | NILM | 51.6% |
| High-risk (non-16/18) | 65.8% | ASCUS | 24.1% |
| HPV positive | ASC-H | 3.8% | |
| Low-risk HPV positive | 10.4% | LSIL | 15.2% |
| HPV negative | 10.2% | HSIL | 4.4% |
| AGC | 0.9% |
Figure 1A schematic representation of the training procedure of our model. (A) Loss analysis of five segmentation algorithms was presented, and one sample was visualized. Each color attribute represented a different algorithm, and the white rectangular outline showed the ground truth of segmentation. (B) Color and texture features in segmented VIA/VILI images were selected and were further extracted using t-test. (C) The H-group was augmented using the SMOTE algorithm and were then fed into an RBF-SVM for training. (D) Six features went into the naïve Bayes classifier to perform the final classification, which was compared with the pathological result.
Loss analysis of five deep learning algorithms for image segmentation.
| VIA | VILI | |
|---|---|---|
| DenseNet-169 | 0.13 ± 0.05 |
|
| ResNet-50 |
| 0.20 ± 0.07 |
| ResNet-101 |
| 0.15 ± 0.05 |
| VGG-16 |
|
|
| Xception |
|
|
Bold-type values indicated algorithms with relatively low loss.
The underlined had an optimal clinical performance.
Sample size: 120/30/50 (training/validation/test set)
Summary of features for the naïve Bayes classifiers.
| Feature | Type | Range | |
| 1 | SVM output | Numerical | [0, 1] |
| 2 | Age | Numerical | [16,83] |
| 3 | HPV-11 | Categorical | 0,1 |
| 4 | HPV-22 | Categorical | 0,1 |
| 5 | HPV-33 | Categorical | 0,1 |
| 6 | TCT | Categorical | 1,2,3,4,5,6 4 |
1:HPV 16/18 positive; 2: high-risk (non-16/18) HPV positive; 3: low-risk HPV positive.
4 : 1: NILM; 2: ASCUS; 3: ASC-H; 4: LSIL; 5:HSIL; 6:AGC.
Top-ranked VIA/VILI features between H– and LN– groups.
| VIA feature | H (N = 96) | LN (N = 594) | P-value |
|---|---|---|---|
| Dissimilarity ( | 12.80 ± 3.54 | 11.36 ± 2.67 | 2.33 ×10-4 |
| Dissimilarity ( | 10.59 ± 2.94 | 9.46 ± 2.29 | 5.07 × 10-4 |
| Std (Lb) ζ | 6.85 ± 2.31 | 5.99 ± 1.63 | 7.44 × 10-4 |
| Std (S) ζ | 26.01 ± 10.83 | 22.12 ± 6.17 | 8.61 × 10-4 |
| Std (Cb) ζ | 6.31 ± 2.14 | 5.55 ± 1.48 | 1.01 × 10-3 |
| Contrast ( | 486.72 ± 265.95 | 392.07 ± 181.19 | 1.12 × 10-3 |
| Std (La) | 4.93 ± 1.77 | 4.33 ± 1.22 | 2.09 × 10-3 |
| Homogeneity ( | 0.11 ± 0.02 | 0.12 ± 0.02 | 2.31 × 10-3 |
| Contrast ( | 373.02 ± 198.80 | 307.16 ± 148.59 | 2.50 × 10-3 |
| Dissimilarity ( | 8.01 ± 2.22 | 7.28 ± 1.81 | 2.92 × 10-3 |
| VILI feature | H (N=96) | LN (N=594) | P-value |
| Std (Lb) ζ | 12.72 ± 3.53 | 9.89 ± 3.20 | 2.45 × 10-11 |
| Std (Cb) ζ | 12.38 ± 4.13 | 9.21 ± 3.40 | 9.97 × 10-11 |
| Otsu (Lb) ζ | 103.39 ± 8.10 | 109.13 ± 7.41 | 1.96 × 10-9 |
| Std (La) ζ | 6.61 ± 2.59 | 4.87 ± 2.06 | 7.22 × 10-9 |
| Std (Cr) ζ | 8.51 ± 2.79 | 6.64 ± 2.18 | 7.59 × 10-9 |
| Otsu (Cb) | 151.56 ± 8.73 | 145.74 ± 7.63 | 1.10 × 10-8 |
| Otsu (La) | 135.73 ± 5.06 | 132.34 ± 4.61 | 1.13 × 10-8 |
| Mean (Lb) | 105.88 ± 11.33 | 12.39 ± 8.76 | 4.42 × 10-7 |
| Mean (Cb) | 149.10 ± 11.29 | 142.67 ± 8.30 | 4.42 × 10-7 |
| Otsu (Cr) | 112.64 ± 6.12 | 115.84 ± 5.22 | 4.10 × 10-6 |
ζ: Features are used in our model, that is, five top-ranked features.
The pixel pair distance offsets or the color channel is inside the parentheses.
Sd, standard deviation; Otsu, Otsu thresholding.
Figure 2Feature selection and feature distribution visualization. (A) The combination of five VIA and five VILI features achieved the maximum macro-averaged F1 score. Statistical distribution of the selected features in VIA (B–F) and VILI (G–K) training samples.
Figure 3Visualization of the extracted five top-ranked features of VIA and VILI images. LN-group: L1; H-group: H1, H2. (A) Three color features and (B) two texture features were for VIA images. (C) Five color features were for VILI images.
Experimental results of different machine learning algorithms and performance of physician diagnoses.
| VIA | VILI | Clin. | Smo. | Sensitivity | Accuracy | Specificity | |
|---|---|---|---|---|---|---|---|
|
| * | * | * | * |
| 81.8% | 86.9% |
| Random forest | * | * | 5.4% | 84.0% | 97.0% | ||
| NN(0.5) | * | * | 7.3% | 85.3% | 97.5% | ||
| NN(0.8) | * | * | 12.2% | 82.1% | 94.7% | ||
| 1D-CNN | * | * | 2.4% | 85.0% | 99.0% | ||
| RBF-SVM | * | 3.9% | 85.0% | 98.9% | |||
| RBF-SVM | * | 4.9% | 81.0% | 94.2% | |||
| RBF-SVM | * | * | 13.2% | 85.0% | 97.0% | ||
| RBF-SVM | * | * | 20.0% | 75.0% | 84.9% | ||
| RBF-SVM | * | * | 8.3% | 78.0% | 89.3% | ||
| RBF-SVM | * | * | * | 25.9% | 81.0% | 80.8% | |
| ResNet-50 | * | 7.30% | 80% | 92.20% | |||
| ResNet-50+NB | * | * | 17.10% | 81.10% | 91.80% | ||
| VGG-16 | * | 7.32% | 86.30% | 99.60% | |||
| VGG-16+NB | * | * | 24.40% | 86.00% | 96.30% | ||
| ResNet-50+VGG-16+NB | * | * | * | 29.30% | 80% | 88.50% | |
| Chef physicians (88) | * | * | * | 60.0% | 85.2% | 88.5% | |
| Attending physicians (558) | * | * | * | 59.0% | 85.8% | 90.2% | |
| Resident physicians (340) | * | * | * | 55.1% | 86.2% | 91.4% | |
| Physicians-test set (285) | * | * | * | 53.7% | 84.6% | 89.8% | |
|
| * | * | * | * | 70.7% | 79.6% | 81.1% |
*: The model or physicians used this kind of training data. Clin, Clinical information; Smo, Smote.
Figure 4(A) Two hundred eighty-five patients in the test set were made wrong predictions only by physicians (red) or model (yellow). (B) The confusion matrix of our model, the physician diagnosis, and the model-aided physician diagnosis.