| Literature DB >> 35607393 |
Ping Li1, Xiaoxia Wang2, Peizhong Liu2,3, Tianxiang Xu3, Pengming Sun4, Binhua Dong4, Huifeng Xue4.
Abstract
Objective: In order to better adapt to clinical applications, this paper proposes a cross-validation decision-making fusion method of Vision Transformer and DenseNet161.Entities:
Mesh:
Year: 2022 PMID: 35607393 PMCID: PMC9124126 DOI: 10.1155/2022/3241422
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 3.822
Figure 1Category samples of clinical colposcopy images.
Figure 2The overall scheme of the method.
Figure 3Data preprocessing.
Figure 4Prediction process of cervical images by Vision Transformer model.
Figure 5Prediction process of cervical images by DenseNet161 model.
Data distribution.
| Items | Negative | LSIL | HSIL | Cancer | Total | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Cases | Images | Cases | Images | Cases | Images | Cases | Images | Cases | Image | |
| Train | 151 | 510 | 159 | 552 | 281 | 1180 | 41 | 170 | 632 | 2412 |
| Test | 24 | 24 | 23 | 23 | 43 | 43 | 10 | 10 | 100 | 100 |
| Total | 175 | 534 | 182 | 575 | 324 | 1223 | 51 | 180 | 732 | 2512 |
Definitions of evaluation indicators.
| Evaluation indicators | Definition |
|---|---|
| ACC | (TP + TN)/(TP + FN + FP + TN) |
| SEN | TP/(TP + FN) |
| SPEC | TN/(TN + FP) |
| PPV | TP/(TP + FP) |
| NPV | TN/(TN + FN) |
| F1 score | 2 × PPV × SEN/(PPV + SEN) |
Comparison results between this fusion model and other models.
| Model | Classes | ACC | F1 | PPV | SEN | SPEC | NPV |
|---|---|---|---|---|---|---|---|
| Vision Transformer | Cancer | 0.6000 | 0.8350 | 0.8136 | 0.8600 | 0.9780 | 0.9846 |
| HSIL | 0.6184 | 0.5726 | 0.6744 | 0.6210 | 0.7190 | ||
| LSIL | 0.4366 | 0.4984 | 0.3912 | 0.8778 | 0.8282 | ||
| Negative | 0.6018 | 0.6556 | 0.5584 | 0.9080 | 0.8674 | ||
|
| |||||||
| ShuffleNetV2 | Cancer | 0.6040 | 0.7722 | 0.7892 | 0.7600 | 0.9780 | 0.9738 |
| HSIL | 0.6076 | 0.5744 | 0.6466 | 0.6384 | 0.7062 | ||
| LSIL | 0.4628 | 0.5486 | 0.4086 | 0.8986 | 0.8364 | ||
| Negative | 0.6386 | 0.6284 | 0.6498 | 0.8790 | 0.8884 | ||
|
| |||||||
| MobileNetV3 | Cancer | 0.6260 | 0.7910 | 0.7490 | 0.8400 | 0.9692 | 0.9824 |
| HSIL | 0.6402 | 0.5960 | 0.6930 | 0.6456 | 0.7372 | ||
| LSIL | 0.4686 | 0.5376 | 0.4176 | 0.8934 | 0.8370 | ||
| Negative |
|
| 0.6166 |
| 0.8840 | ||
|
| |||||||
| EfficientNetV2 | Cancer | 0.6360 | 0.8160 | 0.8406 | 0.8000 | 0.9824 | 0.9780 |
| HSIL | 0.6346 | 0.6218 | 0.6512 | 0.6982 | 0.7268 | ||
| LSIL | 0.5446 | 0.6158 | 0.4956 | 0.9064 | 0.8584 | ||
| Negative | 0.6390 | 0.6096 |
| 0.8632 |
| ||
|
| |||||||
| DenseNet161 | Cancer | 0.6560 | 0.8672 | 0.8500 |
| 0.9802 |
|
| HSIL | 0.6514 |
| 0.6420 |
| 0.7388 | ||
| LSIL | 0.5666 | 0.5798 |
| 0.8778 | 0.8724 | ||
| Negative | 0.6486 | 0.6370 | 0.6666 | 0.8790 | 0.8938 | ||
|
| |||||||
| Ours | Cancer |
|
|
|
|
|
|
| HSIL |
| 0.6600 |
| 0.7190 |
| ||
| LSIL |
|
| 0.5650 |
|
| ||
| Negative | 0.6380 | 0.6520 | 0.6250 | 0.8950 | 0.8830 | ||
Comparison fusion results of different models.
| Model |
|
| ACC |
|---|---|---|---|
| MobileNetV3 + ShuffleNetV2 | 0.65 | 0.35 | 0.61 |
| Vision transformer + ShuffleNetV2 | 0.9 | 0.1 | 0.61 |
| Vision transformer + MobileNetV3 | 0.3 | 0.7 | 0.61 |
| ShuffleNetV2 + EfficientNetV2 | 0.1 | 0.9 | 0.65 |
| MobileNetV3 + DenseNet161 | 0.3 | 0.7 | 0.66 |
| Vision transformer + EfficientNetV2 | 0.05 | 0.95 | 0.66 |
| MobileNetV3 + EfficientNetV2 | 0.35 | 0.65 | 0.67 |
| ShuffleNetV2 + DenseNet161 | 0.1 | 0.9 | 0.67 |
| EfficientNetV2 + DenseNet161 | 0.4 | 0.6 | 0.68 |
| Vision Transformer + DenseNet161 | 0.05 | 0.95 | 0.68 |
Comparison fusion results of DenseNet161 with EfficientNetV2 and Vision Transformer.
| Model | Classes | F1 | PPV | SEN | SPEC | NPV |
|---|---|---|---|---|---|---|
| DenseNet161 + EfficientNetV2 | Cancer | 0.900 | 0.900 | 0.900 | 0.989 | 0.989 |
| HSIL | 0.682 |
| 0.698 |
| 0.764 | |
| LSIL |
|
|
| 0.909 |
| |
| Negative | 0.625 | 0.625 | 0.625 | 0.882 | 0.882 | |
|
| ||||||
| DenseNet161 + Vision Transformer | Cancer | 0.900 | 0.900 | 0.900 | 0.989 | 0.989 |
| HSIL |
| 0.660 |
| 0.719 |
| |
| LSIL | 0.605 | 0.650 | 0.565 | 0.909 | 0.875 | |
| Negative |
|
| 0.625 |
|
| |
Figure 6Confusion matrices of DenseNet161 with EfficientNetV2 and Vision Transformer. (a) DenseNet161 + EfficientNetV2. (b) DenseNet161 + Vision Transformer.
Clinical comparison results of 100 test images.
| Model | Classes | ACC | F1 | PPV | SEN | SPEC | NPV |
|---|---|---|---|---|---|---|---|
| Colposcopists | Cancer | 0.6200 | 0.6660 | 0.5710 | 0.8000 | 0.9330 | 0.9770 |
| HSIL | 0.6850 |
| 0.5810 |
| 0.7430 | ||
| LSIL | 0.5160 | 0.4100 |
| 0.7010 |
| ||
| Negative | 0.6340 |
| 0.5420 |
| 0.8670 | ||
|
| |||||||
| Ours | Cancer | 0.6800 |
|
|
|
|
|
| HSIL |
| 0.6600 |
| 0.7190 |
| ||
| LSIL |
|
| 0.5650 |
| 0.8750 | ||
| Negative |
| 0.6520 |
| 0.8950 |
| ||