| Literature DB >> 35506115 |
Jiana Meng1, Zhiyong Tan1, Yuhai Yu1, Pengjie Wang1, Shuang Liu1.
Abstract
The recognition of medical images with deep learning techniques can assist physicians in clinical diagnosis, but the effectiveness of recognition models relies on massive amounts of labeled data. With the rampant development of the novel coronavirus (COVID-19) worldwide, rapid COVID-19 diagnosis has become an effective measure to combat the outbreak. However, labeled COVID-19 data are scarce. Therefore, we propose a two-stage transfer learning recognition model for medical images of COVID-19 (TL-Med) based on the concept of "generic domain-target-related domain-target domain". First, we use the Vision Transformer (ViT) pretraining model to obtain generic features from massive heterogeneous data and then learn medical features from large-scale homogeneous data. Two-stage transfer learning uses the learned primary features and the underlying information for COVID-19 image recognition to solve the problem by which data insufficiency leads to the inability of the model to learn underlying target dataset information. The experimental results obtained on a COVID-19 dataset using the TL-Med model produce a recognition accuracy of 93.24%, which shows that the proposed method is more effective in detecting COVID-19 images than other approaches and may greatly alleviate the problem of data scarcity in this field.Entities:
Keywords: COVID-19; Pretrained Model; Transfer Learning; ViT
Year: 2022 PMID: 35506115 PMCID: PMC9051950 DOI: 10.1016/j.bbe.2022.04.005
Source DB: PubMed Journal: Biocybern Biomed Eng ISSN: 0208-5216 Impact factor: 5.687
Fig. 1A CT scan of the lungs of a patient infected with COVID-19 and a CT scan of a normal lung [7].
Fig. 3Overall structure of the model.
Fig. 2Schematic diagram of transfer learning.
Fig. 4SA structure.
Fig. 5Multiheaded SA structure.
Fig. 6Transformer block structure.
COVID-19 dataset table.
| categories | quantities |
|---|---|
| COVID-19 | 349 |
| NonCOVID-19 | 397 |
TB dataset.
| Types | Amounts |
|---|---|
| Infiltrative | 420 |
| Focal | 226 |
| Tuberculoma | 101 |
| Miliary | 100 |
| Fibro-cavernous | 70 |
Distributions before and after data enhancement.
| Type | Categories | Total | |
|---|---|---|---|
| COVID-19 | NonCOVID-19 | ||
| Train | 280(2520) | 318(2862) | 598(5382) |
| Test | 69 | 79 | 148 |
| Total | 349(2589) | 397(2941) | 746(5530) |
Comparison among the pretrained models.
| Model | Acc. | Precision | Recall | F1 | Auc | Time (min) |
|---|---|---|---|---|---|---|
| DenseNet169 | 0.8649 | 0.9538 | 0.7848 | 0.8611 | 0.9274 | 8 |
| ResNet101 | 0.8581 | 0.9028 | 0.8228 | 0.8609 | 0.9176 | 15 |
| ResNet34 | 0.8446 | 0.8500 | 0.8608 | 0.8553 | 0.9439 | 10 |
| ViT | 0.8986 | 0.9103 | 0.8987 | 0.9045 | 0.9545 | 9.4 |
Fig. 7Five types of TB slices.
Transfer learning comparison.
| Model | Acc. | Precision | Recall | F1 | Auc | Time |
|---|---|---|---|---|---|---|
| ResNet34-TL | 0.777 | 0.8833 | 0.6709 | 0.7626 | 0.8556 | 9 |
| ResNet101-TL | 0.7635 | 0.7683 | 0.7975 | 0.7826 | 0.7666 | 10 |
| DenseNet169-TL | 0.8919 | 0.8795 | 0.9241 | 0.9012 | 0.9532 | 10 |
| ViT-HTL | 0.8986 | 0.9103 | 0.8987 | 0.9045 | 0.9545 | 9.4 |
| TL-Med | 0.9122 | 0.9459 | 0.8861 | 0.9150 | 0.9606 | 9.3 |
Ablation experiments.
| Model | Acc. | Precision | Recall | F1 | Auc | Time |
|---|---|---|---|---|---|---|
| TL-Med + freeze | 0.8514 | 0.8800 | 0.8354 | 0.8571 | 0.9066 | 7.2 |
| TL-Med + no-freeze | 0.9122 | 0.9459 | 0.8861 | 0.9150 | 0.9606 | 9.3 |
| TL-Med + freeze + PL | 0.8716 | 0.9054 | 0.8481 | 0.8758 | 0.9197 | 7.1 |
| TL-Med + no-freeze + PL | 0.9122 | 0.9342 | 0.8987 | 0.9161 | 0.9576 | 9.3 |
| TL-Med + freeze + PL + dataAug | 0.8581 | 0.8919 | 0.8354 | 0.8627 | 0.9145 | 337 |
| TL-Med + no-freeze + PL + dataAug |
Fig. 8Test set accuracies before and after data enhancement experiments.
Comparison of our proposed method with the existing literature.
| Method | dataset | Acc. | Precision | Recall | F1 | Auc |
|---|---|---|---|---|---|---|
| Xiao et al.[ | Test data 1: | 0.9557 | 0.99 | 0.99 | 0.99 | – |
| Test data 2: | 0.9444 | 0.95 | 0.95 | 0.95 | – | |
| Narendra et al.[ | 400 COVID-19, | 0.9912 | 0.99 | 0.99 | 0.99 | – |
| Nayeeb et al.[ | 408COVID-19, 816Non-COVID | 0.9939 | 0.9919 | 0.9939 | 0.9919 | – |
| Mahesh et al.[ | 2249COVID-19, | 0.983 | – | 0.9831 | 0.98 | 0.999 |
| Bejoy et al.[ | Test data 1: | 0.9115 | 0.853 | 0.985 | 0.914 | 0.963 |
| Test data 2: | 0.9743 | 0.986 | 0.986 | 0.986 | 0.911 | |
| Govardhan et al.[ | 250COVID-19, | 0.9777 | 0.9714 | 0.9714 | – | – |
| Shome et al.[ | 10819COVID-19, | 0.9320 | – | 0.9609 | – | – |
| Ruochi et al.[ | 189COVID-19, | 0.9108 | – | – | – | – |
| Mohammad et al.[ | 184COVID-19, | 0.9911 | – | – | – | – |
| Panwar et al.[ | 142COVID-19, | 0.881 | – | – | – | 0.881 |
| Proposed Method | 349COVID-19, | 0.9324 | 0.96 | 0.9114 | 0.9351 | 0.9686 |
Fig. 9The confusion matrix of the proposed TL-Med framework.