| Literature DB >> 34721657 |
Sushovan Chaudhury1, Nilesh Shelke2, Kartik Sau1, B Prasanalakshmi3, Mohammad Shabaz4.
Abstract
Breast cancer is the most common invasive cancer in women and the second main cause of cancer death in females, which can be classified benign or malignant. Research and prevention on breast cancer have attracted more concern of researchers in recent years. On the other hand, the development of data mining methods provides an effective way to extract more useful information from complex databases, and some prediction, classification, and clustering can be made according to the extracted information. The generic notion of knowledge distillation is that a network of higher capacity acts as a teacher and a network of lower capacity acts as a student. There are different pipelines of knowledge distillation known. However, previous work on knowledge distillation using label smoothing regularization produces experiments and results that break this general notion and prove that knowledge distillation also works when a student model distils a teacher model, i.e., reverse knowledge distillation. Not only this, but it is also proved that a poorly trained teacher model trains a student model to reach equivalent results. Building on the ideas from those works, we propose a novel bilateral knowledge distillation regime that enables multiple interactions between teacher and student models, i.e., teaching and distilling each other, eventually improving each other's performance and evaluating our results on BACH histopathology image dataset on breast cancer. The pretrained ResNeXt29 and MobileNetV2 models which are already tested on ImageNet dataset are used for "transfer learning" in our dataset, and we obtain a final accuracy of more than 96% using this novel approach of bilateral KD.Entities:
Mesh:
Year: 2021 PMID: 34721657 PMCID: PMC8550839 DOI: 10.1155/2021/4019358
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1The labelled images of each type of cell used in classification.
Specifications of microscopy images.
|
| Parameter | Details |
|---|---|---|
| 1 | Colour model | Red, green, blue |
| 2 | Size | 2048( |
| 3 | Pixel scale | 0.42 |
| 4 | Retention area: 10-20 MB (approx.) | 10-20 MB (approx.) |
Figure 2The proposed framework of knowledge distillation.
Algorithm 1Bilateral knowledge distillation (base training: ResNeXt29 for e epochs).
Figure 3Schematic flow diagram of bilateral KD as used in this paper.
Figure 4Train and test set accuracy of MobileNetV2.
Figure 5Train and test set accuracy of ResNeXt29.
Figure 6Normal knowledge distillation results.
Figure 7Reverse knowledge distillation result.
Figure 8Results of training and validation using bilateral knowledge distillation.
Results of all experiments performed.
| Experiment performed | Model | Validation accuracy |
|---|---|---|
| Baseline teacher training | ResNeXt29 | 75% |
| MobileNetV2 | 60.71% | |
|
| ||
| Normal KD | ResNeXt29 teacher | 60.71% |
| MobileNetV2 student | ||
|
| ||
| Reverse KD | MobileNetV2 teacher | 65.58% |
| ResNeXt29 student | ||
|
| ||
| Bilateral KD | Phase 1+phase 2+phase 3 | 96.3% |