| Literature DB >> 35265301 |
Mathew Jose Mammoottil1, Lloyd J Kulangara1, Anna Susan Cherian1, Prabu Mohandas1, Khairunnisa Hasikin2,3, Mufti Mahmud4.
Abstract
Breast cancer is one of the most common forms of cancer. Its aggressive nature coupled with high mortality rates makes this cancer life-threatening; hence early detection gives the patient a greater chance of survival. Currently, the preferred diagnosis method is mammography. However, mammography is expensive and exposes the patient to radiation. A cost-effective and less invasive method known as thermography is gaining popularity. Bearing this in mind, the work aims to initially create machine learning models based on convolutional neural networks using multiple thermal views of the breast to detect breast cancer using the Visual DMR dataset. The performances of these models are then verified with the clinical data. Findings indicate that the addition of clinical data decisions to the model helped increase its performance. After building and testing two models with different architectures, the model used the same architecture for all three views performed best. It performed with an accuracy of 85.4%, which increased to 93.8% after the clinical data decision was added. After the addition of clinical data decisions, the model was able to classify more patients correctly with a specificity of 96.7% and sensitivity of 88.9% when considering sick patients as the positive class. Currently, thermography is among the lesser-known diagnosis methods with only one public dataset. We hope our work will divert more attention to this area.Entities:
Mesh:
Year: 2022 PMID: 35265301 PMCID: PMC8901325 DOI: 10.1155/2022/4295221
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Machine identification methods.
| Methods | Description |
|---|---|
| CT scan | It provides x-ray images that give a complete cross-sectional view of the internal organs. It has a significant advantage in evaluating of the breast to identify cancerous cells. CT scans are used for more advanced stages of cancer to check if cancer has spread to other parts of the body or to see the effects of medication on the cancerous cells [ |
| MRI scan | It uses radio waves to produce scans of the body. It is used to determine the presence of cancerous cells in the body and to see the tumor's growth region. It does not expose a person to radiation, hence a safer alternative to mammograms. In certain cases, a gadolinium-based dye or contrast material could be injected into the arm to show the images more clearly [ |
| Mammograms | It produces an x-ray of the breast used for early detection of breast cancer. During this procedure, the breasts are compressed by compression plates. It is one of the most effective methods for detecting breast cancer when no lump is visible or when the patient would like to scan any specific area that shows certain symptoms associated with the disease. If a woman has a high risk of breast cancer, screening starts at a younger age [ |
| Ultrasound | It uses sound waves to produce images of the breast. It is not usually employed as a screening tool, but is used along with a mammogram to find the existence of a fluid filled cyst or tumour. They are similar to thermography where there is no exposure to radiation and are effective for women with dense breasts, pregnant women, and young women (age below 25) [ |
| Thermography | It captures the image of breasts using a device which essentially evaluates the surface temperature of the skin of the breasts. During this screening process, there is no radiation, no contact between the patient and the device, and no breast compression, making it a desirable screening tool. It is based on the principle that we know cancer cells grow and spread to different regions fast due to their high metabolism. As metabolism increases, the temperature at that region also increases which can be used to detect the presence of cancerous cells [ |
Papers outlining different approaches to breast cancer detection using thermal images.
| References | ML approach | Dataset used | Classification | Evaluation metrics |
|---|---|---|---|---|
| Schaefer et al. [ | If-then rules | 146 images (29-malignant 117-benign) | Fuzzy rule-based classification | Classification 78.05%, sensitivity 74.14%, specificity 79.02% |
| Abdel-Nasser et al. [ | Learning to rank (LTR) and six texture analysis techniques | 37 images (sick) and 19 (healthy) | Multi-layer perceptron (MLP) classifier | AUC 98.9%, accuracy 95.8%, recall 97.1%, precision 94.6% (for HOG texture analysis technique) |
| Sathish and Surekha [ | Decision tree | Dataset used not specified | Ensemble classifiers (ensemble bagged trees classifier and AdaBoost) | Accuracy 87%, sensitivity 83%, specificity 90.6% (for the ensemble bagged trees classifier) |
| Torres-Galvan et al. [ | AlexNet, GoogLeNet, ResNet-50, ResNet-101, inception-v3,VGG-16 and VGG-19 | 173 images (32 abnormal, 141 healthy) | Pretrained version of network loaded for accuracy, efficient training time | VGG-16 performed best with sensitivity 100%, specificity 82.35%, balanced accuracy 91.18% |
| Mambou et al. [ | Deep neural network and SVM (in certain cases) | 67 patients (43 healthy, 24 sick) | Pretrained inception V3 model (modification at the last layer) | ROC area 1.00, precision 1.00, recall 1.00 |
| Milosevic et al. [ | SVM, k-NN, and naive bayes classifier | 40 images (26 normal and 14 abnormal) | 20 GLCM based texture features are extracted to classify followed by image segmentation | Accuracy 92.5%, sensitivity 78.6% NPV 89.7% |
| Tello-Mijares et al. [ | Gradient vector flow with a CNN | 63 images (35 normal and 28 abnormal) | Segmentation using GVF followed by CNN | Accuracy 100%, sensitivity 100%, specificity 100% |
| Hossam et al. [ | Hough transform (HT) algorithm segmentation | 200 images (90 normal and 110 abnormal) | Locate and identify parabolic curves to obtain ROIs | Accuracy 96.667%, kappa statistics 0.9331 (using SVM classifier) |
| Lou et al. [ | Segmentation using MultiResUnet neural networks | 450 images from 14 patients and 16 volunteers | Segmentation using encoder and a decoder | Average accuracy 91.47% (2 percent higher than an auto encoder) |
| Roslidar et al. [ | DenseNet, ResNet101, MobileNetV2, and shuffleNetV2 | 3581 images (731 cancerous and 2850 healthy) | Tuning the parameters of models to reduce training time while maintaining high accuracy | Best considering train time was MobileNetV2 (static) accuracy 100%, recall 100%, precision 100% |
| Ahmed et al. [ | Ant colony optimization and particle swarm optimization | 118 frontal views of Patients (30 normal, 45 benign, 43 malignant) | Feature extraction (grey level occurrence matrix) feature selection (ACO and PSO) | ACO: accuracy 94.29%, sensitivity 94.3%, specificity 97.3 PSO: accuracy 97.14%, sensitivity 98%, specificity 98.6% |
| Nicandro et al. [ | Bayesian networks | 98 cases (77 patients with breast cancer and 21 cases are healthy patients) | Uses hill-climber algo., repeated hill-climber algo. and naive bayes classifier to classify dataset with 14 features | Repeated hill-climber shows best results: accuracy 76.12%, sensitivity 99% |
Machine identification methods.
| CNN model | Description |
|---|---|
| ResNet | Residual networks make use of residual blocks to train very deep neural networks. Making use of these blocks helps prevent the complications that come with training deep networks such as accuracy degradation. The key idea behind these blocks is to have the output of one layer be sent as an input to a layer deeper in the network. This is then added with layer's output in its normal path before the nonlinear function is applied. The variants of ResNet include ResNet-50, ResNet-1202, ResNet-110, ResNet-164, ResNet-101, and ResNet-152 [ |
| DenseNet | In the DenseNet architecture, for any layer in the convolutional neural network, the input will be the concatenated result of the output of all the previous layers in the network. By combining information from the previous layers, the computational power can be reduced. Through these connections across all layers, the model is able to retain more information moving across the network improving the overall performance of the network. The variants of DenseNet are DenseNet-B and DenseNet-BC [ |
| MobileNet | MobileNet is an architecture that is implemented for mobile devices. They use depth-wise separable convolutions where the filters are added for individual input channels as opposed to the general CNN models. These each channel separate and stack the convoluted outputs together. The different variants include MobileNetV1 and MobileNetV2 [ |
| ShuffleNet | They are efficient convolutional neural networks which use filter shuffling to reduce computational complexity. They can reduce the time required to train a model and are used most commonly in smartphones. Shuffling the filters allows more information to pass the network which increases the accuracy of the network with a limited number of resources. The different ShuffleNet models include ShuffleNetV1 and ShuffleNetV2 [ |
Advantages and disadvantages of various classifiers.
| Approach | Advantages | Disadvantages | Ref |
|---|---|---|---|
| Fuzzy rule-based classifier | The logic leading to a prediction is usually transparent | Increasing the partitioning (i.e., number of divisions) makes the classifier computational expensive | [ |
| LTR and texture analysis methods | Enable creation of a “compact descriptive representation” of the thermograms | Ranking order can be altered by small perturbations that could go unnoticed by an individual | [ |
| Ensemble classifiers | Produce better results if using a single classifier will not give an accurate result | Computationally expensive and the model can become complex | [ |
| Alex net, GoogleNet, ResNet-50, ResNet-101, Inception-v3, VGG-16 and VGG-19 | Architecture of VGG 16 compared to others outperforms but smaller architecture like GoogleNet is also preferable | VGG-16 is slow to train and the disk space it requires is inefficient (528 MB) | [ |
| DNN | Powerful feature extraction and the modification in the final layer helps classify with a better confidence | Overfitting and computationally expensive | [ |
| KNN, SVM, Naive Bayes | Lesser data needed to work on compared to NN. SVM works well with outliers. Less complicated to operate. | KNN and SVM take longer to compute. Naive Bayes assumes features are independent | [ |
| Gradient vector flow with a CNN | Segmentation to obtain regions of interest | Edge map function constants are determined using knowledge in the topic | [ |
| Hough transform (HT) algorithm segmentation | Can distinctly identify breast boundaries | For objects similar to luck, it can give wrong values | [ |
| MultiResUnet neural networks | Requires less time and effort than manual segmentation. | Better encoders and decoders exist in for example GoogleNet | [ |
| DenseNet, ResNet101, MobileNetV2 and shuffleNetV2 | These models has fast training time maintaining high accuracy | Thermal image identification accuracy can be higher | [ |
| Ant colony optimization and particle swarm optimization | Feature extraction and selection help eliminate some irrelevant features and increase accuracy | Time of convergence is uncertain for ACO which could be a setback | [ |
| Bayesian networks | Features are considered to be dependent on each other. Ability to visually show this relation properly | Requires complex probabilistic function for better metrics. Poor performance for higher number of features | [ |
Figure 1Different methods employed for prediction.
CNN models for different views with their outputs.
| Model | Output |
|---|---|
| Frontal model | [P (Hfront), P (Sfront)] |
| Right 45° model | [P (Hright45), P (Sright45)] |
| Left 45° model | [P (Hleft45), P (Sleft45)] |
| Right 90° model | [P (Hright90), P (Sright90)] |
| Left 90° model | [P (Hleft90), P (Sleft90)] |
Figure 2Design of the multi-input CNN.
Figure 3The views: (a) Front. (b) Left 45°. (c) Left 90°. (d) Right 90°. (e) Right 45°.
Figure 4(a) Fuzzy frontal view. (b) Protocol not followed. (c) Injury on the breast. (d) Clear frontal view.
The number of samples is based on their class and set.
| Classes | Training set (80%) | Test set (20%) |
|---|---|---|
| Healthy | 126 | 31 |
| Sick | 67 | 17 |
Evaluation metrics.
| Metric | Definition | Formula |
|---|---|---|
| True positive (TP) | The number of samples which were predicted positive and actually positive | The number of sick patients correctly classified out of the 17 positive samples |
| False positive (FP) | The number of samples which were predicted positive and actually negative | The number of sick patients that were incorrectly classified out of the 17 positive samples |
| True negative (TN) | The number of samples which were predicted negative and actually negative | The number of healthy patients correctly classified out of the 31 negative samples |
| False negative (FN) | The number of samples which were predicted negative and actually positive | The number of healthy patients incorrectly classified out of the 31 negative samples |
| Accuracy | The proportion of correct classifications | ( |
| Sensitivity (recall) | The proportion of the positive class that got correctly classified | ( |
| Specificity | The proportion of the negative class that got correctly classified | ( |
| Precision | How good a model is at predicting the positive class | ( |
Figure 5Multi-input CNN model 1. (a) CNN model for front view, (b) CNN model for right 90° view, (c) CNN model for left 90° view.
Model 1 performance evaluation.
| Metric | Without CD (%) | With CD (%) |
|---|---|---|
| Accuracy | 85.4 | 93.8 |
| Sensitivity | 77.8 | 88.9 |
| Specificity | 90.0 | 96.7 |
| Precision | 82.4 | 94.1 |
|
| 80.0 | 91.4 |
| ROC AUC | 89.4 | 98.7 |
| Precision-recall score | 84.1 | 97.7 |
Figure 6Model 1 AUC plots. (a) ROC curve without CD, (b) precision-recall curve without CD, (c) ROC curve with CD, (d) precision-recall curve with CD.
Figure 7Multi-input CNN model 2. (a) CNN model for front view, (b) CNN model for right 90° view, (c) CNN model for left 90° view.
Model 2 performance evaluation.
| Metric | Without CD (%) | With CD (%) |
|---|---|---|
| Accuracy | 81.2 | 89.6 |
| Sensitivity | 75.0 | 87.5 |
| Specificity | 84.4 | 90.6 |
| Precision | 70.6 | 82.4 |
|
| 72.7 | 84.8 |
| ROC AUC | 80.2 | 90.0 |
| Precision-recall score | 77.1 | 92.3 |
Figure 8Model 2 AUC plots. (a) ROC curve without CD, (b) precision-recall curve without CD, (c) ROC curve with CD, (d) precision-recall curve with CD.
Individual view performance measures.
| Model | View | Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|
| Model 1 | Frontal | 0.771 | 0.882 | 0.710 |
| Left 90° | 0.583 | 0.412 | 0.677 | |
| Right 90° | 0.723 | 0.353 | 0.936 | |
|
| ||||
| Model 2 | Frontal | 0.771 | 0.824 | 0.742 |
| Left 90° | 0.604 | 0.530 | 0.645 | |
| Right 90° | 0.708 | 0.824 | 0.645 | |
Comparison of our work with other methods.
| Reference paper | Methodology | Accuracy (%) | Sensitivity (%) | Specificity (%) | ROC AUC (%) |
|---|---|---|---|---|---|
| [ | If-then rules | — | 74.17 | 79.02 | — |
| [ | Learning to rank (LTR) and six texture analysis techniques | 95.8 | 97.1 | — | 98.8 |
| [ | Decision tree | 87 | 83 | 90.6 | - |
| [ | Deep neural network and SVM | — | 100 | — | 100 |
| [ | Gradient vector flow with a CNN | 100 | 100 | 100 | — |
| [ | Hough transform algorithm segmentation | 96.67 | — | — | |
| [ | Segmentation using MulitResUnet neural networks | 91.47 | — | — | — |
| [ | DenseNet, ResNet101, MobileNetV2, and shuffleNetV2 | 100 | 100 | 100 | — |
| [ | AlexNet, GoogleNet, ResNet-50, ResNet-101, Inception-v3, VGG-16, and VGG-19 | 91.8 | 100 | 82.35 | — |
| [ | SVM, KNN and Naive Bayes classifier | 92.5 | 78.6 | 89.7 | — |
| [ | Bayesian networks | 76.12 | 99 | — | — |
| [ | Ant colony optimization (ACO) and particle swarm optimization (PSO) | 94.29 (for ACO) | 94.3 (for ACO) | 97.3 (for ACO) | — |
| Our work | Multiview CNN classification model 1 | 93.8 | 88.9 | 96.7 | 98.7 |
| Our work | Multi-view CNN classification model 2 | 89.6 | 87.5 | 90.6 | 90.0 |