| Literature DB >> 35808433 |
Nagwan Abdel Samee1, Amel A Alhussan2, Vidan Fathi Ghoneim3, Ghada Atteia1, Reem Alkanhel1, Mugahed A Al-Antari4, Yasser M Kadah5,6.
Abstract
One of the most promising research areas in the healthcare industry and the scientific community is focusing on the AI-based applications for real medical challenges such as the building of computer-aided diagnosis (CAD) systems for breast cancer. Transfer learning is one of the recent emerging AI-based techniques that allow rapid learning progress and improve medical imaging diagnosis performance. Although deep learning classification for breast cancer has been widely covered, certain obstacles still remain to investigate the independency among the extracted high-level deep features. This work tackles two challenges that still exist when designing effective CAD systems for breast lesion classification from mammograms. The first challenge is to enrich the input information of the deep learning models by generating pseudo-colored images instead of only using the input original grayscale images. To achieve this goal two different image preprocessing techniques are parallel used: contrast-limited adaptive histogram equalization (CLAHE) and Pixel-wise intensity adjustment. The original image is preserved in the first channel, while the other two channels receive the processed images, respectively. The generated three-channel pseudo-colored images are fed directly into the input layer of the backbone CNNs to generate more powerful high-level deep features. The second challenge is to overcome the multicollinearity problem that occurs among the high correlated deep features generated from deep learning models. A new hybrid processing technique based on Logistic Regression (LR) as well as Principal Components Analysis (PCA) is presented and called LR-PCA. Such a process helps to select the significant principal components (PCs) to further use them for the classification purpose. The proposed CAD system has been examined using two different public benchmark datasets which are INbreast and mini-MAIS. The proposed CAD system could achieve the highest performance accuracies of 98.60% and 98.80% using INbreast and mini-MAIS datasets, respectively. Such a CAD system seems to be useful and reliable for breast cancer diagnosis.Entities:
Keywords: CAD system; breast cancer; breast lesion classification; deep feature extraction and reduction; hybrid CNN-based LR-PCA
Mesh:
Year: 2022 PMID: 35808433 PMCID: PMC9269713 DOI: 10.3390/s22134938
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1The Proposed CAD framework for breast lesion classification from X-ray mammograms.
Figure 2The concept of generating three-channel pseudo-color mapping image.
Figure 3Example of data preparation phase for generating the pseudo-colored image based on the original grayscale X-ray mammogram. (a) Region of interest (ROI); (b) Pseudo-Colored image; (c) Grayscale image.
Figure 4The retrieved processing time using Pretrained CNN for each dataset.
Figure 5A heatmap for the correlation coefficient between the extracted features and a histogram of the corresponding p-value. (a) The heatmap of correlation coefficient; (b) Histogram of the p-value.
Figure 6The energy retained within the retrieved PCs.
Statistics of the Logistic Regression (LR) model.
| Model Predictors | SE | ||
|---|---|---|---|
| PC1 | 1.1638 | 7.8235 | 5.14 × 10−15 |
| PC2 | 0.0008 | 14.7641 | 0 |
| PC3 | 0.0013 | −11.1861 | 4.77 × 10−29 |
| PC4 | 0.0014 | −13.4629 | 2.59 × 10−41 |
| PC5 | 0.0019 | −12.9322 | 2.96 × 10−38 |
| PC6 | 0.0015 | −2.1592 | 0.030837 |
| PC7 | 0.0022 | 13.8375 | 1.51 × 10−43 |
| PC8 | 0.0019 | −0.0055 | 0.995623 |
| PC9 | 0.0023 | −9.0287 | 1.74 × 10−19 |
| PC10 | 0.0026 | 1.2018 | 0.22944 |
| PC11 | 0.0026 | −2.1862 | 0.028801 |
| PC12 | 0.0027 | 1.6188 | 0.1055 |
| PC13 | 0.0032 | −7.2077 | 5.69 × 10−13 |
| PC14 | 0.0033 | −8.3964 | 4.61 × 10−17 |
| PC15 | 0.0034 | −0.7858 | 0.431973 |
| PC16 | 0.0035 | 1.8389 | 0.065923 |
| PC17 | 0.0037 | 0.0854 | 0.931964 |
| PC18 | 0.0036 | 5.9441 | 2.78 × 10−9 |
| PC19 | 0.0035 | 1.8050 | 0.07108 |
| PC20 | 0.0036 | −6.9300 | 4.21 × 10−12 |
| PC21 | 0.0040 | 4.9057 | 9.31 × 10−7 |
| PC22 | 0.0040 | −4.4528 | 8.47 × 10−6 |
| PC23 | 0.0043 | 11.6463 | 2.40 × 10−31 |
| PC24 | 0.0045 | −4.8968 | 9.74 × 10−7 |
| PC25 | 0.0048 | −9.4881 | 2.35 × 10−21 |
| PC26 | 0.0047 | 0.7119 | 0.476501 |
| PC27 | 0.0047 | −0.6445 | 0.519277 |
| PC28 | 0.0050 | −2.6172 | 0.008865 |
| PC29 | 0.0053 | 0.9396 | 0.347413 |
| PC30 | 0.0054 | −6.5354 | 6.34 × 10−11 |
| PC31 | 0.0050 | −1.4328 | 0.151915 |
| PC32 | 0.0054 | −0.5211 | 0.602287 |
| PC33 | 0.0052 | 2.2693 | 0.023249 |
| PC34 | 0.0057 | −0.6697 | 0.503072 |
| PC35 | 0.0058 | −0.1642 | 0.869595 |
| PC36 | 0.0056 | 6.0392 | 1.55 × 10−9 |
| PC37 | 0.0059 | −1.5046 | 0.132437 |
| PC38 | 0.0061 | −2.5012 | 0.012377 |
| PC39 | 0.0060 | 0.5841 | 0.559178 |
| PC40 | 0.0062 | 2.9399 | 0.003284 |
| PC41 | 0.0065 | 3.0751 | 0.002104 |
| PC42 | 0.0065 | −6.5521 | 5.67 × 10−11 |
| PC43 | 0.0065 | −3.3909 | 0.000697 |
| PC44 | 0.0068 | −0.2496 | 0.802873 |
| PC45 | 0.0063 | 3.0721 | 0.002125 |
| PC46 | 0.0065 | −1.5325 | 0.125403 |
| PC47 | 0.0064 | −1.5169 | 0.1293 |
| PC48 | 0.0067 | −2.3565 | 0.01845 |
| PC49 | 0.0069 | 1.7933 | 0.072918 |
| PC50 | 0.0071 | −3.3454 | 0.000822 |
Classification evaluation performance (%) of the proposed CAD system with three different backbone deep learning classifiers. These results were derived using INbreast dataset.
| Dataset | LR-PCA | Feature Extractor | Classification Model | Acc. | SE | SP | PRE | FNR | FPR | AUC | MCC | F1-Score |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| Across all classes | AlexNet | Ensemble (subspace KNN) | 97.20 | 96.75 | 97.67 | 97.65 | 3.25 | 2.33 | 100 | 94.43 | 97.20 |
| VGG16 | Ensemble (subspace KNN) | 95.90 | 94.12 | 97.77 | 97.68 | 5.88 | 2.23 | 99.0 | 91.95 | 95.87 | ||
| GoogleNet | Ensemble (subspace KNN) | 93.90 | 92.29 | 95.54 | 95.39 | 7.71 | 4.46 | 98.0 | 87.88 | 93.81 | ||
|
| AlexNet | Ensemble (subspace KNN) | 98.00 | 96.96 | 98.88 | 98.86 | 3.04 | 1.12 | 95.22 | 95.86 | 97.90 | |
| VGG16 | Ensemble (subspace KNN) | 97.90 | 96.65 | 99.09 | 99.06 | 3.35 | 0.91 | 100 | 95.77 | 97.84 | ||
| GoogleNet | Ensemble (subspace KNN) | 95.10 | 92.90 | 97.06 | 96.93 | 7.10 | 2.94 | 99.0 | 90.04 | 94.87 | ||
|
| For each class separately | AlexNet | Ensemble (subspace KNN) | 97.20 | 96.65 | 97.67 | 97.64 | 3.35 | 2.33 | 100 | 94.33 | 97.15 |
| VGG16 | Ensemble (subspace KNN) | 95.90 | 95.13 | 96.65 | 96.60 | 4.87 | 3.35 | 99.0 | 91.80 | 95.86 | ||
| GoogleNet | Ensemble (subspace KNN) | 94.60 | 92.90 | 97.06 | 96.93 | 7.10 | 2.94 | 99.0 | 90.04 | 94.87 | ||
|
| AlexNet | Ensemble (subspace KNN) | 98.60 | 98.28 | 98.99 | 98.98 | 1.72 | 1.01 | 100 | 97.26 | 98.63 | |
| VGG16 | Ensemble (subspace KNN) | 98.10 | 97.87 | 98.28 | 98.27 | 2.13 | 1.72 | 99.0 | 96.15 | 98.07 | ||
| GoogleNet | Ensemble (subspace KNN) | 94.50 | 92.60 | 96.45 | 96.31 | 7.40 | 3.55 | 98.0 | 89.11 | 94.42 | ||
|
| Not applied | AlexNet | KNN | 96.00 | 94.61 | 97.67 | 97.59 | 5.39 | 2.33 | 96.0 | 92.33 | 96.07 |
| VGG16 | KNN | 95.80 | 94.53 | 97.16 | 97.09 | 5.47 | 2.84 | 99.0 | 91.72 | 95.79 | ||
| GoogleNet | KNN | 93.40 | 92.01 | 95.02 | 94.89 | 7.99 | 4.98 | 98.0 | 87.06 | 93.43 | ||
|
| AlexNet | KNN | 96.80 | 95.13 | 98.48 | 98.43 | 4.87 | 1.52 | 96.0 | 93.66 | 96.75 | |
| VGG16 | KNN | 96.40 | 95.13 | 97.67 | 97.61 | 4.87 | 2.33 | 96.0 | 92.83 | 96.35 | ||
| GoogleNet | KNN | 93.90 | 92.49 | 95.23 | 95.10 | 7.51 | 4.77 | 94.0 | 87.76 | 93.78 |
Classification evaluation performance (%) of the proposed CAD system with three different backbone deep learning classifiers. These results were derived using mini-MIAS dataset.
| Dataset | LR-PCA | Feature Extractor | Classification Model | Acc. | SE | SP | PRE | FNR | FPR | AUC | MCC | F1-Score |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| Across all classes | AlexNet | Ensemble (subspace KNN) | 97.50 | 99.42 | 96.38 | 96.41 | 0.58 | 3.62 | 100 | 95.81 | 97.89 |
| VGG16 | Ensemble (subspace KNN) | 97.30 | 97.49 | 97.30 | 97.30 | 2.51 | 2.70 | 100 | 94.79 | 97.40 | ||
| GoogleNet | Ensemble (subspace KNN) | 97.50 | 97.47 | 95.98 | 95.98 | 2.53 | 4.02 | 100 | 93.45 | 96.72 | ||
|
| AlexNet | Ensemble (subspace KNN) | 98.60 | 99.41 | 97.49 | 97.54 | 0.39 | 2.51 | 100 | 97.13 | 98.57 | |
| VGG16 | Ensemble (subspace KNN) | 98.20 | 98.26 | 98.07 | 98.07 | 1.74 | 1.93 | 100 | 96.33 | 98.17 | ||
| GoogleNet | Ensemble (subspace KNN) | 97.50 | 97.88 | 97.10 | 97.13 | 2.12 | 2.90 | 100 | 94.98 | 97.50 | ||
|
| For each class separately | AlexNet | Ensemble (subspace KNN) | 98.10 | 98.65 | 97.49 | 97.51 | 1.35 | 2.51 | 100 | 96.14 | 98.08 |
| VGG16 | Ensemble (subspace KNN) | 97.00 | 98.07 | 96.53 | 96.58 | 1.93 | 3.47 | 100 | 94.61 | 97.32 | ||
| GoogleNet | Ensemble (subspace KNN) | 96.30 | 96.35 | 96.71 | 96.72 | 3.65 | 3.29 | 99.0 | 93.05 | 96.53 | ||
|
| AlexNet | Ensemble (subspace KNN) | 98.80 | 99.62 | 98.26 | 98.28 | 0.58 | 1.74 | 100 | 97.69 | 98.85 | |
| VGG16 | Ensemble (subspace KNN) | 97.70 | 98.46 | 96.91 | 96.96 | 1.54 | 3.09 | 99.0 | 95.38 | 97.70 | ||
| GoogleNet | Ensemble (subspace KNN) | 97.30 | 97.30 | 97.30 | 97.30 | 2.70 | 2.70 | 0.99 | 94.59 | 97.30 | ||
|
| Not applied | AlexNet | KNN | 97.00 | 98.65 | 96.91 | 96.97 | 1.35 | 3.09 | 96.0 | 95.57 | 97.80 |
| VGG16 | KNN | 97.80 | 97.88 | 97.68 | 97.69 | 2.12 | 2.32 | 100 | 95.56 | 97.78 | ||
| GoogleNet | KNN | 97.80 | 97.68 | 97.88 | 97.87 | 2.32 | 2.12 | 98.0 | 95.56 | 97.78 | ||
|
| AlexNet | KNN | 97.20 | 97.90 | 96.93 | 96.97 | 2.10 | 3.07 | 95.0 | 94.83 | 97.43 | |
| VGG16 | KNN | 97.60 | 97.87 | 97.88 | 97.87 | 2.13 | 2.12 | 100 | 0.9575 | 0.9787 | ||
| GoogleNet | KNN | 97.90 | 97.67 | 97.50 | 97.49 | 2.33 | 2.50 | 99.0 | 0.9517 | 0.9758 |
Figure 7The confusion matrices and corresponding ROC curves of the classification results based on the proposed CAD system with deep extractor AlexNet and the LR-PCA. (a) The derived confusion matrix when the PCA is separately applied for each class; (b) The ROC curve when the PCA is separately applied for each class; (c) The confusion matrix when the PCA is applied across all classes; (d) The ROC curve when the PCA is applied across all classes.
Evaluation comparison results of the proposed CAD system against the state-of the-art breast cancer classification systems.
| Reference | Feature Extraction Approach | Classifier | Dataset | SE (%) | Acc. (%) |
|---|---|---|---|---|---|
| Oliver et al. [ | Utilizing Fuzzy C Means for lesion segmentation and a number of textural and morphological features are used | SVM | MIAS | 87.33 | 91.51 |
| Phadke et al. [ | Breast cancer classification based on the fusion of local and global morphological and textural features | SVM | MIAS | 92.71 | 83.1 |
| Jian et al. [ | Utilizing the wavelet transform in order to retrieve the textural features of ROIs | Classification of ROIs using SVM | MIAS | 96.3 | 97.7 |
| Vijayarajeswari et al. [ | Breast lesions were classified using SVM after applying the Hough transform for feature extraction. | SVM | MIAS | - | 94.0 |
| Xie et al. [ | Classification of breast lesions using metaheuristic-based classifier | PSO-SVM | MIAS | 92.0 | 89.0 |
| Mina et al. [ | Classification of breast cancer using ANN and wavelet decomposition for feature extraction | ANN | MIAS | 68.0 | - |
| Liu et al. [ | Detection of microcalcification in digital mammograms | SVM | INbreast | 92.0 | - |
| Xu et al. [ | Deep CNN for feature extraction and classification of breast lesions | CNN | INbreast | - | 96.8 |
| Al-antari et al. [ | End-to-end CAD system for the segmentation and classification of breast masses | YOLO classifier | INbreast | 95.64 | 89.91 |
| Ragab et al. [ | Deep features fusion of AlexNet, GoogleNet, ResNet-18, ResNet-50, and ResNet-10. | SVM | CBIS-DDSM MIAS | 98.0 | 97.90 |
| Alhussan et al. [ | AlexNet | AlexNet | MIAS | 98.26 | 98.26 |
| GoogLeNet | GoogLeNet | 98.26 | 98.26 | ||
| VGG-16 | VGG-16 | 98.70 | 98.28 | ||
| Zhang et al. [ | Features were extracted by Gist, SIFT, HOG, LBP, VGG, ResNet, and DenseNet and fused together. | SVM, XGBoost, Naïve Bayes, k-NN, DT, AdaBoosting | CBIS-DDSM INbreast | 98.61 | 90.91 |
| Song et al. [ | GoogleNet | XGBoost | DDSM | 99.74 | 92.8 |
| Khan et al. [ | Fusion of deep features extracted by VGG-16, VGG-19, GoogleNet, and ResNet-50. | Transfer Learning | CBIS-DDSM MIAS | 98.07 | 96.6 |
|
|
|
|
|
|
|