| Literature DB >> 33920221 |
Raffaella Massafra1, Samantha Bove2, Vito Lorusso3, Albino Biafora4, Maria Colomba Comes1, Vittorio Didonna1, Sergio Diotaiuti5, Annarita Fanizzi1, Annalisa Nardone6, Angelo Nolasco3, Cosmo Maurizio Ressa7, Pasquale Tamborra1, Antonella Terenzio8, Daniele La Forgia9.
Abstract
Contrast-enhanced spectral mammography (CESM) is an advanced instrument for breast care that is still operator dependent. The aim of this paper is the proposal of an automated system able to discriminate benign and malignant breast lesions based on radiomic analysis. We selected a set of 58 regions of interest (ROIs) extracted from 53 patients referred to Istituto Tumori "Giovanni Paolo II" of Bari (Italy) for the breast cancer screening phase between March 2017 and June 2018. We extracted 464 features of different kinds, such as points and corners of interest, textural and statistical features from both the original ROIs and the ones obtained by a Haar decomposition and a gradient image implementation. The features data had a large dimension that can affect the process and accuracy of cancer classification. Therefore, a classification scheme for dimension reduction was needed. Specifically, a principal component analysis (PCA) dimension reduction technique that includes the calculation of variance proportion for eigenvector selection was used. For the classification method, we trained three different classifiers, that is a random forest, a naïve Bayes and a logistic regression, on each sub-set of principal components (PC) selected by a sequential forward algorithm. Moreover, we focused on the starting features that contributed most to the calculation of the related PCs, which returned the best classification models. The method obtained with the aid of the random forest classifier resulted in the best prediction of benign/malignant ROIs with median values for sensitivity and specificity of 88.37% and 100%, respectively, by using only three PCs. The features that had shown the greatest contribution to the definition of the same were almost all extracted from the LE images. Our system could represent a valid support tool for radiologists for interpreting CESM images.Entities:
Keywords: breast cancer; computer-automated diagnosis (CADx); contrast-enhanced spectral mammography (CESM); feature extraction; feature reduction; principal component analysis (PCA)
Year: 2021 PMID: 33920221 PMCID: PMC8070152 DOI: 10.3390/diagnostics11040684
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Images produced by contrast-enhanced spectral mammography (CESM instrumentation). Typical example of low energy (a), high energy (b), and recombined (c) images [17]. The white arrow points out a suspicious lesion.
Figure 2Schematic overview of the CESM images classification process. First, five sets of features were automatically extracted from each region of interest (ROI), then a principal component analysis was performed on each feature set. Finally, three binary classifiers were trained and their performances were evaluated on 100 ten-fold cross-validation rounds.
Figure 3Scheme of the feature extraction process.
Figure 4Haar decomposition: (a) first level and (b) second level decomposition.
Figure 5Overview of PC’s cumulative variance for each feature set.
Classification performances of the best models obtained from the individual PCs’ sets calculated on 100 ten-fold cross-validation rounds. For each set of features, the performance measures of the different classification algorithms implemented are reported in correspondence with the best PC combination. with the best PC combination. The third column shows the PCs selected by the sequential forward selection algorithm. The best results are highlighted in bold.
| PCs’ Set | Classifier | PCs Best Combination | AUC (%) | Acc (%) | Sens (%) | Spec (%) |
|---|---|---|---|---|---|---|
| STAT | RF | 1 + 2 | 78.29 | 77.59 | 81.40 | 73.33 |
| NB | 1 | 81.71 | 74.14 | 67.44 | 93.33 | |
|
|
|
|
|
|
| |
| GRAD |
|
|
|
|
|
|
| NB | 1 + 2 | 76.59 | 75.86 | 76.74 | 73.33 | |
| GLM | 1 + 5 + 2 + 9 | 83.10 | 74.14 | 65.12 | 100 | |
| COUNT | RF | 1 + 3 | 66.82 | 62.07 | 58.14 | 0.8 |
| NB | 2 + 1 | 64.88 | 60.34 | 50.00 | 86.67 | |
|
|
|
|
|
|
| |
| HAAR |
|
|
|
|
|
|
| NB | 1 + 3 + 16 + 19 + 15 + 14 | 86.51 | 84.48 | 87.21 | 80.00 | |
| GLM | 1 + 3 + 9 + 19 + 16 + 8 + 12 | 83.72 | 77.59 | 74.42 | 93.33 | |
| GLCM |
|
|
|
|
|
|
| NB | 2 + 4 + 1 + 11 + 10 + 9 | 75.50 | 75.86 | 72.09 | 86.67 | |
| GLM | 2 + 4 + 1 + 9 + 11 + 10 | 82.33 | 87.93 | 93.02 | 73.33 |
Overview of the features that were important in the computation of the selected PCs on 100 ten-fold cross-validation rounds. The factors, LE and RC identify the features extracted from the LE images and the RC images, respectively.
| Set | PC | Important Features | ||||
|---|---|---|---|---|---|---|
| STAT | 1 | RC_Entropy | RC_Std | RC_Max-Min | RC_Relative | RC_ |
| GRAD | 1 | RC_Mean_ | LE_Mean_ | RC_Entropy_ | RC_Relative | RC_Variance_ |
| 5 | LE_Kurtosis_ | LE_Skewness_ | RC_Skewness_ | RC_Kurtosis_ | RC_Entropy_ | |
| COUNT | 1 | RC_Fast | RC_Brisk | LE_Brisk | LE_Fast | |
| 3 | LE_Sift | RC_Sift | LE_MSER | RC_Minimum | ||
| HAAR | 2 | LE_Relative | LE_ Relative | LE_ Relative | LE_Entropy_ | LE_Relative |
| 12 | LE_Skewnes_ | RC_Skewness_ | LE_Kurtosis_ | LE_Skewness_ | ||
| GLCM | 2 | RC_Sum | RC_Entropy_ | RC_Entropy_ | RC_Entropy_ | RC_Sum |
| 1 | LE_Sum | LE_Sum | LE_Sum | LE_Sum | ||
Classification performance of the best models obtained from the complete set of PCs calculated on 100 ten-fold cross-validation rounds. The best result is highlighted in bold.
| Classifier | Best Model | AUC (%) | Acc (%) | Sens (%) | Spec (%) |
|---|---|---|---|---|---|
|
|
|
|
|
|
|
| NB | S1 + GL10 + G2 + GL11 + H16 + H3 + H19 | 88.99 | 89.66 | 93.02 | 80 |
| GLM | S1 + G2 + G9 + S3 + H8 + GL2 + GL1 | 90.08 | 84.48 | 81.40 | 100 |
Benign vs. malignant breast lesion classification evaluated on CESM images: comparison of the performance results of the proposed models in the literature.
| Article | No. of ROIs | Features | Classifier | Performance (%) |
|---|---|---|---|---|
| Patel et al. [ | 50 | SVM | AUC: 95 | |
| Perek et al. [ | 129 | Multimodal | AUC: 89 | |
| Fanizzi et al. [ | 48 | 12 | Random | AUC: 93.1 |
| Losurdo et al. [ | 55 | 10 | SVM | Acc: 80.91 |
| Best proposed model | 58 | 2 | Random | AUC: 95.66 |