| Literature DB >> 36135397 |
Ana M Mota1, Matthew J Clarkson2, Pedro Almeida1, Nuno Matela1.
Abstract
Microcalcification clusters (MCs) are among the most important biomarkers for breast cancer, especially in cases of nonpalpable lesions. The vast majority of deep learning studies on digital breast tomosynthesis (DBT) are focused on detecting and classifying lesions, especially soft-tissue lesions, in small regions of interest previously selected. Only about 25% of the studies are specific to MCs, and all of them are based on the classification of small preselected regions. Classifying the whole image according to the presence or absence of MCs is a difficult task due to the size of MCs and all the information present in an entire image. A completely automatic and direct classification, which receives the entire image, without prior identification of any regions, is crucial for the usefulness of these techniques in a real clinical and screening environment. The main purpose of this work is to implement and evaluate the performance of convolutional neural networks (CNNs) regarding an automatic classification of a complete DBT image for the presence or absence of MCs (without any prior identification of regions). In this work, four popular deep CNNs are trained and compared with a new architecture proposed by us. The main task of these trainings was the classification of DBT cases by absence or presence of MCs. A public database of realistic simulated data was used, and the whole DBT image was taken into account as input. DBT data were considered without and with preprocessing (to study the impact of noise reduction and contrast enhancement methods on the evaluation of MCs with CNNs). The area under the receiver operating characteristic curve (AUC) was used to evaluate the performance. Very promising results were achieved with a maximum AUC of 94.19% for the GoogLeNet. The second-best AUC value was obtained with a new implemented network, CNN-a, with 91.17%. This CNN had the particularity of also being the fastest, thus becoming a very interesting model to be considered in other studies. With this work, encouraging outcomes were achieved in this regard, obtaining similar results to other studies for the detection of larger lesions such as masses. Moreover, given the difficulty of visualizing the MCs, which are often spread over several slices, this work may have an important impact on the clinical analysis of DBT images.Entities:
Keywords: convolutional neural network; deep-learning; digital breast tomosynthesis; microcalcifications; virtual clinical trial
Year: 2022 PMID: 36135397 PMCID: PMC9503015 DOI: 10.3390/jimaging8090231
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Summary of deep learning DBT studies (ROI: region of interest, AUC: area under the curve, pAUC: partial AUC).
| Ref. | Classification Task | ROI/Patch/Image | Model | Best Metric |
|---|---|---|---|---|
| [ | True MCs vs. false positives | ROI (16 × 16) | Own | AUC: 0.93 |
| [ | Presence/absence of masses and architectural distortions | Patch (256 × 256) | Based on AlexNet | Accuracy: 0.8640 |
| [ | Presence/absence of masses | ROI (32 × 32 × 25) | Own | AUC: 0.847 |
| [ | True masses vs. false positives | ROI (128 × 128) | Own | AUC: 0.90 |
| [ | True masses vs. false positives | ROI (64 × 64) | Based on VGG16 | AUC: 0.919 |
| [ | Positive (malignant, benign masses) vs. negative images | Image (224 × 224) | Based on AlexNet | AUC: 0.6632 |
| [ | Malignant vs. benign masses | ROI (128 × 128) | Based on AlexNet | AUC: 0.90 |
| [ | Malignant vs. benign masses | Image (256 × 256) | Own | AUC: 0.87 |
| [ | Presence/absence of MCs | Patch (29 × 29 × 9) | Based on [ | pAUC: 0.880 |
| [ | Positive vs. negative volumes | Image (1024 × 1024) | Based on AlexNet, ResNet50, Xception | AUC: 0.854 (AlexNet) |
| [ | Positive vs. negative volumes | Image (832 × 832) | Based on AlexNet, ResNet, DenseNet and SqueezeNet | AUC: 0.91 (DenseNet) |
| [ | Benign vs. malignant lesions | ROI (224 × 224) | Based on VGG19 | AUC (MCs): 0.97 |
| [ | Positive vs. negative patches | Patch (512 × 512) | Based on ResNet | AUC: 0.847 |
| [ | Malignant vs. benign | ROI (256 × 256) | Based on VGG16 | AUC: 0.917, 0.951, 0.993 (malignant, benign, normal) |
| [ | Malignant vs. benign masses | ROI (224 × 224) | Based on DenseNet121 | AUC: 0.8703 |
| [ | BIRADS 0 vs. BIRADS 1 | Image (2200 × 1600) | Based on ResNet50 | AUC: 0.912 (BIRADS 0 vs. non-0) |
| [ | Predict breast density | Image | Based on ResNet34 | AUC: 0.952 |
| [ | True MCs vs. false positives | ROI (128 × 128) | Based on ResNet18 | AUC: 0.9765 |
| [ | Malignant vs. benign | Image (150 × 150) | Own | AUC: 0.89 |
| [ | Malignant vs. benign MCs | Patch (224 × 224) | Ensemble CNN (2D ResNet34 and anisotropic 3D Resnet) | AUC: 0.8837 |
| [ | Malignant vs. benign vs. normal slices based on masses and architectural distortions | Image (input size of each CNN: 224 × 224, 227 × 227) | ResNet18, AlexNet, GoogLeNet, VGG16, MobileNetV2, DenseNet201, Mod_AlexNet | Accuracy: 0.9161 (Mod_AlexNet) |
Detailed summary of the VICTRE data selected for this study.
| Absent | Present MCs | |||
|---|---|---|---|---|
| Density | Number of Cases | Number of Slices | Number of Cases | Number of Slices |
| Fatty | 20 | 100 | 25 | 99 |
| Scattered | 80 | 400 | 100 | 386 |
| Heterogeneous | 80 | 400 | 100 | 371 |
| Dense | 20 | 100 | 25 | 93 |
|
| 1000 | 949 | ||
Figure 1The six preprocessing methodologies implemented in order to reduce noise and amplify the visibility of the MCs (BG: background, normData: data normalized between 0 and 1).
Figure 2Illustration of CNN-a that resulted from the modifications made (bold) to the AlexNet architecture. Conv and GroupConv: convolutional and grouped convolutional layers, respectively; pool: max pooling layers; fc: fully connected layer; relu: rectified linear unit layer; norm: batch normalization layer; drop: dropout layer.
Figure 3Summary of the methodological pipeline followed in this work.
Figure 4(a) Data with contaminated BG; (b) first binary image; (c) filled binary image; (d) largest object extracted from binary image; (e) result from region growing; (f) final image with BG corrected after binary mask from (e) applied to (a).
Figure 5(a) Original data without preprocessing; (b) preprocessing 1 (minimization of TV); (c) preprocessing 2 (CLAHE); (d) preprocessing 3 (minTV + CLAHE); (e) preprocessing 4 (CLAHE + minTV); (f) preprocessing 5 (dataNorm2); (g) preprocessing 6 (dataNorm2 + minTV).
Performance results of CNNs trained with original data and with data resulting from the preprocessing methodologies, in terms of mean AUC.
| AUC (%): Mean ± SD | |||||
|---|---|---|---|---|---|
| AlexNet | GoogLeNet | ResNet18 | SqueezeNet | CNN-a | |
| Original data | 87.92 ± 2.01 | 90.14 ± 0.38 | 86.84 ± 2.62 | 87.43 ± 0.78 | 89.79 ± 1.23 |
| Preprocessing 1 | 87.35 ± 1.63 | 88.38 ± 1.12 | 87.96 ± 0.96 | 88.78 ± 0.99 | 90.66 ± 0.15 |
| Preprocessing 2 | 87.29 ± 0.78 | 93.02 ± 3.59 | 86.42 ± 3.26 | 86.84 ± 3.82 | 86.95 ± 0.97 |
| Preprocessing 3 | 88.61 ± 0.43 | 94.19 ± 1.12 | 86.33 ± 1.46 | 82.15 ± 1.51 | 85.80 ± 1.73 |
| Preprocessing 4 | 90.82 ± 1.29 | 94.15 ± 1.54 | 90.13 ± 0.32 | 86.33 ± 6.31 | 89.07 ± 1.62 |
| Preprocessing 5 | 87.62 ± 0.35 | 88.65 ± 4.27 | 90.44 ± 0.41 | 85.18 ± 2.78 | 89.54 ± 2.63 |
| Preprocessing 6 | 87.47 ± 1.13 | 89.76 ± 1.76 | 89.00 ± 1.33 | 84.09 ± 3.13 | 91.17 ± 0.07 |
Levels of significance (p-values) obtained from the statistical analysis of the difference between the best mean AUCs found.
| GoogLeNet PreProc3 | ResNet18 PreProc5 | SqueezeNet PreProc1 | CNN-a PreProc6 | |
|---|---|---|---|---|
| (94.19 ± 1.12) | (90.44 ± 0.41) | (88.78 ± 0.99) | (91.17 ± 0.07) | |
| AlexNet preProc4 |
| 0.654 | 0.095 | 0.662 |
| (90.82 ± 1.29) | (AlexNet < GoogLeNet) | |||
| GoogLeNet preProc3 |
|
|
| |
| (94.19 ± 1.12) | (GoogLeNet > ResNet18) | (GoogLeNet > SqueezeNet) | (GoogLeNet > CNN-a) | |
| ResNet18 preProc5 | 0.055 |
| ||
| (90.44 ± 0.41) | (ResNet18 < CNN-a) | |||
| SqueezeNet preProc1 |
| |||
| (88.78 ± 0.99) | (SqueezeNet < CNN-a) |
p-Values <0.05 (in bold) indicate a significant difference; preProc—preProcessing.
Figure 6Comparisons of ROC curves for the CNNs and training data with the best AUC values; preProc—preProcessing.
Figure 7Values of sensitivity, specificity, and accuracy obtained with the architectures trained with preprocessed data that achieved the best mean AUC.
Figure 8AUC values obtained with test datasets composed by the four different breast densities separately (* indicates a significant difference between groups).
Training times, in hours, needed for each CNN after threefold cross-validation and mean inference time (in seconds) needed to classify each image.
| Training Time (h) | Inference Time/Slice (s) | |
|---|---|---|
| CNN-a | 2.4 | 0.0057 |
| AlexNet | 4.1 | 0.0062 |
| SqueezeNet | 4.4 | 0.0083 |
| ResNet18 | 7.8 | 0.0143 |
| GoogLeNet | 8.9 | 0.0158 |
Figure 9Some examples of MCs in the DBT data used. (a) True positive (case correctly classified as positive by all CNNs, even in the original image); (b) false negative (case incorrectly classified as negative by all CNNs, even when varying the preprocessing); (c) original case classified as negative and that was only detected by GoogLeNet when preprocessed with method 3 (d); (e) original case classified as negative that was only detected by CNN-a when preprocessed with method 6 (f).