| Literature DB >> 35053441 |
Zuzanna Anna Magnuska1, Benjamin Theek1, Milita Darguzyte1, Moritz Palmowski2, Elmar Stickeler3,4, Volkmar Schulz1,4,5,6,7, Fabian Kießling1,4,7.
Abstract
Automation of medical data analysis is an important topic in modern cancer diagnostics, aiming at robust and reproducible workflows. Therefore, we used a dataset of breast US images (252 malignant and 253 benign cases) to realize and compare different strategies for CAD support in lesion detection and classification. Eight different datasets (including pre-processed and spatially augmented images) were prepared, and machine learning algorithms (i.e., Viola-Jones; YOLOv3) were trained for lesion detection. The radiomics signature (RS) was derived from detection boxes and compared with RS derived from manually obtained segments. Finally, the classification model was established and evaluated concerning accuracy, sensitivity, specificity, and area under the Receiver Operating Characteristic curve. After training on a dataset including logarithmic derivatives of US images, we found that YOLOv3 obtains better results in breast lesion detection (IoU: 0.544 ± 0.081; LE: 0.171 ± 0.009) than the Viola-Jones framework (IoU: 0.399 ± 0.054; LE: 0.096 ± 0.016). Interestingly, our findings show that the classification model trained with RS derived from detection boxes and the model based on the RS derived from a gold standard manual segmentation are comparable (p-value = 0.071). Thus, deriving radiomics signatures from the detection box is a promising technique for building a breast lesion classification model, and may reduce the need for the lesion segmentation step in the future design of CAD systems.Entities:
Keywords: breast cancer; deep learning; machine learning; medical image analysis; radiomics; ultrasound
Year: 2022 PMID: 35053441 PMCID: PMC8773857 DOI: 10.3390/cancers14020277
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1Study overview. To develop the breast lesion detection model, various datasets combining the original and pre-processed US images were prepared and used to train and select the best YOLOv3 and Viola–Jones classifiers. The final model was selected by evaluating the performance of trained detection models on the test dataset. To develop the breast lesion classification model, the breast lesions were outlined in US images by an expert radiologist (Manual Segmentation), the best YOLOv3 breast lesion detection model, and the best Viola–Jones breast lesion detection model. Three separate RS were obtained using the features extracted from the manually delineated and automatically outlined breast lesion segments. The best breast lesion classification model was selected by evaluating the performance of each RS on the test dataset.
Characteristics of the datasets included in the study.
| Characteristics | Study Collective | UIDAT | Rodtook |
|---|---|---|---|
| Total number of patients | 119 | 163 | 215 |
| Total number of lesions | 127 | 163 | 215 |
| Subtypes | |||
| Malignant | 77 (60.6%) | 53 (32.5%) | 122 (56.7%) |
| Benign | 38 (29.9%) | 71 (43.6%) | 53 (24.7%) |
| Cyst | 7 (5.5%) | 0 (0.0%) | 21(9.8%) |
| Fibroadenoma | 5 (4.0%) | 39 (23.9%) | 19 (8.8%) |
Augmentation scenarios.
| Name | Description | No. of Images |
|---|---|---|
| A1: Flips | The image was flipped about its origin, x-axis, and y-axis | 3 |
| A2: Rotation | The image was rotated clockwise and counterclockwise by 45º, 90º, and 135º. New pixels were filled symmetrically at the edges. | 6 |
| A3: Shear | Original, flipped about the origin, flipped about the x-axis and flipped about y-axis images were sheared by 5º, 10º, 15º, 20º, 25º, and 30º. New pixels were filled symmetrically at the edges. | 96 |
| A4: Translation | The image was translated right (x-axis) and down (y-axis), right (x-axis) and up (y-axis), left (x-axis) and up (y-axis) and left (x-axis) and down (y-axis) by 10% of its width and height. | 4 |
| A5: UDWT | The single decomposition with UDWT using coif2 wavelet was applied to the original image. All resultant decomposition matrices were included in the dataset. | 4 |
| A6: EXP | The exponential derivative of the original image was computed | 1 |
| A7: LoG | The Laplacian of Gaussian of the original image was computed | 1 |
| A8: LN | The logarithmic derivative of the original image was computed | 1 |
| A9: SQUARED | The square derivative of the original image was computed | 1 |
| A10: SQRT | The square root derivative of the original image was computed | 1 |
Abbreviations: A, Augmentation; UDWT, Undecimated Discrete Wavelet Transform; EXP, exponential; LN, logarithm; LoG, Laplacian of Gaussian; SQRT, square root; SQUARED, squared.
Assembled datasets.
| Dataset Name | Description |
|---|---|
| D1: All Augmentations | Includes all augmented images from each scenario |
| D2: UDWT + Spatial | Includes UDWT computed images and spatially augmented examples |
| D3: EXP + Spatial | Includes EXP computed images and spatially augmented examples |
| D4: LN + Spatial | Includes LN computed images and spatially augmented examples |
| D5: LoG + Spatial | Includes LoG computed images and spatially augmented examples |
| D6: Spatial Only | Includes only spatially augmented images |
| D7: SQRT + Spatial | Includes SQRT computed images and spatially augmented examples |
| D8: SQUARED + Spatial | Includes SQUARED computed images and spatially augmented examples |
Abbreviations: UDWT, Undecimated Discrete Wavelet Transform; EXP, exponential; LN, logarithm; LoG, Laplacian of Gaussian; SQRT, square root; SQUARED, squared.
Characteristics of the first data pool used for developing breast lesion detection functions.
| Characteristics | Train | Validation | Test |
|---|---|---|---|
| Total number of patients | 54 | 90 | 90 |
| Total number of lesions | 63 | 87 | 85 |
| Subtypes | |||
| Malignant | 44 (70.0%) | 35 (40.0%) | 30 (35.0%) |
| Benign | 19 (30.0%) | 52 (60.0%) | 59 (65.0%) |
Characteristics of the second data pool used for developing the breast lesion classification models.
| Characteristics | Feature Selection Subset | Classification Subset | |
|---|---|---|---|
| Total | Train | Test | |
| Total number of patients | 130 | 77 | 56 |
| Total number of lesions | 139 | 80 | 60 |
| Subtypes | |||
| Malignant | 75 (54.0%) | 40 (50.0%) | 36 (60.0%) |
| Benign | 64 (46.0%) | 40 (50.0%) | 24 (40.0%) |
Figure 2Examples of detected lesions in US images obtained by (A) expert radiologist, (B) and the YOLOv3 and (C) Viola–Jones detection models. The bottom row shows the magnifications of corresponding areas that were considered for further classification.
Figure 3Evaluation of the breast lesion detection using IoU and LE scores. The automatically obtained detection boxes (in yellow) and ground truth boxes (in blue) were used to calculate the IoU and LE scores for 3 different evaluation scenarios. First, (A) where both IoU and LE satisfy the threshold conditions (IoU = 0.88 and LE = 0.0004); second, (B) where only LE does and IoU is below the threshold (IoU = 0.48 and LE = 0.01); and third, (C) where IoU is equal to 0 and LE still satisfy the threshold conditions (IoU = 0 and LE = 0.07).
Figure 4The comparison between automatically computed detections of (A(i–iii)) benign and (B(i–iii)) malignant breast lesions obtained with YOLOv3 (green) and Viola–Jones (yellow) detection models.
The best breast lesion detection functions revealed in the validation step. The evaluation metrics (recall, precision, and F1-Score) was calculated with reference to the IoU score.
| Dataset | Algorithm | IoU (Mean + STD) | Recall | Precision | F1-Score |
|---|---|---|---|---|---|
| D3: EXP + Spatial | Viola–Jones | 0.3992 ± 0.0544 | 0.958 | 0.495 | 0.652 |
| D4: LN + Spatial | YOLOv3 | 0.5362 ± 0.0640 | 0.824 | 0.805 | 0.814 |
The best breast lesion detection functions revealed in the validation step. The evaluation metrics (recall, precision, and F1-Score) was calculated with reference to the LE score.
| Dataset | Algorithm | LE (Mean + STD) | Recall | Precision | F1-Score |
|---|---|---|---|---|---|
| D3: EXP + Spatial | Viola–Jones | 0.1208 ± 0.0146 | 0.970 | 0.699 | 0.813 |
| D4: LN + Spatial | YOLOv3 | 0.1823 ± 0.0058 | 0.835 | 0.874 | 0.854 |
The best breast lesion detection functions revealed in the test step. The evaluation metrics (recall, precision, and F1-Score) was calculated with reference to the IoU score.
| Dataset | Algorithm | IoU (Mean + STD) | Recall | Precision | F1-Score |
|---|---|---|---|---|---|
| D1: All Augmentations | Viola–Jones | 0.3986 ± 0.0540 | 0.959 | 0.500 | 0.657 |
| D4: LN + Spatial | YOLOv3 | 0.5442 ± 0.0808 | 0.835 | 0.759 | 0.795 |
The best breast lesion detection functions revealed in the test step. The evaluation metrics (recall, precision, and F1-Score) was calculated with reference to the LE score.
| Dataset | Algorithm | LE (Mean + STD) | Recall | Precision | F1-Score |
|---|---|---|---|---|---|
| D1: All Augmentations | Viola–Jones | 0.0959 ± 0.0162 | 0.972 | 0.734 | 0.836 |
| D4: LN + Spatial | YOLOv3 | 0.1706 ± 0.0094 | 0.856 | 0.885 | 0.830 |
Figure 5Recall-IoU and recall-LE curves resulting from the evaluation of the best breast lesion detection algorithms on the test group. (A-i) YOLOv3 detection functions scored with different IoU thresholds; (A-ii) YOLOv3 detection functions scored with different LE thresholds; (B-i) Viola–Jones detection functions scored with different IoU thresholds; (B-ii) Viola–Jones detection functions scored with different LE thresholds.
Characteristics of Manual Segmentation, YOLOv3, and Viola–Jones breast lesion classification datasets.
| Dataset | Number of Samples | Lambda | Number of Selected Features |
|---|---|---|---|
| Manual Segmentation | 139 | 0.02 | 33 |
| YOLOv3 | 139 | 0.01 | 51 |
| Viola-Jones | 139 | 0.02 | 41 |
The precision metrics of the 3 best breast lesion classification models selected on the test groups of the Manual Segmentation, YOLOv3, and Viola–Jones Classification Subsets.
| Method | Accuracy (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|
| Weighted KNN (trained on Manual Segmentation dataset) | 85.00 | 83.33 | 87.50 |
| Ensemble Subspace KNN (trained on YOLOv3 dataset) | 70.00 | 70.00 | 70.83 |
| Median KNN (trained on Viola-Jones dataset) | 61.67 | 61.11 | 62.50 |
Figure 6The AUROC curves of breast lesion classification models trained for Manual Segmentation (Model 1), YOLOv3 (Model 2), and Viola–Jones (Model 3) derived radiomics signatures. The dotted line presents the model with no discriminative capacity (Random Model).
The statistical comparison of the 3 best breast lesion classification models with the Wilson Score Interval for estimation of lesion type discrimination probability.
| AUROC Comparison | z-Score | |
|---|---|---|
| Ensemble Subspace KNN (trained on YOLOv3 dataset) and Weighted KNN (trained on Manual Segmentation dataset) | 0.071 | 1.803 |
| Median KNN (trained on Viola-Jones dataset) and Weighted KNN (trained on Manual Segmentation dataset) | 0.002 | −3.160 |