Anton S Becker1, Lukas Jendele2, Ondrej Skopek2, Nicole Berger3, Soleen Ghafoor4, Magda Marcon3, Ender Konukoglu5. 1. Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland; Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland. Electronic address: anton.becker@usz.ch. 2. Department of Computer Science, ETH Zurich, Zurich, Switzerland. 3. Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland. 4. Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland; Department of Radiology, Memorial Sloan Kettering Cancer Center, New York City, USA. 5. Computer Vision Laboratory, Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, Switzerland.
Abstract
PURPOSE: To train a CycleGAN on downscaled versions of mammographic data to artificially inject or remove suspicious features, and to determine whether these AI-mediated attacks can be detected by radiologists. MATERIAL AND METHODS: From two publicly available datasets, BCDR and INbreast, we selected 680 images with and without lesions as training data. An internal dataset (n = 302 cancers, n = 590 controls) served as test data. We ran two experiments (256 × 256 px and 512 × 408 px) and applied the trained model to the test data. Three radiologists read a set of images (modified and originals) and rated the presence of suspicious lesions on a scale from 1 to 5 and the likelihood of the image being manipulated. The readout was evaluated by multiple reader multiple case receiver operating characteristics (MRMC-ROC) analysis using the area under the curve (AUC). RESULTS: At the lower resolution, the overall performance was not affected by the CycleGAN modifications (AUC 0.70 vs. 0.76, p = 0.67). However, one radiologist exhibited lower detection of cancer (0.85 vs 0.63, p = 0.06). The radiologists could not discriminate between original and modified images (0.55, p = 0.45). At the higher resolution, all radiologists showed significantly lower detection rate of cancer in the modified images (0.80 vs. 0.37, p < 0.001), however, they were able to detect modified images due to better visibility of artifacts (0.94, p < 0.0001). CONCLUSION: Our proof-of-concept study shows that CycleGAN can implicitly learn suspicious features and artificially inject or remove them in existing images. The applicability of the method is currently limited by the small image size and introduction of artifacts.
PURPOSE: To train a CycleGAN on downscaled versions of mammographic data to artificially inject or remove suspicious features, and to determine whether these AI-mediated attacks can be detected by radiologists. MATERIAL AND METHODS: From two publicly available datasets, BCDR and INbreast, we selected 680 images with and without lesions as training data. An internal dataset (n = 302 cancers, n = 590 controls) served as test data. We ran two experiments (256 × 256 px and 512 × 408 px) and applied the trained model to the test data. Three radiologists read a set of images (modified and originals) and rated the presence of suspicious lesions on a scale from 1 to 5 and the likelihood of the image being manipulated. The readout was evaluated by multiple reader multiple case receiver operating characteristics (MRMC-ROC) analysis using the area under the curve (AUC). RESULTS: At the lower resolution, the overall performance was not affected by the CycleGAN modifications (AUC 0.70 vs. 0.76, p = 0.67). However, one radiologist exhibited lower detection of cancer (0.85 vs 0.63, p = 0.06). The radiologists could not discriminate between original and modified images (0.55, p = 0.45). At the higher resolution, all radiologists showed significantly lower detection rate of cancer in the modified images (0.80 vs. 0.37, p < 0.001), however, they were able to detect modified images due to better visibility of artifacts (0.94, p < 0.0001). CONCLUSION: Our proof-of-concept study shows that CycleGAN can implicitly learn suspicious features and artificially inject or remove them in existing images. The applicability of the method is currently limited by the small image size and introduction of artifacts.