| Literature DB >> 31671711 |
Jun Akatsuka1,2, Yoichiro Yamamoto3, Tetsuro Sekine4, Yasushi Numata5, Hiromu Morikawa6, Kotaro Tsutsumi7, Masato Yanagi8, Yuki Endo9, Hayato Takeda10, Tatsuro Hayashi11, Masao Ueki12, Gen Tamiya13,14, Ichiro Maeda15,16, Manabu Fukumoto17, Akira Shimizu18, Toyonori Tsuzuki19, Go Kimura20, Yukihiro Kondo21.
Abstract
Deep learning algorithms have achieved great success in cancer image classification. However, it is imperative to understand the differences between the deep learning and human approaches. Using an explainable model, we aimed to compare the deep learning-focused regions of magnetic resonance (MR) images with cancerous locations identified by radiologists and pathologists. First, 307 prostate MR images were classified using a well-established deep neural network without locational information of cancers. Subsequently, we assessed whether the deep learning-focused regions overlapped the radiologist-identified targets. Furthermore, pathologists provided histopathological diagnoses on 896 pathological images, and we compared the deep learning-focused regions with the genuine cancer locations through 3D reconstruction of pathological images. The area under the curve (AUC) for MR images classification was sufficiently high (AUC = 0.90, 95% confidence interval 0.87-0.94). Deep learning-focused regions overlapped radiologist-identified targets by 70.5% and pathologist-identified cancer locations by 72.1%. Lymphocyte aggregation and dilated prostatic ducts were observed in non-cancerous regions focused by deep learning. Deep learning algorithms can achieve highly accurate image classification without necessarily identifying radiological targets or cancer locations. Deep learning may find clues that can help a clinical diagnosis even if the cancer is not visible.Entities:
Keywords: MRI; black box; deep learning; pathology; prostate cancer
Mesh:
Year: 2019 PMID: 31671711 PMCID: PMC6920905 DOI: 10.3390/biom9110673
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Figure 1Flowchart of our study. Step 1: We extracted a rectangular region of the prostate from within the magnetic resonance (MR) images and adjusted the image size to 256 × 256 pixels. Step 2: For preparing explainable model, we applied a well-established deep neural network to MR images for cancer classification. Step 3: For evaluating the classification by the deep neural network, we constructed a receiver operating characteristic (ROC) curve with the corresponding area under the curve (AUC). Step 4: Deep learning-focused regions were compared with both radiologist-identified targets on MR images and pathologist-identified locations through 3D reconstruction of pathological images.
Figure 2Study profile. TP = true positive, FP = false positive, FN = false negative, TN = true negative.
Patient characteristics PSA = Prostate-specific antigen, TPV = Total prostate volume, PSAD = PSA density, SD = Standard deviation.
| Total Cases: N = 105 | Cancer Cases | Non-Cancer Cases | |
|---|---|---|---|
| Number of cases, n | 54 | 51 | - |
| Age, year, mean ± SD | 67.4 ± 6.9 | 65.2 ± 8.6 | 0.09 |
| PSA, ng/mL, mean ± SD | 14.7 ± 12.1 | 8.1 ± 5.4 | <0.001 |
| TPV, mL, mean ± SD | 27.5 ± 10.6 | 42.5 ± 19.3 | <0.001 |
| PSAD, ng/mL/cm3, mean ± SD | 0.63 ± 0.66 | 0.22 ± 0.16 | <0.001 |
Figure 3ROC analysis. The average AUC was 0.90 (95% CI 0.87–0.94). ROC = Receiver operating characteristics, AUC = Area under the curve, CI = Confidence interval, Black circle = Average, White circle = Out of range value, Black solid line = Median, Box = Interquartile range, Dashed line = Range, Upper black line = Maximum value, Bottom black line = Minimum value.
Univariate analysis of clinicopathological features: deep learning classified cancer cases versus misclassified cancer cases. PSA = Prostate-specific antigen, TPV = Total prostate volume, PSAD = PSA density, WBC = White blood cell, Hb = Hemoglobin, Plt = Platelet, LDH = Lactate dehydrogenase, ALP = Alkaline phosphatase, Ca = Calcium, SD = Standard deviation.
| Cancer Cases: N = 54 | Classified Cases | Misclassified Cases | Univariate |
|---|---|---|---|
| Number of cases, (%) | 92.6 | 7.4 | |
| Age, years, mean ± SD | 67.4 ± 6.9 | 67.5 ± 7.3 | 0.96 |
| PSA, ng/mL, mean ± SD | 14.2 ± 11.9 | 21.6 ± 13.5 | 0.07 |
| TPV, mL, mean ± SD | 27.9 ± 10.7 | 23.0 ± 10.2 | 0.66 |
| PSAD, ng/mL/cm3, mean ± SD | 0.59 ± 0.64 | 1.17 ± 0.85 | 0.07 |
| Gleason score, (%) | 0.03 | ||
| <8 | 60.0 | 0.0 | |
| ≥8 | 40.0 | 100.0 | |
| Clinical stage, (%) | 0.21 | ||
| ≤T2 | 80.0 | 50.0 | |
| ≥T3 | 20.0 | 50.0 | |
| Pathological stage, (%) | 0.63 | ||
| ≤T2 | 44.0 | 25.0 | |
| ≥T3 | 56.0 | 75.0 | |
| WBC, 103/μL, mean ± SD | 6074 ± 1248 | 5150 ± 656 | 0.12 |
| Hb, g/dl, mean ± SD | 14.5 ± 1.2 | 13.8 ± 0.7 | 0.08 |
| Plt, 103/μL, mean ± SD | 21.8 ± 5.0 | 18.3 ± 2.5 | 0.14 |
| LDH, U/L, mean ± SD | 180 ± 34.9 | 179 ± 45.4 | 0.93 |
| ALP, U/L, mean ± SD | 208 ± 56 | 249 ± 161 | 0.75 |
| Ca, mg/dL, mean ± SD | 9.3 ± 0.43 | 9.1 ± 0.26 | 0.29 |
Figure 4Representative cases of deep learning-focused regions and expert-identified cancers. Left: Deep learning focused regions. Second left: MR image with PI-RADS score. Second right: Loupe image of pathology slides (red area indicates cancer locations). Right: Representative pathological image in the deep learning-focused region (×20). Details of each cases are shown in Supplementary Table S1. MR image = Magnetic resonance image, PI-RADS = Prostate imaging reporting and data system.
Figure 5Representative images with overlapping areas between deep learning-focused regions and genuine cancer locations. Left image group: 25 images with overlapping areas between deep learning-focused regions and genuine cancer locations. The overlapped areas are colored in green. Right image group: corresponding 25 raw MR images. MR images = Magnetic resonance images.