| Literature DB >> 35008177 |
Jose M Castillo T1, Muhammad Arif1, Martijn P A Starmans1, Wiro J Niessen1,2, Chris H Bangma3, Ivo G Schoots1, Jifke F Veenland1,4.
Abstract
The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning- and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center (n = 371 patients), and two external sets of which one was a publicly available patient cohort (n = 195 patients) and the other contained data from patients from two hospitals (n = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts (n = 271 patients) was used for both the deep-learning- and radiomics-model development, and the three remaining cohorts (n = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.Entities:
Keywords: Gleason score; classification; clinically significant; comparison; deep learning; machine learning; model; mpMRI; prediction; prostate carcinoma; radiomics
Year: 2021 PMID: 35008177 PMCID: PMC8749796 DOI: 10.3390/cancers14010012
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1Flow diagram of patient exclusion and inclusion of the four cohorts used in this study: (a) Active Surveillance, (b) Prodrome, (c) ProstateX, and (d) PCMM. ISUP: International Society of Urological Pathology.
Clinical characteristics of the patients of the four cohorts (Active Surveillance, Prodrome, ProstateX and PCMM) included in this study. Tumorr volume values are presented as median (interquartile range). PZ: peripheral zone. TZ: transition zone. ISUP: International Society of Urological Pathology. GSA: Gleason Score. PSA: Prostate Specific Antigen. NA: not available. (*) The ISUP grade per lesion was not available for ProstateX challenge, the ground truth provided for this set indicated whether the lesion had ISUP grade ≥ 1.
| Training Cohort | Testing Cohort | |||
|---|---|---|---|---|
| Patient Cohort | Active Surveillance | Prodrome | ProstateX * | PCMM |
| Total Number of patients | 271 | 100 | 195 | 78 |
| Patients with a lesion ISUP grade = 1 | 155 | 68 | 128 | 28 |
| Patients with a lesion ISUP grade ≥ 2 | 116 | 32 | 67 | 50 |
|
| 233 | 104 | 328 | 156 |
| ISUP grade 1 | 100 | 52 | 254 | 77 |
| ISUP grade ≥ 2 | 133 | 52 | 74 | 79 |
| ISUP grade 2 | 124 | 45 | NA | 68 |
| ISUP grade 3 | 3 | 6 | NA | 8 |
| ISUP grade 4 & 5 | 6 | 1 | NA | 3 |
| Lesions in PZ | 150 | 60 | 191 | 104 |
| Lesions in TZ | 33 | 41 | 82 | 49 |
| Lesions in other zones (central, anterior stroma) | 38 | 3 | 55 | 3 |
| Lesion volume (mL) | 0.3(0.2–0.8) | 0.61 (0.3–1.0) | 1.42 (1.4–3.2) | 0.80 (0.2–1.1) |
| Prostate Volume(mL) | 43.1 (30.5–76.2) | 50. (33–67) | NA | NA |
| Age (year) | 67 ± 7 | 68 ± 4 | NA | NA |
| PSA(mean ± std ng/mL) | 10 ± 6 | 12 ± 4 | NA | 9 ± 7 |
Figure 2ROC curves of the deep-learning (blue) and radiomics (orange) models on the interval validation on Active Surveillance.
Deep-learning- and radiomics-model performances on the training set (Active Surveillance) and on the external sets (Prodrome, ProstateX and PCMM). DL: deep learning. AUC: Area under the curve.
| Active Surveillance | Prodrome | ProstateX | PCMM | |||||
|---|---|---|---|---|---|---|---|---|
| Metrics | DL | Radiomics | DL | Radiomics | DL | Radiomics | DL | Radiomics |
| AUC | 0.89 | 0.83 | 0.70 | 0.88 | 0.73 | 0.91 | 0.44 | 0.65 |
| Accuracy | 0.76 | 0.63 | 0.58 | 0.78 | 0.71 | 0.85 | 0.52 | 0.55 |
| Sensitivity | 0.85 | 1.00 | 0.72 | 1.00 | 0.70 | 0.72 | 0.70 | 0.44 |
| Specificity | 0.52 | 0.54 | 0.51 | 0.68 | 0.71 | 0.94 | 0.18 | 0.71 |
| F1-score | 0.74 | 0.66 | 0.52 | 0.78 | 0.65 | 0.85 | 0.66 | 0.55 |
Figure 3ROC curves of the deep-learning (blue) and radiomics (orange) models when evaluated on the test sets: (a) Prodrome, (b) ProstateX and (c) PCMM.
Figure 4All images show the same axial slice as 2D view of mpMR images (a,e T2w images; b,f DWI b800; c,g ADC map) of the prostate with the reference ground truth (d) and the segmented PCa lesion by the deep-learning model (h) (A) Example of a true positive case (PSA = 17.6; prostate volume = 46 cc; ISUP grade = 2 (up) and 3 (down)). The ground truth is shown in overlay (red) as delineated by the radiologist and proven by targeted biopsy as significant PCa. The segmented significant-PCa lesion by the deep-learning model is shown in overlay (pink). (B) Example of a false negative case (ISUP grade = 2). The ground truth is shown in overlay (red) as delineated by the radiologist and proven by targeted biopsy as significant PCa. The deep-learning model has not segmented any PCa lesion. (C) Example of a false positive case (ISUP grade = 1). The images show no delineation due to the absence of significant PCa, the region delineated by the radiologist (not shown) proved by targeted biopsy as insignificant PCa. The lesion segmented incorrectly by the deep-learning model is shown in overlay (pink).