| Literature DB >> 35347462 |
Michael Roberts1,2, Leonardo Rundo3,4,5, Nikita Sushentsev6, Nadia Moreira Da Silva4, Michael Yeung3, Tristan Barrett3, Evis Sala3,4,7.
Abstract
OBJECTIVES: We systematically reviewed the current literature evaluating the ability of fully-automated deep learning (DL) and semi-automated traditional machine learning (TML) MRI-based artificial intelligence (AI) methods to differentiate clinically significant prostate cancer (csPCa) from indolent PCa (iPCa) and benign conditions.Entities:
Keywords: Artificial intelligence; Deep learning; MRI; Machine learning; Prostate cancer
Year: 2022 PMID: 35347462 PMCID: PMC8960511 DOI: 10.1186/s13244-022-01199-3
Source DB: PubMed Journal: Insights Imaging ISSN: 1869-4101
Fig. 1Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 flow diagram for literature search. csPCa, clinically significant prostate cancer; iPCa, indolent prostate cancer
QUADAS-2 risk of bias and applicability concerns
Fig. 2Summary QUADAS-2 risk of bias and applicability concerns assessment
Summary demographic characteristics of patients included in the studies selected for narrative synthesis
| Study | Year | Country | No. of patients | Age, years | PSA, ng/mL | Patient population | Bx | MRI vs Bx | Time MRI to Bx | No. of centres /vendors | No. of readers | Reader experience, years |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Wang [ | 2020 | Netherlands | 346 | 66 (48–83) | 13 (1–56) | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 1 | 20 |
| Fernandez-Quilez [ | 2021 | Netherlands | 200 | 66 (48–83) | 13 (1–56) | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 4 | NR |
| Schelb [ | 2019 | Germany | 312 | Training: 64 [58–71] Test: 64 [60–69] | Training: 7.0 [5.0–10.2] Test: 6.9 [5.1–8.9] | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 2 | 0.5, 10 |
| Deniffel [ | 2020 | Canada | 499 | Training: 63.8 ± 8.1 Test: 64.4 ± 8.4 | Training: 7.6 [5.0–10.8]a Test: 7.2 [5.2–11.2] | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 2 | 15, 3 |
| Seetharaman [ | 2021 | USA | 424 | Training: 63.8 (49–76) Test: 65 (38–82) | Training: 6.8 (3.3–28.6) Test: 7.1 (0.9–63.0) | Clinically suspected | RP or TB | Pre-Bx or Pre-Op | NR | 1/1 | Unclear | Unclear |
| Bonekamp [ | 2018 | Germany | 316 | 64 [58–71] | Training: 6.6 [4.9–9.5] Test: 7.5 [5.4–11.0] | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 2 | 0.5, 8 |
| Min [ | 2019 | China | 280 | Training,csPCa: 68.8 ± 8.3Training, iPCa: 71.5 ± 8.4 Test, csPCa: 70.3 ± 7.8 Test, iPCa: 71.6 ± 5.7 | NRb | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 2 | NR, 20 |
| Kwon [ | 2018 | Netherlands | 344 | 66 (48–83) | 13 (1–56) | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 2 | > 25 |
| Castillo [ | 2021 | Netherlands | 107 | C1: 64 ± 7 C2: N/A C3: N/A | C1: 12 ± 10 C2: 9 ± 5 C3: 10 ± 8 | Clinically suspected | RP | Pre-Op | NR | 3/3 | 1 | NR |
| Bleker [ | 2019 | Netherlands | 206 | 66 (48–83) | 13 (1–56) | Clinically suspected | TB | Pre-Bx | NR | 1/1 | Unclear | Unclear |
| Li [ | 2020 | China | 381 | csPCa:75 [68–81] iPCa: 69 [63–75] | csPCa:49.3 [21.1–83.4 iPCa:9.9 [6.7–15.9] | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 2 | 3, 9 |
| Woźnicki [ | 2020 | Germany | 191 | Training: 68 [63–74] Test: 69 [63–72] | Training:7.6 [5.7–11.0] Test: 8.2 [6.8–11.9] | Clinically suspected | TB | Pre-Bx | Bx 3 months before MRI | 1/2 | 2 | 7, 7 |
| Bevilacqua [ | 2021 | Italy | 76 | csPCa: 66 ± 6.8 iPCa: 65 ± 8.8 | csPCa: 7.8 ± 7.5 iPCa: 5.3 ± 3.0 | Biopsy-proven | TB | Post-Bx | Bx 6 weeks before MRI | 1/1 | 2 | 7, 25 |
| Toivonen [ | 2019 | Finland | 62 | 65 (45–73) | 9.3 (1.3–30) | Biopsy-proven | RP | Pre-Bx | NR | 1/1 | 2 | NR |
| Antonelli [ | 2019 | UK | 164 | 64 (43–83) | 7.4 (2.5–30.3) | Clinically suspected | TB | Pre-Bx | NR | 1/1 | 1 | 3 |
| Yoo [ | 2019 | Canada | 427 | NR | NR | Clinically suspected | NR | Pre-Bx | NR | 1/1 | NR | NR |
| Hiremath [ | 2021 | USA, Netherlands | 592 | C1: 65.5 (59–72) C2: 63 (59–68) C3: 62 (56–66) C4: 65.5 (62–73) | C1: 6.6 (0.25–88.2) C2: 6.7 (5–10) C3: 5.7 (4.54–9.58) C4: 7.7 (4.8–11.3) | Clinically suspected | RP or SB or TB | Pre-Bx | NR | 5/3 | 5 | > 15, > 15, > 15, > 10, > 10 |
Bx, biopsy; C, cohort; MRI, magnetic resonance imaging; NR, not reported; PSA, prostate-specific antigen; RP, radical prostatectomy; SB, systematic biopsy; TB, targeted biopsy
aData missing for 110 cases
bPSA values were reported by subcategories (< 4 ng/mL, 4–10 ng/mL, > 10 ng/mL), see the original reference [26] for further details
Predictive modelling characteristics of studies using deep learning-based fully-automated AI methods
| Study | No. of patients | Training set | Validation set | Test set | Algorithm | MRI input | Image registration | Image segmentation | Outcome | Zone | Analysis | Evaluation strategy |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Wang [ | 346 | 204 | fivefold CV | 142 | CNN (MISN) | ADC, BVAL, DWI0, DWI1, DWI2, T2WI-Cor, T2WI-Sag, T2WI-Tra | NR | Open data | csPCa vs iPCa or benign lesions | PZ or TZ | Per lesion | Internal hold-out |
| Fernandez-Quilez [ | 200 | NRa | NRa | NRa | CNN (VGG16) | T2WI, ADC | NR | Open data | csPCa vs iPCa or benign lesions | WP | Per lesion | Internal hold-out |
| Schelb [ | 312 | 250 | No | 62 | CNN (U-Net) | T2WI, DWI | SimpleITK, non-rigid Bspline with Mattes mutual information criterion | Automated (U-Net) | csPCa vs iPCa or benign lesions | WP | Per lesion, per patient | Internal hold-out |
| Deniffel [ | 499 | 324 | 75 | 50b | CNN (3D) | T2WI, ADC, DWI | Static, affine | Manual bounding boxes | csPCa vs iPCa or benign lesions | WP | Per patient | Internal hold-out |
| Seetharaman [ | 424 | 102 | fivefold CV | 322 | CNN (SPCNet) | T2WI, ADC | Manual | Registration from pathology images | csPCa vs iPCa or benign lesions | WP | Per pixel, per lesion | Internal hold-out |
ADC, apparent diffusion coefficient; CNN, convolutional neural networks; csPCa, clinically significant prostate cancer; CV, cross-validation; DWI, diffusion-weighted imaging; iPCa, indolent prostate cancer; MISN, multi-input selection network; MRI, magnetic resonance imaging; NR, not reported; PZ, peripheral zone; T2WI, T2-weighted imaging; TZ, transition zone; WP, whole prostate
aThe study included 200 patients and 299 lesions, of which 70% were used to train train, 20% to test, 10% to fine-tune the models
bDescribes the calibration cohort
Predictive modelling characteristics of studies using traditional machine learning-based semi-automated AI methods
| Study | No. of patients | Training set | Validation set | Test set | Algorithm | MRI input | IR | IS | Discriminative features | No. of features used for training | Outcome | Zone | Analysis | Evaluation strategy |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Bonekamp [ | 316 | 183 | NR | 133 | RF | T2WI, ADC, b = 1500 | No | Manual | First-order, volume, shape, texture | NR | csPCa vs iPCa or benign lesions | WP or PZ or TZ | Per lesion and per patient | Internal hold-out |
| Min [ | 280 | 187 | NR | 93 | LR | T2WI, ADC, b = 1500 | No | Manual | Intensity, shape, texture, wavelet | 9 | csPCa vs iPCa | WP | Per lesion | Internal hold-out |
| Kwon [ | 344 | 204 | tenfold CV | 140 | CART, RF, LASSO | T2WI, DWI, ADC, DCE | Rigid | No | Intensity | 54 | csPCa vs iPCa or benign lesions | PZ or TZ | Per lesion | Internal hold-out |
| Castillo [ | 107 | 80% | 20% of training (100 random repeats) | 20% | LR, SVM, RF, NB, LQDA | T2WI, DWI, ADC | HPa | Manual | Shape, local binary patterns, GLCM | NR | csPCa vs iPCa | WP | Per lesion, Per patient | Mixed hold-out |
| Bleker [ | 206 | 130 | NR | 76 | RF, XGBoost | T2WI, b = 50, b = 400, b = 800, b = 1400, ADC, | No | Manual | Intensity, texture | NR | csPCa vs iPCa or benign lesions | PZ | Per lesion | Internal hold-out |
| Li [ | 381 | 229 | NR | 152 | LR | T2WI, ADC | No | Manual | Intensity, age, PSA, PSAd | 15 | csPCa vs iPCa or benign lesions | WP | Per lesion | Internal hold-out |
| Woźnicki [ | 191 | 151 | fivefold CV | 40 | LR, SVM, RF, XGBoost, CNN | T2WI, ADC | No | Manual | Intensity, shape, PI-RADS, PSAd, DRE | 15 | csPCa vs iPCa or benign lesions | WP | Per patient | Internal hold-out |
| Bevilacqua [ | 76 | 48 | threefold CV | 28 | SVM | ADC, b = 2000 | No | Manual | Intensity | 10 | csPCa vs iPCa | WP | Per lesion | Internal hold-out |
| Toivonen [ | 62 | 62 | LPOCV | N/A | LR | T2WI, ADC, | No | Manual | Intensity, Sobel, texture | NR | csPCa vs iPCa | WP | Per lesion | LPOCV |
| Antonelli [ | 164 | 134 | NR | 30 | PZ: LinR TZ: NB | ADC, DCE | Rigid | Manual | Texture, PSAd | NR | csPCa vs iPCa | PZ or TZ | Per lesion | fivefold CV |
| Yoo [ | 427 | 271 | 48 | 108 | CNN, RF | ADC, DWI | No | No | First-order statistics of deep features | 90 | csPCa vs iPCa or benign lesions | WP | Per slice, Per patient | tenfold CV |
| Hiremath [ | 592 | 368 | threefold CV | 224 | AlexNet or DenseNet and Nomogram | T2WI, ADC | Rigid, affine | Manual | Deep learning imaging predictor, PI-RADS, PSA, gland volume, tumour volume | NR | csPCa vs iPCa or benign lesions | WP | Per patient | External hold-out |
ADC, apparent diffusion coefficient; CART, classification and regression trees; CNN, convolutional neural networks; GLCM, grey level co-occurrence matrix; HP, histopathology; IR, image registration; IS, image segmentation; LASSO, least absolute shrinkage and selection operator; LinR, linear regression; LQDA, linear and quadratic discriminant analysis; LR, logistic regression; NB, naïve Bayes; PI-RADS, prostate imaging-reporting and data system; PSA, prostate-specific antigen; PSAd, prostate-specific antigen density; RF, random forests; SVM, support-vector machines
aHistopathology images registered with T2-weighted images using specialised software
Diagnostic performance of fully-automated and semi-automated AI methods for differentiating between csPCa and iPCa or benign disease
| Study | Threshold | AUC [95% CI] | Accuracy | Sensitivity | Specificity | NPV | PPV |
|---|---|---|---|---|---|---|---|
| Wang [ | NR | PZ: 0.89 [0.86–0.93] TZ: 0.97 [0.95–0.98] | PZ: 0.91 [0.86–0.95] TZ: 0.89 [0.87–0.91] | PZ: 0.60 [0.52–0.69] TZ: 1.0 [1.0–1.0] | PZ: 0.98 [0.95–1.0] TZ: 0.88 [0.82–0.93] | NR | NR |
| Fernandez-Quilez [ | 0.5 | 0.89 | NR | 0.85 | 0.94 | NR | NR |
| Schelb [ | Several for different PI-RADS cut-offs | NR | NR | PI-RADS ≥ 3: 0.96 PI-RADS ≥ 4: 0.92 | PI-RADS ≥ 3: 0.31 PI-RADS ≥ 4: 0.47 | PI-RADS ≥ 3: 0.84 PI-RADS ≥ 4: 0.83 | PI-RADS ≥ 3: 0.53 PI-RADS ≥ 4: 0.67 |
| Deniffel [ | Risk of csPCa ≥ 0.2 | 0.85 [0.76–0.97] | NR | 1.0 [1.0–1.0] | 0.52 [0.32–0.68] | 1.0 [1.0–1.0] | 0.56 [0.48–0.66] |
| Seetharamana [ | NR | 0.80 (per lesion) | NR | 0.70 (per lesion) | 0.77 (per lesion) | NR | NR |
| Bonekamp [ | 0.79 | WP: 0.88 PZ: 0.84 TZ: 0.89 (per lesion) | NR | WP: 0.97 (per lesion) | WP: 0.58 (per lesion) | NR | NR |
| Min [ | NR | 0.82 [0.67–0.98] | NR | 0.84 | 0.73 | NR | NR |
| Castillo [ | NR | 0.75 | NR | 0.88 | 0.63 | NR | NR |
| Bleker [ | NR | 0.87 [0.75–0.98] | NR | 0.86 | 0.73 | NR | NR |
| Woźnicki [ | 0.45 | 0.84 [0.6–1.0] | NR | 0.91 [0.81–0.98] | 0.57 [0.38–0.74] | NR | NR |
| Antonelli [ | Reader SP (training) | PZ: 0.83 TZ: 0.75 | NR | PZ: 90 TZ: 92 | PZ: 65 TZ: 56 | NR | NR |
| Hiremath [ | Maximising accuracy (0.361) | 0.81 [0.76–0.85] | 0.78 | 0.83 | 0.59 | NR | NR |
| Kwona [ | NR | WP: 0.82 | NR | NR | NR | NR | NR |
| Lia [ | − 0.42 | 0.98 [0.97–1.00] | 0.90 | 0.95 | 0.87 | 0.97 | 0.82 |
| Bevilacquaa [ | 0.58 | 0.84 [0.63–0.90] | NR | 0.9 | 0.75 | NR | NR |
| Toivonena [ | NR | 0.88 [0.92–0.95] | NR | NR | NR | NR | NR |
| Yooa [ | NR | 0.84 [0.76–0.91] | NR | NR | NR | NR | NR |
AUC, area under the receiver operating characteristic curve; NPV, negative predictive value; NR, not reported; PI-RADS, prostate imaging-reporting and data system; PPV, positive predictive value; PZ, peripheral zone; SP, specificity; TZ, transition zone; WP, whole prostate
aThese papers had either high or unclear risk of bias on QUADAS-2 assessment (see Table 1; Fig. 2)