| Literature DB >> 35096605 |
Elena Bertelli1, Laura Mercatelli1, Chiara Marzi2, Eva Pachetti3,4, Michela Baccini5,6, Andrea Barucci2, Sara Colantonio3, Luca Gherardini5, Lorenzo Lattavo1, Maria Antonietta Pascali3, Simone Agostini1, Vittorio Miele1.
Abstract
Prostate cancer (PCa) is the most frequent male malignancy and the assessment of PCa aggressiveness, for which a biopsy is required, is fundamental for patient management. Currently, multiparametric (mp) MRI is strongly recommended before biopsy. Quantitative assessment of mpMRI might provide the radiologist with an objective and noninvasive tool for supporting the decision-making in clinical practice and decreasing intra- and inter-reader variability. In this view, high dimensional radiomics features and Machine Learning (ML) techniques, along with Deep Learning (DL) methods working on raw images directly, could assist the radiologist in the clinical workflow. The aim of this study was to develop and validate ML/DL frameworks on mpMRI data to characterize PCas according to their aggressiveness. We optimized several ML/DL frameworks on T2w, ADC and T2w+ADC data, using a patient-based nested validation scheme. The dataset was composed of 112 patients (132 peripheral lesions with Prostate Imaging Reporting and Data System (PI-RADS) score ≥ 3) acquired following both PI-RADS 2.0 and 2.1 guidelines. Firstly, ML/DL frameworks trained and validated on PI-RADS 2.0 data were tested on both PI-RADS 2.0 and 2.1 data. Then, we trained, validated and tested ML/DL frameworks on a multi PI-RADS dataset. We reported the performances in terms of Area Under the Receiver Operating curve (AUROC), specificity and sensitivity. The ML/DL frameworks trained on T2w data achieved the overall best performance. Notably, ML and DL frameworks trained and validated on PI-RADS 2.0 data obtained median AUROC values equal to 0.750 and 0.875, respectively, on unseen PI-RADS 2.0 test set. Similarly, ML/DL frameworks trained and validated on multi PI-RADS T2w data showed median AUROC values equal to 0.795 and 0.750, respectively, on unseen multi PI-RADS test set. Conversely, all the ML/DL frameworks trained and validated on PI-RADS 2.0 data, achieved AUROC values no better than the chance level when tested on PI-RADS 2.1 data. Both ML/DL techniques applied on mpMRI seem to be a valid aid in predicting PCa aggressiveness. In particular, ML/DL frameworks fed with T2w images data (objective, fast and non-invasive) show good performances and might support decision-making in patient diagnostic and therapeutic management, reducing intra- and inter-reader variability.Entities:
Keywords: deep learning; machine learning; mpMRI prostate cancer aggressiveness; prostate cancer; radiomics
Year: 2022 PMID: 35096605 PMCID: PMC8792745 DOI: 10.3389/fonc.2021.802964
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1MpMRI of a 76-years old patient with indeterminate mpMRI result (PI-RADS=4), PSA=14 ng/ml, GS=4+4 (ISUP 4). MpMRI zoomed images containing the target lesion, respectively axial T2-weighted image (A), ADC map (C), and their relative lesion segmentations (B, D). The green (B) and red (D) arrows point out the segmented lesion in T2 and ADC images, respectively.
Figure 3MpMRI of a 69-years old patient with positive mpMRI result (PI-RADS=5), PSA level=7 ng/ml, GS=4+4 (ISUP 4). MRI zoomed images containing the target lesion, respectively axial T2-weighted image (A), ADC map (C), and their relative lesion segmentations (B, D). The green (B) and red (D) arrows point out the segmented lesion in T2 and ADC images, respectively.
Descriptive statistics of our cohorts. GS ≤ 3+4 is equivalent to ISUP GG ≤ 2, and GS≥4+3 corresponds to ISUP GG≥3.
| PI-RADS 2.0 cohort | PI-RADS 2.1 cohort | |
|---|---|---|
| # patients | 85 | 28 |
| Age (years) (mean (STD) | 66.72 (7.58) | 68.64 (5.71) |
| # lesions | 103 (76 with GS ≤ 3+4, | 29 (21 with GS ≤ 3+4, |
| 27 with GS≥4+3) | 8 with GS≥4+3) | |
| PI-RADS score (median ± IQR) | 4 ± 0 | 4 ± 0.625 |
| PSA (ng/ml) (mean (STD)) | 8.34 (8.20) | 5.43 (2.48) |
Indicates significant differences (p-value < 0.5 at Mann-Whitney test) between PI-RADS 2.0 and PI-RADS 2.1 cohorts.
GS, Gleason score; IQR, interquartile range; ISUP GG, ISUP/WHO Grading Group; PI-RADS, Prostate Imaging – Reporting and Data System; PSA, Prostate-Specific Antigen, STD, standard deviation.
Figure 4Nested validation scheme in our ML/DL analysis. (A) On development set 2.0 only, 5-fold CV was used to identify the best performing framework, along with performing hyperparameters optimization. The best performing ML/DL framework was then used to train the final framework on the entire development set 2.0. This framework was then evaluated on an unseen test set 2.0 and test set 2.1, independently. (B) The best performing framework was still trained on multi PI-RADS development set (i.e., the union of development set 2.0 and development set 2.1 and evaluated on unseen multi PI-RADS test set (i.e., the union of test set 2.0 and test set 2.1).
AUROC values of ML and DL analyses for T2w/ADC/T2w+ADC.
| Framework | Test set | T2w | ADC | T2w+ADC |
|---|---|---|---|---|
| ML | 2.0 | 0.750 [0.500, 1] | 0.531 [0.250, 0.75] | 0.625 [0.167, 1] |
| multi PI-RADS | 0.795 [0.615; 1] | 0.500 [0.300; 0.715] | 0.682 [0.455; 1] | |
| AG-free DL on L-DS | 2.0 | 0.667 [0.385, 0.849] | 0.667 [0.355, 0.905] | 0.727 [0.231, 1] |
| multi PI-RADS | 0.750 [0.568, 0.945] | 0.714 [0.445, 0.883] | 0.752 [0.564, 0.872] | |
| AG-free DL on C-DS | 2.0 | 0.775 [0.478, 1] | 0.667 [0.392, 0.903] | 0.700 [0.455; 0.858] |
| multi PI-RADS | 0.524 [0.200, 0.818] | 0.547 [0.393, 0.780] | 0.574 [0.286, 0.819] | |
| AG DL on C-DS | 2.0 | 0.875 [0.639, 1] | 0.750 [0.455, 0.911] | 0.667 [0.301, 1] |
| multi PI-RADS | 0.500 [0.278, 0.717 | 0.463 [0.234, 0.817] | 0.288 [0.09, 0.529] |
The AUROC values are reported as median [5th percentile, 95th percentile]. AG: attention gate; C-DS, cropped dataset; DL, deep learning; L-DS, lesion dataset; ML, machine learning.
Figure 5(A) ROC curves of ML frameworks on the test set 2.0. (B) ROC curves of ML frameworks on the multi PI-RADS test set. (C) ROC curves of DL AG-free CNN trained on L-DS test set 2.0. (D) ROC curves of DL AG-free CNN trained on L-DS multi PI-RADS test set. (E) ROC curves of DL AG-free CNN trained on C-DS test set 2.0. (F) ROC curves of DL AG-free CNN trained on C-DS multi PI-RADS test set. (G) ROC curves of DL AG CNN trained on C-DS test set 2.0. (H) ROC curves of DL AG CNN trained on C-DS multi PI-RADS test set.
Figure 6Axial T2w MR image of a PCa lesion with PI-RADS=5 and GS=4+5. (A) Original image together with its cropped version of size 64x64. We fed AG DL frameworks with the cropped images, i.e. the C-DS images. (B) Attention map obtained from the best performing AG DL framework fed with T2w C-DS images, superimposed to the original cropped version of the T2w image and its segmentation.