| Literature DB >> 35513774 |
Vipulkumar Dadhania1, Daniel Gonzalez2, Mustafa Yousif1,3, Jerome Cheng1, Todd M Morgan4, Daniel E Spratt5, Zachery R Reichert6, Rahul Mannan7, Xiaoming Wang7, Anya Chinnaiyan7, Xuhong Cao7, Saravana M Dhanasekaran7, Arul M Chinnaiyan1,4,7,8,9, Liron Pantanowitz1,8, Rohit Mehra10,11,12.
Abstract
BACKGROUND: TMPRSS2-ERG gene rearrangement, the most common E26 transformation specific (ETS) gene fusion within prostate cancer, is known to contribute to the pathogenesis of this disease and carries diagnostic annotations for prostate cancer patients clinically. The ERG rearrangement status in prostatic adenocarcinoma currently cannot be reliably identified from histologic features on H&E-stained slides alone and hence requires ancillary studies such as immunohistochemistry (IHC), fluorescent in situ hybridization (FISH) or next generation sequencing (NGS) for identification.Entities:
Keywords: Adenocarcinoma; Artificial intelligence; Deep learning; ERG; Gene fusion; Prostate cancer; Whole slide imaging
Mesh:
Substances:
Year: 2022 PMID: 35513774 PMCID: PMC9069768 DOI: 10.1186/s12885-022-09559-4
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.638
Distribution of training and hold-out test datasets utilized for algorithm development
| Dataset | Subset | Patients | ERG status | Gleason grade group | |||||
|---|---|---|---|---|---|---|---|---|---|
| Positive | Negative | 1 | 2 | 3 | 4 | 5 | |||
| TCGA cohort | Training set | 235 | 123(52%) | 112(48%) | 41 | 68 | 60 | 32 | 34 |
| Internal cohort | Training set | 26 | 11(42%) | 15(58%) | 1 | 17 | 6 | 0 | 2 |
| Hold-out test set | 131 | 60(46%) | 71(54%) | 0 | 67 | 31 | 4 | 29 | |
Training subset includes initial training, cross-validation and testing sets. Hold-out test set refers to a separate subset of cases not included as part of the training subset. TCGA The Cancer Genome Atlas
Fig. 1Workflow schematic summarizing our algorithm development. a (Top panel) Whole slide images of H&E-stained prostate adenocarcinoma resections were spilt using QuPath into many 224 × 224 pixel patches for input into a convolutional neural network (CNN). Unknown yellow box indicates a separate subset of cases not included as part of the training subset. (Bottom panel) Patches labeled with ERG status were used for CNN training utilizing MobileNetV2. Final prediction of patches into ERG-negative or ERG-positive was based on highest probability. b MobileNetV2 convolutional block structure (adapted from Sandler et al.)
Fig. 2Patches as classified by AI algorithm. a ERG-negative low grade (100 ×). b ERG-positive low grade (100 ×). c ERG-negative high grade (100 ×). d ERG-positive high grade (100 ×)
Fig. 3Receiver operator characteristics (ROC) and area under curve for models at different magnifications (10x, 20 × and 40x)
Performance metrics of AI-based models at different magnifications
| Magnification | AUC | TP | FP | TN | FN | Sensitivity | Specificity | PPV | NPV | Accuracy | F1 score | Cut-off |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10 × | 0.82 | 45 | 13 | 58 | 15 | 75.0% | 81.7% | 77.6% | 79.5% | 78.6% | 0.78 | 0.4 |
| 20 × | 0.84 | 45 | 12 | 59 | 15 | 75.0% | 83.1% | 78.9% | 79.7% | 79.4% | 0.79 | 0.5 |
| 40 × | 0.85 | 45 | 13 | 58 | 15 | 75.0% | 81.7% | 77.6% | 79.5% | 78.6% | 0.78 | 0.35 |
AUC Area under curve in receiver operator characteristics curve, TP True positive (correctly classified as ERG-positive), FP False positive (incorrectly classified as ERG-positive), TN True negative (correctly classified as ERG-negative), FN False negative (incorrectly classified as ERG-negative), PPV Positive predictive value, NPV Negative predictive value, Cut-off indicates cut-off value of proportion of positive patches that gives best accuracy
Algorithm performance metrics based on tumor grade
| Magnification | Grade Group | TP | FP | TN | FN | Sensitivity | Specificity | PPV | NPV | Accuracy | F1 score | Cut-off |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10 × | 1–2 | 33 | 8 | 25 | 1 | 97.1% | 75.8% | 80.5% | 96.2% | 86.6% | 0.88 | 0.4 |
| 3–5 | 12 | 5 | 33 | 14 | 46.2% | 86.8% | 70.6% | 70.2% | 70.3% | 0.56 | 0.4 | |
| 20 × | 1–2 | 28 | 3 | 30 | 6 | 82.4% | 90.9% | 90.3% | 83.3% | 86.6% | 0.86 | 0.5 |
| 3–5 | 17 | 9 | 29 | 9 | 65.4% | 76.3% | 65.4% | 76.3% | 71.9% | 0.65 | 0.5 | |
| 40 × | 1–2 | 28 | 5 | 28 | 6 | 82.4% | 84.8% | 84.8% | 82.4% | 83.6% | 0.84 | 0.35 |
| 3–5 | 17 | 8 | 30 | 9 | 65.4% | 78.9% | 68.0% | 76.9% | 73.4% | 0.67 | 0.35 |
TP True positive (correctly classified as ERG-positive), FP False positive (incorrectly classified as ERG-positive), TN True negative (correctly classified as ERG-negative), FN False negative (incorrectly classified as ERG-negative), PPV Positive predictive value, NPV: Negative predictive value, Cut-off indicates cut-off value of proportion of positive patches that gives best accuracy