| Literature DB >> 32060715 |
Stephan Ursprung1,2, Lucian Beer1,2,3, Annemarie Bruining4, Ramona Woitek1,2,3, Grant D Stewart2,5, Ferdia A Gallagher1,2, Evis Sala6,7.
Abstract
OBJECTIVES: (1) To assess the methodological quality of radiomics studies investigating histological subtypes, therapy response, and survival in patients with renal cell carcinoma (RCC) and (2) to determine the risk of bias in these radiomics studies.Entities:
Keywords: Angiomyolipoma; Carcinoma, renal cell; Machine learning; Quality improvement; Systematic review
Mesh:
Year: 2020 PMID: 32060715 PMCID: PMC7248043 DOI: 10.1007/s00330-020-06666-3
Source DB: PubMed Journal: Eur Radiol ISSN: 0938-7994 Impact factor: 5.315
Fig. 1Pathway for the development of radiomics algorithms and challenges in clinical translation. In addition to image acquisition and image registration, non-quantitative MRI sequences may undergo intensity normalization to reduce intra- and inter-patient heterogeneity. Subsequently, either classical machine learning algorithms or deep learning are employed to define diagnostic, prognostic, or predictive models. These models require external validation, ensuring transferability of results between sites and MR scanners before prospective validation and demonstration of cost-effectiveness can enable these diagnostic support systems to enter clinical practice. Continuous monitoring is required to detect deteriorating model performance to trigger re-training and model update as image acquisition evolves. ANN: artificial neural network
Elements of the RQS as described by Lambin et al [15] and average rating achieved by the studies included in this systematic review
| RQS scoring item | Interpretation | Average |
|---|---|---|
| Image Protocol | + 1 for well documented protocols, + 1 for publicly available protocols | 0.48 |
| Multiple Segmentations | + 1 if segmented multiple times (different physicians, algorithms, or perturbation of regions of interest) | 0.38 |
| Phantom Study | + 1 if texture phantoms were used for feature robustness assessment | 0.00 |
| Multiple Time Points | + 1 multiple time points for feature robustness assessment | 0.01 |
| Feature Reduction | − 3 if nothing, + 3 if either feature reduction or correction for multiple testing | 0.23 |
| Non Radiomics | + 1 if multivariable analysis with non-radiomics features | 0.15 |
| Biological Correlates | + 1 if present | 0.98 |
| Cut-off | + 1 if cutoff either pre-defined or at median or continuous risk variable reported | 0.11 |
| Discrimination and Resampling | + 1 for discrimination statistic and statistical significance, + 1 if resampling applied | 0.92 |
| Calibration | + 1 for calibration statistic and statistical significance, +1 if resampling applied | 0.04 |
| Prospective | + 7 for prospective validation within a registered study | 0.98 |
| Validation | − 5 if no validation/+ 2 for internal validation/+ 3 for external validation/+ 4 two external validation datasets or validation of previously published signature/+ 5 validation on ≥ 3 datasets from > 1 institute | −4.61 |
| Gold Standard | + 2 for comparison to gold standard | 1.73 |
| Clinical Utility | + 2 for reporting potential clinical utility | 1.91 |
| Cost-effectiveness | + 1 for cost-effectiveness analysis | 0.00 |
| Open Science | + 1 for open-source scans, + 1 for open-source segmentations, + 1 for open-source code, + 1 open-source representative segmentations and features | 0.02 |
RQS: Radiomics Quality Score
Fig. 2Study selection flowchart
Fig. 3Risk factors for bias colored according the four dimensions of the QUADAS tool. The length of the bars is equivalent to the frequency with which this risk factor was identified among the included studies
Inter-rater agreement in the assessment of the RQS
| RQS scoring item | S* [95% CI] |
|---|---|
| Image Protocol | 0.45 [0.20–0.67] |
| Multiple Segmentations | 0.93 [0.82–1.00] |
| Phantom Study | 1.00 [1.00–1.00] |
| Multiple Time Points | 0.93 [0.82–1.00] |
| Feature Reduction | 0.93 [0.82–1.00] |
| Non Radiomics | 0.67 [0.49–0.85] |
| Biological Correlates | 0.93 [0.82–1.00] |
| Cut-off | 0.93 [0.82–1.00] |
| Discrimination and Resampling | 0.82 [0.71–0.92] |
| Calibration | 0.96 [0.89–1.00] |
| Prospective | 1.00 [1.00–1.00] |
| Validation | 1.00 [1.00–1.00] |
| Gold Standard | 0.76 [0.63–0.88] |
| Clinical Utility | 0.60 [0.38–0.82] |
| Cost-effectiveness | 1.00 [1.00–1.00] |
| Open Science | 1.00 [1.00–1.00] |
CI: confidence interval, RQS: Radiomics Quality Score
Fig. 4Forrest plot of the effect size calculated as log odds ratio for 10 of 13 studies investigating the diagnostic accuracy of radiomics in the differentiation of AMLwvf from RCC. TP: number of AMLwvf patients correctly diagnosed, FN: number of AMLwvf patients diagnosed as having RCC, FP: number of RCC patients diagnosed as having AML, TN: number of RCC patients correctly diagnosed. X-axis: log-transformed odds ratios, RE: random effects
Fig. 5Funnel plot of studies included in the meta-analysis (black) and missing studies identified by trim and fill analysis (white dot). The funnel plot was asymmetric with more studies than expected reporting higher odds ratios for the ability of radiomics to differentiate between malignant renal tumors and benign AMLwvf; this can indicate the presence of publication bias