| Literature DB >> 35986808 |
Jingyu Zhong1, Yangfan Hu1, Guangcheng Zhang2, Yue Xing1, Defang Ding1, Xiang Ge1, Zhen Pan3, Qingcheng Yang3, Qian Yin4, Huizhen Zhang4, Huan Zhang5, Weiwu Yao6.
Abstract
OBJECTIVE: To update the systematic review of radiomics in osteosarcoma.Entities:
Keywords: Machine learning; Osteosarcoma; Quality improvement; Radiomics; Systematic review
Year: 2022 PMID: 35986808 PMCID: PMC9392674 DOI: 10.1186/s13244-022-01277-6
Source DB: PubMed Journal: Insights Imaging ISSN: 1869-4101
Fig. 1Flow diagram of study inclusion
Study characteristics
| Study Characteristics | Data |
|---|---|
| Sample size, mean ± standard deviation, median (range) | 86.6 ± 45.8, 81 (17–191) |
| Journal type, | |
| Imaging | 13 (44.8) |
| Non-imaging | 16 (55.2) |
| First authorship, | |
| Radiologist | 19 (65.5) |
| Non-radiologist | 10 (34.5) |
| Imaging modality, | |
| CT | 9 (31.0) |
| MRI | 14 (48.3) |
| PET | 6 (20.7) |
| Biomarker, | |
| Diagnostic | 3 (9.1) |
| Predictive | 18 (54.5) |
| Prognostic | 12 (36.4) |
| Model type, | |
| Type 1a: Developed model validated with exactly the same data | 8 (24.2) |
| Type 1b: Developed model validated with resampling data | 8 (24.2) |
| Type 2a: Developed model validated with randomly splitting data | 12 (36.4) |
| Type 2b: Developed model validated with non-randomly splitting data | 1 (3.0) |
| Type 3: Developed model validated with separate data | 4 (12.1) |
| Type 4: Validation only | 0 (0.0) |
There were 33 radiomics models identified in 29 included studies. The model type was determined according to criteria in TRIPOD statement. TRIPOD Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis
Fig. 2Imaging in osteosarcoma and radiomics study topics. Imaging examination is a routine in diagnosis and treatment decision in osteosarcoma. Radiomics has shown potential in personal precision medicine in this process. The study topics and number of radiomics studies in osteosarcoma with imaging modality were summarized. Note two studies built a prediction model for the response to NAC and a prognostic model for survival, respectively; and two studies built one prediction model for response to NAC a prognostic model for survival, respectively. This resulted in 33 radiomics models in 29 included studies. OS osteosarcoma, ES Ewing sarcoma, CS chondrosarcoma
Fig. 3Quality assessment of included studies. a ideal percentage of RQS; b TRIPOD adherence rate; c CLAIM adherence rate; d QUADAS-2 assessment result
RQS rating of included studies
| 16 items according to 6 key domains | Range | Median (range) | Percentage of ideal score, | Adherence rate, |
|---|---|---|---|---|
| Total 16 items | − 8–36 | 10 (3–18) | 305/1044 (29.2) | 207/464 (44.6) |
| Domain 1: protocol quality and stability in image and segmentation | 0–5 | 2 (0–3) | 50/145 (34.5) | 50/116 (43.1) |
| Protocol quality | 0–2 | 1 (0–1) | 22/58 (37.9) | 22/29 (75.9) |
| Multiple segmentations | 0–1 | 1 (0–1) | 20/29 (69.0) | 20/29 (69.0) |
| Test–retest | 0–1 | 0 (0–1) | 8/29 (27.6) | 8/29 (27.6) |
| Phantom study | 0–1 | 0 (0–0) | 0/29 (0.0) | 0/29 (0.0) |
| Domain 2: feature selection and validation | − 8 to 8 | 5 (− 8 to 8) | 94/232 (40.5) | 49/58 (84.5) |
| Feature reduction or adjustment of multiple testing | − 3 to 3 | 3 (3–3) | 69/87 (79.3) | 26/29 (89.7) |
| Validation | − 5 to 5 | 2 (− 5 to 5) | 25/145 (17.2) | 23/29 (79.3) |
| Domain 3: biologic/clinical validation and utility | 0–6 | 2 (0–5) | 69/174 (39.7) | 61/116 (52.6) |
| Non-radiomics features | 0–1 | 1 (0–1) | 18/29 (62.1) | 18/29 (62.1) |
| Biologic correlations | 0–1 | 1 (0–1) | 27/29 (93.1) | 27/29 (93.1) |
| Comparison to “gold standard” | 0–2 | 0 (0 to 2) | 16/58 (27.6) | 8/29 (27.6) |
| Potential clinical utility | 0–2 | 0 (0–1) | 8/58 (13.8) | 8/29 (27.6) |
| Domain 4: model performance index | 0 to 5 | 2 (1–4) | 61/145 (42.1) | 35/87 (40.2) |
| Cut-off analysis | 0–1 | 0 (0–0) | 0/29 (0.0) | 0/29 (0.0) |
| Discrimination statistics | 0–2 | 2 (1–2) | 49/58 (84.5) | 29/29 (100.0) |
| Calibration statistics | 0–2 | 0 (0–2) | 12/58 (20.7) | 6/29 (20.7) |
| Domain 5: high level of evidence | 0–8 | 0 (0–7) | 21/232 (9.1) | 3/58 (5.2) |
| Prospective study | 0–7 | 0 (0–7) | 21/203 (10.3) | 3/29 (10.3) |
| Cost-effectiveness analysis | 0–1 | 0 (0–0) | 0/29 (0.0) | 0.29 (0.0) |
| Domain 6: open science and data | 0–4 | 0 (0–2) | 10/116 (8.6) | 9/29 (31.0) |
The ideal score was described as score and percentage of score to ideal score for each item. In the cases where a score of one point per item was obtained, the study was considered to have basic adherence to each item. The adherence rate was calculated as proportion of the number of articles with basic adherence to number of total articles
RQS Radiomics Quality Score
TRIPOD adherence of included studies
| 37 Selected items in 22 criteria according to 7 sections ( | Study, |
|---|---|
| Overall (excluding items 5c, 11, 14b, 10c, 10e, 12, 13, 17, and 19a) | 481/812 (59.2) |
| Section | 18/58 (31.0) |
| 1. Title—identify developing/validating a model, target population, and the outcome | 2/29 (6.9) |
| 2. Abstract—provide a summary of objectives, study design, setting, participants, sample size, predictors, outcome, statistical analysis, results, and conclusions | 16/29 (55.2) |
| Section | 36/58 (62.1) |
| 3a. Background—Explain the medical context and rationale for developing/validating the model | 29/29 (100.0) |
| 3b. Objective—Specify the objectives, including whether the study describes the development/validation of the model or both | 7/29 (24.1) |
| Section | 218/277 (57.8) |
| 4a. Source of data—describe the study design or source of data (randomized trial, cohort, or registry data) | 29/29 (100.0) |
| 4b. Source of data—specify the key dates | 29/29 (100.0) |
| 5a. Participants—specify key elements of the study setting including number and location of centers | 29/29 (100.0) |
| 5b. Participants—describe eligibility criteria for participants (inclusion and exclusion criteria) | 22/29 (75.9) |
| 5c. Participants—give details of treatment received, if relevant ( | 16/25 (64.0) |
| 6a. Outcome—clearly define the outcome, including how and when assessed | 27/29 (93.1) |
| 6b. Outcome—report any actions to blind assessment of the outcome | 3/29 (10.3) |
| 7a. Predictors—clearly define all predictors, including how and when assessed | 10/29 (34.5) |
| 7b. Predictors—report any actions to blind assessment of predictors for the outcome and other predictors | 4/29 (13.8) |
| 8. Sample size—explain how the study size was arrived at | 3/29 (10.3) |
| 9. Missing data—describe how missing data were handled with details of any imputation method | 6/29 (20.7) |
| 10a. Statistical analysis methods—describe how predictors were handled | 29/29 (100.0) |
| 10b. Statistical analysis methods—specify type of model, all model-building procedures (any predictor selection), and method for internal validation | 21/29 (72.4) |
| 10d. Statistical analysis methods—specify all measures used to assess model performance and if relevant, to compare multiple models (discrimination and calibration) | 6/29 (20.7) |
| 11. Risk groups—provide details on how risk groups were created, if done ( | n/a |
| Section | 117/174 (67.2) |
| 13a. Participants—describe the flow of participants, including the number of participants with and without the outcome. A diagram may be helpful | 16/29 (55.2) |
| 13b. Participants—describe the characteristics of the participants, including the number of participants with missing data for predictors and outcome | 26/29 (89.7) |
| 14a. Model development—specify the number of participants and outcome events in each analysis | 23/29 (79.3) |
| 14b. Model development—report the unadjusted association between each candidate predictor and outcome, if done (N = 5) | 4/5 (80.0) |
| 15a. Model specification—present the full prediction model to allow predictions for individuals (regression coefficients, intercept) | 21/29 (72.4) |
| 15b. Model specification—explain how to the use the prediction model (nomogram, calculator, etc.) | 11/29 (37.9) |
| 16. Model performance—report performance measures (with confidence intervals) for the prediction model | 20/29 (69.0) |
| Section 5: Discussion | 86/87 (98.9) |
| 18. Limitations—Discuss any limitations of the study | 28/29 (96.6) |
| 19b. Interpretation—Give an overall interpretation of the results | 29/29 (100.0) |
| 20. Implications—Discuss the potential clinical use of the model and implications for future research | 29/29 (100.0) |
| Section 6: Other information | 6/58 (10.3) |
| 21. Supplementary information—provide information about the availability of supplementary resources, such as study | 0/29 (0.0) |
| 22. Funding—give the source of funding and the role of the funders for the present study | 6/29 (20.7)) |
| Section 7: Validation for Model type 2a, 2b, 3, and 4 (N = 16) | 32/64 (50.0) |
| 10c. Statistical analysis methods—describe how the predictions were calculated | 15/16 (93.8) |
| 10e. Statistical analysis methods—describe any model updating (recalibration), if done (N = 0) | n/a |
| 12. Development versus validation—Identify any differences from the development data in setting, eligibility criteria, outcome, and predictors | 10/16 (62.5) |
| 13c. Participants (for validation)—show a comparison with the development data of the distribution of important variables | 2/16 (12.5) |
| 17. Model updating—report the results from any model updating, if done (N = 0) | n/a |
| 19a. Interpretation (for validation)—discuss the results with reference to performance in the development data and any other validation data | 5/16 (31.3) |
In the cases where a score of one point per item was obtained, the study was considered to have basic adherence to each item. The adherence rate was calculated as proportion of the number of articles with basic adherence to number of total articles. During the calculation, the “if done” or “if relevant” items (5c, 11, and 14b) and validation items (10c, 10e, 12, 13, 17, and 19a) were excluded from both the denominator and numerator
TRIPOD Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis, n/a not applicable
CLAIM adherence of included studies
| CLAIM items ( | Study, |
|---|---|
| Overall (excluding item 27) | 961/1508 (63.7) |
| Section | 53/58 (91.4) |
| 1. Title or abstract—Identification as a study of AI methodology | 29/29 (100.0) |
| 2. Abstract—Structured summary of study design, methods, results, and conclusions | 24/29 (82.8) |
| Section | 55/87 (63.2) |
| 3. Background—scientific and clinical background, including the intended use and clinical role of the AI approach | 29/29 (100.0) |
| 4a. Study objective | 22/29 (75.9) |
| 4b. Study hypothesis | 4/29 (13.8) |
| Section | 700/1044 (67.0) |
| 5. Study design—Prospective or retrospective study | 29/29 (100.0) |
| 6. Study design—Study goal, such as model creation, exploratory study, feasibility study, non-inferiority trial | 29/29 (100.0) |
| 7a. Data—Data source | 29/29 (100.0) |
| 7b. Data—Data collection institutions | 29/29 (100.0) |
| 7c. Data—Imaging equipment vendors | 25/29 (86.2) |
| 7d. Data—Image acquisition parameters | 22/29 (75.9) |
| 7e. Data—Institutional review board approval | 28/29 (96.6) |
| 7f. Data—Participant consent | 24/29 (82.8) |
| 8. Data—Eligibility criteria | 22/29 (75.9) |
| 9. Data—Data pre-processing steps | 20/29 (69.0) |
| 10. Data—Selection of data subsets (segmentation of ROI in radiomics studies) | 26/29 (89.7) |
| 11. Data—Definitions of data elements, with references to Common Data Elements | 29/29 (100.0) |
| 12, Data—De-identification methods | 3/29 (10.3) |
| 13. Data—How missing data were handled | 6/29 (20.7) |
| 14. Ground truth—Definition of ground truth reference standard, in sufficient detail to allow replication | 27/29 (93.1) |
| 15a. Ground truth—Rationale for choosing the reference standard (if alternatives exist) | 0/29 (0.0) |
| 15b. Ground truth—Definitive ground truth | 29/29 (100.0) |
| 16. Ground truth—Manual image annotation | 17/29 (586) |
| 17. Ground truth—Image annotation tools and software | 10/29 (34.5) |
| 18. Ground truth—Measurement of inter- and intra-rater variability; methods to mitigate variability and/or resolve discrepancies | 9/29 (31.0) |
| 19a. Data Partitions—Intended sample size and how it was determined | 29/29 (100.0) |
| 19b. Data Partitions—Provided power calculation | 4/29 (13.8) |
| 19c. Data Partitions—Distinct study participants | 23/29 (79.3) |
| 20. Data Partitions—How data were assigned to partitions; specify proportions | 22/29 (75.9) |
| 21. Data Partitions—Level at which partitions are disjoint (e.g., image, study, patient, institution) | 22/29 (75.9) |
| 22a. Model—Provided reproducible model description | 21/29 (72.4) |
| 22b. Model—Provided source code | 0/29 (0.0) |
| 23. Model—Software libraries, frameworks, and packages | 20/29 (69.0) |
| 24. Model—Initialization of model parameters (e.g., randomization, transfer learning) | 23/29 (79.3) |
| 25. Training—Details of training approach, including data augmentation, hyperparameters, number of models trained | 16/29 (55.2) |
| 26. Training—Method of selecting the final model | 21/29 (72.4) |
| 27. Training—Ensembling techniques, if applicable ( | 8/14 (57.1) |
| 28. Evaluation—Metrics of model performance | 29/29 (100.0) |
| 29. Evaluation—Statistical measures of significance and uncertainty (e.g., confidence intervals) | 20/29 (69.0) |
| 30. Evaluation—Robustness or sensitivity analysis | 10/29 (34.5) |
| 31. Evaluation—Methods for explainability or interpretability (e.g., saliency maps), and how they were validated | 11/29 (37.9) |
| 32. Evaluation—Validation or testing on external data | 16/29 (55.2) |
| Section | 90/174 (51.7) |
| 33. Data—Flow of participants or cases, using a diagram to indicate inclusion and exclusion | 16/29 (55.2) |
| 34. Data—Demographic and clinical characteristics of cases in each partition | 25/29 (86.2) |
| 35a. Model performance—Test performance | 16/29 (55.2) |
| 35b. Model performance—Benchmark of performance | 8/29 (27.6) |
| 36. Model performance—Estimates of diagnostic accuracy and their precision (such as 95% confidence intervals) | 20/29 (69.0) |
| 37. Model performance—Failure analysis of incorrectly classified cases | 5/29 (17.2) |
| Section 5: Discussion | 57/58 (98.3) |
| 38. Study limitations, including potential bias, statistical uncertainty, and generalizability | 28/29 (96.6) |
| 39. Implications for practice, including the intended use and/or clinical role | 29/29 (100.0) |
| Section 6: Other information | 6/87 (6.9) |
| 40. Registration number and name of registry | 0/29 (0.0) |
| 41. Where the full study protocol can be accessed | 0/29 (0.0) |
| 42. Sources of funding and other support; role of funders | 6/29 (20.7) |
CLAIM Checklist for Artificial Intelligence in Medical Imaging. In the cases where a score of one point per item was obtained, the study was considered to have basic adherence to each item. The adherence rate was calculated as proportion of the number of articles with basic adherence to number of total articles. During the calculation, the “if applicable” item (27) was excluded from both the denominator and numerator
Fig. 4Quality evaluation with impact factor, sample size, and publication year. Swam plots of (a) ideal percentage of RQS, (b) TRIPOD adherence rate, and (c) CLAIM adherence rate with impact factor and sample size. The diameter of bubbles indicates the sample size of studies. The lighter color indicates the studies after the publication of previous review; the darker color indicates those before its publication. Notice one study published on journals without impact factor was excluded. d Bar plot depicting the number of studies, and line plots presenting ideal percentage of RQS, TRIPOD adherence rate, and CLAIM adherence rate of radiomics studies on osteosarcoma over the years
Fig. 5Forest plots of diagnostic odds ratios. The performance of radiomics in prediction of NAC response in osteosarcoma patients based on testing datasets. TP pathological good responders predicted as good responders, FP pathological poor responders predicted as good responders, FN pathological good responders predicted as poor responders, TN pathological poor responders predicted as poor responders
The prediction performance of radiomics for NAC response in osteosarcoma patients
| Clinical question | MRI-driven radiomics prediction model for NAC response in osteosarcoma patients |
|---|---|
| Number of studies | 4 |
| Good responder/sample size | 44/115 |
| Pooled analysis | |
| DOR (95%CI) | 28.83 (10.27–80.95) |
| | |
| Sensitivity (95% CI) | 0.84 (0.70–0.92) |
| Specificity (95% CI) | 0.85 (0.74–0.91) |
| PLR (95% CI) | 5.43 (3.11–9.49) |
| NLR (95% CI) | 0.19 (0.09–0.37) |
| AUC (95% CI) | 0.91 (0.88–0.93) |
| Heterogeneity | |
| Higgins | |
| Cochran’s | |
| Publication bias | |
| Egger’s test | |
| Begg’s test | |
| Deeks’ test | |
| Trim and fill method | |
| Number of missing studies | 2 |
| Adjusted DOR (95%CI) | 20.53 (7.80–54.06) |
| | |
| Level of Evidence | Weak |
AUC area under curve, CI confidential interval, DOR diagnostic odds ratio, NAC neoadjuvant chemotherapy, NLR negative likelihood ratio, n/a not applicable, PLR positive likelihood ratio