| Literature DB >> 33860829 |
Robert J O'Shea1, Amy Rose Sharkey2,3, Gary J R Cook2,4, Vicky Goh2,3.
Abstract
OBJECTIVES: To perform a systematic review of design and reporting of imaging studies applying convolutional neural network models for radiological cancer diagnosis.Entities:
Keywords: Artificial intelligence; Deep learning; Diagnosis, computer-assisted; Neoplasms; Research design
Mesh:
Year: 2021 PMID: 33860829 PMCID: PMC8452579 DOI: 10.1007/s00330-021-07881-2
Source DB: PubMed Journal: Eur Radiol ISSN: 0938-7994 Impact factor: 5.315
List of the data items evaluated. Items are derived from the CLAIM guidance. CLAIM items with multiple conditions are divided into sub-items, denoted as alphabetical suffixes. Compliant values are all values considered satisfactory for that item. Exemptions indicate types of study which are not required to satisfy an item
| Item | Criterion | Values | Compliant values | Exemptions |
|---|---|---|---|---|
| 1 | Title or abstract specified application of convolutional neural network model | 0. Not specified 1. Specified | 1 | None |
| 2 | Abstract included summary of study design, methods, results and conclusions | 0. Not included 1. Included | 1 | None |
| 3 | Introduction provided scientific and clinical background with role for model | 0. Not provided 1. Provided | 1 | None |
| 4a | Study objectives | 0. Not provided 1. Provided | 1 | None |
| 4b | Study hypotheses | 0. Not documented 1. Documented | 1 | None |
| 5 | Indicated prospective or retrospective study timeframe | 0. Not documented R. Retrospective P. Prospective RP. Both retrospective and prospective | R, P, RP | None |
| 6 | Study goal | 0. Not documented 1. Documented | 1 | None |
| 7a | Data source | 0. Not documented L. Local data collection P. Public data LP. Both local and public data | L, P, LP | None |
| 7b | Data collection institutions | 0. Not documented SC. Single-centre data MC. Multi-centre data | SC, MC | None |
| 7c | Imaging equipment vendors | 0. Not documented SV. Single vendor MV. Multiple vendors | SV, MV | None |
| 7d | Image acquisition parameters | 0. Not documented 1. Documented | 1 | None |
| 7e | Institutional review board approval | 0. Not documented 1. Documented | 1 | None |
| 7f | Participant consent | 0. Not documented 1. Documented | 1 | None |
| 8 | Eligibility criteria | 0. Not documented 1. Documented | 1 | None |
| 9 | Image pre-processing | 0. Not documented P. Pre-processing documented PM. Reproducible pre-processing method documented NP. Documented that pre-processing not employed | PM, NP | None |
| 10 | Data subsetting | 0. Not documented C. Image cropping documented CM. Reproducible image cropping method documented NC. Documented that cropping not employed | CM, NC | None |
| 11 | Model predictors and outcomes | 0. Not defined 1. Not defined | 1 | None |
| 12 | Data de-identification | 0. Not documented A. Anonymisation documented AM. Reproducible anonymisation method documented | AM | None |
| 13 | Missing data handling strategy | 0. Not documented E. Missing data excluded from analysis I. Missing data included in analysis | E, I | None |
| 14 | Reference standard definition | 0. Not defined 1. Defined either explicitly or by reference to a Common Data Element such as the American College of Radiology Image Reporting and Data Systems. | 1 | None |
| 15a | Reference standard rationale | 0. Not documented 1. Documented | 1 | None |
| 15b | Definitive ground truth | 0. No definitive ground truth P. Histopathology DI. Definitive imaging modality FU. Case follow-up PFU. Histopathology and case follow-up PDI. Histopathology and definitive imaging modality | P, DI, FU, PFU, PDI | None |
| 16a | Manual image annotation | 0. Not documented UR. Radiologist with unspecified expertise SR. Radiologist with relevant subspecialist expertise OC. Other clinician | SR | None |
| 16b | Histopathology annotation | 0. Not documented SP. Pathologist with relevant subspecialist expertise | SP | Histopathology not employed |
| 17 | Image annotation tools and software | 0. Not documented 1. Documented | 1 | None |
| 18 | Annotator variability | 0. Not documented V. Variability statistics documented M. Aggregation method documented VM. Variability statistics and aggregation method documented | VM | None |
| 19a | Sample size | 0. Not documented 1. Documented number of images in dataset | 1 | None |
| 19b | Provided power calculation | 0. Not documented 1. Documented | 1 | None |
| 19c | Distinct study participants | 0. Not documented {N}. N = number of study participants | {N} | None |
| 20 | Data partitions and their proportions | 0. Not documented 1. Documented | 1 | None |
| 21 | Partition disjunction | 0. Not documented 1. Documented partition disjunction at patient level | 1 | Validation studies |
| 22a | Provided reproducible model description | 0. Not documented 1. Documented | 1 | Validation studies |
| 22b | Provided source code | 0. Not documented 1. Documented | 1 | Validation studies |
| 23 | Modelling software | 0. Not documented S. Documented software SV. Documented software and version | SV | Validation studies |
| 24 | Parameter initialisation method | 0. Not documented R. Random initialisation T. Transfer learning RT. Both random initialisation and transfer learning employed | R | Validation studies |
| 25a | Provided reproducible data augmentation strategy or specified used of unaugmented data | 0. Not documented A. Documented data augmentation AM. Reproducible data augmentation method NA. No data augmentation | AM, NA | Validation studies |
| 25b | Loss function | 0. Not documented 1. Documented | 1 | Validation studies |
| 25c | Optimisation method | 0. Not documented 1. Documented | 1 | Validation studies |
| 25d | Learning rate settings | 0. Not documented 1. Documented | 1 | Validation studies |
| 25e | Stopping protocol for model training | 0. Not documented 1. Documented | 1 | Validation studies |
| 25f | Batch size | 0. Not documented 1. Documented | 1 | Validation studies |
| 26 | Model selection | 0. Not documented 1. Documented model selection criterion, specifying | 1 | Validation studies |
| 27 | If model ensembling applied, provided ensembling method | 0. Not documented E. Ensembling documented EM. Documented reproducible ensembling method | EM | Ensembling not employed |
| 28 | Metrics | 0. Not documented M. Defined performance metrics MR. Defined performance metrics and provided rationale | MR | None |
| 29 | Significance | 0. Not documented S. Model significance documented SM. Model significance documented with reproducible methodology | SM | None |
| 30 | Robustness | 0. Not documented 1. Documented model robustness to variation in experimental conditions such as sample size, noise and imaging equipment | 1 | None |
| 31 | Model interpretation | 0. Not documented I. Interpreted model IM. Interpreted model with validated methodology | IM | None |
| 32 | Test data description | 0. Not described I. Employed internal test data E. Described test data from different institution | I, E | None |
| 33 | Case-flow diagram | 0. Not documented 1. Documented | 1 | None |
| 34 | Demographics and clinical characteristics | 0. Documented D. Documented aggregate statistics DP. Documented statistics for each data partition | DP | None |
| 35a | Test performance | 0. Model performance assessed on data observed during training V. Model performance assessed on data observed during model selection T. Model performance assessed on data which was unobserved during training and model selection | T | None |
| 35b | Human diagnostic performance benchmarking | 0. No human performance benchmark UR. Benchmarked against radiologist with unspecified expertise SR. Benchmarked against radiologist with relevant subspecialist expertise OC. Benchmarked against other clinicians | SR | None |
| 35c | Computational diagnostic performance benchmarking | 0. No computational benchmark 1. Benchmarked against other computational methods | 1 | None |
| 36 | Diagnostic performance with measure of precision | 0. Diagnostic performance reported without measure of precision 1. Diagnostic performance reported with confidence interval or standard error | 1 | None |
| 37 | Failure analysis | 0. Not discussed 1. Discussed misclassified cases or model errors | 1 | None |
| 38 | Study limitations | 0. Not discussed 1. Discussed | 1 | None |
| 39 | Clinical implications of study findings | 0. Not discussed 1. Discussed | 1 | None |
| 40 | Study registration number | 0. Not documented 1. Documented | 1 | None |
| 41 | Study protocol | 0. Not documented 1. Provided access to the full study protocol | 1 | None |
| 42 | Funding | 0. Not documented F. Funding source documented FR. Funding source and role documented NF. Stated no funding received | FR, NF | None |
Fig. 1Flow diagram of literature search process
Fig. 2Distribution of included articles. Left: study publication year. Middle: body system imaged. Right: imaging modality employed
Fig. 3Compliance with CLAIM items 1–13. Compliance rate is defined as the proportion of articles subject to that item which satisfy it. Exemptions are provided in Table 1. Point estimates and 95% confidence intervals are reported
Fig. 4Compliance with CLAIM items 14–27. Compliance rate is defined as the proportion of articles subject to that item which satisfy it. Exemptions are provided in Table 1. Point estimates and 95% confidence intervals are reported
Fig. 5Compliance with CLAIM items 28–42. Compliance rate is defined as the proportion of articles subject to that item which satisfy it. Exemptions are provided in Table 1. Point estimates and 95% confidence intervals are reported
Fig. 6Left: CLAIM compliance over time. Compliance was defined per article by the proportion of applicable items satisfied. Boxplot centrelines indicate median annual compliance. Hinges indicate first and third quartiles. Whiskers indicate maxima and minima. Middle: CLAIM compliance and journal H-index for each article. Right: CLAIM compliance in clinical journals and technical journals. Journals were categorised as either “clinical” or “technical” according to the journal name—names containing any term related to computer science, artificial intelligence or machine learning were assigned the “technical” category. The remaining journals were assigned the “clinical” category