| Literature DB >> 35267597 |
Ivan Zhovannik1,2,3, Dennis Bontempi2, Alessio Romita2, Elisabeth Pfaehler2,4, Sergey Primakov5, Andre Dekker2, Johan Bussink1, Alberto Traverso2, René Monshouwer1.
Abstract
Problem. Image biomarker analysis, also known as radiomics, is a tool for tissue characterization and treatment prognosis that relies on routinely acquired clinical images and delineations. Due to the uncertainty in image acquisition, processing, and segmentation (delineation) protocols, radiomics often lack reproducibility. Radiomics harmonization techniques have been proposed as a solution to reduce these sources of uncertainty and/or their influence on the prognostic model performance. A relevant question is how to estimate the protocol-induced uncertainty of a specific image biomarker, what the effect is on the model performance, and how to optimize the model given the uncertainty. Methods. Two non-small cell lung cancer (NSCLC) cohorts, composed of 421 and 240 patients, respectively, were used for training and testing. Per patient, a Monte Carlo algorithm was used to generate three hundred synthetic contours with a surface dice tolerance measure of less than 1.18 mm with respect to the original GTV. These contours were subsequently used to derive 104 radiomic features, which were ranked on their relative sensitivity to contour perturbation, expressed in the parameter η. The top four (low η) and the bottom four (high η) features were selected for two models based on the Cox proportional hazards model. To investigate the influence of segmentation uncertainty on the prognostic model, we trained and tested the setup in 5000 augmented realizations (using a Monte Carlo sampling method); the log-rank test was used to assess the stratification performance and stability of segmentation uncertainty. Results. Although both low and high η setup showed significant testing set log-rank p-values (p = 0.01) in the original GTV delineations (without segmentation uncertainty introduced), in the model with high uncertainty, to effect ratio, only around 30% of the augmented realizations resulted in model performance with p < 0.05 in the test set. In contrast, the low η setup performed with a log-rank p < 0.05 in 90% of the augmented realizations. Moreover, the high η setup classification was uncertain in its predictions for 50% of the subjects in the testing set (for 80% agreement rate), whereas the low η setup was uncertain only in 10% of the cases. Discussion. Estimating image biomarker model performance based only on the original GTV segmentation, without considering segmentation, uncertainty may be deceiving. The model might result in a significant stratification performance, but can be unstable for delineation variations, which are inherent to manual segmentation. Simulating segmentation uncertainty using the method described allows for more stable image biomarker estimation, selection, and model development. The segmentation uncertainty estimation method described here is universal and can be extended to estimate other protocol uncertainties (such as image acquisition and pre-processing).Entities:
Keywords: image biomarkers; prognostic modeling; radiomics; radiomics harmonization; uncertainty
Year: 2022 PMID: 35267597 PMCID: PMC8909427 DOI: 10.3390/cancers14051288
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1The Workflow of Image Biomarker Analysis: from image acquisition to modeling—each procedure can be performed using a different protocol (set of parameters), thus inducing uncertainty in the image biomarker model performance. Partially adapted from: https://med.stanford.edu/bmrgroup/Research/AcqRecon.html (accessed on 14 January 2022), https://www.nature.com/articles/srep03529 (accessed on 14 January 2022) [6].
Figure 2Image biomarker value distributions in populations A and B given relatively low (low opacity curves) and high (full opacity curves) intra-population variance.
Two sets of radiomic features selected for analysis.
| Setup A—Low | Setup B—High | ||
|---|---|---|---|
| Feature |
| Feature |
|
| first-order Maximum | 0.0 | glszm GrayLevelVariance | 0.2944 |
| gldm GrayLevelNonUniformity | 0.0108 | glrlm RunEntropy | 0.3066 |
| glrlm GrayLevelNonUniformity | 0.0115 | glcm MCC | 0.3216 |
| ngtdm Coarseness | 0.0129 | glszm GrayLevelNonUniformityNormalized | 0.4005 |
Figure 3Monte Carlo sampling.
Figure 4The general outline of the experiment.
Figure 5Log-rank p-value distribution in the test set for setups A (a, low η) and B (b, high η).
Figure 6Segmentation uncertainty influence in the high η (setup B): two sample testing set realizations result in different performances.
Figure 7Segmentation uncertainty and patient stratification agreement δ Equation(5) in setups A (a, low η) and B (b, high η).