Literature DB >> 36184987

Compute Tomography Radiomics Analysis on Whole Pancreas Between Healthy Individual and Pancreatic Ductal Adenocarcinoma Patients: Uncertainty Analysis and Predictive Modeling.

Shuo Wang¹, Chi Lin¹, Alexander Kolomaya², Garett P Ostdiek-Wille², Jeffrey Wong¹, Xiaoyue Cheng³, Yu Lei⁴, Chang Liu⁵.

Abstract

Radiomics is a rapidly growing field that quantitatively extracts image features in a high-throughput manner from medical imaging. In this study, we analyzed the radiomics features of the whole pancreas between healthy individuals and pancreatic cancer patients, and we established a predictive model that can distinguish cancer patients from healthy individuals based on these radiomics features.
Methods: We retrospectively collected venous-phase scans of contrast-enhanced computed tomography (CT) images from 181 control subjects and 85 cancer case subjects for radiomics analysis and predictive modeling. An attending radiation oncologist delineated the pancreas for all the subjects in the Varian Eclipse system, and we extracted 924 radiomics features using PyRadiomics. We established a feature selection pipeline to exclude redundant or unstable features. We randomly selected 189 cases (60 cancer and 129 control) as the training set. The remaining 77 subjects (25 cancer and 52 control) as a test set. We trained a Random Forest model utilizing the stable features to distinguish the cancer patients from the healthy individuals on the training dataset. We analyzed the performance of our best model by running 5-fold cross-validations on the training dataset and applied our best model to the test set.
Results: We identified that 91 radiomics features are stable against various uncertainty sources, including bin width, resampling, image transformation, image noise, and segmentation uncertainty. Eight of the 91 features are nonredundant. Our final predictive model, using these 8 features, has achieved a mean area under the receiver operating characteristic curve (AUC) of 0.99 ± 0.01 on the training dataset (189 subjects) by cross-validation. The model achieved an AUC of 0.910 on the independent test set (77 subjects) and an accuracy of 0.935.
Conclusion: CT-based radiomics analysis based on the whole pancreas can distinguish cancer patients from healthy individuals, and it could potentially become an early detection tool for pancreatic cancer.

Entities: Chemical

Keywords: computed tomography; pancreatic cancer; radiomics; random forest; uncertainty analysis

Mesh：

Year: 2022 PMID： 36184987 PMCID： PMC9530578 DOI： 10.1177/15330338221126869

Source DB: PubMed Journal: Technol Cancer Res Treat ISSN： 1533-0338

Introduction

Pancreatic cancer is the 4th leading cause of all cancer-related deaths in the US, with an estimated 5-year overall survival of about 9%. Furthermore, more than 50% of cases were diagnosed at an advanced stage with metastatic disease and 5-year survival of <3%. This disheartening fact has stressed the importance of innovative and robust diagnostic tools for the early detection of pancreatic cancer, which may lead to a more favorable prognosis. Medical imaging has been an essential component in clinical cancer care. Among different imaging modalities, computed tomography (CT) is the most used medical imaging modality for diagnosis and treatment response monitoring of a variety of cancer types including pancreatic cancer.[3-5] However, the current practice of medical imaging analysis is still mainly intended for visual interpretation and lesion size determination.[6,7] Recent success in quantitative imaging analysis is redefining the role of medical imaging as a new data source of novel biomarkers, in the form of imaging-based phenotyping. Radiomics is a method of high-throughput extraction of hundreds of features encrypted in medical images based on segmentation (delineation of a boundary around a region of interest). These radiomics features typically include the shape, first-, second- (or textual), and higher-order statistics of a volume of interest, and can provide a far more comprehensive, quantitative, and nuanced representation of the radiographic phenotype of a tumor or an organ than semantic or qualitative descriptors from human experts.[7-13] Due to its distinct advantages for biomarker development, radiomics has become an active area of research focusing on risk assessment and treatment response prediction of cancer[8,14-17] as well as the relationship between image features and genomics.[18-23] Currently, the majority of radiomics studies analyze the radiomics features of an existing lesion. The purpose of our present study, on the other hand, aimed at analyzing the radiomics features of the entire pancreas to differentiate normal individuals from patients with pancreatic ductal adenocarcinoma (PDAC).

Ethics Statement

The study was approved by the Institutional Review Board at the University of Nebraska Medical Center (IRB#: 789-18-EP). This study does not involve any animal subjects. Informed consent of the subjects was waived for this retrospective study by IRB at UNMC, and the waiver would not affect the rights and welfare of the study subjects.

Patients

The Institutional Review Board at our institution approved this retrospective study (IRB 789-18-EP) and waived the informed consent of the subjects. The waiver would not affect the rights and welfare of the study subjects. A total of 80 healthy individuals who do not have major gastrointestinal pathologies from The Cancer Imaging Archive (TCIA) were identified as the control subjects.[24-26] We combined these 80 subjects from TCIA with 101 healthy individuals without gastrointestinal pathologies at our institution from 2008 to 2018 to form the control group. We also identified 85 patients with non-metastatic, borderline resectable PDAC at our institution from 2008 to 2018 as cancer case subjects (Figure 1). The control group included 99 female subjects and 82 male subjects, with an average age of 46.7 ± 16.2. The cancer group included 35 female subjects and 50 male subjects, with an average age of 62.5 ± 10.4.

Figure 1.

Patient selection and randomization flowchart.

CT Acquisition

The 80 control subjects from the TCIA database underwent abdominal contrast-enhanced CT scans with intravenous contrast. The CT images were acquired on Philips and Siemens MDCT scanners (120 kVp tube voltage). The slice thickness was between 1.5 and 2.5 mm.[24-26] The 101 control subjects from our institution also underwent abdominal CT scans with intravenous contrast on GE, Philips, and Siemens MD CT scanners. The preoperative contrast-enhanced abdominal CT scans of 85 cancer cases from our institution acquired on GE, Philips, and Siemens MDCT scanners were also collected. The CT scanners at our institution included the following models: GE-LightSpeed, GE-Revolution, Siemens-Sensation, Siemens-Definition, Philips-Ingenuity, and Philips-Brilliance. All the scans were acquired based on our standardized abdomen scanning protocol, with 120 kV, 350 mAs (effective), and 0.6 to 1.4 pitch. Each individual scan was optimized to minimize the dose to the patient. The field of view of each image was in the range of 360-450 mm. We collected the venous-phase images, which were acquired 60-90 s following the intravenous injection of the contrast material (100 mL of ISOVUE®) based on our standard protocol. The slice thickness was in the range of 1.25-3 mm. All the CT images of the cancer group were acquired at the time of diagnosis without any treatment, ruling out any treatment effect on image features.

Volume-of-Interest Segmentation

The whole pancreas of all subjects was manually contoured by two trained researchers (MD students) using the Varian Eclipse treatment planning system (Varian Medical Systems, Palo Alto, CA) under the supervision of an attending radiation oncologist. The tumor was included as part of the whole pancreas for all the cancer subjects. The attending radiation oncologist, who has more than 18 years of experience specializing in Gastro-Intestinal Cancer, finalized all segmentations. All pancreas segmentations were delineated on the CT with the original in-plane resolution and slice thickness. CT images with associated segmentations were saved and exported via DICOM format for processing and analysis.

Feature Extraction

PyRadiomics, an open-source package, was used to extract radiomics features. Briefly, we converted the DICOM images and target delineation to NRRD format using a batch process in 3D slicer software (https://www.slicer.org/). A total of 924 radiomics features were extracted to represent the whole pancreas from both healthy individuals and cancer patients. The radiomics features included first-order statistics, 3D shape-based features, gray level co-occurrence matrix, gray level run length matrix, gray level size zone matrix, neighboring gray-tone difference matrix, and gray level dependence matrix from the original images, images derived from Laplacian of Gaussian (LoG) filters, and 8 derived images from wavelet decompositions. Extraction parameters such as bin width and resampling were studied in the following uncertainty analysis.

Feature Stability Evaluation

We used the intraclass correlation coefficient (ICC) to quantitatively assess the stability of radiomics features against various uncertainty sources. We selected to use ICC (2,1) to assess the absolute agreement with the 2-way random-effects model since this 2-way random-effects model is the appropriate model to generalize our reliability results.[29,30] The ICC was calculated as follows :

Uncertainty Analysis of Radiomics Feature

Feature selection is an indispensable component of the radiomics workflow as the number of available radiomics features usually is more numerous than the number of cases in the study cohort. A model without feature selection leads to over-fitting and thereby impairs the model's ability to adapt properly to independent datasets with previously unseen data. To evaluate the stability and reproducibility of the extracted radiomics features, we investigated the effect of image preprocessing (bin width and resampling) and several image perturbations, including image rotation and translation, image noise, and contour dilation/erosion, on the stability of the extracted features. Briefly, the DICOM CT images were converted into a 3D volume image, and the DICOM RTSTRUCT (Radiotherapy Structure Set) polygons were converted into a binary segmentation mask. Both volume image and the segmentation mask are in the NRRD format for image perturbation studies. All the imaging processing and transformations were performed using the SimpleITK toolkit (https://simpleitk.org/) with Python 3.6 (Python Software Foundation, https://www.python.org/).

Bin Width and Resampling

As previously reported, the slice thickness and bin width are 2 important extraction parameters that affect radiomics features. We investigated the effect of various bin widths and resampling (ie, with or without resampling) on the stability of the extracted features. Briefly, we chose the gray-level discretization in 5 different bin widths (5, 10, 25, 50, and 75 HU), and applied it to the volume images and masks with the original resolution and resampled resolution (1 × 1 × 1 mm3). In other words, we repeated the feature extraction process with 10 different parameter sets. We then assessed the stability of the features by using the Intraclass Correlation Coefficient[14,29,30,32] (ICC>0.75), and the unstable features were excluded from predictive modeling.

Image Transformation

We also performed an affine transformation that shifts both volume images and masks for a total of 10 combinations of specified fractions of the isotropic voxel spacing (0.25, 0.5, 0.75) in the in-plane directions as shown in Figure 2B. The feature extraction process was repeated for all the transformed images (10 combinations) with resampling (1 × 1 × 1 mm3) and without resampling, that is, we extracted the 924 features on each of the transformed images. The radiomics features with an ICC[14,29,30,32] >0.75 are considered stable features.

Figure 2.

Image perturbations (translation, rotation, noise addition, and expansion/shrinkage).

Image Rotation

Similarly, we performed an affine transformation that rotates both the volume image and mask over a set of angles in the axial plane (−9°, −6°, −3°, 3°, 6°, and 9°) as shown in Figure 2C. The feature extraction process was repeated for all rotated images (6 sets) with resampling (1 × 1 × 1 mm3) and without resampling, that is, we extracted the 924 features on each of the rotated images. The radiomics features with an ICC[14,29,30,32] >0.75 are considered stable features.

Noise Addition

Furthermore, we applied the additive Gaussian image filter and the shot noise (Poisson noise) image filter to the volume image, and we repeated the feature extractions on both the original image and the images with noise additions. Examples are shown in Figure 2D. The radiomics features with an ICC[14,29,30,32] >0.75 are considered stable features.

Segmentation Growth/Shrinkage

We also investigated the impact of volume adaptation on the stability of extracted features by applying dilation and erosion to the segmentation mask. The dilation/erosion mimics the variations in the manual delineation by human experts. Briefly, we used the 2D binary dilation and erosion functions in the SciPy library to create dilated or shrunk masks (Figure 2E), and we repeated feature extraction on the original images with the original or dilated/eroded segmentation masks. The radiomics features with an ICC[14,29,30,32] >0.75 are considered stable features.

Exploratory Analysis of Radiomics Features

Besides the stability/reproducibility of the radiomics features against various image perturbations, exclusion of highly correlated, or redundant, radiomics features is also critical in building robust and generalizable predictive models. In our study, we utilized a combination of statistical techniques to evaluate the correlation among features and their relative importance. First, the whole data set was randomly divided into a training set and a test. The training set includes 189 subjects (∼70%) including 129 normal patients and 60 cancer patients. The test set includes the rest 77 subjects (52 normal and 25 cancer). Then, we developed a radiomics feature importance ranking workflow based on a stacked ensemble model inspired by the work by Zhai et al, as shown in Figure 3. Briefly, we first used the Stability Feature Selection to exclude irrelevant features. Stability Feature Selection runs a selected algorithm (e.g., logistic regression) on a subset of samples repeatedly. By analyzing how many times a feature gets selected from the aggregated results of repeated runs, the model can determine the importance of each feature. The features are ranked by a score based on how many times they get selected. Irrelevant features would result in a score close to 0, whereas important features are expected to have a score close to 100%. We carried out the Stability Feature Selection using the randomized logistic regression (scikit learn toolkit) to rank the extracted radiomics features using the training dataset. The features with zero scorings are excluded from the following analysis. Pearson coefficient analysis is then applied to the nonzero features to evaluate correlations among these features, and we used Pearson coefficient <0.4 as a threshold to exclude features that are correlated to each other. Then, we ranked the importance of each feature by utilizing repeated 5-fold cross-validation on the training data with 4 classifiers that are widely applied to data mining: XGBoost, Random Forest (RF), Adaboost, and Extra Tree. The feature importance ranking from each classifier is averaged equally and the average importance ranking is plotted.

Figure 3.

Feature selection and ranking workflow.

Predictive Analytics

We built an RF classifier by using the scikit-learn library in Python on the training set. Briefly, the same training data was used to build an RF classifier (ie, 189 randomly selected cases including 60 cancer subjects and 129 control subjects). The most important and nonredundant features selected from the exploratory analysis were used as the input for the predictive model. We further divided the training set into a training set with 173 subjects and 16 subjects as validation. We trained the model on 173 subjects in the training set and search the hyperparameter of n_estimator and max_depth by minimizing the difference in the area under the receiver operating characteristic curve (AUC) between the training set (173 subjects) and the validation set (16 subjects). We evaluated the performance of the predictive model by running 5-fold cross-validations on the training dataset (189 subjects) and by analyzing the sensitivity, specificity, and AUC. In the end, we applied the selected model, with a minimum difference of the AUC between the training set (173 subjects) and the validation set (16 subjects), to the test set (77 subjects) to evaluate its performance (accuracy and AUC). The 95% confidence interval was estimated based on the method proposed by Hanley et al. We implemented the training, validation, and analysis of the predictive model by using Python 3.6 (Python Software Foundation, https://www.python.org/).

Results

Stable Features Against Extraction Parameters

With 5 different bin widths on images with or without resampling, we had 10 different sets of extraction parameters. Figure 4 shows the ICC of all the features, categorized by feature groups, against extraction parameters. Our analysis revealed that a substantial portion of the extracted features was not stable against various extraction parameters. Only 119 out of 924 (12.9%) extracted features are stable when bin width or image resolution changes.

Figure 4.

Effect of bin width and resampling on the stability of the radiomics features. The figure shows the intraclass correlation coefficient (ICC) distribution for each group of the 924 extracted radiomics features.

Stable Features Against Image Transformations

We evaluated the feature robustness against image rotation by comparing the extracted features from the original images to those extracted from the rotated images, with various degrees of rotation, by the Intraclass Correlation Coefficient (ICC). Figure 4A and B shows the ICC of all the features, categorized by feature type groups, against rotational image perturbations on the images. Figure 5A shows the results with the image resampled to 1 × 1 × 1 mm3, and Figure 5B shows the results without image resampling. Similarly, we also evaluated the robustness of the features by comparing the original images to the shifted images, with and without resampling to 1 × 1 × 1 mm3 resolution. Figure 5C and D shows the ICC of all the features, categorized by feature type groups, against translational image perturbations on the images, with or without image resampling. Our results on the robustness of features against image perturbation have shown that most of the extracted features (895 out of 924) remain stable and robust against these image transformations.

Figure 5.

Effect of image transformation (rotation and translation) on the stability of the radiomics features. (A) Shows the intraclass correlation coefficient (ICC) distribution for each group of the radiomics features after the images were rotated (−9°, −6°, −3°, 3°, 6°, 9°) and resampled to 1×1×1 mm3. (B) Shows the ICC distribution for each group of the radiomics features after the images were rotated (−9°, −6°, −3°, 3°, 6°, 9°) with the original image resolution. (C) Shows the ICC distribution for each group of the radiomics features after the images were translated, as described in the Materials and Methods section, and resampled to 1×1×1 mm3. (D) Shows the ICC distribution for each group of the radiomics features after the images were translated, as described in the Materials and Methods section, with the original image resolution.

Stable Features Against Noise Addition

We evaluated the feature robustness against image noise by comparing the extracted features from the original images to those extracted from images with Gaussian or Poisson noise added by the ICC. Figure 6 shows the stability of each feature type against the noise addition. Our results showed that the original and wavelet-LLL features were the least stable against image noise addition. On the contrary, most of the other types of features were stable. In total 756 out of 924 extracted features (81.8%) are robust against noise addition based on the analysis.

Figure 6.

Effect of image noise on the stability of the radiomics features. The figure shows the intraclass correlation coefficient (ICC) distribution for each group of the radiomics features after the Gaussian and Poisson noises were applied to the original images.

Stable Features Against Segmentation Growth/Shrinkage

Similarly, we assessed the feature stability against segmentation variation by comparing the extracted features from the original segmentation to those with segmentation dilated or eroded. Figure 7 shows the stability of each feature type against segmentation variation. Most of the extracted features (776 out of 924) are stable if segmentation dilation or erosion occurs.

Figure 7.

Effect of segmentation growth or shrinkage on the stability of the radiomics features. The figure shows the intraclass correlation coefficient (ICC) distribution for each group of the radiomics features after the original segmentation was dilated or eroded as described in the Materials and Methods section. After all the uncertainty analyses, we have identified 91 shared features that are stable against extraction parameters (bin width and resampling), image transformations, image noise addition, and segmentation growth/shrinkage, as shown in Table 1. Among the remaining 91 stable features against image perturbations, our exploratory analysis showed that 84 features have a nonzero ranking by the randomized logistic regression analysis (data not shown), also known as stability feature selection. Furthermore, our Pearson coefficient analysis showed that only 8 out of the 84 features, selected by randomized logistic regression, or stability feature selection, had a Pearson coefficient <0.4 (Figure 8), indicating that most of the extracted features are highly correlated. We ranked the importance of the eight nonredundant features by using a combination of four widely used classifiers: XGBoost, RF, AdaBoost, and Extra Tree, and Figure 9 shows the final ranking.

Table 1.

Stable Features Against all Uncertainties.

Feature type	Features
Shape	Elongation
	Flatness
	LeastAxisLength
	MajorAxisLength
	Maximum2DDiameterRow
	Maximum2DDiameterSlice
	Maximum3DDiameter
	MeshVolume
	MinorAxisLength
	Sphericity
	SurfaceArea
	SurfaceVolumeRatio
	VoxelVolume
Original	firstorder_10Percentile
	firstorder_90Percentile
	firstorder_Energy
	firstorder_InterquartileRange
	firstorder_Kurtosis
	firstorder_Mean
	firstorder_MeanAbsoluteDeviation
	firstorder_Median
	firstorder_RobustMeanAbsoluteDeviation
	firstorder_RootMeanSquared
	firstorder_Skewness
	firstorder_TotalEnergy
	firstorder_Variance
log-sigma-1-0-mm-3D	firstorder_10Percentile
	firstorder_90Percentile
	firstorder_Energy
	firstorder_InterquartileRange
	firstorder_Maximum
	firstorder_MeanAbsoluteDeviation
	firstorder_Range
	firstorder_RobustMeanAbsoluteDeviation
	firstorder_RootMeanSquared
	firstorder_TotalEnergy
	firstorder_Variance
Wavelet-HHH	firstorder_Maximum
	firstorder_Minimum
	firstorder_Range
Wavelet-HHL	firstorder_Kurtosis
	firstorder_Maximum
	firstorder_Minimum
	firstorder_Range
	firstorder_TotalEnergy
Wavelet-HLH
Wavelet-HLL	firstorder_10Percentile
	firstorder_90Percentile
	firstorder_Energy
	firstorder_InterquartileRange
	firstorder_Kurtosis
	firstorder_Maximum
	firstorder_MeanAbsoluteDeviation
	firstorder_Range
	firstorder_RobustMeanAbsoluteDeviation
	firstorder_RootMeanSquared
	firstorder_TotalEnergy
	firstorder_Variance
	glcm_Correlation
	glcm_Idn
Wavelet-LHH	firstorder_Kurtosis
Wavelet-LHL	firstorder_10Percentile
	firstorder_90Percentile
	firstorder_Energy
	firstorder_InterquartileRange
	firstorder_Kurtosis
	firstorder_Maximum
	firstorder_Mean
	firstorder_MeanAbsoluteDeviation
	firstorder_Range
	firstorder_RobustMeanAbsoluteDeviation
	firstorder_RootMeanSquared
	firstorder_TotalEnergy
	firstorder_Variance
	glcm_Correlation
	glcm_Idn
Wavelet-LLH	firstorder_Kurtosis
Wavelet-LLH	firstorder_Skewness
Wavelet-LLL	firstorder_10Percentile
	firstorder_90Percentile
	firstorder_Energy
	firstorder_InterquartileRange
	firstorder_Kurtosis
	firstorder_Mean
	firstorder_MeanAbsoluteDeviation
	firstorder_Median
	firstorder_RobustMeanAbsoluteDeviation
	firstorder_RootMeanSquared
	firstorder_Skewness
	firstorder_TotalEnergy
	firstorder_Variance
	ngtdm_Coarseness

Figure 8.

Pearson coefficient of the stable features selected by stability feature selection. The figure shows the Pearson coefficient matrix for all the stable features that have the Pearson coefficient <0.4.

Figure 9.

Importance ranking of the selected features. The figure shows the importance ranking of the final selected features, which are important and nonredundant, calculated by our feature selection and ranking workflow.

Pearson coefficient of the stable features selected by stability feature selection. The figure shows the Pearson coefficient matrix for all the stable features that have the Pearson coefficient <0.4. Importance ranking of the selected features. The figure shows the importance ranking of the final selected features, which are important and nonredundant, calculated by our feature selection and ranking workflow. Stable Features Against all Uncertainties.

Model Performance

The final model was selected with a minimum difference in the AUC between the training (173 subjects) and the validation set (16 subjects). As shown in Figure 10, the repeated cross-validation showed that the RF model achieved a mean AUC of 0.99 ± 0.01 on the training set (189 subjects) and an AUC of 0.910, with a 95% confidence interval in the range between 0.85 and 0.97, on the independent test set (77 subjects). The accuracy of predicting normal versus cancer subjects on the test set (77 subjects) is 93.5%, with a 95% confidence interval between 88% and 99%. The sensitivity and specificity of the predictive model on the test set were 84% and 98%, respectively, as summarized in the confusion matrix (Figure 11).

Figure 10.

Receiver operating characteristic (ROC) analysis for the predictive model. The figure shows the ROC analysis for the Random Forest model we built to distinguish a healthy pancreas from a cancerous pancreas. (A) Shows the ROC curve of repeated 5-fold cross-validation results for the training dataset, including 189 randomly selected subjects. (B) Shows the ROC curve when the model is applied to the test dataset, including the remaining 77 subjects that the model has not been seen during the training.

Figure 11.

Confusion matrix of the predictive model on the test dataset. The figure shows the confusion matrix summarizing the performance of the predictive model on the test dataset (77 subjects).

Discussion

With the rapid developments and advancements in medical imaging, clinicians are producing and analyzing an ever-expanding amount of imaging data in their daily practice. Quantitative imaging analysis, with its ability to decipher ample amounts of information digitally encrypted in the medical images, has turned medical imaging into a new data source to discover diagnosis- or prognosis-related phenotypic signatures or biomarkers. Radiomics is a high throughput method of image feature extractions, which is compelling for novel biomarker development because it phenotypically characterizes the whole volume of interest (i.e., lesion or tissue) at a macroscopic level. This unique feature allows radiomics to reveal the spatial heterogeneity within that volume of interest (e.g., intratumoral heterogeneity) more than the standard biopsy procedure, the analysis of which is based on a small portion of the tissue. Furthermore, medical imaging is already immensely integrated into standard care, radiomics analysis can potentially reveal additional diagnostic/prognostic signatures repeatedly in a less burdensome and less invasive manner to patients. Previous studies have demonstrated the prognostic values of radiomics feature-based biomarkers in cancer[8,14-17] and their prominent role in bridging medical imaging and genomics.[18-23] Notwithstanding these achievements, the vast majority of radiomics studies analyzed radiomics features of the existing tumor and utilized the features for risk assessment of a tumor or treatment response prediction. The radiomics analysis, focusing on the existing tumor, provides limited contribution with respect to how radiomics can contribute to the early diagnosis of cancer. Our study, on the other hand, aimed at evaluating the utilization of radiomics analysis in the tumor-harboring organ (i.e., pancreas), with the long-term goal of detecting radiomics features reflecting cancerous changes of the hosting organ at an early stage of the disease. This would be especially important for battling pancreatic cancer since over 80% of pancreatic cancer cases are diagnosed when regional spread or distal metastasis has occurred. Our study is a proof-of-concept study that demonstrates radiomics analysis can potentially identify these cancerous changes in the pancreas between normal individuals from cancer patients solely using robust features. Our next step is to include patients at precursor stages of pancreatic cancer, such as patients with chronic pancreatitis or pancreatic cystic lesion to explore imaging features that identify the “cancerized” pancreas at an early stage. Our study aligns with current efforts investigating the application of quantitative imaging analysis (radiomics or deep learning) in detecting various chronic pancreatic conditions including pancreatitis, cystic lesion, and cancer.[40-43] It is foreseeable that radiomics or deep learning would be the ideal candidate to detect “cancerized” changes in the pancreas due to its prowess in identifying hidden patterns in the images, which may lead to the development of new early detection tools for pancreatic cancer. Although radiomics holds great promise in the personalized era, before its clinical implementation, a hurdle the scientific community must overcome is the consistency and reproducibility of feature extraction and subsequent modeling. First, our study investigated the robustness of radiomics features against multiple uncertainty sources, including bin width, resampling, image transformation, image noise, and segmentation growth/shrinkage. To our knowledge, this is the first study assessing the uncertainty of radiomics features of the whole pancreas against these perturbations. In our study, we found that 91 of the extracted 924 features are stable against these perturbations. Furthermore, we identified that the bin width and resampling is the most profound factor affecting the stability of radiomics features. Only 12.9% of the features were stable against bin width changes. Second, our exploratory analysis further reduced the highly correlated or redundant features and selected the most important and archetypal features via a combination of widely used machine learning algorithms. Our results showed that most radiomics features are highly correlated with each other, and these redundant features were excluded in the following modeling process as it does not provide additional meaningful information. Eight out of 91 features were determined to be nonredundant (Pearson coefficient <0.4). Through our comprehensive analysis, we removed unstable and redundant features from the predictive modeling process. Our results of the feature stability study showed that most extracted features are not stable against various image perturbations, which agrees with previous studies on the robustness of radiomics features.[44-47] In addition, our analysis indicated that many radiomics features are highly correlated to each other since most features are generated by repeatedly calculating first- and second-order statistics on derived images. Our findings further stressed the importance of performing feature robustness and redundancy analysis before predictive modeling to avoid spurious results in future radiomics studies. The fact that most of the radiomics features are unstable also poses challenges for the clinical applications of radiomics or radiomics-based biomarkers for personalized patient care and disease management. Addressing this major challenge of bringing radiomics to the clinic requires continued efforts to validate and evolve the existing models on new datasets with ever-evolving technical parameters of image acquisition and reconstruction. Our work has a few limitations. First, with a total number of 268 patients, our sample size is limited compared to previously published studies.[50,51] The limited sample size is a common issue in machine learning-related projects in medical fields,[52-54] and we implemented uncertainty analysis and feature selection to reduce the impact of this limitation. We believe our strategy of assessing the uncertainty introduced by such image perturbations can potentially be a standard workflow in future radiomics studies with a limited sample size. Besides, we have also designed our exploratory analysis and a feature selection pipeline that excludes redundant or nonimportant features, which is essential in a standardized radiomics analysis workflow. Although we did not perform any power analysis for this retrospective study, our final predictive model only used 8 selected features, which complies with the recommendations from the previous studies with respect to the sample size and the number of features used in the predictive model.[9,55,56] Second, our study is a single-institution study without any external validation, which is one of the essential metrics for assessing the robustness of a study.[57-59] Although we sought to improve the robustness of our study by randomizing the TCIA public dataset into our radiomic analysis and predictive analytics, the generalizability of our results remains to be further validated on new datasets. Third, although all the CT images we collected are the venous phases of the contrast CT, it is difficult to evaluate contrast enhancement variation since it depends on patient-specific physiology (eg, blood flow rate). Therefore, we did not study the feature stability against contrast enhancement variation among various patients.

Conclusion

Our study proved that CT-based radiomics analysis and modeling can distinguish healthy individuals from pancreatic cancer patients, and potentially can become an effective tool to detect cancerous pancreatic tissue at an early stage.

48 in total

1. Imaging patterns predict patient survival and molecular subtype in glioblastoma via machine learning techniques.

Authors: Luke Macyszyn; Hamed Akbari; Jared M Pisapia; Xiao Da; Mark Attiah; Vadim Pigrish; Yingtao Bi; Sharmistha Pal; Ramana V Davuluri; Laura Roccograndi; Nadia Dahmane; Maria Martinez-Lage; George Biros; Ronald L Wolf; Michel Bilello; Donald M O'Rourke; Christos Davatzikos
Journal: Neuro Oncol Date: 2015-07-16 Impact factor: 12.300

2. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research.

Authors: Terry K Koo; Mae Y Li
Journal: J Chiropr Med Date: 2016-03-31

3. Influence of gray level discretization on radiomic feature stability for different CT scanners, tube currents and slice thicknesses: a comprehensive phantom study.

Authors: Ruben T H M Larue; Janna E van Timmeren; Evelyn E C de Jong; Giacomo Feliciani; Ralph T H Leijenaar; Wendy M J Schreurs; Meindert N Sosef; Frank H P J Raat; Frans H R van der Zande; Marco Das; Wouter van Elmpt; Philippe Lambin
Journal: Acta Oncol Date: 2017-09-08 Impact factor: 4.089

Review 4. Radiomics for precision medicine: Current challenges, future prospects, and the proposal of a new framework.

Authors: A Ibrahim; S Primakov; M Beuque; H C Woodruff; I Halilaj; G Wu; T Refaee; R Granzier; Y Widaatalla; R Hustinx; F M Mottaghy; P Lambin
Journal: Methods Date: 2020-06-03 Impact factor: 3.608

5. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository.

Authors: Kenneth Clark; Bruce Vendt; Kirk Smith; John Freymann; Justin Kirby; Paul Koppel; Stephen Moore; Stanley Phillips; David Maffitt; Michael Pringle; Lawrence Tarbox; Fred Prior
Journal: J Digit Imaging Date: 2013-12 Impact factor: 4.056

Review 6. Using Quantitative Imaging for Personalized Medicine in Pancreatic Cancer: A Review of Radiomics and Deep Learning Applications.

Authors: Kiersten Preuss; Nate Thach; Xiaoying Liang; Michael Baine; Justin Chen; Chi Zhang; Huijing Du; Hongfeng Yu; Chi Lin; Michael A Hollingsworth; Dandan Zheng
Journal: Cancers (Basel) Date: 2022-03-24 Impact factor: 6.639

Review 7. The Potential of Radiomic-Based Phenotyping in Precision Medicine: A Review.

Authors: Hugo J W L Aerts
Journal: JAMA Oncol Date: 2016-12-01 Impact factor: 31.777

8. Peritumoral radiomics features predict distant metastasis in locally advanced NSCLC.

Authors: Tai H Dou; Thibaud P Coroller; Joost J M van Griethuysen; Raymond H Mak; Hugo J W L Aerts
Journal: PLoS One Date: 2018-11-02 Impact factor: 3.240

9. Radiomics features on radiotherapy treatment planning CT can predict patient survival in locally advanced rectal cancer patients.

Authors: Jiazhou Wang; Lijun Shen; Haoyu Zhong; Zhen Zhou; Panpan Hu; Jiayu Gan; Ruiyan Luo; Weigang Hu; Zhen Zhang
Journal: Sci Rep Date: 2019-10-25 Impact factor: 4.379

10. Trends and Focus of Machine Learning Applications for Health Research.

Authors: Brett Beaulieu-Jones; Samuel G Finlayson; Corey Chivers; Irene Chen; Matthew McDermott; Jaz Kandola; Adrian V Dalca; Andrew Beam; Madalina Fiterau; Tristan Naumann
Journal: JAMA Netw Open Date: 2019-10-02