Literature DB >> 26692535

Variability of Image Features Computed from Conventional and Respiratory-Gated PET/CT Images of Lung Cancer.

Jasmine A Oliver¹, Mikalai Budzevich², Geoffrey G Zhang¹, Thomas J Dilling², Kujtim Latifi¹, Eduardo G Moros³.

Abstract

Radiomics is being explored for potential applications in radiation therapy. How various imaging protocols affect quantitative image features is currently a highly active area of research. To assess the variability of image features derived from conventional [three-dimensional (3D)] and respiratory-gated (RG) positron emission tomography (PET)/computed tomography (CT) images of lung cancer patients, image features were computed from 23 lung cancer patients. Both protocols for each patient were acquired during the same imaging session. PET tumor volumes were segmented using an adaptive technique which accounted for background. CT tumor volumes were delineated with a commercial segmentation tool. Using RG PET images, the tumor center of mass motion, length, and rotation were calculated. Fifty-six image features were extracted from all images consisting of shape descriptors, first-order features, and second-order texture features. Overall, 26.6% and 26.2% of total features demonstrated less than 5% difference between 3D and RG protocols for CT and PET, respectively. Between 10 RG phases in PET, 53.4% of features demonstrated percent differences less than 5%. The features with least variability for PET were sphericity, spherical disproportion, entropy (first and second order), sum entropy, information measure of correlation 2, Short Run Emphasis (SRE), Long Run Emphasis (LRE), and Run Percentage (RPC); and those for CT were minimum intensity, mean intensity, Root Mean Square (RMS), Short Run Emphasis (SRE), and RPC. Quantitative analysis using a 3D acquisition versus RG acquisition (to reduce the effects of motion) provided notably different image feature values. This study suggests that the variability between 3D and RG features is mainly due to the impact of respiratory motion.

Entities: Chemical Disease Gene Species

Year: 2015 PMID： 26692535 PMCID： PMC4700295 DOI： 10.1016/j.tranon.2015.11.013

Source DB: PubMed Journal: Transl Oncol ISSN： 1936-5233 Impact factor: 4.243

Introduction

Positron emission tomography (PET) is a beneficial technology in the process of cancer diagnosis and staging [1], [2], monitoring tumor response to treatment [3], detecting necrosis and tumor heterogeneity, identifying the primary tumor location [4], and delineating tumors from atelectasis [5], particularly in lung cancers. In fact, studies have shown that the use of PET/computed tomography (CT) improves confidence in diagnosis, increases the number of “definite” lesions by 41% in patients with non–small cell lung cancer (NSCLC) [6], and improves delineation accuracy for gross tumor volumes (GTVs) in radiation therapy [4], [5]. Standardized uptake value (SUV) is the standard PET metric in image analysis. However, it is clear that SUV is dependent on many technical as well as physiologic factors [7]. It is proportionately dependent upon fluorodeoxyglucose (FDG) uptake, which in turn is affected by dose calibration, clock (decay) synchronization, patient weight and blood sugar level, documentation of unused tracer remains, and other setup specifics [7]. Moreover, SUV indirectly depends on the method of obtaining raw data, radionuclide uptake time, hardware platform, and the applied reconstruction algorithm [7], [8]. Studies have shown SUV’s predictive ability of therapy response and survival [9], [10], although, particularly in NSCLC, discrepancy remains regarding whether maximum or minimum SUV is a predictor based on treatment modality [11]. The prognostic ability of SUV parameters has also been shown, but conflict emerges when defining the best cutoff value [9]. Nonetheless, SUV can be unreliable. A clinical study done on the test-retest reproducibility of SUV demonstrated greater than expected SUV variability within a single institution [7], and 10% to 25% SUV variability was detected in a multicenter consortium before biological effects or protocol influences [12]. Based on these results and the SUV’s reliance on various nonstandardized factors, it is clear that there is a need for additional indicators that are more robust than SUV or complementary to SUV-based findings. Quantifiable and robust (radiomic) image features, such as texture features, may be candidates for such indicators. Texture is a global pattern resulting from repetition of local subpatterns [13]. Features associated with image texture describe the relationships between the gray-scale intensity of pixels [or voxels in three-dimensional (3D) image] on a local or global image scale. These features have been used for classification and segmentation purposes, identifying regions of interest in an image and estimating tumor heterogeneity [14]. Image features are quite vast in number and can be subdivided into shape features and first-, second-, and higher-order features. First-order features provide information about gray-scale intensities and are derived from intensity distributions and histograms, whereas second-order texture features are derived from gray-tone spatial dependency matrices which are constructed from the intensity values of an image as described below [14], [15]. In this paper, we refer to all of these as image features. In medical imaging, CT image texture analysis has been studied extensively, dating back to the early 1980s [16]. More recent studies in CT image analysis have uncovered feature correspondence with lung tumor aggressiveness and tumor heterogeneity; demonstrated potential as a marker for survival in NSCLC; and revealed relationships between features, tumor stage, and metabolism [17], [18], [19]. The reproducibility and robustness of specific identifiers in NSCLC CT images have also been studied [20]. The application of image feature analysis to PET images has been explored more recently. Prior studies in PET/CT image texture analysis have demonstrated its potential as a predictor of tumor and normal tissue response to therapy, a quantifier of tumor heterogeneity and radiosensitivity, and an indicator for adaptive therapy schemes [9], [21], [22]. Conclusive and beneficial results regarding tumor response to treatment using multimodality imaging [23] have been drawn, and PET texture analysis is currently being explored for application to predictive/prognostic models of treatment outcome, partnered with genomics and proteomics patterns [21]. In addition, various features have been investigated using test-retest and interobserver stability in FDG-PET [24]. PET image analysis has also been shown to predict response to radiochemotherapy in esophageal cancer [25] and to quantify tumor heterogeneity as a response predictor [26]. Partnered with a multimodality modeling system, PET texture analysis could lead to more individualized treatment planning in lung radiotherapy [23]. Moreover, certain FDG-PET–based texture features have demonstrated association with nonresponse to chemoradiotherapy for NSCLC tumors [27]. These and many other studies are part of a more general systematic approach, namely radiomics, which is an emerging framework relating image features to molecular medicine where large amounts of quantitative features (400 +) are extracted for diagnostic, prognostic, and predictive information [21], [28], [29], [30]. In other words, PET/CT image feature analysis is an emerging and promising quantitative imaging field. Although radiomics shows much promise, there have been several studies showing image features’ dependency on various factors in the production of images. For example, in a recent study, 45 of 50 texture features showed 10% to 200% variability across acquisition protocols and reconstruction algorithms [31]. Therefore, several investigators have pointed to the need for standardization in texture analysis [32], [33], [34]. The usefulness of radiomics depends on the reliability of feature values; thus, it is important to characterize feature behavior under many potential clinical conditions. Our goal in this study was to explore image feature value variability between respiratory-gated (RG) and conventional (3D) PET images acquired on the same patient during a single PET scan session. Conventional (3D) PET images are influenced by motion because of their relatively long acquisition times. The acquired coincidence counts measured and used to form the images are spatiotemporally averaged over multiple breathing cycles [35]; consequently, for a point inside a mobile tumor, the signal is convoluted along its trajectory of motion. Respiratory-gated PET/CT aims to account for respiratory motion and thereby respiration-induced image blurring. One way to discern the effect of motion on feature values is by comparing image feature values between conventional and RG acquisition protocols, although other important factors stemming from the differences in the imaging protocols such as image noise may also be at play. Thus far, only one study accounting for motion in PET images has been reported by Yip et al. [36], which was limited to only five features. This report represents the first study that evaluates how 3D and RG acquisitions affect a large number of image features currently being used and tested in several medical applications. Robust features that emerge from this study may be suitable candidates for future quantitative imaging applications involving mobile tumors.

Materials and Methods

Preliminaries

Twenty-three lung cancer patients were retrospectively selected for a study of image feature variation between 3D and RG [RG is alternatively known as four-dimensional (4D)] PET/CT images and feature value variation among the 10 phases of an RG scan. The main selection criterion was for these patients to have both 3D and RG PET/CT scans performed during the same imaging session. The 3D images were acquired in free-breathing conditions for patients with regular breathing as required for radiotherapy planning. There were 13 female and 10 male patients ranging from age 47 to 83. All lung cancer patients were diagnosed with NSCLC. For each patient, 18F-FDG was the administered radiotracer, and the RG PET scan was acquired during the same session as the 3D PET scan in the same position per protocol for radiation treatment planning. A radiologist-approved 4D PET/CT protocol was applied for image acquisition [37]. The 3D PET/CT protocol was adapted for our institution from the Netherlands Protocol [38]. In routine clinical practice following these protocols, the average scan start times after the tracer administration were 118 ± 17.3 minutes (standard deviation) for 3D PET and 117 ± 36.0 minutes for RG PET, with average administered activity of 11.9 ± 2.0 mCi. A study on SUV variance in clinical FDG-PET/CT found that SUVmax and SUVmean were independent of variations in the uptake period [7]. PET/CT data were obtained using a GE Discovery STE PET/CT Scanner (for 21 cases) and a GE Discovery 600 PET/CT Scanner (for 2 cases). The 3D CT was a standard step and shoot CT (not helical), and the RG PET counts were binned into 10 phases with 3D CT attenuation correction applied to 4D PET data (standard protocol at our institution). The standard reconstruction protocol was the ordered subset expectation maximization algorithm with 20 or 28 subsets and 2 iterations. Full width at half maximum and field of view varied between patients: full width at half maximum of 4.29, 7, or 10 mm, and field of view of 50, 60, or 70 cm were standard PET settings. Standard of practice procedures at our institution were followed, and this study was approved with waived informed consent by the University of South Florida Institutional Review Board.

ROI Delineation and Tumor Segmentation

Following patient selection, the tumor region was identified using an advanced image viewing software (Mirada RTx, Mirada Medical, Oxford, UK), allowing identification of the primary tumor location and exportation of Digital Imaging and Communications in Medicine image data for 3D and RG PET/CT images. The image viewing software provided tumor visualization; easy access to X, Y, or Z slices of 3D PET data; and region-of-interest (ROI) delineation. A background-adapted thresholding method of segmentation defined by Dholakia et al., which accounted for background uptake, was applied to eliminate subjective errors and interobserver variability [39]. This method involved placing a 3-cm spherical contour inside the liver and extracting the mean SUV and standard deviation to calculate a threshold value for the lung tumor [Equation (1)]:where SUVmean is the mean SUV value of the 3-cm sphere and SUVSD is the standard deviation of the 3-cm sphere's SUV values. We are aware of the many segmentation methods in the literature and that no one method is generally regarded optimal for general medical applications [40]. Because we were working with lung tumors, major structures and surrounding tissues were minimal. Consequently, there was little uptake outside the tumor volume. If, however, a tumor was close to the diaphragm or pleura, nearby metabolic structures were also segmented. Because of the adaptive segmentation method, six PET lesions were rendered too tiny and were not evaluated. In CT images, tumors were contoured with CT threshold, a proprietary algorithm using Mirada RTx (RTx, Mirada Medical, Oxford, UK) (Figure 1). The 3D contours were drawn separately on the 3D CT image and on one phase (phase 1 or phase 10) of the corresponding RG CT image for each patient. In our clinic, CT contours are used for treatment planning purposes, whereas PET ensures that the entire metabolic tumor volume is included in the GTV.

Figure 1

CT segmentation of one patient viewed in two dimensions (the ROI extends in 3D). This CT image is viewed in the window preset for the lung in Mirada DBx (RTx, Mirada Medical, Oxford, UK).

Feature Extraction

An internally developed application imported the ROI data file and extracted image intensity statistics, shape descriptors, co-occurrence matrices, run length matrices, and other second-order features from each ROI for a total of 56 image features (Tables 1 and 2). Although some authors have shown the instability of certain features from 3D images [25], [26], [31], we decided to include them here to analyze their stability for RG images. Moreover, some groups have used a large number of features (> 200 [21]). Nevertheless, we deemed 56 features sufficient to assess the variability between 3D and RG feature values.

Table 1

Intensity and First-Order Image Features

Feature	Description
Imin	Minimum intensity value in the 3D ROI
Imax	Maximum intensity value in the ROI
Imean	Mean intensity value in the ROI
SD	Variation from the average intensity in the ROI
Skewness	Measure of the symmetry of the intensity distribution
Kurtosis	Measure of the shape of the peak of the intensity distribution
Coeff Var	Normalized measure of the dispersion of the ROI (coefficient of variation)
TGV	Total summed intensity
I30⁎	Intensity ranging from lowest to 30% highest intensity volume
I10-I90⁎	Intensity ranging from lowest to 10% highest intensity volume minus intensity ranging from lowest to 90% highest intensity volume
V40⁎	Percentage volume with at least 40% intensity
V70⁎	Percentage volume with at least 70% intensity
V80⁎	Percentage volume with at least 80% intensity
V10-V90⁎	Percentage volume with at least 10% intensity minus percentage volume with at least 90% intensity
Sphericity	Measure of the spherical shape (roundness) of the ROI
Convexity	Measure of the spiculation of the ROI (ratio of true ROI volume to convex ROI volume)

Denotes features that were derived from intensity volume histogram [10].

Table 2

Selected Second-Order Features

Name	Description	Mathematical Description
Energy⁎	Also defined as Angular Second Moment. This feature describes the homogeneity of an image. 0 represents complete heterogeneity. 1 represents complete homogeneity [42].	∑i=1N∑j=1NPij2
Local homogeneity⁎	Measures the relation of GLCM intensities to the diagonal GLCM matrix. A value of 1 represents total homogeneity. A value of 0 represents nonhomogeneity [42].	∑i=1N∑j=1NPij1+i−j2
Entropy⁎	Measures the pair contributions and information content.	∑i=1N∑j=1NPijlogPij
Correlation⁎	Measures correlation between co-occurrence matrix values.	∑i∑jijPij−μxμyσxσy
SRE†	Measures short run distribution (short run emphasis).	1n∑i=1M∑j−1NRijj2
LRE†	Measures long run distribution (long run emphasis).	1n∑i=1M∑j=1NRijj2
RPC†	Ratio of total number of runs to total number of pixels in the image. Measures homogeneity and run distribution (run percentage).	nnp
LGRE†	Measures low gray-level distribution (low gray-level run emphasis).	1n∑i=1M∑j=1NRiji2
SRLGE†	Measures short runs and low gray-level distribution (short run low gray-level emphasis).	1n∑i=1M∑j=1NRiji2j2
LRLGE†	Measures long runs and low gray-level distribution (long run low gray-level emphasis).	1n∑i=1M∑j=1NRiji2j2
RLNU†	Measures the nonuniformity of the run lengths (run length nonuniformity).	1n∑i=1N∑j=1MRij2
GLNU†	Measures the nonuniformity of the gray levels (gray-level nonuniformity).	1n∑j=1M∑i=1NRij2

Where P(i,j) is an element of the gray-level co-occurrence matrix. GLCM features were originally developed by Haralick et al. [14], [41].

Where R(I,j) is an element of the RLM, n is the total number of runs, n is the number of pixels in the image, N is the longest run, and M is the number of gray levels. RLM features were originally developed by Galloway et al. [44], Chu et al. [45], and Dasarathy and Holder [46].

Shape descriptors were calculated directly from the segmented ROI. First-order features (extracted from image intensity statistics) were calculated from volume intensity histograms. Second-order gray-level co-occurrence matrix (GLCM) features, originally described by Haralick et al. [14], [41], were implemented with feature descriptions provided by Liang [42]. The Haralick definition of second-order statistics (based on gray-level matrix metric), nearest neighbor spatial dependence matrices, provided texture information from the spatial relationship of image voxels [14]. The GLCM feature calculations were implemented as follows: the intensities of image voxels were binned into 256 gray-scale levels for PET (128 gray-scale levels for CT) with equal intervals. The resulting two-dimensional co-occurrence matrix was 256 × 256 (128 × 128) with unit (1) pixel distance. Co-occurrence matrices were calculated in 13 directions across a 3D image, and the resultant matrix was the average of the matrices in the 13 directions. Given a set of cubical voxels, the 13 directions were 3 axial directions, 2 diagonal directions per axial plane × 3 axial planes, and 4 diagonal directions cross cube [43]. These 13 directions were chosen so that the resulting matrix would represent the entire tumor texture without bias. The elements of the matrix were integers. Next, a probability matrix was calculated by dividing each element by the total sum of the matrix so that the sum of the probability matrix was 1. The features were then calculated using the probability matrix. Galloway’s original run length features were also implemented [44]. Feature definitions were acquired from Galloway, Chu et al., and Dasarathy and Holder [44], [45], [46]. The run length matrix (RLM) had dimensions of L × R, where L was the number of gray-scale levels (256 for PET; 128 for CT) and R was the possible runs (determined case by case). The elements of the matrix were integers which represented runs. A run was defined as a set of pixels that possessed the same gray level in a specified direction [47]. The RLM was calculated in 13 directions across an image (similar to the co-occurrence matrix) [48]. The feature values were the summed values of all 13 directions normalized by the total runs in each direction. No probability matrix was involved for the run length features. In PET, the image intensity was the number of registered counts per voxel. For CT, intensity represented the Hounsfield units in each voxel. All intensity levels were used. Normalization was applied only to the co-occurrence and run length matrices (in the form of binning; 128 bins). In addition, intensity values for PET images were not converted to SUV. Instead, stored image intensity values were analyzed directly. For each patient, image features were extracted from the 3D PET ROI, 3D CT ROI, all phases (bins) of the RG PET ROI, and one phase (bin) of the RG CT ROI. Following feature extraction, 3D and RG PET/CT image feature differences were calculated using Equation (2):where RGTV is the RG image feature value for feature i and 3DTV is the 3D image feature value for feature i. This method was chosen because it accounts for features that changed sign between 3D and RG cases. The maximum possible percent difference using Equation (2) is 200%, and differences greater than 100% were deemed large. The percent difference across cases was then averaged for each image feature and a paired, two-tailed t test was applied to 3D and RG feature data to compare the two data sets. We assumed normal distributions and that the t test was applicable. The concordance correlation coefficient (CCC) was calculated for all features across 3D and RG feature data to determine the correlation between the two data sets. The scale used for determining strength of agreement was as follows: high strength of agreement, CCC > 0.99; substantial strength of agreement, CCC: 0.95 to 0.99; moderate strength of agreement, CCC: 0.90 to 0.95; poor strength of agreement, CCC < 0.90 [49].

RG (4D) PET Phase Analysis

The previously described procedure of image feature extraction was applied to all RG PET bins. Mean percent difference was used to compare features between phases:where i represents the bin, j represents the specific image feature, TV represents the value for bin i and feature j, and μ represents the mean value for image feature j. Image feature values were also normalized by average value across all bins: The subscript definitions for Equation (3) also apply to Equation (4). In addition, a paired, two-tailed t test was applied to RG inhale (phase 1) feature data and RG exhale (phase 5) feature data to compare the two data sets. The CCC was calculated for phase 1 and phase 5 of the feature data to determine correlation between the two data sets.

Long Axis Calculation and Rotation Analysis

The long axis length (through the center of mass) was calculated with an internally developed program for each bin of the RG cycle (PET only). The tumor’s center of mass location was calculated for the inhale and exhale phases of the RG PET image sets [Equation (5)]. where CM (x, y, z) is the center of mass for a tumor in a PET phase and I is the number of counts per voxel i. The center of mass motion (CMM) was calculated as the displacement between the center of mass for inhale phase and center of mass for exhale phase. The difference in long axis length and CMM were used to assess changes in internal tumor morphology. Tumor angle was defined as the angle between the long axis of the tumor and the XY plane (Figure 2). A Pearson’s correlation test was applied to identify correlation in tumor angle and long axis length between inhale and exhale images.

Figure 2

Tumor rotation calculation method. First, the tumor volume is delineated at exhale (phase 1) and inhale (phase 5) on RG PET images. Second, the center of mass of each volume is calculated. The long axis length (longest diameter) through the center of mass of the tumor is calculated. Then, the angle between the long axis length and the XY plane is calculated. This angle is compared between the exhale (phase 1) and inhale (phase 2) to determine the pseudotumor rotation.

Results

3D and RG PET/CT Image Feature Analysis

Features from both PET and CT images demonstrated dependency on whether the acquisition was 3D, which is conventional (also called static), or RG (4D), where the coincidence counts are binned in multiple phases/bins composing the respiratory cycle. Large differences in some features were found between 3D PET/CT and one of the phases/bins of the corresponding RG data set. The percent differences between 3D and RG modalities were usually larger for CT than for PET. For PET, 10 of 56 features had a percentage difference (between 3D PET and RG PET for each patient) of less than 5% for more than half of the cases. In comparison, 11 of 56 CT features had a percentage difference (between 3D CT and RG CT for each patient) of less than 5% for more than half the cases. The percent differences between 3D PET and RG PET varied from 0% to 193%. The outlier of 193% was kurtosis. For 4 of 17 cases, kurtosis demonstrated the greatest percent difference between 3D and RG PET. Image feature average differences between 3D PET and RG PET are shown in Table 3.

Table 3

Features Presenting Average Differences between 3D and RG PET Image Features

< 2% Difference	< 5% Difference	< 10% Difference	< 15% Difference	< 20% Difference	> 50% Difference
SRE	Sphericity	Surface area/volume	Volume	V10-V90	Minimum intensity
Spherical disproportion	Compactness	Surface area	Contrast (1st order)	Mean intensity
Entropy (1st order)	Convexity	Long axis	Co-occurrence mean	Kurtosis
Information measure of correlation 2	Entropy (2nd order)	Short axis	Sum average	TGV
RPC	Sum entropy	Local homogeneity (1st order)	Information measure of correlation 1	RMS
	Difference entropy	Difference average		I30
		Difference variance		I10-I90
				LRLGE

Percent differences between 3D CT and RG CT varied from 0% to 176%, kurtosis again being the outlier. Figure 3 shows selected feature percent differences, and Table 4 shows image feature average differences between 3D and RG CT. Forty-six percent of the CT features between 3D CT and RG CT presented average percent differences larger than 20%. In some cases, average percent differences were larger than 50%. Table 5 displays the number and percent of total features with specific percent differences for CT, PET, and PET RG phases.

Figure 3

Average differences between 3D and RG image features. % Diffi3D/RG between selected image features from 3D PET/CT and RG PET/CT.

Table 4

Features Presenting Average Differences between 3D and RG CT Image Features

< 2% Difference	< 5% Difference	< 10% Difference	< 15% Difference	< 20% Difference	> 50% Difference
Minimum intensity	Mean intensity	Convexity	Surface area/volume	Volume	Kurtosis
SRE	RMS	LRE	Sphericity	SD	TGV
I30		Compactness	Coefficient of variation	V70
RPC		Spherical disproportion	I10-I90	V80
		Difference entropy	Local homogeneity (2nd order)	Energy (1st order)
			Sum average	Cluster shade
				Cluster prominence
				Co-occurrence mean
				Co-occurrence variance
				GLNU
				RLNU

Table 5

Percent Differences (% Diffi3D/RG) between Image Features of 3D and RG, PET and CT Images, and Conglomerate Image Features of RG PET Phases for All Cases (% DiffijMean)

	CT		PET		PET RG Phases
Percent Difference	No. of Features⁎(1288 Total)	% Total Features⁎	No. of Features⁎(952 Total)	% Total Features⁎	No. of Features⁎(9464 Total)	% of Total Features⁎
< 5%	342	26.6%	249	26.2%	5051	53.4%
< 10%	498	38.7%	405	42.5%	7258	76.7%
< 15%	617	47.9%	515	54.1%	8043	85.0%
< 20%	697	54.1%	585	61.4%	8410	88.9%
> 20%	591	45.9%	367	38.6%	998	10.5%

Total number of features refers to 56 image features per tumor.

Overall, 249 of 952 (26.2%) of all PET features (56 features per patient) had a percent difference of less than 5% between 3D and RG protocols, whereas 342 of 1288 (26.6%) of all CT features (56 features per patient) had a percent difference of less than 5% between 3D and RG scans. Table 6 shows common features with percent differences between 3D and RG protocols for all cases and for both PET and CT modalities.

Table 6

Image Features with Common Average Differences in 3D/RG PET and CT

Percent Difference	Common Features
< 2%	SRE
< 5%	–
< 10%	Convexity, 1st and 2nd order entropy, sum entropy, LRE, RPC
< 15%	Surface area/volume, sphericity, compactness, spherical disproportion, difference entropy, information measure of correlation 2
< 20%	Volume, long axis length,V10-V90, sum average
> 50%	Kurtosis, TGV

According to the CCC strength-of-agreement scale by Mcbride et al., PET and CT feature subtypes demonstrated poor correlation between 3D and RG images [49] (Figure 4). This was demonstrated by CCC strength-of-agreement values less than 0.90 for each feature subtype (shape, first order, GLCM, and RLM). However, there were specific features that demonstrated substantial strength of agreement. These were from the shape and first-order features in PET and shape features only in CT.

Figure 4

Concordance correlation coefficients for each feature with mean and standard deviation for each feature subtype for (A) 3D/4D CT and (B) 3D/4D PET.

The paired, two-tailed, t test for 3D PET and RG PET features revealed 17 PET features with P values < .05 (indicating that these data sets are different). The t test for 3D CT and RG CT features revealed 12 CT features with P values < .05. Features with P values < .05 for both PET and CT were entropy (first order), compactness, and information measure of correlation 1 and 2. Results indicated a weak dependency (relative to the differences between 3D and RG presented above) of all PET features on respiration phase in RG scans of 10 phases (Figure 5). The most robust features (less than 5% difference among RG phases) belonged to select features from all categories (shape descriptors, and first- and second-order features). Sphericity, spherical disproportion, information measure of correlation 2, SRE, and LRE were within 10% difference of the average value for all cases across all phases. Normalized image features across 10 phases for RG PET demonstrated that, for all patients, 77% (7258:9464) of image features (56 features per phase per patient) varied less than 10% from the average values and 10.5% (998:9464) demonstrated more than 20% difference from average values (Table 3). Features with the largest difference (> 50%) were kurtosis, Low Gray-level Run Emphasis (LGRE), Short Run Low Gray-level Emphasis (SRLGE), and Long Run Low Gray-level Emphasis (LRLGE). The paired, two-tailed, t test for RG PET inhale and RG PET exhale feature data revealed one PET feature, namely, short axis length, with P value < .05. The CCC revealed that the shape features had the highest CCC strength of agreement between image data sets from phase 1 and phase 5 (mean CCC strength of agreement, 0.95; moderate). First-order features and GLCM had mean strength-of-agreement values of 0.93 (moderate), and RLM features exhibited mean CCC strength of agreement of 0.86 (poor).

Figure 5

Feature dependency on respiration phase for selected features. (A) Normalized GLNU across 10 phases of RG PET image sets. (B) Normalized correlation across 10 phases of a RG PET image set.

Overall Feature Results

In comparisons of results among respiratory-phases and 3D-to-RG PET features, we concluded that the features with the least variability overall for PET images were sphericity, spherical disproportion, first-order entropy, information measure of correlation 2, and SRE. Features demonstrating the greatest variability were kurtosis and LRLGE. For CT images, features with the least variability were minimum intensity, mean intensity, RMS, SRE, and RPC, whereas features with the greatest variability were kurtosis, V70, V80, energy (first order), cluster shade, cluster prominence, co-occurrence mean, co-occurrence variance, Gray-level Nonuniformity (GLNU), and Run Length Nonuniformity (RLNU).

Long Axis Tumor Length, Rotation, and CMM

The long axis tumor length and rotation results demonstrated that tumors exhibited deformation over RG phases (Table 7). A Pearson’s correlation test demonstrated that there were a weak correlation between the tumor angle with respect to the XY plane at inhale and the same angle in the corresponding 3D image (R = 0.350) and a weak correlation between the tumor angle at the exhale phase and the corresponding 3D image (R = 0.319). There was a weak correlation between 3D image tumor volume and the 3D image tumor angle (R = − 0.399), and long axis length was not correlated to the breathing cycle. Table 7 shows that the long axis length of the tumor was inconsistent across inhalation phase (phase 1), 3D scan, and exhalation phase (phase 5). The long axis lengths of the tumor for 3D, phase 1, and phase 5 were highly correlated (3D and phase 1: R = 0.936; 3D and phase 5: R = 0.954; phase 1 and phase 5: R = 0.986); long axis lengths between phase 1 and phase 5 differed by more than 5 mm in some cases, indicating possible changes in tumor shape during the respiratory cycle. The largest difference was case 11 with long axis lengths of 124.5 mm and 139.9 mm for phase 1 and phase 5, respectively, whereas the long axis length for 3D was 147.4 mm. There was a weak to moderate correlation between tumor angle at the inhale phase and the exhale phase (R = 0.438), indicating tumor rotation during the respiratory cycle. Moreover, the long axis angle also changed in some cases from positive to negative, indicating tumor rotation. There was also a weak to moderate negative correlation between average percent difference in 3D and RG images (in PET) and center of mass motion (R = − 0.445).

Table 7

Long Axis Lengths of Lung Tumors on 3D PET Images and RG PET Images at Exhale and Inhale

Case	Length (mm)			Angle (Relative to XY Axis)			Volume (cc)			CMM (mm)
3D	Exhale	Inhale	3D	Exhale	Inhale	3D	Exhale	Inhale
1	31.58	46.29	35.65	18.10	− 25.08	− 39.95	12.57	12.79	12.36	3.70
2	67.19	69.78	66.97	− 25.98	− 48.58	− 22.99	45.77	40.45	40.02	4.21
3	62.73	76.31	71.78	38.72	0.00	− 43.10	82.56	81.77	77.24	6.22
4	24.67	25.85	24.92	7.62	− 14.65	− 7.54	4.69	4.44	4.14	1.87
5	41.00	41.98	40.38	− 13.84	22.92	34.53	23.47	24.94	21.81	13.30
6	49.21	44.18	43.71	− 36.73	31.20	− 8.61	30.41	30.18	29.75	2.99
7	125.95	133.22	126.72	− 65.32	− 68.87	− 61.33	140.78	119.62	113.54	3.35
8	55.03	46.76	46.66	− 40.82	− 24.81	− 57.25	24.84	20.84	19.90	1.29
9	28.11	21.67	21.96	13.45	26.91	− 17.33	6.45	4.38	4.31	0.28
10	29.05	24.78	21.96	19.74	− 15.30	17.33	6.85	4.67	4.74	1.71
11	147.42	124.53	139.92	− 36.79	6.03	− 37.42	571.04	419.89	427.80	0.16
12	63.53	68.11	63.27	27.60	22.59	14.98	64.35	58.92	53.38	4.08
13	46.96	48.25	55.16	16.17	− 7.79	− 62.77	26.21	24.21	28.17	2.05
14	54.12	54.02	54.02	17.58	14.01	14.01	33.25	33.34	32.91	0.49
15	10.94	47.98	47.98	0.00	− 15.82	− 15.82	35.40	27.38	30.46	0.29
16	24.59	20.40	19.86	23.51	− 18.70	− 19.23	6.75	3.81	2.80	4.60
17	41.55	47.61	39.99	18.35	− 33.33	19.09	26.01	22.70	22.99	2.84

Discussion

RG PET scans can provide a “snapshot” of the tumor within a phase along the breathing cycle, thereby greatly reducing the effects of motion on a tumor’s shape, volume, and image feature values. In contrast, 3D PET scans convolute the absorbed activity distribution over the motion/deformation pattern a tumor and its surroundings experience during multiple respiration cycles [50]. Consequently, a 3D (static) PET may fail to provide accurate position, volume, and absorbed activity distribution for a mobile tumor. This is especially true in the thoracic region and regions with substantial internal motion. This agrees with Adams’ finding that respiratory motion affects the SUV with changes up to 30% and that any moving lesion would be inaccurately measured because of the effects of blurring [51]. In addition, patient motion/breathing is known to cause image artifacts due to a mismatch in registration between CT attenuation correction and emission scans [38]. Internal motion, as the results support, notably affected the image feature values of PET and CT images. The percent differences between 3D and RG CT were generally greater than those in PET. CT images have higher spatial resolution than PET images and therefore more voxels for texture formation and thus a greater sensitivity to motion. In addition, 3D CT may also be affected by motion depending on the acquisition protocol [52]. In addition to the affine tumor motion caused by respiration, we identified deformation of tumors (characterized by varying tumor axis lengths and angles with respect to the XY plane between 3D PET, RG PET at inhale, and RG PET at exhale). Conceivably, rotations and deformations also affect image feature values. Our results demonstrated a weak correlation between the long axis angles of RG images at inhalation and exhalation. There was also an inconsistency of long axis length between 3D images, RG images at inhale, and RG images at exhale, thus indicating that tumor shape and rotation varied between phases. The degree to which rotations and/or deformations affect image features, and in particular texture values, requires further investigation. Interestingly, there was no correlation between CMM, tumor volume, or long axis length with 3D/RG feature value differences based on Pearson’s correlation tests. There was, however, a weak to moderate correlation between CMM and average percent difference. Nonetheless, it is clear from our data that the feature value differences between RG phases are smaller than the differences between 3D images and RG images at a given phase. In other words, the rotational motion and/or deformation of the tumors in our patient cohort had a smaller effect on image feature values than the averaging effects of the static acquisition. Yip et al. also investigated variability of texture features between 3D and RG imaging [36]. In contrast to our study, they tested only five image features (contrast, busyness, coarseness, maximal correlation coefficient, and long run low gray). They found that differences between 3D and RG PET were significant [36] after having accounted for noise differences due to different acquisition times. This agrees with our findings that certain features (e.g., kurtosis and LRLGE) demonstrated large variability between 3D and RG protocols. There were, however, certain features in our study (e.g., SRE, first-order entropy, and RPC) that did not demonstrate large variability between protocols. Figure 3 and Tables 3 and 5 show differences between feature values from 3D and RG protocols. The features with the smallest change across PET for all RG bins and for 3D PET were sphericity, spherical disproportion, entropy (first and second order), sum entropy, information measure of correlation 2, SRE, LRE, and RPC. Interestingly, a study by Galavis et al. on the variability of PET texture features caused by different acquisition modes and reconstruction parameters demonstrated that first-order entropy exhibited small variation (≤ 5%), whereas second-order entropy and sum entropy exhibited intermediate variability (10%-25%) [31]. Our results were comparable, showing that first-order entropy exhibited variation smaller than 5% and that second-order entropy and sum entropy exhibited less than 10% difference between 3D and RG PET protocols. Sum entropy, second-order entropy, and the information measure of correlation 2 are based on entropy calculations which measure randomness in a pattern. A portion of the randomness can be attributed to the noise intrinsic to the scanner, whereas the remaining can be attributed to statistical differences in counts (quantum noise). Hence, 3D images are less noisy than RG images because percentage image noise is given by , where N is the count density (counts/cm2). Thus, 3D/RG feature differences are a combination of both tumor motion and count statistics. This suggests that it would be informative to normalize for count density. Unfortunately, this study was retrospective and list-mode data were not accessible for normalization. Nevertheless, the number of counts and therefore the noise among RG images from the 10 phases can be assumed to be similar. Therefore, the differences in feature values from phase to phase may be attributed to the effect of motion and/or deformation. The features LRE, SRE, and RPC, which demonstrated small change across PET for all RG bins and 3D PET, are features from the RLM. LRE measures the long run emphasis distribution. Correspondingly, SRE measures the short run emphasis distribution. Run percentage is the ratio of the number of runs to the number of pixels in an image (Table 2). We conclude that the cumulative number and length of short runs and cumulative number and length of long runs do not vary significantly between 3D and RG images and that the total number of runs does not vary significantly between 3D and RG images. These conclusions may depend on the algorithms used to calculate these features. For example, in this paper, we averaged runs from 13 directions; other definitions are possible. Feature differences between 3D and RG in PET and CT images that showed large differences (> 50%) were typically features from intensity volume histograms such as kurtosis and TGV. Thus, the intensity histogram distributions between 3D and RG features were quite different in terms of symmetry about their means and the degrees of “peakness” of their distributions. Cluster shade and cluster prominence exhibited large differences in CT. These features measure the skewness of the GLCM [53]. According to Ion, a high cluster shade value reveals an asymmetric image [53]. Overall, it is clear that image feature values are different between 3D and RG images. As discussed above, this is due to both the smearing effects of tumor motion, both affine and nonaffine, and noise intrinsic to image acquisition, with the former apparently having a larger effect [36]. This is also supported by the relative variation in feature values from different phases of the RG scans even though the tumor VOIs varied from phase to phase because of motion and deformation. Thus, the motion convoluted into the 3D images seems to have a greater effect on feature values than noise given that the RG images are intrinsically noisier because of lower counts (acquisition times). This study suggests that it would be important to account for motion in quantitative image feature analysis, regardless of modality (PET or CT), as attempted by other investigators [23]. Alternatively, if the definition of any one feature includes details of the acquisition protocol, then 3D and RG features may be treated as “different” sets of features. Further studies are needed to elucidate the potential usefulness of this alternative definition.

Limitations

Though our results clearly demonstrated that image feature values were different between 3D and RG protocols, there were limitations to the study. First of all, we were unable to normalize for count density between 3D and RG protocols. Another limitation was the nonconformity of the uptake time with the protocol. This was mainly due to clinical logistics. Also, partial volume effects were not taken into account. Because 3D and RG data on same patient were acquired on the same scanner and hence partial volume effects were similar in both sets of images except for the effect of motion, we did not take these affects into account. In addition, binning artifacts and breathing irregularities were assumed negligible because only patients with regular breathing patterns are candidates for RG PET for radiotherapy in our institution [54]. Another limitation was that 4D PET received 3D CT attenuation correction. This is currently standard procedure at our institution. Lastly, our patient size was limited but comparable to other published studies [22], [23], [36]. We plan to address these limitations in future studies.

Conclusions

This study investigated the variation of image features between 3D and RG PET/CT images of lung tumors. To our knowledge, this is the first study that evaluates how 3D and RG acquisitions affect a large number of image features currently being used and tested in several medical applications. The data showed that image feature analysis using a static acquisition (3D) versus an RG acquisition (to account for motion of the ROI) revealed notably different feature values. The results support that these differences are mainly due to the effect that respiratory motion has on image features. We have also concluded that rotational motion and deformation of the tumor also affect the features of an image. However, the effect of rotational motion and deformation from phase to phase appears to be smaller than the averaging/smearing effects of static acquisition. In sum, this study calls attention to the differences in 3D and RG image feature values for mobile tumors. The predictive and/or prognostic power of RG versus 3D image feature values will be explored in future studies.

38 in total

1. Variability in PET quantitation within a multicenter consortium.

Authors: Frederic H Fahey; Paul E Kinahan; Robert K Doot; Mehmet Kocak; Harold Thurston; Tina Young Poussaint
Journal: Med Phys Date: 2010-07 Impact factor: 4.071

2. 4D PET/CT: Radiology Imaging to Radiation Therapy.

Authors: C C Kuykendall; M M Budzevich; K Latifi; E G Moros; S E Hoffe; T J Dilling; G G Zhang; J L Montilla-Soler; E A Eikman
Journal: Pract Radiat Oncol Date: 2013-03-25

3. Standardised FDG uptake: a prognostic factor for inoperable non-small cell lung cancer.

Authors: Gerben R Borst; José S A Belderbos; Ronald Boellaard; Emile F I Comans; Katrien De Jaeger; Adriaan A Lammertsma; Joos V Lebesque
Journal: Eur J Cancer Date: 2005-07 Impact factor: 9.162

4. Use of PET to monitor the response of lung cancer to radiation treatment.

Authors: Y E Erdi; H Macapinlac; K E Rosenzweig; J L Humm; S M Larson; A K Erdi; E D Yorke
Journal: Eur J Nucl Med Date: 2000-07

5. Tumour heterogeneity in non-small cell lung carcinoma assessed by CT texture analysis: a potential marker of survival.

Authors: Balaji Ganeshan; Elleny Panayiotou; Kate Burnand; Sabina Dizdarevic; Ken Miles
Journal: Eur Radiol Date: 2011-11-17 Impact factor: 5.315

6. Are pretreatment 18F-FDG PET tumor textural features in non-small cell lung cancer associated with response and survival after chemoradiotherapy?

Authors: Gary J R Cook; Connie Yip; Muhammad Siddique; Vicky Goh; Sugama Chicklore; Arunabha Roy; Paul Marsden; Shahreen Ahmad; David Landau
Journal: J Nucl Med Date: 2012-11-30 Impact factor: 10.057

7. Texture analysis of aggressive and nonaggressive lung tumor CE CT images.

Authors: Omar S Al-Kadi; D Watson
Journal: IEEE Trans Biomed Eng Date: 2008-07 Impact factor: 4.538

8. Staging of non-small-cell lung cancer with integrated positron-emission tomography and computed tomography.

Authors: Didier Lardinois; Walter Weder; Thomas F Hany; Ehab M Kamel; Stephan Korom; Burkhardt Seifert; Gustav K von Schulthess; Hans C Steinert
Journal: N Engl J Med Date: 2003-06-19 Impact factor: 91.245

9. The promise and limits of PET texture analysis.

Authors: Nai-Ming Cheng; Yu-Hua Dean Fang; Tzu-Chen Yen
Journal: Ann Nucl Med Date: 2013-08-13 Impact factor: 2.668

10. The effect of SUV discretization in quantitative FDG-PET Radiomics: the need for standardized methodology in tumor texture analysis.

Authors: Ralph T H Leijenaar; Georgi Nalbantov; Sara Carvalho; Wouter J C van Elmpt; Esther G C Troost; Ronald Boellaard; Hugo J W L Aerts; Robert J Gillies; Philippe Lambin
Journal: Sci Rep Date: 2015-08-05 Impact factor: 4.379

42 in total

Review 1. Towards precision medicine: from quantitative imaging to radiomics.

Authors: U Rajendra Acharya; Yuki Hagiwara; Vidya K Sudarshan; Wai Yee Chan; Kwan Hoong Ng
Journal: J Zhejiang Univ Sci B Date: 2018 Jan. Impact factor: 3.066

2. Radiomics in nuclear medicine: robustness, reproducibility, standardization, and how to avoid data analysis traps and replication crisis.

Authors: Alex Zwanenburg
Journal: Eur J Nucl Med Mol Imaging Date: 2019-06-25 Impact factor: 9.236

3. Evaluating stability of histomorphometric features across scanner and staining variations: prostate cancer diagnosis from whole slide images.

Authors: Patrick Leo; George Lee; Natalie N C Shih; Robin Elliott; Michael D Feldman; Anant Madabhushi
Journal: J Med Imaging (Bellingham) Date: 2016-10-24

4. The impact of image reconstruction settings on 18F-FDG PET radiomic features: multi-scanner phantom and patient studies.

Authors: Isaac Shiri; Arman Rahmim; Pardis Ghaffarian; Parham Geramifar; Hamid Abdollahi; Ahmad Bitarafan-Rajabi
Journal: Eur Radiol Date: 2017-05-31 Impact factor: 5.315

Review 5. Characterization of PET/CT images using texture analysis: the past, the present… any future?

Authors: Mathieu Hatt; Florent Tixier; Larry Pierce; Paul E Kinahan; Catherine Cheze Le Rest; Dimitris Visvikis
Journal: Eur J Nucl Med Mol Imaging Date: 2016-06-06 Impact factor: 9.236

6. Accounting for reconstruction kernel-induced variability in CT radiomic features using noise power spectra.

Authors: Muhammad Shafiq-Ul-Hassan; Geoffrey G Zhang; Dylan C Hunt; Kujtim Latifi; Ghanim Ullah; Robert J Gillies; Eduardo G Moros
Journal: J Med Imaging (Bellingham) Date: 2017-12-14

7. Use of ¹⁸F-FDG PET/CT texture analysis to diagnose cardiac sarcoidosis.

Authors: Osamu Manabe; Hiroshi Ohira; Kenji Hirata; Souichiro Hayashi; Masanao Naya; Ichizo Tsujino; Tadao Aikawa; Kazuhiro Koyanagawa; Noriko Oyama-Manabe; Yuuki Tomiyama; Keiichi Magota; Keiichiro Yoshinaga; Nagara Tamaki
Journal: Eur J Nucl Med Mol Imaging Date: 2018-10-16 Impact factor: 9.236

Review 8. Radiomics in precision medicine for lung cancer.

Authors: Julie Constanzo; Lise Wei; Huan-Hsin Tseng; Issam El Naqa
Journal: Transl Lung Cancer Res Date: 2017-12

9. Combination of computer extracted shape and texture features enables discrimination of granulomas from adenocarcinoma on chest computed tomography.

Authors: Mahdi Orooji; Mehdi Alilou; Sagar Rakshit; Niha Beig; Mohammad Hadi Khorrami; Prabhakar Rajiah; Rajat Thawani; Jennifer Ginsberg; Christopher Donatelli; Michael Yang; Frank Jacono; Robert Gilkeson; Vamsidhar Velcheti; Philip Linden; Anant Madabhushi
Journal: J Med Imaging (Bellingham) Date: 2018-04-18

10. Sensitivity of Image Features to Noise in Conventional and Respiratory-Gated PET/CT Images of Lung Cancer: Uncorrelated Noise Effects.

Authors: Jasmine A Oliver; Mikalai Budzevich; Dylan Hunt; Eduardo G Moros; Kujtim Latifi; Thomas J Dilling; Vladimir Feygelman; Geoffrey Zhang
Journal: Technol Cancer Res Treat Date: 2016-08-08