| Literature DB >> 31189944 |
Yoganand Balagurunathan1,2,3, Matthew B Schabath4, Hua Wang5,6, Ying Liu5,6, Robert J Gillies7,5.
Abstract
Pulmonary nodules are frequently detected radiological abnormalities in lung cancer screening. Nodules of the highest- and lowest-risk for cancer are often easily diagnosed by a trained radiologist there is still a high rate of indeterminate pulmonary nodules (IPN) of unknown risk. Here, we test the hypothesis that computer extracted quantitative features ("radiomics") can provide improved risk-assessment in the diagnostic setting. Nodules were segmented in 3D and 219 quantitative features are extracted from these volumes. Using these features novel malignancy risk predictors are formed with various stratifications based on size, shape and texture feature categories. We used images and data from the National Lung Screening Trial (NLST), curated a subset of 479 participants (244 for training and 235 for testing) that included incident lung cancers and nodule-positive controls. After removing redundant and non-reproducible features, optimal linear classifiers with area under the receiver operator characteristics (AUROC) curves were used with an exhaustive search approach to find a discriminant set of image features, which were validated in an independent test dataset. We identified several strong predictive models, using size and shape features the highest AUROC was 0.80. Using non-size based features the highest AUROC was 0.85. Combining features from all the categories, the highest AUROC were 0.83.Entities:
Mesh:
Year: 2019 PMID: 31189944 PMCID: PMC6561979 DOI: 10.1038/s41598-019-44562-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Systematic process followed in the study. (A) Cohort identification flowchart, (B) Quantitative imaging (Radiomic) based predictor work flow, where N+/C+ indicate presence of a nodule that are confirmed to be malignant, while N+/C− indicate presence of a nodule but remains benign, following NLST inclusion criteria.
Demographic information for the Train and test cohort used in the study (for more details referrer to Supplemental Table S1).
| Category | Training | Testing | P-value | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Overall | Lung Cancer Cases | Nodule + Controls | P-value | Overall | Lung Cancer Cases | Nodule + Controls | |||
| 1 | Demographic Patients | 244 | 78 | 166 | 235 | 88 | 147 | ||
| 2 |
| ||||||||
| 171 | 32 | 139 | 162 | 31 | 131 | ||||
| 56 | 38 | 18 | 54 | 46 | 8 | ||||
|
| 17 | 19 | |||||||
| 3 | Age at Diagnosis (Yrs) (Mean (STD), Median) | 63.9 (5.2), 64 | 64.4 (5.4), 64.5 | 63.6 (5.1),63 |
| 63.3 (4.9), 63 | 63.3 (4.7), 63.5 | 63.2 (4.9),63 |
|
| 4 | Gender (Male:Female) | 1.12 | 1 | 1.18 |
| 1.67 | 1.44 | 1.82 |
|
| 5 | Male, N | 129 | 39 | 90 | 147 | 52 | 95 | ||
| 6 | Female, N | 115 | 39 | 76 | 88 | 36 | 52 | ||
| Ratio (Male/Female) | 1.122 | 1 | 1.184 | 1.671 | 1.44 | 1.826 | |||
| 7 | Pack Years smoking (Mean (STD), Median) | 63.8 (25.2), 57 | 62.8 (25.7), 54.5 | 64.2 (24.9), 58 |
| 61.8 (22.1), 55.5 | 62.5 (22.2), 61.5 | 61.4 (22.1), 54 |
|
| 8 | Race(White/Black/Native/Asian/Others), N | 238/3/0/3/0 | 77/1/0/0/0 | 161/2/0/3/0 |
| 221/7/3/0/4 | 82/5/0/0/1 | 139/2/3/0/3 |
|
| Race(White/Black/Native/Asian/Others), Percent (%) | 97/1.23/0/1.23/0 | 98.7%/1.28/0/0/0 | 96.98/1.21/0/1.81/0 | 94/2.97/1.28/0/1.7 | 93.18/5.68/0/0/1.14 | 94.56/1.36/2.04/0/2.04 | |||
| 9 | Smoke status (current/former), N (ratio) | 130/114 (1.140) | 35/43 (0.813) | 95/71 (1.338) |
| 125/110 (1.136) | 50/38 (1.315) | 75/72 (1.042) |
|
The p-values were computed using Students t test for continues variable and Fishers exact test for categorical data.
Qualitative comparison of CT reconstruction kernel across scanner manufacturers.
| Manufacturer | Followed Choice | Alternative Choice |
|---|---|---|
| GE | Standard | Lung/Bone |
| SIEMENS | B30f | B50f |
| PHILIPS | C | D |
| TOSHIBA | FC01/FC10 | FC30/FC51 |
Figure 2Representative 2D slices of patient LDCT with 3D rendering of the pulmonary nodule of interest, where each row corresponds to a patient (A) Patient with confirmed malignant nodule (B) Patients with benign nodules, as reported by NLST.
Top images features discriminating cancer and benign nodules using image features across categories.
| All Categories (129 Reproducible, Non-Redundant Features) | |||||
|---|---|---|---|---|---|
| Features | Training | Testing | |||
| Broad description | Detailed feature | Image | Image + Clinical | Image | Image + Clinical |
| E[AUC], CI | E[AUC], CI | E[AUC], CI | AUC | ||
| Size, Shape and Texture | (F3: ShortAx; F4: Mn-Hu; F26: EllipticFit; F176: 3D-Laws-127) | 0.83 [0.71, 0.94] | 0.82, [0.68, 0.94] | 0.90 [0.89, 0.91] | 0.89 [0.88, 0.9] |
| Location, Texture | (F19: 10a-3D-Relat-Vol-Airspaces;F21: 10c-3D-Av-Vol-AirSpaces;F22: 10d-3D-SD-Vol-AirSpaces; F201:3D-WaveP2-L2-12) | 0.83 [0.70, 0.93] | 0.81, [0.67, 0.92] | 0.87 [0.86, 0.88] | 0.86 [0.85, 0.87] |
| Location, Size, Texture-Runlength | (F19: 10a-3D-Relat-Vol-Airspaces;F34: Volume-pxl; F37: Thickness-Pxl; F48: AvgGLN) | 0.80, [0.65, 0.92] | 0.80, [0.68, 0.89] | 0.89 [0.89, 0.90] | 0.88 [0.869, 0.893 |
|
| Longest Diameter | 0.76, [0.62, 0.93] | 0.85, [0.35, 0.95] | ||
| Volume | 0.78 [0.63, 0.91] | 0.87 [0.37, 0.96] | |||
Statistical estimates were computed using cross validation (hold out) for training and tested, randomized multiple times.
Top image features discriminating cancer and benign nodules, using feature in each category (size & shape, Texture: co-occurrence, runlength, pixel HU, Texture: wavelets & laws).
| Image feature based predictors across different descriptive feature | ||||
|---|---|---|---|---|
| Features | Training | Testing | ||
| Image | Image + Clinical | Image | Image + Clinical | |
| E[AUC], CI | E[AUC], CI | E[AUC], CI | E[AUC], CI | |
| 0.80 [0.66, 0.91] | 0.78, [0.651, 0.92] | 0.86 [0.85, 0.86] | 0.85, [0.84, 0.86] | |
| 0.79, [0.65, 0.90] | 0.79, [0.60, 0.91] | 0.86, [0.85, 0.86] | 0.84 [0.83, 0.86] | |
| 0.85, [0.73, 0.95] | 0.84, [0.71, 0.94] | 0.88 [0.87, 0.88] | 0.87 [0.87, 0.88] | |
| 0.79, [0.59, 0.93] | 0.77, [0.64, 0.93] | 0.86 [0.84, 0.87] | 0.85 [0.84, 0.86] | |
| 0.761, [0.63, 0.92] | 0.75, [0.54, 0.87] | 0.79 [0.79, 0.80] | 0.774 [0.75, 0.79] | |
| 0.774, [0.62, 0.9] | 0.76, [0.61, 0.9] | 0.819 [0.82, 0.83] | 0.804 [0.79, 0.82] | |
Statistical estimates were computed using cross validation (hold out), repeated multiple times.
Top images features discriminating cancer and benign nodules across different size ranges, with mixed image categories.
| Different nodule size ranges. | ||||
|---|---|---|---|---|
| Features | Training | Testing | ||
| Image | Image + Clinical | Image | Image + Clinical | |
| E[AUC], CI | E[AUC], CI | E[AUC], CI | E[AUC], CI | |
| 0.75, [0.39, 0.96] | 0.74, [0.46, 0.97] | 0.78, [0.76, 0.79]. | 0.75, [0.71, 0.78] | |
| 0.74, [0.47, 0.95] | 0.73, [0.48, 0.96] | 0.79 [0.77, 0.81] | 0.759, [0.72, 0.79] | |
| 0.595, [0.29, 0.87] | 0.67, [0.17, 0.86] | |||
| Volume | 0.589, [0.33, 0.87] | 0.68 [0.51, 0.69] | ||
| 0.70, [0.24, 0.96] | 0.69, [0.2, 0.95] | 0.61, [0.51, 0.65] | 0.65, [0.44, 0.65] | |
| 0.71, [0.37, 0.97] | 0.545, [0.24, 0.86] | 0.57, [0.50, 0.6] | 0.55, [0.44, 0.65] | |
| 0.53 [0.05, 0.90] | 0.57, [0.56, 0.57] | |||
| Volume | 0.57 [0.07, 0.95] | 0.65 [0.34, 0.66] | ||
Statistical estimates were computed using cross validation (hold out), randomized multiple times.
Top images features discriminating cancer and benign nodules across different size ranges and features with in each image categories.
| Feature categories and nodule size ranges. | ||||
|---|---|---|---|---|
| Features | Training | Testing | ||
| Image | Image + Clinical | Image | Image + Clinical | |
| E [AUC], CI | E [AUC], CI | E [AUC], CI | E[AUC],CI | |
|
| ||||
| 0.62, [0.415, 0.811] | 0.61, [0.39, 0.82] | 0.71 [0.68, 0.72] | 0.66, [0.61, 0.69] | |
|
| ||||
| 0.83, [0.63, 0.97] | 0.81, [0.61, 0.97] | 0.764 [0.74, 0.78] | 0.665 [0.02, 0.67] | |
|
| ||||
| 0.61, [0.27, 0.89] | 0.64, [0.40, 0.86] | 0.649 [0.62, 0.66] | 0.58, [0.53, 0.625] | |
|
| ||||
|
| ||||
| 0.71, [0.19, 0.96] | 0.63, [0.25, 0.96] | 0.79 [0.72, 0.83] | 0.784, [0.70, 0.86] | |
|
| ||||
(F187: Hist-Entropy-L1; F188:Hist-Kurt-L1; F42: AvgCoOc-Homo; F48: AvgGLN) | 0.67, [0.21, 0.96] | 0.57, [0.19, 0.88] | 0.68, [0.62, 0.715] | 0.659, [0.55, 0.76] |
|
| ||||
| 0.76, [0.42, 0.97] | 0.68 [0.28, 0.96] | 0.591 [0.55, 0.61] | 0.634 [0.57, 0.68] | |
Statistical estimates were computed using cross validation (hold out), randomized multiple times.
Figure 3Comparison of area under receiver operator curves (AUC) and clinical net benefit curves using radiomics image markers to predict cancer status in lung nodules with different sizes. (A) All size range, (B) Indeterminate range (4 to 12 mm), (C) larger range (>12 to 30 mm).
Figure 4Comparison of area under receiver operator curves and clinical net benefit curves to predict cancer status of lung nodules in the indeterminate range (4 to 12 mm) using different categories of Radiomic features: (A) C1: Size & Shape, (B) C2: Run length & Co-Occurrence (C) C3: Texture (Wavelets & Laws).
Figure 5Analysis workflow followed in the study. The extracted features are filtered for reproducibility and redundancy. The data is partitioned into three different sizes based on its longest diameter and further stratified based on feature categories.