| Literature DB >> 31166787 |
Usman Bashir1, Bhavin Kawa2, Muhammad Siddique1, Sze Mun Mak3, Arjun Nair3, Emma Mclean4, Andrea Bille5, Vicky Goh3, Gary Cook6.
Abstract
OBJECTIVE: Non-invasive distinction between squamous cell carcinoma and adenocarcinoma subtypes of non-small-cell lung cancer (NSCLC) may be beneficial to patients unfit for invasive diagnostic procedures or when tissue is insufficient for diagnosis. The purpose of our study was to compare the performance of random forest algorithms utilizing CT radiomics and/or semantic features in classifying NSCLC.Entities:
Mesh:
Year: 2019 PMID: 31166787 PMCID: PMC6636267 DOI: 10.1259/bjr.20190159
Source DB: PubMed Journal: Br J Radiol ISSN: 0007-1285 Impact factor: 3.039
Figure 1.Patient inclusion workflow in our study for training and validation data sets. ADCA, adenocarcinoma; NOS, not otherwise specified; NSCLC, non-small-cell lung cancer; SCCA, squamous cell carcinoma; TCIA, The cancer imaging Archive.
Nodule semantic features and their descriptions
| Presence of visible air-filled bronchi within the lesion. Measured as being present or absent. | |
| Presence of hazy attenuation, higher than background, but not sufficiently high to obscure bronchial and vascular margins within the lesion.[ | |
| Central or peripheral, based on whether the tumour was closer to the hilum than the nearest segmental bronchus or not. | |
| Irregular, smooth, or lobulated. Lobulation was defined as the presence of at least three undulations with a height of more than 2 mm.[ | |
| Retraction of pleura near the tumour margin.[ | |
| Presence of smaller nodules in the immediate vicinity of the main lesion. | |
| The presence of linear strands at least 2 mm thick extending from tumour margin into adjacent parenchyma.[ | |
| Presence of a round lucency inside the lesion, usually within the centre of the lesion and larger than pseudo cavitation; suggests necrosis.[ | |
| Pseudocavitation | Presence of bubble-like areas of low attenuation within the nodule. |
Clinical and demographic features of patients in training data set
| Age in years, mean (range, SD) | 69 (40.2–84.75, 10.2) | 70.8 (52.35–85.54,8.1) |
| Sex (M : F) | 32 : 32 | 24 : 18 |
| Smokers | 65.6% ( | 71.4%( |
| T1a | 10 | 7 |
| T1b | 12 | 6 |
| T2a | 27 | 15 |
| T2b | 3 | 5 |
| T3 | 10 | 8 |
| T4 | 2 | 1 |
| N0 | 50 | 35 |
| N1 | 3 | 3 |
| N2 | 11 | 3 |
| N3 | 0 | 1 |
| M0 | 64 | 40 |
| M1 | 0 | 2 |
ADCA, adenocarcinoma; SCCA, squamous cell carcinoma; SD = standard deviation.
Frequencies of semantic features according to tumour type
| ADCA ( | SCCA ( | Weighted-κ (95% CI) | ||||
| Air-bronchogram | Absent | 31 (48.44%) | 36 (85.71%) | <0.0001 | 0.34 (0.16 to 0.52) | |
| Present | 33 (51.56%) | 6 (14.29%) | ||||
| Airway thickening | Absent | 31 (48.44%) | 15 (35.71%) | 0.2 | 0.44 (0.25 to 0.63) | |
| Present | 30 (46.88%) | 20 (47.62%) | ||||
| Emphysema | Absent | 24 (37.5%) | 10 (23.81%) | 0.2 | 0.78 (0.69 to 0.86) | |
| Present | 20 (31.25%) | 16 (38.1%) | ||||
| Ground-glass component | Absent | 50 (78.13%) | 42 (100%) | 0.0006 | 0.74 (0.54 to 0.94) | |
| Present | 14 (21.88%) | 0 (0%) | ||||
| Location | Central third | 20 (31.25%) | 10 (23.81%) | 0.5 | 0.35 (0.16 to 0.55) | |
| Peripheral two-thirds | 44 (68.75%) | 32 (76.19%) | ||||
| Margins | Irregular | 35 (54.69%) | 22 (52.38%) | 0.9 | 0.2 (0.04 to 0.35) | |
| Lobulated | 27 (42.19%) | 18 (42.86%) | ||||
| Smooth | 2 (3.13%) | 2 (4.76%) | ||||
| Pleural indentation | Absent | 18 (28.13%) | 10 (23.81%) | 0.65 | 0.44 (0.24 to 0.63) | |
| Present | 46 (71.88%) | 32 (76.19%) | ||||
| Satellite nodules | Absent | 50 (78.13%) | 41 (97.62%) | 0.004 | 0.74 (0.55 to 0.92) | |
| Present | 14 (21.88%) | 1 (2.38%) | ||||
| Spiculation | Absent | 38 (59.38%) | 23 (54.76%) | 0.69 | 0.27 (0.11 to 0.42) | |
| Present | 26 (40.63%) | 19 (45.24%) | ||||
| Cavitation | Absent | 63 (98.44%) | 34 (80.95%) | 0.002 | 0.78 (0.57 to 0.99) | |
| Present | 1 (1.56%) | 8 (19.05%) | ||||
| Pseudocavitation | Absent | 51 (79.69%) | 39 (92.86%) | 0.09 | 0.23 (0.01 to 0.45) | |
| Present | 13 (20.31%) | 3 (7.14%) | ||||
IQR, interquartile range; SD, standard deviation.
Figure 2.Performance curves of RF models on test data (A) and training data (B) show that RF models containing radiomic features (i.e. RF-rad and RF-all) yielded perfect discrimination (AUC 1) on training data (A), but very poor discrimination (AUC 0.52 and 0.56 respectively) on test data, similar to random guess (black line in A and B). RF-sem gave consistent good performance on training (B; AUC 0.78) as well as test data (B; AUC 0.82). AUC, area under the curve; RF, radiofrequency.
Figure 3.Figure showing two cases of ADCA (A, B), and two of SCCA (C, D). All cases were assigned high probability of respective histologies by the RF-sem model (inset). Among other semantic features, these tumours displayed features well known for ADCA, i.e. ground-glass component (arrow in A) and air bronchogram (arrow in B), and for SCCA, i.e. spiculation (arrow in C) and cavitation (arrow in D). Since spiculation was not strongly correlated with SCCA histopathology, the RF-sem model used absence of ADCA-specific features in C, although the overall confidence for SCCA (probability = 75%) was relatively lower. ADCA,adenocarcinoma; SCCA, squamous cell carcinoma.
Variable importance determined by random forests classifier using MDA
| Air bronchogram | 0.039 |
| Ground-glass component | 0.023 |
| Cavitation | 0.019 |
| Satellite nodules | 0.015 |
| Airway thickening | 0.008 |
| Pleural indentation | 0.006 |
| Emphysema | 0.004 |
| Pseudocavitation | 0.002 |
| Location | −0.002 |
| Spiculation | −0.005 |
| Margin | −0.011 |
| db1 LLL GLSZM Short Zone | 0.005 |
| db1 HLH Coefficient of Variation | 0.004 |
| db1 LLL NGTDM Coarseness | 0.003 |
| db1 HHH GLCM Cluster Shade | 0.003 |
| db1 HHH NGTDM Coarseness | 0.003 |
| db1 HHH GLCM Correlation | 0.003 |
| NGTDM Contrast | 0.003 |
| Maximum intensity | 0.003 |
| db1 HHL Coefficient of Variation | 0.002 |
GLCM, Grey-level cooccurence matrix; GLSZM, Grey-level size zone matrix; MDA, mean decrease in accuracy;NGTDM, Neighbourhood grey-tone difference matrix.
A high MDA score of a variable corresponds to greater predictive power.
Negative MDA means the variable did not perform better than random chance. MDA = Mean decrease in accuracy. Note: Only the top 10 radiomic features are given here. For full table, please see supplemental file.