| Literature DB >> 29051570 |
Xiangxue Wang1, Andrew Janowczyk1, Yu Zhou1, Rajat Thawani1, Pingfu Fu1, Kurt Schalper2, Vamsidhar Velcheti3, Anant Madabhushi4.
Abstract
Identification of patients with early stage non-small cell lung cancer (NSCLC) with high risk of recurrence could help identify patients who would receive additional benefit from adjuvant therapy. In this work, we present a computational histomorphometric image classifier using nuclear orientation, texture, shape, and tumor architecture to predict disease recurrence in early stage NSCLC from digitized H&E tissue microarray (TMA) slides. Using a retrospective cohort of early stage NSCLC patients (Cohort #1, n = 70), we constructed a supervised classification model involving the most predictive features associated with disease recurrence. This model was then validated on two independent sets of early stage NSCLC patients, Cohort #2 (n = 119) and Cohort #3 (n = 116). The model yielded an accuracy of 81% for prediction of recurrence in the training Cohort #1, 82% and 75% in the validation Cohorts #2 and #3 respectively. A multivariable Cox proportional hazard model of Cohort #2, incorporating gender and traditional prognostic variables such as nodal status and stage indicated that the computer extracted histomorphometric score was an independent prognostic factor (hazard ratio = 20.81, 95% CI: 6.42-67.52, P < 0.001).Entities:
Mesh:
Year: 2017 PMID: 29051570 PMCID: PMC5648794 DOI: 10.1038/s41598-017-13773-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Inclusion and exclusion criteria for patient selection for the training and test sets.
Demographic and clinical characteristics of patients in Cohort #1, Cohort #2 and Cohort #3.
| Training cohort (Cohort #1, N = 70) | Validation cohort (Cohort #2, N = 119) | Validation cohort (Cohort #3, N = 116) | |
|---|---|---|---|
| Median age | 62.4 | 65.4 | 66.8 |
| Gender | |||
| Male (%) | 84.3 | 47.9 | 55.2 |
| Female (%) | 15.7 | 52.1 | 44.8 |
| Stage | |||
| I (%) | 64.3 | 81.1 | 66.4 |
| II (%) | 35.7 | 18.9 | 33.6 |
| T-Stage | |||
| T1 (%) | 19.1 | 51.3 | 51.7 |
| T2 (%) | 80.9 | 48.7 | 48.3 |
| N-Stage | |||
| N0 (%) | 74.3 | 53.8 | 79.3 |
| N1 (%) | 25.7 | 46.2 | 20.7 |
| Non-recurrence (%) | 51.4 | 54.6 | 67.2 |
| Recurrence (%) | 48.6 | 45.4 | 32.8 |
Figure 2Flowchart illustrating the procedure for training and validating the quantitative histomorphometric classifier for distinguishing early versus no/late recurrence in early stage lung cancer.
Statistical significance test by Fisher’s exact test between gender, major pathological characteristics and disease outcome of patients in Cohort #2 and Cohort #3 set.
| Characteristic | Cohort #2 = 119, N (%) | No-Recurrence = 65, N (%) | Recurrence = 54, N (%) | P | Cohort #3 = 116, N (%) | No-Recurrence = 78, N (%) | Recurrence = 38, N (%) | P |
|---|---|---|---|---|---|---|---|---|
| Gender | ||||||||
| Male | 57(47.9) | 30 (46.2) | 27(50.0) | 0.7151 | 64 | 44 | 20 | 0.8425 |
| Female | 62(52.1) | 35(53.8) | 27(50.0) | 52 | 34 | 18 | ||
| T Pathological | ||||||||
| T1 | 61(51.3) | 35(53.8) | 26(48.1) | 0.5834 | 60 | 43 | 17 | 0.3267 |
| T2 | 58(48.7) | 30(46.2) | 28(51.9) | 56 | 35 | 21 | ||
| N Pathological | ||||||||
| N0 | 64(53.8) | 36(55.4) | 28(51.9) | 0.7159 | 92 | 63 | 29 | 0.6287 |
| N1 | 55(46.2) | 29(44.6) | 26 (48.1) | 24 | 15 | 9 | ||
Tow-sided P < 0.05 was considered as statistically significant.
A subset of 7 features selected from entire feature set.
| Feature category | Description |
|---|---|
| Graph | Voronoi: Area Ratio Minimum / Maximum |
| Graph | Arch: Average Nearest Neighbors in a 40 Pixel Radius |
| Shape | Min/max ratio of Fourier Descriptor 8 |
| Shape | Mean of Fourier Descriptor 4 |
| Texture | Haralick standard deviation intensity contrast variance |
| Texture | Haralick standard deviation intensity contrast energy |
| Texture | Haralick standard deviation intensity contrast inverse moment |
Figure 3Representative TMA tissue spots of recurrent (top row) and non-recurrent (bottom row) NSCLC with corresponding feature maps: Recurrence TMA with (a,e) nuclear shape feature, (b,f) texture feature map (Haralick standard deviation intensity correlation), (c,g) nuclear cluster graph feature map, and (d,h) nuclear orientation.
Figure 4ROC analysis of classifier predicting recurrence on (a) training set Cohort #1, (b) independent validation set Cohort #2, (c) independent validation set Cohort #3 batch #1 and (d) independent validation set Cohort #3 batch #2 show consistent predicting ability among different classifiers and among different tumor section.Kaplan-Meier survival analysis for (e) training set Cohort #1 and (f) validation set Cohort #2 (g,h) batch #1 and batch #2 from Cohort #3 show good visual separation and log-rank test indicates the two groups were statistically different (p-value ≪ 0.05).
Classification results with real patient 5-year outcomes of validation set Cohort #2 and Cohort #3.
| 5-year recurrence | No 5-year recurrence | |
|---|---|---|
| Classifier-recurrence, Cohort #2 | 51 | 19 |
| Classifier-non-recurrence, Cohort #2 | 3 | 46 |
| Classifier-recurrence, Cohort #3 batch #1 | 17 | 8 |
| Classifier-non-recurrence, Cohort #3 batch #1 | 21 | 70 |
| Classifier-recurrence, Cohort #3 batch #2 | 20 | 11 |
| Classifier-non-recurrence, Cohort #3 batch #2 | 18 | 67 |
Multivariable Cox proportional hazard model controlling for major pathological variables on validation set Cohort #2 and Cohort #3, batch #1 and batch #2.
| Characteristic | hazard ratio (95% CI) | P - value | |
|---|---|---|---|
| Cohort #2 | Gender (Male Vs. Female) | 1.3046 (0.753, 2.26) | 0.343 |
| T Pathological (T1 vs. T2) | 1.0737 (0.433, 2.664) | 0.878 | |
| N Pathological (N0 vs. N1) | 1.1961 (0.477, 3) | 0.703 | |
| Classified (Non-Recurrence vs. Recurrence) | 20.812 (6.415, 67.52) | <0.0001 | |
| Cohort #3, batch #1 | Gender (Male Vs. Female) | 0.9274 (0.449, 1.916) | 0.839 |
| T Pathological (T1 vs. T2) | 1.7725 (0.905, 3.473) | 0.095 | |
| N Pathological (N0 vs. N1) | 1.8686 (0.811, 4.307) | 0.142 | |
| Classified (Non-Recurrence vs. Recurrence) | 4.6532 (2.294, 9.44) | <0.0001 | |
| Cohort #3, batch #2 | Gender (Male Vs. Female) | 0.7885 (0.385, 1.614) | 0.516 |
| T Pathological (T1 vs. T2) | 1.4941 (0.768, 2.909) | 0.237 | |
| N Pathological (N0 vs. N1) | 1.8984 (0.842, 4.28) | 0.122 | |
| Classified (Non-Recurrence vs. Recurrence) | 2.9239 (1.476, 5.791) | 0.002 |
P-values in bold are statistically significant.