| Literature DB >> 31752667 |
Xin Li1, Gengen Shi1, Qingsong Chu2, Wenbin Jiang1, Yixin Liu1, Sainan Zhang1, Zheyang Zhang1, Zixin Wei3, Fei He4, Zheng Guo5,6,7, Lishuang Qi8.
Abstract
BACKGROUND: Targeted therapy for non-small cell lung cancer is histology dependent. However, histological classification by routine pathological assessment with hematoxylin-eosin staining and immunostaining for poorly differentiated tumors, particularly those from small biopsies, is still challenging. Additionally, the effectiveness of immunomarkers is limited by technical inconsistencies of immunostaining and lack of standardization for staining interpretation.Entities:
Keywords: Histological subtype; Non-small cell lung cancer; Pathological assessment; Qualitative transcriptional signature; Relative gene expression orderings
Mesh:
Substances:
Year: 2019 PMID: 31752667 PMCID: PMC6868745 DOI: 10.1186/s12864-019-6086-2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1The flowchart of this study. Using gene expression profiles of pADC and pSCC, we developed a qualitative transcriptional signature to individually distinguish ADC from SCC. The signature was tested in “golden” standard dataset, fresh frozen samples with survival data and clinical challenging cases, including FFPE specimens, mixed tumors, small biopsy specimens and poorly differentiated specimens. The pADC and pSCC represent pathologically-determined squamous cell carcinoma and pathologically-determined adenocarcinoma, respectively
The datasets analyzed in this study
| Types | Data Source | Database | Platform | pADC | pSCC | Normal |
|---|---|---|---|---|---|---|
| Train (frozen) | GSE30219 | GEO | Affy. Plus 2 | 85 | 61 | 14 |
| GSE18842 | GEO | Affy. Plus 2 | 14 | 31 | 45 | |
| Total | – | – | – | 99 | 92 | 59 |
| “Golden”standard data | GSE19188 | GEO | Affy. Plus 2 | 45 | 27 | – |
| E-MTAB-2435 | ArrayExpress | Affy. Plus 2 | 0 | 63 | – | |
| Total | – | – | – | 45 | 90 | – |
| Integrated data (frozen) | GSE42127a | GEO | Illu. WG V3.0 | 90 | 32 | – |
| GSE50081a | GEO | Affy. Plus 2 | 127 | 43 | – | |
| GSE37745a | GEO | Affy. Plus 2 | 40 | 24 | – | |
| GSE31210a | GEO | Affy. Plus 2 | 204 | 0 | – | |
| GSE31546a | GEO | Affy. Plus 2 | 13 | 0 | – | |
| GSE14814a | GEO | Affy. U133A | 32 | 26 | – | |
| GSE68465a | GEO | Affy. U133A | 299 | 0 | – | |
| Total | – | – | 805 | 125 | – | |
| FFPE | GSE44170 | GEO | Affy. U133A | 0 | 38 | – |
| Mixed | TCGA | TCGA | Illu. HiSeqV2 | 498 | 499 | – |
| Biopsy | GSE58661 | GEO | Affy. 2.0 | 42 | 36 | – |
| Poorly differentiated | GSE94601 | GEO | Illu. HT V4.0 | 19b | 4b | – |
| Total | – | – | – | 1364 | 702 | – |
pADC pathologically-determined ADC, pSCC pathologically-determined SCC, Affy. Plus 2 Affymetrix Plus 2, Affy. U133A Affymetrix U133A, Affy. 2.0 Rosetta/Merck Human RSTA Custom Affymetrix 2.0, Illu. WG V3.0 Illumina HumanWG-6 V3.0, Illu. HT V3.0 Illumina HumanHT-12 V3.0, Illu. HiSeqV2 Illumina HiSeqV2, Illumina HT V4.0 Illumina HumanHT-12 V4.0
athe data records the survival information of patients treated with curative surgery resection only
bthe 19 pADCs and 4 pSCCs samples are poorly differentiated which were improperly assigned to LCC subtype before and reclassified by the authors using ADC and SCC immunomarkers
Fig. 2Immunohistochemical analysis of Krt5 protein and Agr2 protein expressions in human lung cancer tissue microarray. a Krt5 and Agr2 proteins expression profile in lung cancer tissue array. The red frame containing samples from A1-E8 are pSCC. The green frame containing samples from E18-K5 are pADC. The remaining samples are the other subtypes of lung cancer and normal controls. b, c Inverse correlation between Krt5 protein and Agr2 protein expressions in pADC (b) and pSCC (c) samples. The protein expression score was quantified and considered as low, medium and high expression, basing on a multiplicative index of the average staining intensity and the extent of staining (see Methods). d, e Representative immunohistochemical staining results of Krt5 and Agr2 proteins in pADC (d) and pSCC (e) samples. Scale bar, 1 mm
The performance of our signature for pSCC and pADC samples in test datasets
| Data Source | pADC | pSCC | A- Sen | A- Spe | A- Acc | Re (SCC) (rate) | Re (ADC) (rate) | |
|---|---|---|---|---|---|---|---|---|
| “Golden”standard data | GSE19188 | 45 | 27 | 93.33% | 96.30% | 94.44% | 3 (6.67%) | 1 (3.70%) |
| E-MTAB-2435 | 0 | 63 | – | 98.41% | 98.41% | – | 1 (1.59%) | |
| Total | – | 45 | 90 | 93.33% | 97.78% | 96.30% | 3 (6.67%) | 2 (2.22%) |
| Integrated data (frozen) | GSE42127 | 90 | 32 | 90.00% | 84.38% | 88.52% | 9 (10.00%) | 5 (15.62%) |
| GSE50081 | 127 | 43 | 88.19% | 86.05% | 87.65% | 15 (11.81%) | 6 (13.95%) | |
| GSE37745 | 40 | 24 | 95.00% | 87.50% | 92.19% | 2 (5.00%) | 3 (12.50%) | |
| GSE31210 | 204 | 0 | 99.02% | – | 99.02% | 2 (0.98%) | – | |
| GSE31546 | 13 | 0 | 100% | – | 100% | 0 (0.00%) | – | |
| GSE14814 | 32 | 26 | 93.75% | 96.15% | 94.83% | 2 (6.25%) | 1 (3.85%) | |
| GSE68465 | 299 | 0 | 98.66% | – | 98.66% | 4 (1.34%) | – | |
| Total | 805 | 125 | 95.78% | 88.00% | 94.73% | 34 (4.22%) | 15 (12.00%) | |
| FFPE | GSE44170 | 0 | 38 | – | 92.11% | 92.11% | – | 3 (7.89%) |
| Mixed | TCGA | 498 | 499 | 97.59% | 83.57% | 90.75% | 12 (2.41%) | 82 (16.43%) |
| Biopsy | GSE58661 | 42 | 36 | 95.24% | 88.89% | 92.31% | 2 (4.76%) | 4 (11.11%) |
| Poorly differentiated | GSE94601 | 19a | 4a | 100% | 50.00% | 91.30% | 0 (0.00%) | 2 (50.00%) |
| Total | – | 1364 | 702 | 96.48% | 84.90% | 92.55% | 48 (3.52%) | 106 (15.10%) |
A-Sen represents the apparent sensibility, A-Spe represents the apparent specificity and A-acc represents the apparent accuracy
Re (SCC) represents the number of pADC samples reclassified as SCC by signature
Re (ADC) represents the number of pSCC samples reclassified as ADC by signature
athe 19 pADCs and 4 pSCCs samples are poorly differentiated which were improperly assigned to LCC subtype before and reclassified by the authors using ADC and SCC immunomarkers
Fig. 3The validation of the reclassifications by the signature for fresh frozen samples with survival data. a Kaplan-Meier curves of overall survival (OS) respectively for the pADC reclassified as SCC and the signature-confirmed pADC groups. b Kaplan-Meier curves of OS respectively for the pSCC reclassified as ADC and the signature-confirmed pSCC groups. c, d Kaplan-Meier curves of OS respectively for the SCC and ADC groups reclassified by the signature (c) and original pathological assessment (d). e The violin plot of proliferation scores in the reclassified and signature-confirmed samples, respectively, in the GSE50081 dataset with the higher reclassification rate in the fresh frozen samples. Wilcoxon rank sum test was used to test the difference of proliferation scores between two groups. f The violin plot of mRNA expressions of the seven subtype-specific marker genes in the GSE50081 dataset. The subtype-specific marker genes include ADC marker genes (NAPSA, TTF1), SCC marker genes (KRT5, TP63) and neuroendocrine marker genes (CD56, SYP, CHGA). The RankProd (RP) algorithm was used to test the difference of the subtype-specific marker genes between reclassified samples and signature-confirmed samples
Multivariate Cox regression analysis for the pADC reclassified as SCC samples in the integrated dataset
| Variable | Hazard ratio | 95% CI | |
|---|---|---|---|
| Histological classification by the signature (reclassified as SCC vs. signature-confirmed ADC) | 1.72 | 0.0458 | 1.01–2.93 |
| Data centers | 1.05 | 0.0887 | 0.99–1.12 |
| Stage (III vs. II vs. I) | 2.32 | < 0.0001 | 1.95–2.76 |
| Age (> 65 vs. ≤65) | 1.56 | 0.0009 | 1.20–2.04 |
| Gender (Male vs. Female) | 1.54 | 0.0013 | 1.18–2.01 |
CI confidence interval
Multivariate Cox regression analysis for the histological classification by the signature in the integrated dataset
| Variable | Hazard ratio | 95% CI | |
|---|---|---|---|
| Histological classification by the signature (SCC vs. ADC) | 1.36 | 0.0500 | 1.00–1.85 |
| Data centers | 1.05 | 0.0729 | 1.00–1.11 |
| Stage (III vs. II vs. I) | 2.08 | < 0.0001 | 1.78–2.45 |
| Age (> 65 vs. | 1.54 | 0.0004 | 1.21–1.97 |
| Gender (Male vs. Female) | 1.54 | 0.0005 | 1.21–1.97 |
CI confidence interval
Fig. 4The validation of the reclassifications by the signature for the FFPE and mixed tumor specimens. a The violin plot of proliferation scores and b mRNA expressions of the subtype-specific marker genes in the GSE44170 dataset derived from FFPE specimens. c The violin plot of proliferation scores and d mRNA expressions of the subtype-specific marker genes in the TCGA-ADC dataset derived from mixed tumor specimens. e The violin plot of proliferation scores and f mRNA expressions of the subtype-specific marker genes in the TCGA-SCC dataset derived from mixed tumor specimens
Fig. 5The validation of the reclassifications by the signature for small biopsy and poorly differentiated specimens. a The violin plot of proliferation scores and b mRNA expressions of the subtype-specific marker genes in the GSE58661 dataset derived from small biopsy specimens. c The violin plot of proliferation scores of 23 poorly differentiated specimens in the GSE94601 dataset. d The volcano plot of the differential expressions of the 44 proliferation-related genes in the pSCC samples reclassified as ADC when compared with the signature-confirmed pSCC samples. For the 44 proliferation-related genes, 20 genes were significantly differentially expressed and all the genes were down-regulated in the reclassified pSCC