| Literature DB >> 28456114 |
Radoslaw Charkiewicz1, Jacek Niklinski1, Jürgen Claesen2, Anetta Sulewska1, Miroslaw Kozlowski3, Anna Michalska-Falkowska1, Joanna Reszec4, Marcin Moniuszko5, Wojciech Naumnik6, Wieslawa Niklinska7.
Abstract
Advances in molecular analyses based on high-throughput technologies can contribute to a more accurate classification of non-small cell lung cancer (NSCLC), as well as a better prediction of both the disease course and the efficacy of targeted therapies. Here we set out to analyze whether global gene expression profiling performed in a group of early-stage NSCLC patients can contribute to classifying tumor subtypes and predicting the disease prognosis. Gene expression profiling was performed with the use of the microarray technology in a training set of 108 NSCLC samples. Subsequently, the recorded findings were validated further in an independent cohort of 44 samples. We demonstrated that the specific gene patterns differed significantly between lung adenocarcinoma (AC) and squamous cell lung carcinoma (SCC) samples. Furthermore, we developed and validated a novel 53-gene signature distinguishing SCC from AC with 93% accuracy. Evaluation of the classifier performance in the validation set showed that our predictor classified the AC patients with 100% sensitivity and 88% specificity. We revealed that gene expression patterns observed in the early stages of NSCLC may help elucidate the histological distinctions of tumors through identification of different gene-mediated biological processes involved in the pathogenesis of histologically distinct tumors. However, we showed here that the gene expression profiles did not provide additional value in predicting the progression status of the early-stage NSCLC. Nevertheless, the gene expression signature analysis enabled us to perform a reliable subclassification of NSCLC tumors, and it can therefore become a useful diagnostic tool for a more accurate selection of patients for targeted therapies.Entities:
Year: 2017 PMID: 28456114 PMCID: PMC5408153 DOI: 10.1016/j.tranon.2017.01.015
Source DB: PubMed Journal: Transl Oncol ISSN: 1936-5233 Impact factor: 4.243
Patient Characteristics for the Training Set (n = 108) and the Validation Set (n = 44)
| Characteristic | Set 1, | Set 2, | All, | ||
|---|---|---|---|---|---|
| Age (years) | Mean ± SD | 62.27 ± 8.36 | 64.78 ± 8.32 | 62.99 ± 8.39 | .095 |
| Median | 62.92 | 64.30 | 63.51 | ||
| Range | 39.83-78.08 | 46.3-78.8 | 39.83-78.8 | ||
| Gender | Female | 22 (20%) | 10 (23%) | 32 (21%) | .747 |
| Male | 86 (80%) | 34 (77%) | 120 (79%) | ||
| Histology | SCC | 56 (52%) | 25 (57%) | 81 (53%) | .813 |
| AC | 42 (39%) | 16 (36%) | 58 (38%) | ||
| LCC | 10 (9%) | 3 (7%) | 13 (9%) | ||
| Tumor stage | IA | 21 (19%) | 8 (18%) | 29 (19%) | .806 |
| IB | 30 (28%) | 11 (25%) | 41 (27%) | ||
| IIA | 24 (22%) | 8 (18%) | 32 (21%) | ||
| IIB | 33 (31%) | 17 (39%) | 50 (33%) | ||
| Progression at 3 years | Yes | 45 (42%) | 14 (32%) | 59 (39%) | .258 |
| No | 63 (58%) | 30 (68%) | 93 (61%) | ||
SD, standard deviation. Progression at 3 years: yes: recurrence and/or cancer-related death at 3 years; no: free from recurrence and/or cancer-related death at 3 years.
P values were calculated with Pearson's chi-squared test of independence; independent-samples t test for age.
Figure 1Hierarchical clustering of fold changes expression for the genes that displayed the statistical significance (adjusted P value < .05) for differential expression between AC and SCC samples in the training set (A), in the validation set (B), and in both sets (C). Columns correspond to individual genes, and rows represent AC or SCC tumors in appropriate group of the analyzed samples. For comparison of the gene expression between two data sets of samples, each heat map shows fold changes expression in both the training and validation sets. (A) The top and bottom rows represent expression data for genes significantly altered in AC samples compared with SCC samples (adjusted P value < .05) in the training set. Two middle rows correspond to expression levels of the same genes in the validation set. Despite the lack of statistically significant differences between AC and SCC in this set of samples, most of the genes have consistent sign with respect to training set. (B) The top and bottom rows displayed the differential expression pattern in SCC and AC samples (adjusted P value < .05) in the validation set. Two middle rows also reveal that the great majority of the genes in the training set have the same direction of change in expression compared with validation set. (C) Heat map of the statistically significant (adjusted P value < .05) results for differential gene expression profiles in AC and SCC samples in both sets. The scale represents the intensity of fold changes expression (log2 scale).
Figure 2The gene expression classifier construction using PAM algorithm with adopted shrinkage threshold by training performed on the two classes (AC and SCC histologies) in the training group. The optimal number of genes in the signature was selected based on the minimum number of misclassification errors using cross-validation procedure. The red color represents AC samples, and green color corresponds to SCC tumors.
Figure 3Gene expression–based signature for classification of SCC/AC subtypes using PAM method. The shrunken centroid algorithm was used to select 53 genes. The Y axis represents the individual genes according to their class scores, and the X axis displays the SCC and AC histologies.
Figure 4Classification of SCC/AC subtypes based on the 53-gene signature using PAM algorithm in the validation set. The red color represents AC samples, and the green color displays SCC tumors.
Classification and Performance Evaluation of Predictive Model Using PAM Method in the Validation Cohort
| Classification, Numbers | Classifier Performance % | ||||||
|---|---|---|---|---|---|---|---|
| Histology | PAM_AC | PAM_SCC | Sensitivity | Specificity | PPV | NPV | Accuracy |
| AC, n = 16 | 16 | 0 | 100 | 88 | 84.2 | 100 | 92.68 |
| SCC, n = 25 | 3 | 22 | |||||
PPV, positive predictive value.
Functional GO Enrichment Analysis for Set of 53 Genes Included in the Histotypic Signature
| Biological Process | Genes Involved | Adj. |
|---|---|---|
| Epidermis development | <.0001 | |
| Keratin filament | <.0001 | |
| Intermediate filament | <.0001 | |
| Structural constituent of cytoskeleton | <.0001 | |
| Extracellular exosome | <.0001 |
Significantly overrepresented GO terms in SCC are linked to genes.