| Literature DB >> 25944280 |
Duncan H Whitney1,2, Michael R Elashoff3, Kate Porta-Smith4,5, Adam C Gower6, Anil Vachani7, J Scott Ferguson8, Gerard A Silvestri9, Jerome S Brody10, Marc E Lenburg11, Avrum Spira12.
Abstract
BACKGROUND: The gene expression profile of cytologically-normal bronchial airway epithelial cells has previously been shown to be altered in patients with lung cancer. Although bronchoscopy is often used for the diagnosis of lung cancer, its sensitivity is imperfect, especially for small and peripheral suspicious lesions. In this study, we derived a gene expression classifier from airway epithelial cells that detects the presence of cancer in current and former smokers undergoing bronchoscopy for suspect lung cancer and evaluated its sensitivity to detect lung cancer among patients from an independent cohort.Entities:
Mesh:
Year: 2015 PMID: 25944280 PMCID: PMC4434538 DOI: 10.1186/s12920-015-0091-3
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Clinical and demographic characteristics of the patients used to train the classifier
|
|
|
|
|
|
|---|---|---|---|---|
| N | 223 | 76 | ||
| Sex | Female | 97 | 26 | 0.178 |
| Male | 126 | 50 | ||
| Age (median years) | 65 | 56 | <0.001 | |
| Race | Caucasian | 168 | 59 | 0.757 |
| African-American | 47 | 13 | ||
| Other | 5 | 3 | ||
| Unknown | 3 | 1 | ||
| Smoking status | Current | 101 | 26 | 0.107 |
| Former | 122 | 50 | ||
| Smoking history (median PY) | 43 | 30 | <0.001 | |
| Mass size | <2 cm | 46 | 23 | <0.001 |
| >2 to <3 cm | 30 | 12 | ||
| ≥3 cm | 122 | 19 | ||
| ill-defined infiltrate | 10 | 13 | ||
| Unknown | 15 | 9 | ||
| Mass location | Central | 86 | 16 | 0.018 |
| Peripheral | 60 | 30 | ||
| Central & peripheral | 60 | 18 | ||
| Unknown | 17 | 12 | ||
| Histology | Sub-type | |||
| SCLC | 40 | |||
| NSCLC | 180 | |||
| Adenocarcinoma | 83 | |||
| Squamous | 73 | |||
| Large cell | 6 | |||
| Mixed/undefined | 18 | |||
| Unknown | 3 | |||
| Histology | Stage | |||
| SCLC | Limited | 16 | ||
| Extensive | 18 | |||
| Unknown | 6 | |||
| NSCLC | 1 | 28 | ||
| 2 | 16 | |||
| 3 | 42 | |||
| 4 | 62 | |||
| Unknown | 32 | |||
| Benign disease | Sub-category | |||
| Alternative diagnosis | 54 | |||
| Infection | 23 | |||
| Sarcoid | 14 | |||
| Inflammation | 7 | |||
| Fibrosis | 4 | |||
| Other | 4 | |||
| Benign growths | 2 | |||
| Resolution/Stability | 22 |
The classifier training set included 223 patients diagnosed with lung cancer and 76 patients diagnosed with benign disease. The table lists clinical and demographic factors for all patients in the training set as well as characteristics of the lung cancer positive and patients with benign disease. The p-value for race is calculated for Caucasian versus non-caucasian.
Figure 1Pairwise correlation of genes with cancer-associated gene expression. The correlation between all possible pairs of genes with cancer-associated gene expression (n = 232) were assessed to identify groups of genes that share a similar pattern of gene expression. Unsupervised hierarchical clustering was used to group correlated genes into 11 clusters, with the dendrogram threshold level to establish clusters indicated on the y-axis (green line). Genes were selected from the clusters in a parsimonious manner to predict lung cancer status using linear regression. The classifier genes came from specific clusters (outlined in blue), using 2–4 genes from each cluster. Clusters 4 and 7 contain genes which were up-regulated in lung cancer, and clusters 1, 2, 9, and 10 were down-regulated in lung cancer.
Description of the gene expression classifier
|
|
|
| |||
|---|---|---|---|---|---|
| Age | 0.0623 | ||||
| GG | 0.5450 | RPS4Y1 | |||
| GS | 0.1661 | SLC7A11 | CLND10 | TKT | |
| GPY | 3.0205 | RUNX1T1 | AKR1C2 | ||
| CA (1) | −0.4406 | BST1 | CD177.1 | CD177.2 | |
| CA (2) | −0.3402 | ATP12A | TSPAN2 | ||
| CA (4) | 0.1725 | GABBR1 | MCAM | NOVA1 | SDC2 |
| CA (7) | 0.5670 | CDR1 | CGREF1 | CLND22 | NKX3-1 |
| CA (9) | −0.3160 | EPHX3 | LYPD2 | ||
| CA (10) | −0.3791 | MIA | RNF150 | ||
| Intercept (b0) | 3.3173 | ||||
a)Genomic gender was defined as GG = 1 (female) if RPS4Y1 < 7.5, 0 (male) otherwise. The predicted genomic smoking (GS) value was derived, where x = 40.8579-0.4462*SLC7A11-2.1298*CLND10-1.8256*TKT, and genomic smoking GS = e/(1+ e). The predicted genomic pack years (GPY) value was derived, where x = −5.1429 + 2.1891*RUNX1T1 -0.9506*AKR1C2, and genomic pack years GPY = exp(x)/(1 + exp(x)). The generalized equation for the prediction classifier was: Score = e/(1+ e), where, y = b + Ʃ(b*x), where b is the intercept, b is the coefficient, and x is the feature (as shown).
b)Features include patient age (as reported), GG, GS, GPY as described in the methods, and CA (i), the lung cancer gene clusters (shown in Figure 1).
Figure 2ROC curve of patients with a non-diagnostic bronchoscopy in the test set. The AUC = 0.81 for the 123 patients whose bronchoscopy did not result in a diagnosis of lung cancer (in which the prevalence of lung cancer = 31%).
Performance of bronchoscopy, classifier, and the combined procedures in the test set
|
|
|
|
|
|---|---|---|---|
| N, total | 163 | 123 | 163 |
| N, Lung cancer | 78 | 38 | 78 |
| N, Benign disease | 85 | 85 | 85 |
| Sensitivity (95% CI) | 51% (40-62%) | 92% (78-98%) | 96% (89-99%) |
| Specificity (95% CI) | 100% (95-100%) | 53% (42-63%) | 53% (42-63%) |
| NPV (95% CI) | 69% (60-77%) | 94% (83-98%) | 94% (83-98%) |
| PPV (95% CI) | 100% (90-100%) | 47% (36-58%) | 65% (56-73%) |
*The performance of the classifier was evaluated in patients in which bronchoscopy did not result in a finding of cancer (n = 123).
Sensitivity of bronchoscopy, the classifier, and the combined procedures for patients with lung cancer in the test set
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| All cancers | 78 | 51%a | 92%b | 96%c | |
| SCLC | 14 | 64% | 100% | 100% | |
| NSCLC | 64 | 48% | 91% | 95% | |
| Adenocarcinoma | 18 | 33% | 83% | 89% | |
| Squamous | 27 | 56% | 92% | 96% | |
| Large cell | 4 | 25% | 100% | 100% | |
| Undefined | 15 | 60% | 83% | 93% | |
| Histology | Stage | ||||
| SCLC | |||||
| Limited | 9 | 78% | 100% | 100% | |
| Extensive | 5 | 40% | 100% | 100% | |
| NSCLC | |||||
| 1 | 14 | 36% | 100% | 100% | |
| 2 | 2 | 50% | 100% | 100% | |
| 3 | 25 | 52% | 92% | 96% | |
| 4 | 22 | 55% | 80% | 91% | |
| Unknown | 1 | 0% | 100% | 100% |
Of 163 patients who underwent a diagnostic bronchoscopy procedure for suspicion of lung cancer, 78 were diagnosed with cancer. A lung cancer diagnosis was made at bronchoscopy (a) in 40 patients (51%; 95% CI, 40-62%), and in the remaining lung cancer patients where no diagnosis was made at bronchoscopy, (b) the classifier correctly predicted 34 of them (89%; 95% CI, 75-96%). The classifier combined with bronchoscopy (c) yielded a detection of 74 of 78 (95%; 95% CI, 87-98%) patients with lung cancer. The sensitivities of bronchoscopy, the classifier, and the combined procedures are also shown for lung cancers according to sub-type and stage.
Sensitivity of bronchoscopy, the classifier, and the combined procedures in the test set stratified by size of suspicious lesions
|
|
|
|
|
|
|---|---|---|---|---|
| <3 cm | 99 | 44% | 87% | 93% |
| >3 cm | 48 | 58% | 94% | 98% |
| Ill-def Infiltrate | 16 | 38% | 100% | 100% |
*Includes patients diagnosed with lung cancer and those with benign disease.