Literature DB >> 20161782

Transcriptome profiles of carcinoma-in-situ and invasive non-small cell lung cancer as revealed by SAGE.

Kim M Lonergan1, Raj Chari, Bradley P Coe, Ian M Wilson, Ming-Sound Tsao, Raymond T Ng, Calum Macaulay, Stephen Lam, Wan L Lam.   

Abstract

BACKGROUND: Non-small cell lung cancer (NSCLC) presents as a progressive disease spanning precancerous, preinvasive, locally invasive, and metastatic lesions. Identification of biological pathways reflective of these progressive stages, and aberrantly expressed genes associated with these pathways, would conceivably enhance therapeutic approaches to this devastating disease. METHODOLOGY/PRINCIPAL
FINDINGS: Through the construction and analysis of SAGE libraries, we have determined transcriptome profiles for preinvasive carcinoma-in-situ (CIS) and invasive squamous cell carcinoma (SCC) of the lung, and compared these with expression profiles generated from both bronchial epithelium, and precancerous metaplastic and dysplastic lesions using Ingenuity Pathway Analysis. Expression of genes associated with epidermal development, and loss of expression of genes associated with mucociliary biology, are predominant features of CIS, largely shared with precancerous lesions. Additionally, expression of genes associated with xenobiotic metabolism/detoxification is a notable feature of CIS, and is largely maintained in invasive cancer. Genes related to tissue fibrosis and acute phase immune response are characteristic of the invasive SCC phenotype. Moreover, the data presented here suggests that tissue remodeling/fibrosis is initiated at the early stages of CIS. Additionally, this study indicates that alteration in copy-number status represents a plausible mechanism for differential gene expression in CIS and invasive SCC.
CONCLUSIONS/SIGNIFICANCE: This study is the first report of large-scale expression profiling of CIS of the lung. Unbiased expression profiling of these preinvasive and invasive lesions provides a platform for further investigations into the molecular genetic events relevant to early stages of squamous NSCLC development. Additionally, up-regulated genes detected at extreme differences between CIS and invasive cancer may have potential to serve as biomarkers for early detection.

Entities:  

Mesh:

Year:  2010        PMID: 20161782      PMCID: PMC2820080          DOI: 10.1371/journal.pone.0009162

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Lung cancer is estimated to inflict the highest cancer mortality rate in the United States in 2009 [1]. Molecular-targeted therapies directed towards signal transduction pathways active in cellular proliferation, differentiation, apoptosis, and angiogenesis, have been used in the treatment of non-small cell lung cancer (NSCLC), but with varied and often disappointing results [2], [3], [4], [5], [6]. As the simultaneous targeting of multiple signaling pathways has shown improvements in clinical response [6], further knowledge of biological pathways involved in lung cancer development, along with identification of aberrantly expressed genes within these pathways, would be expected to facilitate development of novel therapeutic intervention [7], [8]. Carcinoma-in-situ (CIS) of the lung, are preinvasive lesions of squamous NSCLC, frequently associated with histological, cytological, and genetic aberrations, and progression to invasive cancer typically ensues [9], [10], [11], [12]. As these minute lesions are optimally visible in the central airways by fluorescent bronchoscopy or LIFE (lung-imaging fluorescent endoscopy) [13], [14], experimental and clinical studies are rare ( ). Although many expression profiling studies have been reported for advanced stage lung tumors [15], [16], early stage (CIS and locally invasive) lesions remain largely unexplored. Molecular genetic analysis of preinvasive lesions, free from background noise associated with commonly studied advanced tumors, is essential to the identification of key genes and corresponding molecular pathways underlying early events in neoplastic transformation and cancer development. An understanding of these early aberrations is essential for prompt therapeutic intervention of this devastating disease.
Figure 1

Detection of carcinoma-in-situ bronchial lesions.

Bronchoscopy using A. white light for detection of CIS lesions (indicated by arrow), or B. LIFE (lung-imagine fluorescent endoscopy) for detection of CIS lesions (indicated by arrow). C. Histological section identifying a CIS lesion within the bronchial epithelium, typified by extensive squamous stratification.

Detection of carcinoma-in-situ bronchial lesions.

Bronchoscopy using A. white light for detection of CIS lesions (indicated by arrow), or B. LIFE (lung-imagine fluorescent endoscopy) for detection of CIS lesions (indicated by arrow). C. Histological section identifying a CIS lesion within the bronchial epithelium, typified by extensive squamous stratification. Large-scale gene expression studies frequently employ microarray technologies. Recent studies emphasize the value of tailor-made arrays, with target selection based upon prior knowledge of gene expression, to more accurately reflect the transcriptome of the tissue of interest; specifically applied to the study of NSCLC [17]. SAGE (serial analysis of gene expression) offers a sequence-based, non-biased approach to comprehensive transcriptome analysis, with no prior knowledge of expression required [18]. Moreover, information gained from SAGE analysis potentially contributes to design of more appropriate microarrays for focused research and diagnostic purposes. In a previous study, we described the transcriptomes of smoke-damaged bronchial epithelium and lung parenchyma by way of SAGE [19]. Here we extend our analysis to the unexplored preinvasive stage in lung cancer development, and present the first report of large-scale expression profiling of carcinoma-in-situ of the lung. Additionally, we describe transcriptomes for invasive squamous cell carcinoma (SCC), and precancerous metaplastic and dysplastic (PC) lesions, derived from the construction and analysis of multiple SAGE libraries, culminating in greater than 22 megabases of sequence information. Through the utilization of Ingenuity Pathway Analysis, we have identified genes associated with epidermal development and xenobiotic metabolism/detoxification as components of CIS lesions, and genes associated with immune response and tissue remodeling/fibrosis as components of invasive SCC. Additionally, we found genes associated with mucociliary differentiation to be down-regulated in both CIS and PC lesions. We discuss the potential relevance of these transcriptional aberrations to the early stages of lung cancer development, and present a view of the CIS transcriptome previously unknown.

Materials and Methods

Ethics statement

This study was approved by the University of British Columbia-British Columbia Cancer Agency Research Ethics Board (UBC-BCCA REB). Written consent was obtained from all subjects.

Specimens

Bronchial epithelial (BE) brushing specimens were previously described [19]. Biopsy specimens included precancerous (PC) lesions (metaplasia, dysplasia), carcinoma-in-situ (CIS) lesions, and invasive squamous cell carcinoma (SCC) tumor tissue. PC and CIS specimens were obtained by autofluorescent bronchoscopy (Figure 1). All biopsies (except for those used in constructing libraries CIS-3, SCC-5, and SCC-6) were collected in RNAlater® [Applied Biosystems (Ambion Inc., Canada)] and stored at −85°C. The biopsy used in construction of library CIS-3 was embedded in OCT media, and stored at −85°C. For construction of libraries SCC-5 and SCC-6, RNA was isolated from tumor tissue from eight individuals by guanidium isothiocyanate and phenol/chloroform extraction. All subjects contributing to this study were either former or current smokers ( ).
Table 1

Summary of patient demographics and library descriptions.

Library1 GenderAgeSmoking statusPathologyUseful tags sequencedGEO Series Accession
BE-32 M68formerNA152,086GSE5473
BE-4A&4BM69formerNA222,395GSE3707
BE-52 M70formerNA158,288GSE5473
BE-6M67formerNA91,571GSE3707
BE-7M56currentNA81,309GSE3707
BE8BM72formerNA81,799GSE5473
BE-92 M68formerNA162,903GSE5473
BE-10M65formerNA86,725GSE3707
BE-11AF56formerNA89,622GSE3707
BE-12F63currentNA88,186GSE3707
BE-132 F63currentNA93,345GSE5473
BE-14F63formerNA155,462GSE3707
BE-15M72formerNA143,129GSE3707
BE-16F71formerNA131,285GSE3707
BE total  = 1,738,105
CIS-1M61formerin-situ163,460GSE78983
CIS-2M68formerin-situ160,466GSE7898
CIS-3M69currentin-situ201,617GSE7898
CIS-4M70formerin-situ174,246GSE7898
CIS-5M72formerin-situ211,034GSE7898
CIS total  = 910,823GSE7898
SCC-1M78currentinvasive150,712GSE7898
SCC-2M62formerinvasive152,220GSE7898
SCC-3M70currentinvasive208,451GSE7898
SCC-4M68formerinvasive176,874GSE7898
SCC-51M/3F54, 64, 75, 84former/currentinvasive152,786GSE7898
SCC-64M65, 69, 74, 77former/currentinvasive150,233GSE7898
SCC total  = 991,276GSE7898
MetM51currentmetaplasia202,340GSE7898
DysM71formerdysplasia155,185GSE7898
Precancer total  = 357,525

BE, bronchial epithelial; CIS, carcinoma in-situ; SCC, invasive squamous cell carcinoma; Met, metaplasia; Dys, dysplasia.

With the exception of SCC-5 and SCC-6, all libraries were constructed from single individuals; the former of which were constructed from two pools of four individuals each.

With deeper sequencing of previously reported libraries generated in previous studies [19], [37].

All libraries constructed in this study were submitted to GEO as a single series.

BE, bronchial epithelial; CIS, carcinoma in-situ; SCC, invasive squamous cell carcinoma; Met, metaplasia; Dys, dysplasia. With the exception of SCC-5 and SCC-6, all libraries were constructed from single individuals; the former of which were constructed from two pools of four individuals each. With deeper sequencing of previously reported libraries generated in previous studies [19], [37]. All libraries constructed in this study were submitted to GEO as a single series.

SAGE library construction and sequence processing

With the exception of construction of libraries SCC-5 and SCC-6, each specimen (biopsy or bronchial brushing as previously described [19]) was retrieved from RNAlater® (or OCT for library CIS-3), and homogenized in lysis/binding solution. For libraries SCC-5 and SCC-6, RNA was pooled from four specimens in equal amount, and ∼19 µg of total RNA were immersed in lysis/binding solution and used for construction of each library. The resultant lysates were used directly for SAGE library construction according to the MicroSAGE protocol, using Nla III as the anchoring enzyme and Bsm FI as the tagging enzyme (www.ncbi.gov/SAGE). This approach was shown to yield highly reproducible SAGE libraries [19]. On average, 105 SAGE tags, excluding linker and duplicate ditags, were sequenced per library; all raw SAGE data has been deposited with GEO (Table 1). For normalization, tag counts were scaled to 106 tags for each library, i.e. as tags per million (TPM).

Cluster analysis

To evaluate the degree of similarity among the lung SAGE libraries generated in this study (two PC libraries, five CIS libraries, and six invasive SCC libraries) and those generated in a previous study (14 BE libraries, and two lung parenchyma libraries), cluster analysis was used. For this analysis, the 300 most abundant tags were retained from each library, yielding a merged list of 1128 unique tags. The data were then log10 transformed and clustered using Genesis, using an average-linkage algorithm and a Euclidean distance metric [20]. For tags with counts of zero, the data was not transformed, and was retained as zero.

Differential expression analysis

Unless stated otherwise, differential gene expression was defined by a minimal three-fold difference (rounded to one decimal place) in average normalized tag counts (TPM) between any two datasets being compared. In addition, a minimal average normalized tag count of 40 TPM was required in the over-expressing dataset. Tag-to-gene mapping was according to the SAGE Genie database, September 17, 2009 version [21] (cgap.nci.nih.gov/SAGE). As the newest release of SAGE Genie does not automatically align to mitochondrial transcripts, we utilized a manual mapping approach to identify these transcripts.

Data analysis

Datasets of differentially expressed genes were analyzed primarily through the use of Ingenuity Pathway Analysis (IPA), version 8.0 (Ingenuity® Systems, www.ingenuity.com). Functional analysis of entire datasets identified the biological functions and/or diseases that were most significant to the dataset. Genes from the dataset that were associated with biological functions and/or diseases in the Ingenuity Pathways Knowledge Base were considered for the analysis (IPA eligible mapped IDs). Canonical pathway analysis of entire datasets identified the pathways from the IPA library of canonical pathways that were most significant to the dataset, based upon genes within the dataset that were associated with a canonical pathway in the Ingenuity Pathways Knowledge Base. Toxicity list analysis of entire datasets identified the Tox lists from the IPA library of Tox Lists that were most significant to the dataset. Genes from the dataset that were associated with a List were considered for the analysis. For each of these analyses, the significance of the association between the genes in the dataset and the assigned biological function and/or disease/canonical pathway/toxicity list, was measured by Fischer's exact test to calculate a p-value to determine the probability that the association was explained by chance alone. My Pathways is a graphical representation of the biological relationships between gene products, which are supported by at least one reference from the literature, from a textbook, or from canonical information stored in the Ingenuity Pathways Knowledge Base. Genes are displayed using various shapes that represent the functional class of the gene product as indicated in the legend within the specific figures.

RNA isolation from clinical specimens

For quantitative RT-PCR analysis, RNA from matched tumor and normal lung parenchyma were collected from resected tissues. Briefly, multiple sections of each tumor were cut, with the first and last in a series stained with H&E for inspection by a lung pathologist. After confirming diagnosis and assessing tumor heterogeneity with pathology review, we captured those portions of the tumor having a minimum of 70% cancer cells. RNA was extracted from both microdissected tumor tissue and from associated normal tissue with the RNeasy kit (Qiagen, Mississauga, ON, Canada). In total, RNA from nine paired tumor and normal parenchyma samples was used for RT-PCR analysis. In addition, RNA was extracted as previously described [19], from six bronchial brushings representing three current and three former smokers, for quantitative RT-PCR analysis.

RT-PCR analysis

For quantitative RT-PCR (qPCR), approximately 1 µg total RNA was converted into cDNA using the High-Capacity cDNA Archive kit (cat# 4322171, Applied Biosystems), and gene-specific quantitative PCR was performed using TaqMan Universal PCR Master Mix and TaqMan primers (cat# 4326708; Applied Biosystems), according to manufacturer's recommendation. Beta-actin was used as an endogenous control (primer product code 4352935E). Primer product codes for test genes were as follows: ECE2 (Hs00981189_g1), MAGEA9 (Hs00245619_s1), MAGEA11 (Hs00377815_m1), CLDN1 (Hs00221623_m1), CKS1B (Hs01029137_g1), POSTN (Hs00170815_m1), ARTN (Hs00754699_s1), SFRP2 (Hs00293258_m1), UBE2S (Hs00819350_m1), C19orf48 (Hs00364147_m1), FBXO27 (Hs00381091_m1), MCM2 (Hs00170472_m1), NTS (Hs00175048_m1), SLC6A8 (Hs00373917_g1), and SLC2A1 (Hs00197884_m1, Hs00892681_m1). The reactions were run on an iCycler iQ Real-Time PCR Detection System (Bio-Rad Laboratories (Canada) Ltd., Mississauga, ON, Canada). Differential expression was determined using the delta-delta CT method. For the tumor/normal parenchyma pairs, fold changes were calculated for each pair. When comparing the tumors to the brushings, fold changes were calculated comparing each tumor to the average expression of the six BE samples. The average tumor over normal parenchyma fold change and the average tumor over BE fold change are reported for each gene.

Gene dosage determination

Carcinoma-in-situ specimens used for copy-number profiling were collected in 10% buffered formalin. Microdissection was performed on paraffin sections to obtain cancer cells. Typically greater than 20 serial sections were necessary to yield sufficient material. DNA was isolated from collected cells by proteinase K phenol/chloroform extraction as previously described [22]. Whole genome tiling path array CGH analysis was performed using SMRT array version 2 as previously described [23], [24]. This platform is suitable for profiling formalin fixed paraffin embedded material [22], [23], [25], [26], [27], [28], [29], [30]. Genome segmentation and copy number status was performed using aCGH-Smooth on array image data and visualized using SIGMA software [31], [32], [33], [34]. Loss array elements were assigned a value of −1, retained elements a value of 0, and gained elements a value of 1. Twenty CIS specimens were profiled in total, and a threshold for gene-specific copy number gain/loss was set at 20%. The 20% threshold was imposed in an effort to reduce the detection of spurious or random events due to background genomic instability inherent to the samples, and thereby selecting for those events which occur with some degree of regularity.

Public microarray data comparisons

Microarray expression data for 53 primary squamous tumors was retrieved from the Lung Cancer Dataset at NCBI, GEO accession number GSE3141 [35]. Microarray expression data for 67 bronchial brushings retrieved from a mixed population of current and former smokers, was profiled internally. All microarray data was RMA normalized [36].

Results and Discussion

SAGE offers an unbiased and comprehensive approach to expression profiling, limited only by the depth of sequencing chosen by the researcher, and offers an unprecedented opportunity for transcript discovery. This is in sharp contrast to microarray analysis, where expression profile and scope of analysis is predetermined by target design typically employing a limited number of the most commonly characterized genes [17]. In a previous study, we described the transcriptomes of smoke-damaged bronchial epithelium and lung parenchyma by way of SAGE [19]. Here we extend our analysis to the largely unexplored area of early-stage lung cancer development, and present the first report of large-scale gene expression profiling of carcinoma-in-situ (CIS) of the lung. An understanding of the molecular genetics governing the preinvasive stages is critical to facilitate early detection and immediate therapeutic intervention before progression to invasive cancer ensues. In the current study, we present a comparative analyses, with emphasis on genes over-expressed in CIS and invasive cancer transcriptomes, relative to non-cancerous transcriptomes of the lung including bronchial epithelium (BE), and precancerous lesions (PC: squamous metaplasia and dysplasia). Twenty-seven lung SAGE libraries comprised of 3,997,729 total sequence tags (∼40 megabases of high quality DNA sequence) were analyzed in this study. Normal lung is represented by 14 bronchial epithelial libraries (BE-1 through BE-14) [19], [37]. Precancer stage is represented by two libraries derived from squamous metaplasia (Met) and squamous dysplasia (Dys). Squamous cell carcinoma of the lung is represented by five carcinoma in-situ libraries (CIS-1 through CIS-5), and six invasive carcinoma libraries (SCC-1 through SCC-6) (detailed in and ). (It is noted that specimens comprising the BE, PC, CIS, and SCC datasets were from a mixed population of current and former smokers.) This data has identified greater than 129,000 unique sequence tags/potential transcripts in CIS lesions, and nearly 140,000 unique sequence tags/potential transcripts in invasive squamous NSCLC.
Table 2

Summary of SAGE libraries generated and tags sequenced.

BE3 PrecancerCISSCCTotal
Libraries1425627
Tags Sequenced1,738,105357,525910,823991,2763,997,729
Unique tags (UT)1 177,71370,043129,683139,843304,568
UTs excl. singleton2 76,47124,66149,05452,017150,331

BE, bronchial epithelial; Precancer (squamous metaplasia, squamous dysplasia); CIS, carcinoma-in-situ; SCC, invasive squamous cell carcinoma.

Unique tags are defined by the 10 nucleotide long sequence, and represent the maximum number of unique transcripts within the respective SAGE dataset.

Exclusion of singletons; singleton is defined as sequence tags having a raw tag count of one within an individual dataset (comprised of multiple libraries as indicated).

BE libraries were generated in previous studies [19], [37].

BE, bronchial epithelial; Precancer (squamous metaplasia, squamous dysplasia); CIS, carcinoma-in-situ; SCC, invasive squamous cell carcinoma. Unique tags are defined by the 10 nucleotide long sequence, and represent the maximum number of unique transcripts within the respective SAGE dataset. Exclusion of singletons; singleton is defined as sequence tags having a raw tag count of one within an individual dataset (comprised of multiple libraries as indicated). BE libraries were generated in previous studies [19], [37].

Analysis of the top 300 most abundant tags in BE, CIS and SCC SAGE datasets

Cluster analysis

Cluster analysis yielded anticipated grouping of SAGE libraries, attesting to sample quality. For this analysis, the 300 most abundant tags were retained from each library, yielding a merged list of 1128 unique tags. Average linkage clustering analysis based on the 1128 most abundant SAGE tags, reveals that all cancer libraries (both CIS and invasive SCC) cluster together, and separately from the BE libraries ( ). We note some clustering of the invasive SCC libraries (four out of six). Similar clustering is observed when using the top 500 or top 1000 unique tags per library (data not shown).
Figure 2

Analysis of the top 300 most abundant tags from the BE, CIS, and invasive cancer datasets.

A. Cluster analysis of lung SAGE libraries. All SAGE libraries from this study, including five carcinoma in-situ libraries (CIS-1 through CIS-5), six invasive squamous cell carcinoma libraries (SCC-1 through SCC-6), one squamous metaplasia library (Met), and one squamous dysplasia library (Dys), as well as 14 bronchial epithelial libraries (BE-1 through BE-14), and two normal lung parenchyma SAGE libraries (LP-1, LP-2; accession GSE3708) generated in a previous study [19], [37], were analyzed by cluster analysis using an average-linkage algorithm. The top 300 most abundant tags were retained from each library, and analysis was based on 1128 unique tags in total. In the dendrogram, branch length represents distance. B–D. IPA functional analysis of the most abundant genes in the BE, CIS, and invasive cancer datasets. Tag-to-gene mappings for the top 300 most abundant tags from the BE, CIS, and SCC datasets, were used for IPA core analysis, consisting of 220, 231, and 233 IPA eligible mapped IDs, respectively. The three sets of data were displayed together using IPA core comparisons, and the five most significant functions within Physiological System Development and Function are shown for each of the three datasets. The data in B is sorted according to highest significance in BE, the data in C is sorted according to highest significance in CIS, and the data in D is sorted according to highest significance in invasive SCC. The orange line indicates the threshold limit of significance, preset at a p-value of 0.05. For a complete listing of the tags/mapped IDs used for this analysis see Table S1.

Analysis of the top 300 most abundant tags from the BE, CIS, and invasive cancer datasets.

A. Cluster analysis of lung SAGE libraries. All SAGE libraries from this study, including five carcinoma in-situ libraries (CIS-1 through CIS-5), six invasive squamous cell carcinoma libraries (SCC-1 through SCC-6), one squamous metaplasia library (Met), and one squamous dysplasia library (Dys), as well as 14 bronchial epithelial libraries (BE-1 through BE-14), and two normal lung parenchyma SAGE libraries (LP-1, LP-2; accession GSE3708) generated in a previous study [19], [37], were analyzed by cluster analysis using an average-linkage algorithm. The top 300 most abundant tags were retained from each library, and analysis was based on 1128 unique tags in total. In the dendrogram, branch length represents distance. B–D. IPA functional analysis of the most abundant genes in the BE, CIS, and invasive cancer datasets. Tag-to-gene mappings for the top 300 most abundant tags from the BE, CIS, and SCC datasets, were used for IPA core analysis, consisting of 220, 231, and 233 IPA eligible mapped IDs, respectively. The three sets of data were displayed together using IPA core comparisons, and the five most significant functions within Physiological System Development and Function are shown for each of the three datasets. The data in B is sorted according to highest significance in BE, the data in C is sorted according to highest significance in CIS, and the data in D is sorted according to highest significance in invasive SCC. The orange line indicates the threshold limit of significance, preset at a p-value of 0.05. For a complete listing of the tags/mapped IDs used for this analysis see Table S1.

Ingenuity pathway analysis

To characterize and compare the bronchial epithelial and cancer transcriptomes, we computated average normalized tag counts for BE (14 libraries), CIS (5 libraries), and invasive SCC (6 libraries) datasets, and subsequently selected the most abundant 300 unique tags from each dataset for analysis (). Tags mapping to mitochondrial-encoded genes and ribosomal protein genes, were found at similar frequencies within the top 300 most abundant tags, across all three datasets of BE, CIS, and SCC, at ∼8% and ∼18%, respectively. We used the core analysis component of Ingenuity Pathway Analysis (IPA) to categorize these genes according to biological functions. Only those molecules having at least one functional annotation in the IPA Knowledge Base qualify for analysis, and included 220 (BE), 231 (CIS), and 233 (SCC) IPA eligible genes. These analyses reveal that genes within the category of Hair and Skin Development and Function are highly expressed in the CIS transcriptome relative to both BE and SCC, and genes within the categories of Hematological System Development and Function and Immune Cell Trafficking are highly expressed in the SCC transcriptome relative to both BE and CIS. Hence, high expression of genes associated with epidermal development is identified here as a characteristic feature of the CIS transcriptome, and high expression of genes associated with cellular movement as a characteristic feature of the SCC transcriptome. See for a summary of these analyses. A notable feature of both the CIS and SCC datasets is the abundance of tags mapping to the constant region of immunoglobulin heavy chains. In fact, the SAGE tag for IGHG1 is the most abundant tag in both the CIS and the invasive cancer datasets (Table S1). Although it is possible that expression of these immunoglobulin chains originates from infiltrating lymphoid tissue and surrounding stroma, previous studies have demonstrated expression of heavy chain constant and variable regions of immunoglobulins in breast cancer epithelial cells [38], [39], [40], and expression of IgG heavy and light chains in various epithelial cancer cells including lung SCC [41]. These immunoglobulin chains may represent cancer cell autoantibodies, and stimulate growth in an autocrine/paracrine fashion [40]. Intriguingly, the abundance of tags mapping to immunoglobulin heavy chain transcripts in CIS preinvasive lesions of the lung, in contrast to the relatively low detection in both BE and precancerous metaplastic and dysplastic lesions (detailed in subsequent tables), suggests that up-regulation of these transcripts may have relevance to initiation of NSCLC.

Gene expression changes common to carcinoma-in-situ and precancerous lesions

Transition from a healthy bronchial epithelium to invasive cancer is thought to proceed via progression of histological and genetic abnormalities: BE to PC to CIS to SCC, where PC represents precancerous lesions (squamous metaplasia and dysplasia). Squamous metaplasia is a transient component of normal wound healing of the bronchial epithelium, and typically resolves to a re-differentiated epithelium composed of pseudostratified ciliated and secretory cells, restoring bronchial function [42]. (Use of the term PC here does not imply an obligatory progression to cancer, but rather refers to lesions/abnormalities that despite infrequent progression to cancer, are considered as precursors to cancer.) Conversely, CIS lesions demonstrate a low regression frequency with a high incidence of progression to invasive cancer [43], [44] and are characterized by a more extensive stratification of squamous cell types compared to PC [9], [11], [12]. By identifying gene expression changes common to both PC and CIS relative to BE, we focus on those genetic events which occur early and persist through to CIS. In accordance with our selection criteria (minimal three-fold difference in average normalized tag abundance; minimal average normalized tag abundance of 40 TPM in the over-expressing dataset), 868 SAGE tags were found to be similarly differentially expressed in PC and CIS relative to BE, consisting of 190 up-regulated tags, and 678 down-regulated tags ( ).
Figure 3

Venn diagrams of differentially expressed genes discussed in this manuscript.

Criteria for differential gene expression was defined as a minimal three-fold difference in normalized mean tag counts, and with a minimal mean tag abundance of 40 TPM in the over-expressing datasets. Up-arrows indicate up-regulated gene expression changes; down-arrows indicate down-regulated gene expression changes; numerical values refer to the number of differentially expressed tags. Areas of interception reflect gene expression changes in common between the two datasets. A. Expression changes in carcinoma-in-situ and precancerous lesions relative to BE. B. Expression changes in the cancer datasets relative to both bronchial epithelium and precancerous datasets. BE: bronchial epithelial; PC: precancer; CIS: carcinoma-in-situ; SCC: invasive squamous cell carcinoma.

Venn diagrams of differentially expressed genes discussed in this manuscript.

Criteria for differential gene expression was defined as a minimal three-fold difference in normalized mean tag counts, and with a minimal mean tag abundance of 40 TPM in the over-expressing datasets. Up-arrows indicate up-regulated gene expression changes; down-arrows indicate down-regulated gene expression changes; numerical values refer to the number of differentially expressed tags. Areas of interception reflect gene expression changes in common between the two datasets. A. Expression changes in carcinoma-in-situ and precancerous lesions relative to BE. B. Expression changes in the cancer datasets relative to both bronchial epithelium and precancerous datasets. BE: bronchial epithelial; PC: precancer; CIS: carcinoma-in-situ; SCC: invasive squamous cell carcinoma.

Up-regulated expression changes

Approximately 35% of the tags up-regulated in CIS relative to BE (190 out of 529 tags) were found to be commonly up-regulated in PC lesions, and approximately 25% of tags up-regulated in PC lesions relative to BE (190 out of 754 tags) were found to be similarly up-regulated in CIS (Figure 3A). See for a description of these tags up-regulated in both PC and CIS. IPA functional analysis (based on 143 eligible mapped IDs) indicates that roughly 35% of these commonly up-regulated genes are associated with epidermal development and associated disorders, as described in . In addition to those genes described in Table 3, a review of the literature identified other genes within the PC/CIS up-regulated dataset to be associated with epidermal development, including SBSN, CNFN, CRCT1, additional members of the small proline-rich family of proteins (SPRR2E and SPRR3), and additional members of the S100A family of calcium-binding proteins. Many of these genes are encoded either within the epidermal differentiation complex (EDC) locus on 1q21 [45], [46] or within a conserved locus on 19q13 [47], and specify components of the cornified cell envelope, a structure that provides barrier protection to epidermis and internal epithelium in response to insult or injury [48], [49].
Table 3

Genes associated with epidermal development in the CIS_PC over BE dataset by IPA functional analysis.

Function Annotationp-valueMolecules3 # Molecules4
Category: Dermatological Diseases and Conditions 1
dermatological disorder4.18E-17COL1A1, COL1A2, COL3A1, COL6A1, COL7A1, DEFB103A, DEFB4, DSG1, DSG3, DSP, FYN, GJA1, HIF1A, IGHG1, IGL@, ITGA6, JUP, KRT5, KRT14, KRT17, KRT6A, KRT6B, LGALS1, LMNA, LTBP2, MMP1, PKP1, S100A7, S100A8, SELL, SFN, TP63, TUBA1C, TUBA4A, TYMS35
dermatological disorder of mammalia2.98E-03DSG1, DSG3, IGHG1, MMP1, SELL5
epidermolysis bullosa3.66E-09COL7A1, DSP, ITGA6, KRT5, KRT14, MMP16
recessive epidermolysis bullosa dystrophica8.12E-05COL7A1, MMP12
burn1.04E-04COL1A1, COL1A2, COL3A1, COL6A1, COL7A15
Ehlers-Danlos syndrome1.94E-04COL1A1, COL1A2, COL3A13
skin cancer4.53E-04CD44, FYN, HIF1A, KLK6, MCL1, MMP2, TUBA1C, TUBA4A, TYMS9
epidermolysis bullosa simplex4.81E-04KRT5, KRT142
skin tumor1.62E-03CD44, FYN, HIF1A, MCL1, MMP2, TUBA1C, TUBA4A, TYMS8
disease of skin2.04E-03CAV1, DSG3, KRT14, MMP1, SELL5
acanthosis2.80E-03DSG3, MMP12
psoriasis3.90E-03DEFB103A, DEFB4, IGHG1, S100A74
pemphigus of mice4.23E-03DSG1, DSG32
Category: Hair and skin Development and Function 2
development of epidermis7.70E-13COL1A1, COL7A1, DSP, EMP1, EVPL, FABP5, KRT5, KRT14, KRT17, S100A7, SPRR1A, SPRR1B, TP6313
development of skin1.07E-03COL1A1, COL3A1, SFN, TP634
differentiation of keratinocytes1.12E-08CSTA, DSP, EVPL, FABP5, IVL, SFN, SPRR1A, SPRR1B, TP639
cell movement of keratinocytes4.23E-03JUP, TP632
proliferation of epidermal cells6.78E-03JUP, KLK6, SFN, TP634

IPA Diseases and Disorders.

IPA Physiological System Development and Function.

See Table S2 for corresponding tag abundance.

47 unique genes were identified out of 143 IPA eligible mapped IDs.

IPA Diseases and Disorders. IPA Physiological System Development and Function. See Table S2 for corresponding tag abundance. 47 unique genes were identified out of 143 IPA eligible mapped IDs. IPA pathway graphical representation of the genes commonly up-regulated in CIS and PC relative to BE, is presented in . Considering the molecular interactions identified by IPA, functional associations among desmosomal cadherins and catenins are prominent within the PC/CIS up-regulated dataset. Desmosomes are intercellular adhesion junctions that provide mechanical integrity to the epithelium, and studies indicate that desmosomal cadherins modulate keratinocyte differentiation and epidermal morphogenesis [50]. This analysis also suggests that a signaling cascade mediated by members of the 14-3-3 family of proteins, may be active here. 14-3-3 sigma (SFN) mediates keratinocyte differentiation and stratification of epidermis [51]. The pathway diagram also suggests that specific aspects of keratinocyte terminal differentiation may be mediated by the AP-1 transcription factor FOSL2, which may have an additional role in extracellular matrix remodeling. Indeed, FOSL2 has been identified as a mediator of pulmonary fibrosis [52]. A consideration of genes associated with the transcription factor HIF1A in the PC/CIS lesions, is suggestive of a remodeling/profibrotic response to hypoxic growth conditions [53], [54], [55]. Indeed, a functional link between hypoxia and fibrosis is documented in the literature [56], [57], [58]. Expression of other genes identified here, such as NCF1 (the regulatory subunit of NADPH oxidase), and the heme catabolic enzyme HMOX1, is also indicative of an oxygen-related stress response and tissue remodeling/fibrosis [59], [60], [61], [62]. It is noted that additional IPA analysis identified Hepatic Fibrosis/Hepatic Stellate Cell Activation and 14-3-3-mediated Signaling as significant categories within Canonical Pathways, and identified Hepatic Fibrosis as the most significant category within Toxicity Lists (data not shown).
Figure 4

IPA pathway graphical representation for the CIS_PC over BE dataset of up-regulated genes.

155 genes (IPA mapped IDs) are represented out of 190 SAGE tags up-regulated in both CIS and PC relative to BE. (See Table S2 for tag data.) Gene products are positioned according to subcellular localization. Only direct connections (i.e., direct physical contact between two molecules) among the individual gene products are shown for clarity of presentation; lines indicate protein-protein binding interactions, and arrows refer to “acts on” interactions such as proteolysis, expression, and protein-DNA/RNA interactions. Genes associated with epidermal development (see Table 3) are highlighted.

IPA pathway graphical representation for the CIS_PC over BE dataset of up-regulated genes.

155 genes (IPA mapped IDs) are represented out of 190 SAGE tags up-regulated in both CIS and PC relative to BE. (See Table S2 for tag data.) Gene products are positioned according to subcellular localization. Only direct connections (i.e., direct physical contact between two molecules) among the individual gene products are shown for clarity of presentation; lines indicate protein-protein binding interactions, and arrows refer to “acts on” interactions such as proteolysis, expression, and protein-DNA/RNA interactions. Genes associated with epidermal development (see Table 3) are highlighted. Additional analysis by Gene Ontology using the GATHER annotation tool [63], similarly identified epidermal development as a prominent component of the transcriptome of commonly up-regulated genes in CIS and PC lesions, relative to BE ().

Down-regulated expression changes

The majority of tags down-regulated in CIS relative to BE, were found to be commonly down-regulated in PC lesions (678 tags out of 904 tags), and vice versa (678 tags out of 912 tags down-regulated in PC relative to BE) (Figure 3A). See for a description of these tags down-regulated in both PC and CIS. IPA functional analysis (based on 347 eligible mapped IDs) identified Cellular Assembly and Organization and Embryonic Development as the two most significant functional categories (p-values 1.29E-05–4.67E-02 and 1.30E-05–4.32E-02, respectively) for this dataset of commonly down-regulated genes (data not shown). Within the former, seven genes associated with biogenesis and formation of cilia were identified. These include dynein components of the cilium axoneme (DNAI2, DYNC2H1), FOXJ1 transcription factor and master regulator of motile ciliogenesis [64], intraflagellar transport proteins IFT172 and IFT88, kinesin family member KIF3A, and BBS5, a protein family member linked to Bardet-Biedl Syndrome and localized to ciliary basal bodies. Genes identified within the functional category of Embryonic Development are associated with patterning, specification of the midline axis, and formation of the neural tube. As these developmental processes have been linked to ciliary activity, specifically primary cilia-mediated Hedgehog signaling [65], [66], presumably this data overall reflects loss of the ciliated cell phenotype as common to PC and CIS lesions. We also note at a lower significance, down-regulation of genes associated with DNA recombination and repair in CIS and PC lesions (p-value 1.16E-04-4.42E-02). These include DNA repair genes (CCNO and NEIL1, glycosylases associated with base-excision repair; cyclin-dependent kinase CDK2; glycoprotein CLU; p53-inducible ribosomal protein RPS27L associated with the G1 DNA damage checkpoint; helicase RUVBL2; TRIP13, a regulator of double-strand break repair and meiotic checkpoint control; antioxidant SOD1), genes associated with DNA modification (editing enzyme APOBEC3G; antioxidant CAT), and DNA catabolism (exoribonuclease XRN2), mediators of ATP hydrolysis (ATPIF1, MAPK1, N4BP2, RUVBL1, RUVBL2, TGM2), centriole duplication (AKAP9, CETN2), and folate receptor FOLR1. IPA pathway graphical representation of genes commonly down-regulated in CIS and PC also highlights genes associated with ciliogenesis including axonemal components and centrosomal proteins, genes associated with goblet cell differentiation, and genes associated with epithelial cell polarization and ion transport (). Down-regulation of these genes, as well as many others in the BE over PC_CIS dataset also associated with ciliogenesis but not identified as such by IPA, reflect a pronounced loss of mucociliary differentiation in both PC and CIS lesions, presumably accompanied by deficiency in clearance and defense of the airways [67]. For further description of the genes identified in Figure S2, see . Additional analysis by Gene Ontology using the GATHER annotation tool [63], identified processes associated with cilia function such as gametogenesis and spermatogenesis, and microtubule-based processes, as prominent components of the transcriptome of commonly down-regulated genes in CIS and PC lesions, relative to BE (, Table S3).

Identification of differentially expressed genes in carcinoma-in-situ and invasive cancer transcriptomes key to cancer development

By identifying genes differentially expressed between preinvasive and invasive stages of lung cancer development (CIS and SCC, respectively), relative to both non-cancerous bronchial epithelium and precancerous metaplasia/dysplasia lesions (BE and PC, respectively), we propose to identify expression changes instrumental to both initiation (CIS) and progression (SCC) of lung cancer. In accordance with our selection criteria (minimal three-fold difference in average normalized tag abundance; minimal average normalized tag abundance of 40 TPM in the over-expressing dataset), 309 SAGE tags were found to be differentially expressed in CIS relative to BE and PC, and 280 tags were differentially expressed in SCC relative to BE and PC, with 116 tags similarly differentially expressed ( ). It is noted that the stringent selection criteria imposed in this study for differential expression would preclude certain genes, although present in the SAGE datasets and relevant to cancer development, from further analysis (see example below). However, a high stringency within the selection process, typically lends greater confidence to the relevance of those genes identified as differentially expressed in the cancer datasets. We identified 225 SAGE tags to be over-expressed in CIS relative to both BE and PC (), and 232 tags to be over-expressed in invasive SCC relative to both BE and PC (). It is noted that greater than 35% of the over-expressed tags within the CIS dataset (85 tags) were commonly up-regulated in SCC (Figure 3B), suggesting that significant expression changes relating to advanced cancer have already occurred by the time a diagnosis of CIS has been made, in accordance with irreversibility of CIS lesions. Discrepancy between the number of up-regulated tags and the number of IPA mapped IDs within each dataset, indicates that a significant proportion of potentially up-regulated genes in early-stage lung cancer remain to be identified ( ). IPA pathway graphical representation for up-regulated tags with mapped IDs for the two cancer datasets, is presented in . A higher proportion of up-regulated gene products are localized to the extracellular space in the SCC dataset relative to the CIS dataset. Considering the molecular interactions identified by IPA, functional networks involving the cell surface/extracellular matrix adhesion protein FN1, and transcriptional/cell cycle regulator CDKN2A, highlight the SCC dataset. Up-regulation of FN1-interacting proteins associated with tissue remodeling/fibrosis, and FN1-interacting proteins associated with acute phase response, suggests a link between these processes in SCC. A link between acute phase response and tissue repair has been previously proposed [68]. Activation of a CDKN2A functional network associated with cellular senescence, may reflect a protective response of the involved organ to acute tissue injury [69]. No outstanding molecular interactions were apparent for the CIS up-regulated dataset. Differential expression for a subset of the up-regulated genes in Figure 5B was validated by real-time quantitative RT-PCR ().
Figure 5

Genes up-regulated in the CIS and invasive SCC datasets relative to BE and PC.

A. Venn diagram of up-regulated SAGE tags and corresponding IPA mapped IDs for the CIS and SCC datasets. (See Table S5 and Table S6 for description of up-regulated tags in the CIS and SCC datasets, respectively.) B. IPA pathway graphical representation for the CIS over BE_PC dataset (80 unique IDs displayed in green; 58 shared IDs displayed in gray), and the SCC over BE_PC dataset (112 unique IDs displayed in red; 58 shared IDs displayed in gray). Gene products are positioned according to subcellular localization. Only direct connections (i.e., direct physical contact between two molecules) among the individual gene products are shown for clarity of presentation; lines indicate protein-protein binding interactions, and arrows refer to “acts on” interactions such as proteolysis, expression, and protein-DNA/RNA interactions. Eleven genes were detected at levels 20-fold or greater in the CIS over BE_PC dataset relative to the invasive cancer dataset (indicated by dark green), and 10 genes were detected at levels 20-fold or greater in the SCC over BE_PC dataset relative to the CIS dataset (indicated by dark red).

Genes up-regulated in the CIS and invasive SCC datasets relative to BE and PC.

A. Venn diagram of up-regulated SAGE tags and corresponding IPA mapped IDs for the CIS and SCC datasets. (See Table S5 and Table S6 for description of up-regulated tags in the CIS and SCC datasets, respectively.) B. IPA pathway graphical representation for the CIS over BE_PC dataset (80 unique IDs displayed in green; 58 shared IDs displayed in gray), and the SCC over BE_PC dataset (112 unique IDs displayed in red; 58 shared IDs displayed in gray). Gene products are positioned according to subcellular localization. Only direct connections (i.e., direct physical contact between two molecules) among the individual gene products are shown for clarity of presentation; lines indicate protein-protein binding interactions, and arrows refer to “acts on” interactions such as proteolysis, expression, and protein-DNA/RNA interactions. Eleven genes were detected at levels 20-fold or greater in the CIS over BE_PC dataset relative to the invasive cancer dataset (indicated by dark green), and 10 genes were detected at levels 20-fold or greater in the SCC over BE_PC dataset relative to the CIS dataset (indicated by dark red). We performed analysis by Gene Ontology using the GATHER annotation tool [63] for genes up-regulated in CIS and SCC relative to BE and PC (). This analysis identified fatty acid biosynthesis (described by a cluster of 5 genes) as a component of the CIS dataset. In agreement with IPA analysis described above, gene ontology analysis identified defense response (described by a cluster of 20 genes) as a notable component of the invasive cancer dataset. Additionally, skeletal development, response to wounding, anion transport, and carbohydrate catabolism were also identified by GATHER gene ontology analysis, as components of the invasive cancer dataset.

Epidermal development

Considering the notable expression of genes associated with epidermal development common to the PC and CIS datasets, we investigated whether functionally related genes are also enriched in the cancer datasets (CIS and SCC) relative to both BE and PC. The genes identified from this investigation using IPA functional analysis are described in . These data indicate that gene expression patterns reflective of epidermal development are not restricted to precancerous lesions, but rather also present as a component of CIS (∼15% of IPA eligible mapped IDs), and invasive cancer (∼26% of IPA eligible mapped IDs) apart from precancerous lesions.
Table 4

Up-regulated genes associated with epidermal development in CIS and invasive SCC according to IPA functional analysis.

Function AnnotationCIS/BE_PC3 p-valueSCC/BE_PC4 p-valueMolecules5 # Molecules6
Category: Dermatological Diseases and Conditions 1
burn 3.35E-08 COL12A1, COL17A1, COL1A1, COL1A2, COL3A1, COL4A1, COL5A1, COL6A3 8
Ehlers-Danlos syndrome 5.63E-06 COL1A1, COL1A2, COL3A1, COL5A1 4
Ehlers-Danlos syndrome, type I 2.77E-04 COL1A1, COL5A1 2
psoriatic arthritis 8.43E-05 CD27, IGHG1, IGKC, IL23A, MAP4K1 5
psoriatic arthritis of humans 1.37E-02 IGHG1 1
malignant cutaneous melanoma 5.51E-04 CDK4, CDKN2A 2
dermatological disorder 9.67E-04 5.55E-10 APOE, C1R, C1S, CCL5, CCL20, CD27, CDC25B, CDKN2A, CLDN1, COL12A1, COL17A1, COL1A1, COL1A2, COL3A1, COL4A1, COL5A1, COL6A3, CRP, DSG1, FN1, GDF15, GPX2 , GRIN2C, IGHG1, IGKC , IGL@ , IL23A, JUNB, MAP4K1, ODC1, PERP, PKP1, PSMB4, TP63 34
dermatological disorder of mammalia 3.86E-02 DSG1, IGHG1, IL23A 3
blister 2.49E-03 4.83E-03 COL17A1, DSG1, PLAT 3
blistering of epithelial tissue 6.89E-03 PERP 1
psoriasis 1.30E-02 IGHG1, IL23A, ODC1 3
fibrosis of dermis 1.36E-03 CDK4, COL1A1 2
hereditary angioedema 1.36E-03 C1R, C1S 2
hirsutism 1.37E-02 ODC1 1
metaplasia of squamous epithelium 1.37E-02 TP63 1
skin tumor 2.61E-02 9.26E-03 CDK4, CDKN2A, EFNA1 , GPX2 , HSP90B1, IL23A, ODC1 , RRM2, SPP1 9
Category: Hair and Skin Development and Function2
cell spreading of epithelial cell lines 1.01E-04 ANTXR1, CYR61, FN1 3
adhesion of epithelial cell lines 2.16E-04 EPHB3, FN1, POSTN, TIMP3 4
presence of hair follicle 6.89E-03 TP63 1
development of skin 1.18E-04 CLDN1, COL1A1, COL3A1, COL5A1, IRF6 5
tensile strength of skin 2.77E-04 COL5A1, DCN 2
stratification of skin 6.89E-03 TP63 1
proliferation of epidermal cells 8.57E-03 CDK4, EFNA1, IRF6, JUNB 4
thickness of skin tissue 1.37E-02 ODC1 1
mitogenesis of skin cell lines 2.73E-02 IGHG1 1
survival of melanocytes 2.73E-02 EFNA1 1
vascularization of skin 2.73E-02 5.51E-04 MFGE8, ODC1 2
senescence of keratinocytes 1.36E-03 CDK4, CDKN2A 2
arrest in growth of keratinocytes 4.07E-02 TP63 1
differentiation of keratinocytes 4.47E-03 CDKN2A, FN1, IRF6, JUNB 4
growth of melanocytes 4.73E-02 1.89E-03 CDKN2A, EFNA1 2

IPA Diseases and Disorders. 2IPA Physiological System Development and Function.

Dermatological Diseases and Conditions: 8.43E-05–3.86E-02; Hair and Skin Development and Function: 6.89E-03–4.73E-02.

Dermatological Diseases and Conditions: 5.55E-10–9.26E-03; Hair and Skin Development and Function: 1.01E-04–8.57E-03.

type in bold denotes those genes associated with the corresponding function annotation in the CIS over BE_PC dataset; type in italics denotes those genes associated with the corresponding function annotation in the SCC over BE_PC dataset; type in both bold and italics denotes those genes associated with the corresponding function annotation in both the CIS over BE_PC and the SCC over BE_PC datasets. See Table S5 and Table S6 for corresponding tag abundance values.

16 unique genes identified out of 109 IPA eligible mapped IDs within the CIS over BE_PC dataset; 40 unique genes identified out of 153 IPA eligible mapped IDs within the SCC over BE_PC dataset.

IPA Diseases and Disorders. 2IPA Physiological System Development and Function. Dermatological Diseases and Conditions: 8.43E-05–3.86E-02; Hair and Skin Development and Function: 6.89E-03–4.73E-02. Dermatological Diseases and Conditions: 5.55E-10–9.26E-03; Hair and Skin Development and Function: 1.01E-04–8.57E-03. type in bold denotes those genes associated with the corresponding function annotation in the CIS over BE_PC dataset; type in italics denotes those genes associated with the corresponding function annotation in the SCC over BE_PC dataset; type in both bold and italics denotes those genes associated with the corresponding function annotation in both the CIS over BE_PC and the SCC over BE_PC datasets. See Table S5 and Table S6 for corresponding tag abundance values. 16 unique genes identified out of 109 IPA eligible mapped IDs within the CIS over BE_PC dataset; 40 unique genes identified out of 153 IPA eligible mapped IDs within the SCC over BE_PC dataset. In addition to those genes described in Table 4, other genes up-regulated in CIS relative to both BE and PC that are associated with epidermal development, include KRTDAP (encoded on19q13), KPRP, SPRR2F, SPRR2G, and LCE3D (all encoded on 1q21). KRTDAP is associated with epidermal morphogenesis, and is a potential regulator of keratinocyte differentiation [70], [71], [72], [73], [74]. KPRP is an epidermal marker expressed in stratified squamous epithelia [75], and has a potential role in calcium-induced keratinocyte differentiation, and expression is increased in psoriasis [76]. It is noted that genes associated with development of the cornified cell envelope encoded on 1q21/19q13, tend to be expressed at notably lower levels in invasive cancer relative to CIS (see Table S2 and Table S5). As described in Table 4, transcriptional regulators associated with keratinocyte differentiation over-expressed within invasive cancer include IRF6, CDKN2A, and JUNB. IRF6 is an interferon-induced transcription factor associated with the switch between keratinocyte proliferation and differentiation [77]. JUNB, a member of the AP-1 transcription factor family, is associated with keratinocyte differentiation during wound healing and psoriasis [78], [79]. The cyclin-dependent protein kinase inhibitor/transcription factor CDKN2A (see above), is associated with cell cycle arrest/senescence of keratinocytes, and differentiation of epidermis [80], [81]. Although considered a tumor suppressor protein, increased expression of CDKN2A at the invasive front of basal cell carcinomas and colon cancer has been reported, and the correlation of increased invasiveness with decreased proliferation, suggests that CDKN2A may play a role in cancer cell invasion [82], [83]. An association between epidermal development and squamous cell lung cancer development is frequently studied through analysis of the EGFR pathway [84]. Anti-EGFR therapies have been initiated for various types of cancer, including NSCLC [85], [86], [87], [88]. Also, up-regulation of KGF, a member of the fibroblast growth factor family and mediator of epidermal differentiation, is associated with pancreatic cancer [89]. The identification of additional genes associated with epidermal development and up-regulated in the early stages of NSCLC, may enhance our understanding of the role of this pathway in lung cancer development, and broaden treatment options.

Additional IPA analysis

We utilized IPA core analysis to identify additional functions within the CIS and invasive SCC datasets of up-regulated genes ( ). Canonical Pathways analysis and Toxicity Lists analysis identified metabolism/detoxification of xenobiotics as a component of the CIS dataset, and hepatic fibrosis as a major characteristic of the invasive cancer phenotype. Specific genes associated with these phenotypes are listed in and and described below.
Figure 6

IPA canonical pathways analysis and toxicity lists analysis of the CIS over BE_PC and the SCC over BE_PC datasets.

For analysis of the CIS over BE_PC dataset, 109 IPA mapped IDs were eligible; for analysis of the SCC over BE_PC dataset, 153 IPA mapped IDs were eligible. The two sets of data were displayed together using IPA core comparisons, and the 10 most significant functions within Canonical Pathways and Toxicity Lists are shown above for each dataset. The data in A and C is sorted according to highest significance in CIS over BE_PC, and the data in B and D is sorted according to highest significance in SCC over BE_PC. The orange line indicates the threshold limit of significance, preset at a p-value of 0.05.

Table 5

Up-regulated genes in CIS and invasive SCC associated with metabolism/detoxification of xenobiotics according to specific categories within IPA canonical pathways and toxicity lists as indicated.

IPA Canonical & ToxicityCIS/BE_PC1 SCC/BE_PC2 Entrez Gene NameLocationType(s)
Aryl Hydrocarbon Receptor Signaling CDK4cyclin-dependent kinase 4Nucleuskinase
CDKN2Acyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4)Nucleustranscription regulator
FASNfatty acid synthaseCytoplasmenzyme
MCM7MCM7minichromosome maintenance complex component 7Nucleusenzyme
Xenobiotic Metabolism Signaling CHST2CHST2carbohydrate (N-acetylglucosamine-6-O) sulfotransferase 2Cytoplasmenzyme
CHST3carbohydrate (chondroitin 6) sulfotransferase 3Cytoplasmenzyme
PPP2R1Bprotein phosphatase 2 (formerly 2A), regulatory subunit A, beta isoformUnknownphosphatase
Metabolism of Xenobiotics by Cytochrome P450; Xenobiotic Metabolism AKR1C1aldo-keto reductase family 1, member C1 (dihydrodiol dehydrogenase 1; 20-alpha (3-alpha)-hydroxysteroid dehydrogenase)Cytoplasmenzyme
AKR1C3AKR1C3aldo-keto reductase family 1, member C3 (3-alpha hydroxy-steroid dehydrogenase, type II)Cytoplasmenzyme
Cytochrome P450 Panel-substrate is a Vitamin (Human) CYP26A1cytochrome P450, family 26, subfamily A, polypeptide 1Cytoplasmenzyme
Glutathione Metabolism G6PDG6PDglucose-6-phosphate dehydrogenaseCytoplasmenzyme
Glutathione Metabolism; NRF2-mediated Oxidative Stress Response; Oxidative Stress GPX2GPX2glutathione peroxidase 2 (gastrointestinal)Cytoplasmenzyme
Glutathione Metabolism; Metabolism of Xenobiotics by Cytochrome P450; Xenobiotic Metabolism Signaling; Xenobiotic Metabolism; Aryl Hydrocarbon Receptor Signaling; NRF2-mediated Oxidative Stress Response; Oxidative Stress GSTM1GSTM1glutathione S-transferase mu 1Cytoplasmenzyme
GSTM3GSTM3glutathione S-transferase mu 3 (brain)Cytoplasmenzyme
GSTM4glutathione S-transferase mu 4Cytoplasmenzyme
NRF2-mediated Oxidative Stress Response ACTA1actin, alpha 1, skeletal muscleCytoplasmother
JUNBjun B proto-oncogeneNucleustranscription regulator
Oxidative Stress CCL5chemokine (C-C motif) ligand 5Extracellular Spacecytokine

13 unique genes identified out of 109 IPA eligible mapped IDs. See Table S5 for corresponding tag abundance values.

12 unique genes identified out of 153 IPA eligible mapped IDs. See Table S6 for corresponding tag abundance values.

Table 6

Up-regulated genes in invasive SCC associated with tissue fibrosis according to IPA functions, canonical pathways, and toxicity lists1.

Gene Symbol2 Entrez Gene NameLocationType(s)SCC/BE5 SCC/PC6 CIS/BE5 CIS/PC6
A2Malpha-2-macroglobulinExtracellular Spacetransporter17.43.24.10.8
APOEapolipoprotein EExtracellular Spacetransporter4.130.90.7
BGNbiglycanExtracellular Spaceother>434.3>50.5
CCL5chemokine (C-C motif) ligand 5Extracellular Spacecytokine3.14.21.52
CDK4cyclin-dependent kinase 4Nucleuskinase3.931.71.3
CDKN2Acyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4)Nucleustranscription regulator11.23.55.51.7
COL1A1collagen, type I, alpha 1Extracellular Spaceother456.420.3
COL1A2collagen, type I, alpha 2Extracellular Spaceother160.58.290.5
COL3A1collagen, type III, alpha 1Extracellular Spaceother>28440.6>131.9
COL4A1collagen, type IV, alpha 1Extracellular Spaceother28510.5461.7
COL5A1collagen, type V, alpha 1Extracellular Spaceother>466.6>10.1
COL6A3collagen, type VI, alpha 3Extracellular Spaceother>1594.5>180.5
DCNdecorinExtracellular Spaceother40416.71.7
ECE23 endothelin converting enzyme 2Plasma Membranepeptidase19.44.422.45.1
FBN1fibrillin 1Extracellular Spaceother18.53.57.71.5
FN1fibronectin 1Plasma Membraneenzyme38.63.90.70.1
GLIS2GLIS family zinc finger 2Nucleustranscription regulator5.4101.73.2
IFI6interferon, alpha-inducible protein 6Cytoplasmother3.25.92.44.5
IFITM1interferon induced transmembrane protein 1 (9–27)Plasma Membraneother5.16.81.62.2
MMP12matrix metallopeptidase 12 (macrophage elastase)Extracellular Spacepeptidase274.59.51.6
MYH7Bmyosin, heavy chain 7B, cardiac muscle, betaUnknownother>514.2>282.3
SPARCsecreted protein, acidic, cysteine-rich (osteonectin)Extracellular Spaceother1163330.8
SPP1secreted phosphoprotein 1Extracellular Spacecytokine17.330.21.93.3
TIMP34 TIMP metallopeptidase inhibitor 3Extracellular Spaceother18.5 & 5.53.5 & 6.93 & 0.90.6 & 1.1

Functions: Organismal Injury and Abnormalities (p value 2.59E-04); Canonical Pathways: Hepatic Fibrosis/Hepatic Stellate Cell Activation; Toxicity Lists: Hepatic Fibrosis.

24 unique genes identified out of 153 IPA eligible mapped IDs.

Also up-regulated in CIS.

Two unique tags per gene.

Ratio of mean TPM for 6 SCC libraries (or 5 CIS libraries) versus mean TPM for 14 BE libraries. Where mean TPM for BE libraries is 0, ratio is cited as “> mean TPM value” for SCC libraries (or CIS libraries).

Ratio of mean TPM for 6 SCC libraries (or 5 CIS libraries) versus average TPM for 2 PC libraries.

See Table S5 and Table S6 for corresponding tag abundance values.

IPA canonical pathways analysis and toxicity lists analysis of the CIS over BE_PC and the SCC over BE_PC datasets.

For analysis of the CIS over BE_PC dataset, 109 IPA mapped IDs were eligible; for analysis of the SCC over BE_PC dataset, 153 IPA mapped IDs were eligible. The two sets of data were displayed together using IPA core comparisons, and the 10 most significant functions within Canonical Pathways and Toxicity Lists are shown above for each dataset. The data in A and C is sorted according to highest significance in CIS over BE_PC, and the data in B and D is sorted according to highest significance in SCC over BE_PC. The orange line indicates the threshold limit of significance, preset at a p-value of 0.05. 13 unique genes identified out of 109 IPA eligible mapped IDs. See Table S5 for corresponding tag abundance values. 12 unique genes identified out of 153 IPA eligible mapped IDs. See Table S6 for corresponding tag abundance values. Functions: Organismal Injury and Abnormalities (p value 2.59E-04); Canonical Pathways: Hepatic Fibrosis/Hepatic Stellate Cell Activation; Toxicity Lists: Hepatic Fibrosis. 24 unique genes identified out of 153 IPA eligible mapped IDs. Also up-regulated in CIS. Two unique tags per gene. Ratio of mean TPM for 6 SCC libraries (or 5 CIS libraries) versus mean TPM for 14 BE libraries. Where mean TPM for BE libraries is 0, ratio is cited as “> mean TPM value” for SCC libraries (or CIS libraries). Ratio of mean TPM for 6 SCC libraries (or 5 CIS libraries) versus average TPM for 2 PC libraries. See Table S5 and Table S6 for corresponding tag abundance values.

Metabolism/detoxification of xenobiotics

Metabolism/detoxification of xenobiotics is a protective cellular response to prevent damage to macromolecules upon exposure to both exogenous and an excess of endogenous stressors [such as reactive oxygen species (ROS)]. Most notably, glutathione metabolic enzymes that mediate phase II detoxification of xenobiotics via conjugation with glutathione, and members of the aldose-reductase family of oxidoreductases, have been identified to be up-regulated in the CIS dataset (Table 5). Most of these genes are also up-regulated in the invasive cancer dataset. Functionally related genes not included within Table 5, include the oxidoreductases SRXN1 and AKR1B10. SRXN1 expression protects against cigarette smoke-induced oxidative stress, and is important for redox homeostasis [90], [91], [92]. Similar to AKR1C1, AKR1B10 catalyzes NADPH-dependent reduction and inhibition of 4-HNE and other toxic aldehydes resulting from peroxidation of membrane lipids [93]. The pentose phosphate pathway (PPP) provides reducing equivalents for glutathione reduction/recycling and maintenance of redox status; three genes associated with the PPP (ALDOC, G6PD, TKTL1), are up-regulated in CIS lesions. Expression of many of these genes (AKR1B10, AKR1C1, AKR1C3, G6PD, GPX2, GSTM1, GSTM3, GSTM4, SRXN1) is regulated by the redox-sensitive NRF2 transcription factor, and is induced by cigarette smoke [94], [95], [96], [97]. However, these protective responses may also promote adaptation to adverse environmental conditions (redox stressors), and survival with propagation of damaged cells. For example, SRXN1 may play a role in development of skin malignancies [92]. Both AKR1C isoforms and AKR1B10 catalyze oxidative activation of xenobiotic proximate carcinogen PAH trans-dihydrodiols (such as B[a]P, a component of cigarette smoke), to generate reactive ortho-quinones, and mediate redox cycling with ROS amplification [98]. Over-expression of AKR1B10 and AKR1C1 has been reported for many cancer types [99]. Expression of NRF2-regulated anti-oxidant/glutathione metabolic gene HMOX1, and NCF1 (the p47phox subunit of NADPH oxidase, a major cellular source of ROS) is up-regulated in CIS lesions, but is also relatively high in precancerous lesions (Figure 4, Table S2), suggesting that xenobiotic/oxidative stress may initiate early in the pathway leading to invasive lung cancer. It has recently been suggested that NADPH oxidase may stimulate the protective activity of the NRF2-KEAP1 signaling pathway [100].

Tissue fibrosis

Tissue fibrosis, initiated in response to injury and facilitated by inflammatory mediators, is characterized by the excessive accumulation of extracellular matrix components. Genes typically associated with tissue fibrosis, and up-regulated in the invasive SCC dataset described in this study, include fibrillar collagens and other fibrillar matrix components, matrix metalloproteases, metalloprotease inhibitors, proteoglycans, chemotactic proteins, transcriptional regulators, and contractile proteins (Table 6). In addition to those genes identified by IPA, other genes within the invasive cancer dataset associated with tissue fibrosis include extracellular proteins MFGE8 [101], and POSTN, a mediator of collagen fibrillogenesis [102], [103] [104]. Related genes up-regulated in the SCC dataset include constituents of the fibrillar extracellular matrix such as MFAP2 [105], [106], FBLN1 [107], and the small leucine-rich proteoglycan FMOD, a mediator of collagen fibrillogenesis and matrix assembly [108]. In addition to structural properties, many of these components, such as MFAP2, MFGE8, and POSTN have signaling activities relevant to cancer development [109], [110], [111], [112], [113]. Tissue fibrosis is a component of various cancer types, and studies suggest that advanced fibrosis contributes to aggressiveness and resistance to chemotherapy [114], [115] [116]. The myofibroblast cell, thought to originate from various sources including transformation of resident or bone marrow-derived fibroblasts, transdifferentiation of epithelial cells to mesenchymal-type cells via EMT (epithelial-mesenchymal transition), and activation of resident stellate (astrocyte) cells, is a mediator of tissue fibrosis [117]. Activation of pancreatic stellate cells mediates the fibrotic process inherent to pancreatic cancer and contributes to cancer progression [118], [119], [120], [121]. Activated hepatic stellate cells are the major mediators of liver fibrosis and contribute to liver cancer [122], [123], [124]. The epidermal growth factor receptor regulates pancreatic fibrosis via stimulation of pancreatic stellate cells [125]. The cellular origins of the tissue fibrosis apparent from analysis of the invasive lung cancer dataset presented in this study, is not known. Some genes identified in Table 6 are associated with hepatic stellate cell activation. Others, including POSTN, contribute to pancreatic stellate cell activiation [126], and also mediate EMT [127]. MFGE8 is also a mediator of EMT [111], and the cytoskeletal intermediate filament protein, VIM (over-expressed here) is a mesenchymal cell marker, and an indicator of EMT [128], [129], [130]. EMT is associated with progression to invasive cancer [131]. We identified 84 SAGE tags to be down-regulated in CIS relative to both BE and PC (), and 48 SAGE tags to be down-regulated in invasive SCC relative to both BE and PC (), with 31 tags in common (Figure 3B). It is noted that IPA functional analysis did not identify any down-regulated genes from these datasets as specifically associated with ciliogenesis, and although multiple significant functional categories were identified, no specific biological process within these categories took prominence (data not shown). This was the case when analysed separately or as a single dataset of down-regulated genes, perhaps partially attributed to the relatively small size of the datasets. However, when based on indirect as well as direct molecular connections, IPA pathway graphical representation identified receptor tyrosine kinase ERBB2, as central to a network associated with airway biology (. In addition to associations with multiple developmental processes, ERBB2 also plays a role in airway repair including differentiation of ciliated and goblet cells, while inhibiting squamous metaplasia [132]. In this regard, it is intriguing to hypothesize that failure to initiate a potential ERBB2 signaling complex in CIS lesions, may compromise redifferentiation/restoration of the bronchial epithelium following injury, and contribute to initiation of in-situ cancer. For further description of the down-regulated genes identified in Figure S4, see . We performed analysis by Gene Ontology using the GATHER annotation tool [63] for down-regulated genes in CIS and SCC relative to BE and PC (; Table S8). Genes associated with defense response were most prominently identified by this analysis.

Potential biomarkers

Genes expressed at notably different levels among normal and cancer datasets, have the potential to serve as biomarkers for early detection. For a listing of potential biomarkers for both CIS and invasive SCC, based upon a minimal 20-fold up-regulation (and a minimal average tag abundance of 40 TPM), see . Generally, genes associated with epidermal development (KRTDAP, SPRR2G, SPRR2E) may potentially serve as biomarkers for CIS, whereas genes associated with immune response (MHC class I receptor HLA-G, acute-phase response protein CRP) may potentially serve as biomarkers for invasive SCC. It is noted that, although precise identification of the expressing cell type(s) for the genes associated with immune response may warrant further investigation, these genes may nonetheless accurately reflect the tumor cell microenvironment and provide diagnostic potential. Multiple literature reports support the potential of HLA-G and CRP to serve as biomarkers for invasive cancer of various tissue types including NSCLC [133], [134], [135], [136], [137], [138]. Additionally, the data presented in this study suggests that the neuropeptide NTS, may have potential as a biomarker for CIS lesions, whereas CST1, a peptidase inhibitor within the cystatin superfamily, may have potential as a biomarkers for invasive SCC lesions. Intriguingly, a recent study describes the potential of CST1 as a urinary marker for colorectal cancer [139].
Table 7

Potential biomarkers for CIS, invasive SCC, and squamous cell lung cancer.

Tag1 BE Mean TPMCIS Mean TPMSCC Mean TPMPC Av TPMGene SymbolMapping Reliability (%)6
CIS 2
GGCTTCTAAC 09033531SPRR2E95
ACCTCCACTG 0698010KRTDAP91
AATCTCTCAA 16318130NTS95
GTCAAGCCCA 016512SPRR2G91
CTGCCATTAA 28940OR14J147
TTTCCAGCAC 17530C6orf1595
ACGCCTACTG 06112TKTL195
CACAGGCATC 05800nu_sr5 50
TAACCAAGAG 04200TTR95
AGGAAGTCTT 14110nu_r5 48
GAAATACCCA 14010nu_r5 48
SCC 3
GATCAGGCCA 0132847COL3A194
CCACGGGATT 0102506No match
CACCTCCTAT 102390nu_r5 48
CTGAACTGCA 001332HLA-G94
GCCGTGAACA 001296SFTPC94
GAGAGAGACT 011272CRP95
AGGAAAGGTT 101020Hs.62959449
GTACACACCC 00780CST195
CTAAGAAAGT 21430No match
CIS and SCC 4
GAAATAAAGC 623801914716561IGHG194
AAGGGAGCAC 315039322932IGL@94
CTCCCCCAAG 337043196932IGHG194
CTCCCCCAAA 294772160418IGHG194
CAAACTAACC 104665162IGHG194
GGTTGAAAAA 1131734815SNAR-E89
AGAAGACGTT 17302400nu_r5 48
GCGGAGGTGG 25041763IGHG189
AAATAAAGCA 13921740nu_r5 48
CTGGGTGCCT 05271723PSMB454
GGAAATAAAG 33091342nu_r5 48
GAAGCCCCAG 03001010IGKC94
AGGGTCCCCG 1266970IGKC94
TGCCGTTTTG 2192942GSTM392
GAGATAAAGC 0250783Hs.68270748
GAAATAAGGC 2246770SLCO1A251
TTGAAACTGT 067610MID167
GAAATAGAGC 0190570nu_r5 48
TTAAATTAAT 251480NTRK294
AGGGGAGCAC 045450LOC10028792767

Criteria for potential biomarkers was set at a minimal of 20-fold enhanced expression based on normalized mean tag counts, and a minimal mean abundance level of 40 TPM, in the marker dataset.

Tags detected at a minimal of 20-fold enhanced expression in CIS relative to BE, PC, and SCC SAGE datasets.

Tags detected at a minimal of 20-fold enhanced expression in SCC relative to BE, PC, and CIS SAGE datasets.

Tags detected at a minimal of 20-fold enhanced expression in both CIS and SCC relative to both BE and PC SAGE datasets.

Tags map to cDNA sequences from the database of Unclustered ESTs.

Mapping reliability as defined by SAGE Genie.

Criteria for potential biomarkers was set at a minimal of 20-fold enhanced expression based on normalized mean tag counts, and a minimal mean abundance level of 40 TPM, in the marker dataset. Tags detected at a minimal of 20-fold enhanced expression in CIS relative to BE, PC, and SCC SAGE datasets. Tags detected at a minimal of 20-fold enhanced expression in SCC relative to BE, PC, and CIS SAGE datasets. Tags detected at a minimal of 20-fold enhanced expression in both CIS and SCC relative to both BE and PC SAGE datasets. Tags map to cDNA sequences from the database of Unclustered ESTs. Mapping reliability as defined by SAGE Genie. Due to rarity of CIS specimens, we were not able to validate up-regulation of gene expression directly. However, an analysis of publically available microarray expression data from squamous lung carcinoma specimens [35], and microarray expression data from internally profiled bronchial brushings, indicates that genes associated with epidermal development and showing specific expression in the CIS SAGE dataset, such as KRTDAP and SPRR2G, may actually retain up-regulation in a subset of invasive tumors (). Thus, these genes may have potential to serve as early-stage biomarkers for SCC in a combinatorial manner. To identify potential biomarkers associated with both preinvasive and invasive squamous cell lung cancer, we selected SAGE tags on the basis of a minimal 20-fold up-regulation in both the CIS and invasive cancer datasets (relative to BE and PC) (Table 7). Intriguingly, several of these tags map to immunoglobulin heavy chain and light chain genes (see above). Other genes identified here include NTRK2, a member of the neurotrophin tyrosine kinase receptor family, and GSTM3, a mediator of glutathione/xenobiotic metabolism (see above). Genetic polymorphisms of GSTM3 and other members of the GST family of proteins, have been associated with cancer risk [140], [141], [142]. Enhanced expression of NTRK2 has been associated with poor prognosis in neuroblastoma and other cancer types [143], [144], [145]. It is noted that several of the selected tags in Table 7 are currently unmapped to a gene ID; further experimentation to resolve these mappings may provide additional genes for biomarker evaluation. In an effort to evaluate the frequency of over-expression of these candidate genes in SCC on a broader scale, we again consulted publically available microarray expression data from squamous lung carcinoma specimens [35], and microarray expression data from internally profiled bronchial brushings. Although sporadic, a trend of up-regulation was observed for COL3A1, SFTPC, CST1, IGHG1, GSTM3, SLCO1A2, and NTRK2 in squamous tumors relative to bronchial epithelium, lending support to the potential of these genes to serve as biomarkers for invasive SCC (Figure S6). See for raw microarray data used for this analysis.

Differential gene expression and genomic copy-number status in CIS lesions

To investigate whether alterations in gene dosage contributes mechanistically to the differential gene expression identified here in early-stage lung cancer, we compared segmental copy number gain/loss from 20 independent CIS specimens with locus information for up-regulated and down-regulated genes in CIS/SCC relative to BE and PC. A subset of up-regulated genes localized to regions of stable frequent gain ( ), and a subset of down-regulated genes localized to regions of stable frequent loss ( ). Most prominently identified corresponding regions of copy number gain include 1q21–1q42.13 and 3q12.1–3q29, and loci within chromosomal arms 7q, 8q, 17q, and 20q. It is noted that the region 1q21 encodes the EDC, a region over-expressed early in CIS lesions (see above). Most prominently identified corresponding regions of copy number loss include loci within chromosomal arms 3p and 6p. This data agrees with previously published data for lung cancer, particularly amplification at 3q, 7q, 8q, and loss of 3p [146], [147], [148]. [Notably, a recent study describes genomic amplification and over-expression of transcription factor SOX2 encoded at 3q26.33 in lung squamous cell carcinoma [149]. Although we detect enhanced expression of SOX2 in CIS relative to both BE and PC, the tag abundance ratio falls marginally below the three-fold threshold/cut-off applied in this study, precluding this gene from the copy-number analysis presented here.] The data presented here suggests that frequent chromosomal gain/loss of specific loci, represents a significant mechanism for differential gene expression in early, preinvasive stages of squamous cell lung cancer. See and for raw data describing copy-number status for up-regulated genes and down-regulated genes, respectively.
Figure 7

Correlation between up-regulated gene expression in CIS and SCC relative to BE and PC, with regions of frequent copy-number gain in CIS specimens.

Up-regulated genes (x-axis), plotted according to chromosomal location as indicated, were matched with segmental copy-number status (y-axis), defined by frequent copy-number gain (blue) and loss (red), from 20 independent CIS specimens. 224 genes were analyzed, and only those associated with regions gained at a minimal frequency of 0.2 are shown above. Knowledge of losses in addition to gains serves as a filter to identify those chromosomal regions that are preferentially gained rather than a reflection of general instability. See Table S12 for raw data pertaining to these analyses.

Figure 8

Correlation between down-regulated gene expression in CIS and SCC relative to BE and PC, with regions of frequent copy-number loss in CIS specimens.

Down-regulated genes (x-axis), plotted according to chromosomal location as indicated, were matched with segmental copy-number status (y-axis), defined by frequent copy-number loss (red) and gain (blue), from 20 independent CIS specimens. 81 genes were analyzed, and only those associated with regions lost at a minimal frequency of 0.2 are shown above. Knowledge of gains in addition to losses serves as a filter to identify those chromosomal regions that are preferentially lost rather than a reflection of general instability. See Table S13 for raw data pertaining to these analyses.

Correlation between up-regulated gene expression in CIS and SCC relative to BE and PC, with regions of frequent copy-number gain in CIS specimens.

Up-regulated genes (x-axis), plotted according to chromosomal location as indicated, were matched with segmental copy-number status (y-axis), defined by frequent copy-number gain (blue) and loss (red), from 20 independent CIS specimens. 224 genes were analyzed, and only those associated with regions gained at a minimal frequency of 0.2 are shown above. Knowledge of losses in addition to gains serves as a filter to identify those chromosomal regions that are preferentially gained rather than a reflection of general instability. See Table S12 for raw data pertaining to these analyses.

Correlation between down-regulated gene expression in CIS and SCC relative to BE and PC, with regions of frequent copy-number loss in CIS specimens.

Down-regulated genes (x-axis), plotted according to chromosomal location as indicated, were matched with segmental copy-number status (y-axis), defined by frequent copy-number loss (red) and gain (blue), from 20 independent CIS specimens. 81 genes were analyzed, and only those associated with regions lost at a minimal frequency of 0.2 are shown above. Knowledge of gains in addition to losses serves as a filter to identify those chromosomal regions that are preferentially lost rather than a reflection of general instability. See Table S13 for raw data pertaining to these analyses.

Summary

In this study we describe genes differentially regulated in early stages of squamous NSCLC development by way of SAGE profiling. Differentially expressed genes found in common between CIS and precancerous lesions relative to bronchial epithelium, are presumed to reflect early expression changes during CIS development. Those genes differentially expressed in CIS relative to the precancerous lesions/bronchial epithelium, and in invasive SCC relative to the precancerous lesions/bronchial epithelium, presumably reflect gene expression changes more instrumental to cancer initiation, and cancer cell invasion, respectively. In this study, data was analyzed primarily through the use of Ingenuity Pathway Analysis, complemented by literature searches pertaining to specific genes. Here we have described the up-regulation of genes associated with epidermal development, and the down-regulation of genes associated with mucociliary development, in both CIS and precancerous lesions relative to bronchial epithelium. Increased expression of genes associated with desmosomal cell-cell junctions, and epidermal barrier formation, would conceivably enhance tissue integrity, and may reflect a protective response to tissue damage occurring early in CIS lesions. Although genes associated with epidermal development are also elevated in SCC, those genes specifically associated with epidermal barrier formation and desmosomal structures, show relatively low expression in invasive SCC, suggesting further tissue architectural changes upon transition to invasive cancer. Our data also suggests that tissue remodeling/fibrosis is present in early stage CIS lesions, where it may reflect a cellular response to hypoxia/oxidative stress. Our analysis has identified up-regulation of genes associated with xenobiotic metabolism/detoxification in CIS and invasive SCC relative to bronchial epithelium and precancerous lesions, implying an enhanced requirement for protection against electrophile and/or oxidative stress upon the transition from precancer to CIS. Up-regulated genes specifying tissue fibrosis is a pronounced feature of the invasive cancer dataset, where it appears in association with acute phase immune components. Thus, the data presented here suggests that a fibrotic tissue response is initiated in early stage CIS, and is further developed in invasive cancer. Considering that many of these matrix components have signaling activities associated with regulation of cellular proliferation and migration similar to those described for EMT, the profibrotic phenotype described here may represent a defining component of advanced lung SCC. Additionally, by selecting SAGE tags showing extreme up-regulation among the various datasets, we have identified a small number of genes that may have potential as biomarkers for early diagnosis. Although some of these genes have previously been investigated as biomarkers for invasive cancer by other researchers, this is the first description of potential biomarkers for CIS. Lastly, a comparative analysis between differential gene expression in CIS lesions and invasive carcinoma with array CGH data from independent CIS specimens, suggests that copy number alterations plays a significant role in differential gene expression in CIS lesions. Most abundant 300 tags in bronchial epithelium, carcinoma-in-situ, and invasive cancer SAGE datasets. (0.39 MB DOC) Click here for additional data file. Up-regulated gene expression changes in common between carcinoma-in-situ and precancerous lesions relative to bronchial epithelium. (0.29 MB DOC) Click here for additional data file. Input for GATHER analysis of gene ontology for genes showing similar differential expression in CIS and PC. (0.39 MB DOC) Click here for additional data file. Down-regulated gene expression changes in common between carcinoma-in-situ and precancerous lesions relative to bronchial epithelium. (0.97 MB DOC) Click here for additional data file. Up-regulated gene expression changes in carcinoma-in-situ relative to bronchial epithelium and precancerous lesions. (0.34 MB DOC) Click here for additional data file. Up-regulated gene expression changes in invasive cancer relative to bronchial epithelium and precancerous lesions. (0.35 MB DOC) Click here for additional data file. Summary of real-time RT-PCR data for select genes within invasive cancer, bronchial epithelium, and lung parenchyma. (0.03 MB DOC) Click here for additional data file. Input for GATHER analysis of gene ontology for genes differentially expressed in cancer datasets relative to BE and PC. (0.16 MB DOC) Click here for additional data file. Down-regulated gene expression changes in carcinoma-in-situ relative to bronchial epithelium and precancerous lesions. (0.14 MB DOC) Click here for additional data file. Down-regulated gene expression changes in invasive cancer relative to bronchial epithelium and precancerous lesions. (0.09 MB DOC) Click here for additional data file. Microarray expression data for up-regulated genes with biomarker potential. (0.19 MB DOC) Click here for additional data file. Up-regulated genes in CIS and SCC analyzed for frequent copy-number gain (and loss) in CIS specimens. (0.33 MB DOC) Click here for additional data file. Down-regulated genes in CIS and SCC analyzed for frequent copy-number loss (and gain) in CIS specimens. (0.14 MB DOC) Click here for additional data file. Gene ontology analysis as determined by the GATHER annotation tool for genes showing similar differential expression in CIS and PC. Those gene ontology annotations cited at a minimal depth of five and associated with a positive ln(Bayes factor) value are shown. The y-axis refers to the total number of genes per GO annotation. A. Up-regulated genes in CIS and PC relative to BE (151 genes analyzed). ln(Bayes factor) range 25.15–0.06 (left to right as indicated). B. Down-regulated genes in CIS and PC relative to BE (484 genes analyzed). ln(Bayes factor) range 8.04–0.12 (left to right as indicated). A positive Bayes factor indicates support for the hypothesis that an association of an annotation with the identified gene cluster is stronger than the association to other genes in the genome. See Table S3 for exact listing of gene symbols used as input for GATHER analysis. (1.96 MB EPS) Click here for additional data file. IPA pathway graphical representation for genes commonly down-regulated in CIS and PC relative to BE. Results are displayed for 76 gene products characterized by either direct (solid lines) or indirect (broken lines) connections derived from analysis of 500 input molecules*. Lines indicate protein-protein binding interactions, and arrows refer to “acts on” interactions such as proteolysis, expression, and protein-DNA/RNA interactions. Gene products are positioned according to subcellular localization. Genes associated with cilia formation (IPA function: Cellular Assembly and Organization) are highlighted. (See Table S4 for tag data.) *To comply with the maximum input of 500 mapped IDs for IPA pathway analysis, 8 IDs from the BE over CIS_PC dataset were excluded from the total of 508 IPA mapped IDs. These included C10orf79, C13orf30, C16orf46, C1orf110, C1orf92, C22orf15, C5orf32, and C8orf40, chosen randomly from within the set of open reading frame genes. (6.95 MB EPS) Click here for additional data file. Gene ontology analysis as determined by the GATHER annotation tool for genes up-regulated in cancer datasets relative to BE and PC. Those gene ontology annotations cited at a minimal depth of five and associated with a positive ln(Bayes factor) value are shown. The y-axis refers to the total number of genes per GO annotation. A. Up-regulated genes in CIS relative to BE and PC (138 genes analyzed). ln(Bayes factor) range 3.4–0.19 (left to right as indicated). B. Up-regulated genes in SCC relative to BE and PC (173 genes analyzed). ln(Bayes factor) range 9.66–0.13 (left to right as indicated). A positive Bayes factor indicates support for the hypothesis that an association of an annotation with the identified gene cluster is stronger than the association to other genes in the genome. See Table S8 for exact listing of gene symbols used as input for GATHER analysis. (3.31 MB EPS) Click here for additional data file. Genes down-regulated in the CIS and invasive SCC datasets relative to BE and PC. A. Venn diagram of down-regulated SAGE tags and corresponding IPA mapped IDs for the CIS and SCC datasets. (See Table S9 and Table S10 for tag data.) B. IPA pathway graphical representation for the BE_PC over CIS dataset (50 unique IDs displayed in green; 26 shared IDs displayed in gray), and the BE_PC over SCC dataset (15 unique IDs displayed in red; 26 shared IDs displayed in gray). Gene products are positioned according to subcellular localization. Both direct connections (solid lines) and indirect connections (broken lines) among the individual gene products are shown; lines indicate protein-protein binding interactions, and arrows refer to “acts on” interactions such as proteolysis, expression, and protein-DNA/RNA interactions. (6.31 MB EPS) Click here for additional data file. Gene ontology analysis as determined by the GATHER annotation tool for genes down-regulated in cancer datasets relative to BE and PC. Those gene ontology annotations cited at a minimal depth of five and associated with a positive ln(Bayes factor) value are shown. The y-axis refers to the total number of genes per GO annotation. A. Down-regulated genes in CIS relative to BE and PC (74 genes analyzed). ln(Bayes factor) range 2.23–0.06 (left to right as indicated). B. Down-regulated genes in SCC relative to BE and PC (39 genes analyzed). ln(Bayes factor) range 3.5–0.07 (left to right as indicated). A positive Bayes factor indicates support for the hypothesis that an association of an annotation with the identified gene cluster is stronger than the association to other genes in the genome. See Table S8 for exact listing of gene symbols used as input for GATHER analysis. (2.74 MB EPS) Click here for additional data file. Validation of up-regulated gene expression in invasive SCC using microarray data. Microarray expression data for 53 squamous lung tumors (SqCC) was downloaded from NCBI Lung Cancer Dataset, GEO accession number GSE3141 [35], and microarray expression data for 67 bronchial brushings (Normal) was internally profiled. Results are shown above for seven genes up-regulated in the invasive SCC SAGE dataset relative to BE and PC (COL3A1, CST1, *GSTM3, *IGHG1, *NTRK2, SFTPC, and *SLCO1A2), and for two genes up-regulated in the CIS SAGE dataset relative to BE, PC, and invasive SCC (KRTDAP, SPRR2G). See Table S11 for raw microarray data. *Also up-regulated in CIS. (1.75 MB EPS) Click here for additional data file. Further description of the down-regulated genes displayed in Figure S2. (0.14 MB DOC) Click here for additional data file. Further description of the down-regulated genes displayed in Figure S4. (0.08 MB DOC) Click here for additional data file.
  149 in total

1.  Outcome of bronchial carcinoma in situ.

Authors:  B J Venmans; T J van Boxem; E F Smit; P E Postmus; T G Sutedja
Journal:  Chest       Date:  2000-06       Impact factor: 9.410

Review 2.  Wnt signaling in lung cancer.

Authors:  Julien Mazieres; Biao He; Liang You; Zhidong Xu; David M Jablons
Journal:  Cancer Lett       Date:  2005-05-10       Impact factor: 8.679

3.  Increased expression of O-acetyl disialoganglioside synthase during rat liver fibrogenesis relates to stellate cell activation.

Authors:  Sung Hee Lee; Geom Seog Seo; Pil-Hoon Park; Jeong-Yong Choi; Young Nyun Park; Hong Kuk Kim; Keon-Sang Chae; Dong Hwan Sohn
Journal:  Biochem Biophys Res Commun       Date:  2003-04-11       Impact factor: 3.575

4.  Molecular cloning and expression of keratinocyte proline-rich protein, a novel squamous epithelial marker isolated during skin development.

Authors:  Wuyi Kong; Michael T Longaker; H Peter Lorenz
Journal:  J Biol Chem       Date:  2003-03-31       Impact factor: 5.157

5.  Immunoglobulin G expression in carcinomas and cancer cell lines.

Authors:  Zhengshan Chen; Jiang Gu
Journal:  FASEB J       Date:  2007-05-02       Impact factor: 5.191

6.  Pancreatic stellate cells: partners in crime with pancreatic cancer cells.

Authors:  Alain Vonlaufen; Swapna Joshi; Changfa Qu; Phoebe A Phillips; Zhihong Xu; Nicole R Parker; Cheryl S Toi; Romano C Pirola; Jeremy S Wilson; David Goldstein; Minoti V Apte
Journal:  Cancer Res       Date:  2008-04-01       Impact factor: 12.701

7.  Periostin, a novel marker of intramembranous ossification, is expressed in fibrous dysplasia and in c-Fos-overexpressing bone lesions.

Authors:  Takeshi G Kashima; Takashi Nishiyama; Kazuhiro Shimazu; Masashi Shimazaki; Isao Kii; Agamemnon E Grigoriadis; Masashi Fukayama; Akira Kudo
Journal:  Hum Pathol       Date:  2008-09-16       Impact factor: 3.466

8.  Milk fat globule EGF-8 promotes melanoma progression through coordinated Akt and twist signaling in the tumor microenvironment.

Authors:  Masahisa Jinushi; Yukoh Nakazaki; Daniel R Carrasco; Dobrin Draganov; Nicholas Souders; Matthew Johnson; Martin C Mihm; Glenn Dranoff
Journal:  Cancer Res       Date:  2008-11-01       Impact factor: 12.701

9.  A comprehensive analysis of phase I and phase II metabolism gene polymorphisms and risk of non-small cell lung cancer in smokers.

Authors:  Shanbeh Zienolddiny; Daniele Campa; Helge Lind; David Ryberg; Vidar Skaug; Lodve B Stangeland; Federico Canzian; Aage Haugen
Journal:  Carcinogenesis       Date:  2008-02-06       Impact factor: 4.944

10.  Transcriptional profiling of developing mouse epidermis reveals novel patterns of coordinated gene expression.

Authors:  Hisham Bazzi; Katherine A Fantauzzo; Gavin D Richardson; Colin A B Jahoda; Angela M Christiano
Journal:  Dev Dyn       Date:  2007-04       Impact factor: 3.780

View more
  13 in total

Review 1.  Measurement of biomarker proteins for point-of-care early detection and monitoring of cancer.

Authors:  James F Rusling; Challa V Kumar; J Silvio Gutkind; Vyomesh Patel
Journal:  Analyst       Date:  2010-07-08       Impact factor: 4.616

2.  Lung cancer transcriptomes refined with laser capture microdissection.

Authors:  Juan Lin; Gabrielle Marquardt; Nandita Mullapudi; Tao Wang; Weiguo Han; Miao Shi; Steven Keller; Changcheng Zhu; Joseph Locker; Simon D Spivack
Journal:  Am J Pathol       Date:  2014-08-14       Impact factor: 4.307

Review 3.  Airway basal stem cells: a perspective on their roles in epithelial homeostasis and remodeling.

Authors:  Jason R Rock; Scott H Randell; Brigid L M Hogan
Journal:  Dis Model Mech       Date:  2010-08-10       Impact factor: 5.758

4.  A sequence-based approach to identify reference genes for gene expression analysis.

Authors:  Raj Chari; Kim M Lonergan; Larissa A Pikor; Bradley P Coe; Chang Qi Zhu; Timothy H W Chan; Calum E MacAulay; Ming-Sound Tsao; Stephen Lam; Raymond T Ng; Wan L Lam
Journal:  BMC Med Genomics       Date:  2010-08-03       Impact factor: 3.063

Review 5.  Epidemiology of lung cancer: Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines.

Authors:  Anthony J Alberg; Malcolm V Brock; Jean G Ford; Jonathan M Samet; Simon D Spivack
Journal:  Chest       Date:  2013-05       Impact factor: 9.410

6.  hSAGEing: an improved SAGE-based software for identification of human tissue-specific or common tumor markers and suppressors.

Authors:  Cheng-Hong Yang; Li-Yeh Chuang; Tsung-Mu Shih; Hsueh-Wei Chang
Journal:  PLoS One       Date:  2010-12-17       Impact factor: 3.240

7.  Impact of collection and storage of lung tumor tissue on whole genome expression profiling.

Authors:  Maxim B Freidin; Neesa Bhudia; Eric Lim; Andrew G Nicholson; William O Cookson; Miriam F Moffatt
Journal:  J Mol Diagn       Date:  2012-01-10       Impact factor: 5.568

8.  EYA4 is inactivated biallelically at a high frequency in sporadic lung cancer and is associated with familial lung cancer risk.

Authors:  I M Wilson; E A Vucic; K S S Enfield; K L Thu; Y A Zhang; R Chari; W W Lockwood; N Radulovich; D T Starczynowski; J P Banáth; M Zhang; A Pusic; M Fuller; K M Lonergan; D Rowbotham; J Yee; J C English; T P H Buys; S A Selamat; I A Laird-Offringa; P Liu; M Anderson; M You; M S Tsao; C J Brown; K L Bennewith; C E MacAulay; A Karsan; A F Gazdar; S Lam; W L Lam
Journal:  Oncogene       Date:  2013-10-07       Impact factor: 9.867

9.  LRIG1 regulates cadherin-dependent contact inhibition directing epithelial homeostasis and pre-invasive squamous cell carcinoma development.

Authors:  Liwen Lu; Vitor H Teixeira; ZhengQiang Yuan; Trevor A Graham; David Endesfelder; Krishna Kolluri; Noura Al-Juffali; Nicholas Hamilton; Andrew G Nicholson; Mary Falzon; Maik Kschischo; Charles Swanton; Nicholas A Wright; Bernadette Carroll; Fiona M Watt; Jeremy P George; Kim B Jensen; Adam Giangreco; Sam M Janes
Journal:  J Pathol       Date:  2013-03       Impact factor: 7.996

10.  Reinventing diagnostics for personalized therapy in oncology.

Authors:  Diponkar Banerjee
Journal:  Cancers (Basel)       Date:  2010-06-02       Impact factor: 6.639

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.