The present study aimed to explore gene and microRNA (miRNA) expression differences between lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC). Differentially expressed genes (DEGs) and differentially expressed miRNAs (DEMs) were identified by analyzing mRNA and miRNA expression data in normal and cancerous lung tissues that were obtained from The Cancer Genome Atlas database. A total of 778 DEGs and 7 DEMs were identified. Altered gene functions and signaling pathways were investigated using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analyses, which revealed that DEGs were significantly enriched in extracellular matrix organization, cell differentiation, negative regulation of toll signaling pathway, and several other terms and pathways. Transcription factor (TF)‑miRNA‑gene networks in LUAD and LUSC were predicted using the TargetScan, Miranda, and TRANSFAC databases, which revealed the regulatory links among the TFs, DEMs, and DEGs. The central TFs, i.e., the TFs in the middle of the TF‑miRNA‑gene network, of LUAD and LUSC were similar. Although LUAD and LUSC shared similar miRNAs in the predicted networks, miR‑29b‑3p was demonstrated to be upregulated only in LUAD, whereas miR‑1, miR‑105‑5p, and miR‑193b‑5p were altered in LUSC. These findings may improve our understanding of the different molecular mechanisms in non‑small cell lung cancers and may promote new and accurate strategies for prevention, diagnosis, and treatment.
The present study aimed to explore gene and microRNA (miRNA) expression differences between lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC). Differentially expressed genes (DEGs) and differentially expressed miRNAs (DEMs) were identified by analyzing mRNA and miRNA expression data in normal and cancerous lung tissues that were obtained from The Cancer Genome Atlas database. A total of 778 DEGs and 7 DEMs were identified. Altered gene functions and signaling pathways were investigated using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analyses, which revealed that DEGs were significantly enriched in extracellular matrix organization, cell differentiation, negative regulation of toll signaling pathway, and several other terms and pathways. Transcription factor (TF)‑miRNA‑gene networks in LUAD and LUSC were predicted using the TargetScan, Miranda, and TRANSFAC databases, which revealed the regulatory links among the TFs, DEMs, and DEGs. The central TFs, i.e., the TFs in the middle of the TF‑miRNA‑gene network, of LUAD and LUSC were similar. Although LUAD and LUSC shared similar miRNAs in the predicted networks, miR‑29b‑3p was demonstrated to be upregulated only in LUAD, whereas miR‑1, miR‑105‑5p, and miR‑193b‑5p were altered in LUSC. These findings may improve our understanding of the different molecular mechanisms in non‑small cell lung cancers and may promote new and accurate strategies for prevention, diagnosis, and treatment.
Lung cancer is one of the most common malignant tumors and in 2012 accounted for ~1.82 million new cases and ~1.59 million mortalities worldwide (1,2). Despite the advances in treatment methods that have been made available in recent years, including minimally invasive surgical approaches, chemotherapies, and targeted therapies, the 5-year survival of patients with lung cancer is far from satisfying, ranging between 10 and 20% for most geographic areas (3). There are two major pathological subtypes that constitute the majority of lung cancers: lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), which differ in a number of ways (4–7).LUAD and LUSC originate from different cells and have several major differences not only in biological patterns, but also molecular characteristics and, most importantly, therapeutic strategies (5,8). For example, activating mutations in epidermal growth factor receptor and mutations in ALK fusion proteins usually occur in LUAD, but not LUSC, rendering medications targeted at these genes ineffective for LUSC (9). Therefore, comprehensive investigations into the differences of molecular characteristics and mechanisms of these two major subtypes of lung cancer are required, which will lead to deeper understanding and identification of novel molecular-targeted strategies for lung cancer therapy.A large amount of high-throughput data on multiple types of cancer have recently been released by The Cancer Genome Atlas (TCGA; http://cancergenome.nih.gov) database, including mRNA and microRNA (miRNA) sequencing data from hundreds of LUAD and LUSC samples. These data enabled the molecular differences between LUAD and LUSC to be fully investigated. The present study explored differences in gene expression, miRNA expression, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and molecular regulatory networks by bioinformatics analyses, and the results may facilitate a better understanding of the different molecular mechanisms of non-small cell lung cancer (NSCLC) and may promote the discovery and development of new, accurate strategies for lung cancer prevention, diagnosis, and treatment.
Materials and methods
mRNA and miRNA expression data resources and preprocessing
Level 3 RNA sequencing data from 108 normal pulmonary samples and 980 pulmonary carcinoma samples (that is, 490 LUAD with 58 normal control samples, and 490 LUSC with 50 normal control samples), and level 3 miRNA sequencing data from 91 normal pulmonary samples and 966 pulmonary carcinoma samples (that is, 499 LUAD with 46 normal control samples, and 467 LUSC with 45 normal control samples) were released by TCGA prior to April 15, 2015, and were obtained by the present study from the TCGA data portal (https://portal.gdc.cancer.gov). Data preprocessing was carried out as described in previous studies (10).
Identification of differentially expressed genes (DEGs) and differentially expressed miRNAs (DEMs)
Genes and miRNAs that are differentially expressed among normal, LUAD, and LUSC sample groups were identified as previously reported (11). DEGs with a fold change (tumor/normal) >2 or <0.5 and DEMs with a fold change >2.5 or <0.4 were qualified for subsequent analyses. A random variance model t-test was used to confirm the DEGs and DEMs, as previously described (10). Following analysis of the significance, fold change, and false discovery rate (FDR), mRNAs and miRNAs that had both P<0.05 and FDR<0.05 were considered to be significantly differentially expressed (10). Only the DEGs and DEMs that were identified in both the LUAD and LUSC groups were included when comparing the differences between gene and miRNA expressions.
GO and KEGG pathway analyses
To investigate the significantly enriched functions and the significant pathways for these DEGs, GO term (http://geneontology.org) and KEGG pathway (http://www.genome.jp/kegg) analyses were conducted as previously reported (12,13). Briefly, the two-tailed Fisher's exact test and the χ2 test were used to classify the GO categories or KEGG pathways, and the FDR was calculated for multiple testing corrections. GO terms or KEGG pathways having both P<0.05 and FDR<0.05 were considered to be significantly different. Enrichment values were calculated to identify those significant terms or pathways that provided the most concrete functional descriptions in this analysis. GO-map and PathNet analyses were conducted to further outline the functional links among the related GO terms and the significant KEGG pathways.
TF-miRNA-gene network
The regulation networks among transcription factors (TFs), DEMs and DEGs were established as previously described (10,14). Briefly, the target DEGs of DEMs were predicted using TargetScan (http://www.targetscan.org) and miRanda (http://www.microrna.org/microrna/home.do) (15,16). Subsequently, the TFs that may regulate the expression of DEMs and DEGs were identified using the TRANSFAC database (http://gene-regulation.com/pub/databases.html) (17). Finally, TF-miRNA-gene networks were created in LUAD and LUSC, as previously described (10).
Statistical analysis
Statistical analysis was performed using the software IBM SPSS version 20 (IBM SPSS, Armonk, NY, USA).
Results
Identification of DEGs and DEMs
Significant differences in expression levels were detected for certain genes and miRNAs that may be used as biomarkers for the early diagnosis, assessment, and monitoring of lung cancer. As shown in Table I, 1,492 DEGs and 36 DEMs were identified as being significantly different between LUAD and normal lung tissues; for LUSC vs. normal lung tissue, 2,726 DEGs and 45 DEMs were identified. The top 20 DEGs and the top 10 DEMs exhibiting the most significant differential expressions in LUAD and LUSC are shown in Figs. 1 and 2, respectively.
Table I.
DEGs and DEMs identified between LUAD, LUSC and normal samples.
Top 20 genes and top 10 microRNAs identified as the most differentially expressed in LUAD compared with normal lung tissue samples. Data are expressed as the mean ± standard deviation. ***P<0.001 vs. normal. DEGs, differentially expressed genes; DEMs, differentially expressed microRNAs; miR, microRNA; FDR, false discovery rate; LUAD, lung adenocarcinoma; RSEM, RNA-Seq by expectation maximization; RPM, reads per million miRNA mapped.
Figure 2.
Top 20 genes and top 10 microRNAs identified as the most differentially expressed in LUSC compared with normal lung tissue samples. Data are expressed as the mean ± standard deviation. ***P<0.001 vs. normal. DEGs, differentially expressed genes; DEMs, differentially expressed microRNAs; miR, microRNA; FDR, false discovery rate; LUSC, lung squamous cell carcinoma; RSEM, RNA-Seq by expectation maximization; RPM, reads per million miRNA mapped.
A total of 778 DEGs and 7 DEMs were identified in both LUAD and LUSC (Table I). As demonstrated in Fig. 3, transmembrane 4 L six family member 4 (TM4SF4), diffuse panbronchiolitis critical region 1 (DPCR1), prograstricsin (PGC), galectin 4 (LGALS4), and interleukin 37 (IL37) were the top five DEGs that were more upregulated in LUAD than in LUSC (TM4SF4, 214.5 fold change; IL37, 63.2 fold change, DPCR1, PGC and LGALS4, fold change 63.2–214.5); serpin family B member 12 (SERPINB12), amelotin (AMTN), small proline-rich protein 4 (SPRR4), transmembrane protease, serine 11A (TMPRSS11A), and embryonic stem cell related (ESRG) were the top five DEGs identified as being more upregulated in LUSC compared with LUAD (fold change, 172.0–322.9). As expected, a number of well-established biomarkers for LUAD and LUSC were also identified, including transcription termination factor 1 (TTF1; fold change LUAD/LUSC, 10.5), keratin 7 (KRT7; fold change, 3.63), SRY-box 2 (SOX2; fold change, 0.11), p63 (fold change, 0.03), and KRT5 (fold change, 0.01).
Figure 3.
Top 10 genes and top 5 microRNAs identified as the most differentially expressed between LUAD and LUSC. Data are expressed as the mean ± standard deviation. ***P<0.001 vs. LUSC. DEGs, differentially expressed genes; DEMs, differentially expressed microRNAs; miR, microRNA; FDR, false discovery rate; LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma; RSEM, RNA-Seq by expectation maximization; RPM, reads per million miRNA mapped.
miRNA miR-375 was the only DEM that was demonstrated to be more upregulated in LUAD compared with LUSC (fold change, 5.62); six other DEMs were revealed to be upregulated in LUSC vs. LUAD, including miR-205-5p, miR-205-3p, miR-149-5p, miR-196b-5p, miR-1269a, and miR-105-5p (fold change, 2.3–25.2; Fig. 3).
Enriched GOs and pathways
GO and KEGG analyses were used in the present study to provide a preliminarily perspective on the altered biological functions and pathways in which the DEGs are enriched. In LUAD vs. Normal lung tissue, DEGS were enriched in 806 GO terms and 84 pathways, whereas in LUSC vs. Normal, DEGs were enriched in 1266 GOs and 146 pathways (Table II).
Table II.
Basic information of the GOs and KEGG pathways in which the DEGs and DEMs were enriched.
Comparing LUAD and LUSC, DEMs were enriched in 409 GOs and 47 pathways (Table I), and the links among these GOs and pathways are integrated in Figs. 4 and 5, respectively. The DEGs identified as upregulated in LUAD vs. LUSC were enriched in 124 GOs, such as negative regulation of the toll signaling pathway (GO:0045751) and negative regulation of nuclear factor (NF)-κB activity (GO:0032088), and in 22 pathways, such as peroxisome proliferator-activated receptor (PPAR) signaling pathway (id:03320) and glycolysis/gluconeogenesis (id:00010). The upregulated DEGs in LUSC vs. LUAD were enriched in 285 GOs, such as extracellular matrix organization (GO:0030198) and cell differentiation (GO:0030154), and in 25 pathways, such as cell adhesion molecules (id:04514) and p53 signaling pathway (id:04115).
Figure 4.
GO map of LUAD vs. LUSC. Red circles denote GOs in which the upregulated genes in LUAD are significantly enriched; blue circles indicate GOs in which the upregulated genes in LUSC are significantly enriched; yellow circles indicate genes with upregulated expression in both LUAD and LUSC. Arrows denote the relationships between GOs, where the tail end is the source GO and the arrow end is the target GO. The diameter of each circle represents the number of GOs that interact closely with a GO; larger circles indicate more interactions. P<0.05; FDR<0.05. FDR, false discovery rate; GO, gene ontology; LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma.
Figure 5.
KEGG analysis in LUAD vs. LUSC. Red circles denote GOs in which the upregulated genes in LUAD are significantly enriched; blue circles indicate GOs in which the upregulated genes in LUSC are significantly enriched; yellow circles indicate genes with upregulated expression in both LUAD and LUSC. Arrows denote the relationships between GOs, where the tail end is the source GO and the arrow end is the target GO. The diameter of each circle represents the number of GOs that interact closely with a GO; larger circles indicate more interactions. KEGG analysis results: P<0.05; FDR<0.05. FDR, false discovery rate; GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma.
The present study constructed TF-miRNA-gene networks of LUAD and LUSC (Figs. 6 and 7), using the large amount of interrelated expression data of miRNAs and genes in the TCGA database, to predict regulatory networks among the TFs, DEMs, and DEGs. As the proposed networks demonstrated, the central TFs and DEMs, i.e. the TFs and DEMs in the middle of the TF/DEM-miRNA-gene network, in LUAD and LUSC were quite similar. The top six central TFs were core promoter element-binding protein (CPBP), gut-enriched Krüppel-like factor (GKLF), Churchill, nuclear factor of activated T-cells 1 (NF-AT1), zinc finger protein 333 (ZNF333), and inhibitor of growth protein 4 (ING4); these TFs were the same in LUAD and in LUSC, indicating that there are still common regulatory mechanisms shared between these two subtypes of lung cancer. LUAD and LUSC shared 19 DEMs in common (data not shown), of which miR-486-5p, miR-133a-3p, and miR-196a-5p were centrally positioned in the predicted TF-miRNA-gene regulatory networks of both LUAD and LUSC. LUAD and LUSC had different patterns of DEMs. miR-29b-3p was upregulated and was predicted to regulate the most DEGs in the LUAD network, but not in LUSC. miR-1, miR-105-5p, and miR-193b-5p were only in the center of the LUSC network.
Figure 6.
TF-microRNA-gene network in LUAD. The regulatory network of TFs (triangles), DEMs (squares) and DEGs (circles); the size of each shape represents the number of closely interacting factors, with the large shapes indicating a higher number of interactions. Red represents upregulated expression in LUAD; blue represents downregulated expression; and yellow is uncertain. Lines denote the regulatory links among these factors. DEG, differential expressed genes; DEM, differential expressed miRNAs; LUAD, lung adenocarcinoma; TF, transcription factor.
Figure 7.
TF-microRNA-gene network in LUSC. The regulatory network of TFs (triangles), DEMs (squares) and DEGs (circles); the size of each shape represents the number of closely interacting factors, with the large shapes indicating a higher number interactions. Red represents upregulated expression in LUAD; blue represents downregulated expression; and yellow are uncertain. Lines denote the regulatory links among these factors. DEGs, differentially expressed genes; DEMs, differentially expressed microRNAs; LUSC, lung squamous cell carcinoma; TF, transcription factor.
Discussion
The present study analyzed the differences in DEG and DEM expressions in LUAD and LUSC compared with normal lung tissue and, most importantly, the differences between LUAD and LUSC. To further elucidate the functions of these DEGs and DEMs, GO and KEGG pathway analyses were performed. In addition, TF-miRNA-gene networks were constructed for LUAD and LUSC; however, further independent validation with experimental data is still required.Cytology and pathology have traditionally been used in the differential diagnosis of LUAD and LUSC; however, in some cases, such as small biopsy samples or aspiration cytology samples, additional tests of molecular characteristics were required (18). Although several genes have been used as biomarkers in lung cancer diagnosis and differential diagnosis, more accurate and convenient biomarkers are still needed. The present study identified 778 DEGs and 7 DEMs that were differentially expressed between LUAD and LUSC. These DEGs and DEMs may be possible candidates for differential diagnosis between LUAD and LUSC, and several have already been used in clinical practice, such as TTF-1, SOX2, p63, KRT5, and KRT7 (6,19). Although a high fold change is not the only criteria for biomarkers, those exhibiting significant differences in expression, such as TM4SF4, DPCR1, SERPINB12, and AMTN, may be worth further investigation. In our previous study, several of the DEGs, such as melanophilin (MLPH), transmembrane channel-like 5 (TMC5), surfactant associated 3 (SFTA3), desmoglein 3 (DSG3), desmocollin 3 (DSC3), and calmodulin-like 3 (CALML3), were confirmed to be differentially expressed in LUAD and LUSC by immunohistochemical staining (7). miR-205-5p expression levels were previously reported to be significantly higher in LUSC compared with LUAD, both in serum and tissue (20); miR-375 was also demonstrated to be highly expressed in LUAD (21), which was consistent with the current data. The present study offers a list of DEGs and DEMs in which better biomarkers may exist.Various gene functions and pathways that are greatly altered in LUAD and LUSC have been identified in the present study, suggesting that these GOs and pathways serve primary roles in lung cancer pathogenesis, as do the DEGs that participate in these GOs and pathways. Although LUAD and LUSC share a lot of common activated GOs and pathways, they also display their own features. According to the results, genes related to extracellular matrix organization, such as matrix metalloproteinase (MMP) 3, MMP10 and MMP12, were upregulated in LUAD and even significantly higher in LUSC compared with in normal tissue. MMPs are key factors in the development of the tumor microenvironment and drive cancer progression and metastasis, and have been identified as prognostic factors for poor survival in many types of cancer (22–24). In the present study, the PPAR pathway was demonstrated to be activated in LUAD, but not in LUSC. PPARs have been reported to be associated with breast, ovary, prostate, bladder, gastric and colon adenocarcinoma carcinogenesis, as well as in leukemia (25). Tsubouchi et al (26) proposed that a PPARγ agonist may be a useful therapeutic agent in the treatment of humanlung cancer. The p53 signaling pathway was also identified as upregulated in LUSC; a critical role of the p53 mutation in malignant transformation, histologic progression, invasion, and metastasis has been previously demonstrated in both in vitro and in vivo models of lung cancer (27–29). Smoking was revealed to be closely related to p53 mutation (4,30), which may explain the prevalence of p53 alterations in LUSC.miR-29b acts as a tumor suppressor in breast cancer and is a potential marker for recurrence and metastasis (31). miR-29b-3p in peripheral blood mononuclear cells was reported to be a novel target for the diagnosis of NSCLC (32). miR-1 was revealed to be downregulated in various types of cancers, including LUSC, and could act as a tumor suppressor (33). Previous studies have suggested that miR-1 functions through the regulation of oncogenic coronin 1C (34), and the silencing miR-1 resulted in sensitization of LUSC to traditional chemotherapeutics (35). miR-375 appears to serve many different roles in carcinogenesis, and functions as an oncogene or a tumor suppressor depending on the type of cancer (36). miR-375 was previously revealed to inhibit cell proliferation, invasion and motility in several types of cancer, including NSCLC (37), whereas upregulated miR-375 expression may stimulate cell proliferation in thyroid carcinoma, small-cell lung, breast and prostate cancers (38). Conversely, miR-375 was reported to be downregulated in NSCLC, but the prognostic significance remains unclear (39). Further research into these TFs and miRNAs may lead to novel treatment of NSCLC.In conclusion, the present study investigated the differences between the gene and miRNA expression patterns in LUAD and LUSC, and explored their different biological characteristics. Further understanding of these differences may promote the discovery and development of new, accurate strategies for the prevention, diagnosis and treatment of lung cancer. Further experiments are required to validate the results of the present bioinformatics analysis.
Authors: Erica L Jackson; Kenneth P Olive; David A Tuveson; Roderick Bronson; Denise Crowley; Michael Brown; Tyler Jacks Journal: Cancer Res Date: 2005-11-15 Impact factor: 12.701
Authors: Claudia Allemani; Hannah K Weir; Helena Carreira; Rhea Harewood; Devon Spika; Xiao-Si Wang; Finian Bannon; Jane V Ahn; Christopher J Johnson; Audrey Bonaventure; Rafael Marcos-Gragera; Charles Stiller; Gulnar Azevedo e Silva; Wan-Qing Chen; Olufemi J Ogunbiyi; Bernard Rachet; Matthew J Soeberg; Hui You; Tomohiro Matsuda; Magdalena Bielska-Lasota; Hans Storm; Thomas C Tucker; Michel P Coleman Journal: Lancet Date: 2014-11-26 Impact factor: 79.321
Authors: David M Garcia; Daehyun Baek; Chanseok Shin; George W Bell; Andrew Grimson; David P Bartel Journal: Nat Struct Mol Biol Date: 2011-09-11 Impact factor: 15.369