Literature DB >> 31611968

Exploration of estrogen receptor-associated hub genes and potential molecular mechanisms in non-smoking females with lung adenocarcinoma using integrated bioinformatics analysis.

Hao Wang1, Zhihong Zhang1, Ke Xu1, Song Wei1, Lailing Li1, Lijun Wang2.   

Abstract

The present study aimed to explore important estrogen receptor-associated genes and to determine the potential pathogenic and prognostic factors for lung adenocarcinoma in non-smoking females. The gene expression profiles of the two datasets (GSE32863 and GSE75037) were downloaded from the Gene Expression Omnibus (GEO) database. Data for non-smoking female patients with lung adenocarcinoma from The Cancer Genome Atlas (TCGA) database were also downloaded. The Linear Models for Microarray Data package in R was used to explore the differentially expressed genes (DEGs) between samples from non-smoking female patients with lung adenocarcinoma and samples of adjacent non-cancerous lung tissue. The Database for Annotation, Visualization and Integrated Discovery was used for functional enrichment of the DEGs. The Search Tool for the Retrieval of Interacting Genes/Proteins and Cytoscape software were used to obtain a protein-protein interaction (PPI) network and to identify the hub genes. In addition, the network between the estrogen receptor and the DEGs was constructed. A Kaplan-Meier survival plot was used to analyze the overall survival (OS). In total, 248 DEGs were identified in the GEO database, and 2,362 DEGs were identified in TCGA database. The intersection of the two datasets (DEGs in GEO and TCGA) revealed 170 DEGs, and these were selected for further investigation. Gene Ontology was used to group the 170 DEGs into biological process, molecular function and cellular component categories. Kyoto Encyclopedia of Genes and Genomes pathway analysis was subsequently performed. A total of 27 hub genes, including caveolin 1 (CAV1), matrix metallopeptidase 9 (MMP9), secreted phosphoprotein 1 (SPP1) and collagen type I α 1 chain (COL1A1), were closely associated with the estrogen receptor. CAV1 and SPP1 were associated with the OS. However, MMP9 and COL1A1 did not have any significant effect on OS. In summary, the identification of CAV1, MMP9, SPP1 and COL1A1 may provide novel insights into the molecular mechanism of lung adenocarcinoma in non-smoking female patients, and the results obtained in the current study may guide future clinical studies. Copyright: © Wang et al.

Entities:  

Keywords:  bioinformatics; lung adenocarcinoma; non-smoking females

Year:  2019        PMID: 31611968      PMCID: PMC6781748          DOI: 10.3892/ol.2019.10845

Source DB:  PubMed          Journal:  Oncol Lett        ISSN: 1792-1074            Impact factor:   2.967


Introduction

Lung cancer remains the leading cause of cancer-associated mortalities worldwide. Small cell lung cancer and non-small cell lung cancer (NSCLC) are the two main types of lung cancer. NSCLC includes adenocarcinoma, squamous cell cancer and large cell lung cancer (1). Chemotherapy, radiotherapy, targeted therapy and immunotherapy are the main treatment strategies for NSCLC. However, NSCLC is often diagnosed at a late stage; consequently, the five-year survival rate is low (2). Therefore, investigating the etiology and prognostic factors of NSCLC is important. Squamous cell lung cancer is strongly associated with smoking. While lung adenocarcinoma is associated with smoking, this type of cancer also occurs in non-smokers (3). Non-smoking lung adenocarcinoma is strongly associated with the female sex (4). Specific molecules (CXCR2 and PPBP) and pathways (cell adhesion molecules and CAMs) play important roles in the pathogenesis of lung adenocarcinoma in non-smoking female patients (5). The prevalence of lung adenocarcinomas in non-smoking females is higher than that in non-smoking males, suggesting the sex hormones may be involved in tumorigenesis (3). In vitro studies have revealed that estrogen promotes the proliferation of NSCLC cells through estrogen receptor-mediated signaling pathways, whereas, anti-estrogens exhibit the opposite effect (6,7). Downregulation of estrogen receptor β (ERβ) inhibits cell growth in lung adenocarcinoma (8). 17β-estradiol upregulates the expression of interleukin-16 through the ERβ signaling pathway and promotes the progression of lung adenocarcinoma (9). Previous studies have demonstrated that EGFR (epidermal growth factor receptor) and HER2 (human epidermal growth factor receptor 2) mutations, and anaplastic lymphoma kinase rearrangements are more commonly observed in lung cancer in non-smokers compared with that in smokers (10,11). Tumor protein P53 and breast cancer types 1 and 2 susceptibility protein variants are likely to contribute to the development of lung adenocarcinoma in non-smoking females (12). Osteopontin (OPN), hypoxia inducible factor-1 and several energy metabolism-associated proteins have been associated with estrogen receptor function (13). However, the pathogenesis and prognostic factors of non-smoking female patients with lung adenocarcinoma remain unclear. In the present study, bioinformatics analysis was used to explore estrogen receptor-associated genes that are related to prognosis in non-smoking female patients with lung adenocarcinoma. The results may improve the understanding of the pathogenic and prognostic factors associated with lung adenocarcinoma in non-smoking females.

Materials and methods

Analysis of microarray data and RNA-sequencing data

Microarray data and the corresponding clinical data for non-smoking female patients with lung adenocarcinoma from the GSE32863 (14) and GSE75037 (15) datasets, both datasets of 24 non-smoking female patients, were downloaded from the Gene Expression Omnibus (GEO) (ncbi.nlm.nih.gov/geo/) based on the platform of GPL6884 Illumina Human WG-6 v3.0 expression beadchip (Illumina, Inc.). Data for 48 non-smoking female patients with lung adenocarcinoma detected using a microarray chip in the GEO database (GSE32863 and GSE75037) and from 160 non-smoking female patients with lung adenocarcinoma detected using RNA-sequencing in the The Cancer Genome Atlas (TCGA) database (portal.gdc.cancer.gov; last updated on July 2017) were also downloaded. The SVA package (version 3.32.1; www.bioconductor.org/help/search/index.html?q=sva/) in Bioconductor (version 3.9; www.bioconductor) was used to normalize the gene expression profile data.

Identification of differentially expressed genes (DEGs)

The Linear Models for Microarray Data (LIMMA) package (version 3.1; www.bioconductor.org/help/search/index.html?q=limma) in Bioconductor was used to identify DEGs between samples from non-smoking female patients with lung adenocarcinoma and samples of adjacent non-cancerous lung tissue. Adjusted P-values and fold-change (FC) values were calculated. The DEGs screening criteria were an adjusted P<0.05 and absolute value of Log2FC >2. The DEGs at the intersection of the datasets (DEGs in GEO and TCGA) were selected for subsequent investigation. The pheatmap package (version 1.0.12; http://cran.r-project.org/web/packages/pheatmap/index.html) was used to draw the heat map.

Enrichment analyses of DEGs

The Database for Annotation, Visualization and Integrated Discovery (version 6.8; david.ncifcrf.gov) was used to perform functional enrichment analysis of the DEGs in non-smoking female patients with lung adenocarcinoma. Gene Ontology (GO; version 6.8; www.geneontology.org) and Kyoto Encyclopedia of Genes and Genomes (KEGG; version 6.8; www.genome.ad.jp/kegg/) pathway analyses were conducted using the WEB-based GEne SeT AnaLysis Toolkit (www.webgestalt.org). P<0.05 was considered to indicate strong enrichment in the annotation categories.

Analysis of protein-protein interaction (PPI) networks

The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (version 3.6.0; string-db.org/cgi/input.pl) provides experimental and predicted interactions among proteins. STRING analyses were performed to form a PPI network, with the criterion of a combined score of >0.4. The DEGs, ESR1 (estrogen receptor 1), ESR2 (estrogen receptor 2) and GPER (G protein-coupled estrogen receptor) were used as queries in the STRING database and the resultant PPI network was subsequently visualized using Cytoscape software (version 3.6.0; cytoscape.org). CytoHubba (a plug-in in Cytoscape; version 1.6; http://apps.cytoscape.org/apps/cytohubba) was used to identify the estrogen receptor-related hub genes (the top 20 genes with the most connections in the PPI network).

DEGs prognosis analysis

The Kaplan-Meier plotter (kmplot.com) (16) is an online database including clinical and expression data. The median expression level of each gene was used to divide patients into high and low groups. The Kaplan-Meier plotter was used to identify the hub genes with significant effects on prognosis. P<0.05 was considered to indicate a statistically significant difference.

Statistical analysis

Statistical analysis was performed using R (version 3.4.3; www.r-project.org). For the microarray and RNA-sequencing data analysis, the LIMMA package in Bioconductor was used to identify the DEGs. The thresholds for identifying DEGs were P<0.05 and a false discovery rate <2. The SVA package was used for batch normalization of GSE32863 and GSE75037. The log rank test was used to compare the survival trend. P<0.05 was considered to indicate a statistically significant difference.

Results

Identification of DEGs

According to the screening criteria, 248 DEGs (57 upregulated and 191 downregulated) were identified in the GEO data, and 2,362 DEGs (1,773 upregulated and 589 downregulated) were identified in the TCGA data. The DEGs at the intersection of the two databases were selected for further investigation, revealing 170 DEGs between lung adenocarcinoma and normal lung tissues from non-smoking female patients. All 248 DEGs in the GEO database and the 2,362 DEGs in the TCGA database were visualized using a heat map, and a Venn diagram was used to present the DEGs at the intersection of the two databases (Fig. 1). The top 10 (by fold change) upregulated and downregulated DEGs in the GEO and TCGA databases are presented in Table I.
Figure 1.

DEGs in non-smoking female patients with lung adenocarcinoma. (A) The DEGs in the GSE32863 and GSE75037 datasets, which were downloaded from the GEO database. There were 248 DEGs, comprising 57 upregulated and 191 downregulated genes. (B) The DEGs from the TCGA database. There were 2,362 DEGs, comprising 1,773 upregulated and 589 downregulated genes. Red indicates genes that were upregulated and green indicates downregulated genes. Each column represents a tissue sample; each row represents a gene. (C) Venn diagram presenting the 170 common DEGs between the GEO and TCGA datasets. DEGs, differentially expressed genes; GEO, Gene Expression Omnibus; TCGA, The Cancer Genome Atlas.

Table I.

The top 10 (by fold change) upregulated and downregulated DEGs in the GEO and the TCGA databases.

A, Top 10 upregulated DEGs

GEOTCGA
GCNT3, MMP11, CST1, CEACAM5, SPINK1, SPP1, CRABP2, COMP, EEF1A2, ATP10BMAGEA4, PSG3, PSG1, HOXC12, REG4, PSG5, TFF2, FGB, CGB5, PSG4

B, Top 10 downregulated DEGs

GEOTCGA

LOC401286, ITLN2, MCEMP1, FABP4, CA4, FAM107A, GKN2, HBB, CLEC3B, SCGB1A1SLC6A4, SLC6A5, FABP4, AGER, UMOD, ITLN2, CYP1A2, FREM3, OR6K3, CHRM1

DEGs, differentially expressed genes; GEO, Gene Expression Omnibus; TCGA, The Cancer Genome Atlas.

DEGs enrichment analysis

The 170 DEGs were grouped into BP (biological process; 56 BP terms were significantly enriched), MF (molecular function; 17 MF terms were significantly enriched) and CC (cellular component; 23 CC terms were significantly enriched) categories. The most enriched GO terms in the BP category were ‘cell adhesion’, ‘receptor-mediated endocytosis’ and ‘angiogenesis’. The most enriched GO terms in the MF category were ‘heparin binding’ and ‘carbohydrate binding’. The most enriched GO terms in the CC category were ‘extracellular region’, ‘proteinaceous extracellular matrix’ and ‘extracellular space’ (Fig. 2).
Figure 2.

Enrichment analyses of differentially expressed genes. The 170 differentially expressed genes were grouped into (A) BP, (B) MF, (C) CC and (D) KEGG categories. BP, biological process; MF, molecular function; CC, cellular component; KEGG, Kyoto Encyclopedia of Genes and Genomes.

KEGG pathway enrichment analysis (8 KEGG terms were significantly enriched) revealed that the majority of the DEGs were enriched in pathways including ‘malaria’ and ‘ECM-receptor interaction’ (Fig. 2).

PPI network analysis

A PPI network was constructed to understand the biological significance of the DEGs. The PPI network consisted of 124 nodes and 266 interactions (Fig. 3). The network between the estrogen receptor (ESR1, ESR2 and GPER) and DEGs was also constructed. In addition, a network of hub genes associated with estrogen receptor was constructed. The PPI network analysis identified 27 DEGs that were considered as hub genes in the network. There were four hub genes that were closely associated with the estrogen receptor, including caveolin 1 (CAV1), matrix metalloproteinase 9 (MMP9), SUMO-1-specific protease 1 (SPP1) and collagen type I α 1 chain (COL1A1), as presented in Fig. 4.
Figure 3.

PPI network showing experimentally verified and predicted interaction information among the proteins encoded by the differentially expressed genes. There were 124 nodes and 266 interactions in the PPI network. The key below the network shows which lines indicate experimentally verified interactions and which indicate predicted interactions. PPI, protein-protein interaction.

Figure 4.

The network between DEGs and genes encoding estrogen receptors. Estrogen receptors comprise three subtypes: ESR1, ESR2, and GPER (presented in red). The hub genes (presented in green) make up the small circle. The big circle together with the small circle represents all the DEGs. DEGs, differentially expressed genes; ESR1, estrogen receptor 1; ESR2, estrogen receptor 2; GPER, G protein-coupled estrogen receptor.

Kaplan-Meier survival analysis

Kaplan-Meier curves were used to assess the effect of estrogen receptor-associated hub genes on the overall survival (OS) of 121 non-smoking females with lung adenocarcinoma. Low expression of SPP1 and high expression of CAV1 were associated with improved OS. However, there was no significant association between MMP9 or COL1A1 expression status and OS (Fig. 5).
Figure 5.

Effect on OS of four hub genes associated with the estrogen receptor in the network. The prognostic effect of (A) CAV1, (B) SPP1, (C) COL1A1 and (D) MMP9 were plotted using the Kaplan-Meier plotter. Red lines indicate patients with high expression of the gene. Black lines indicate patients with low expression of the gene. Low expression of SPP1 and high expression of CAV1 were associated with increased OS, whereas COL1A1 and MMP9 expression were not relevant to survival. The 95% confidence interval of HR is also presented. OS, overall survival; SPP1, secreted phosphoprotein 1; CAV1, caveolin 1; HR, hazard ratio.

Discussion

The pathogenesis and prognostic factors for lung adenocarcinoma in non-smoking females remain controversial (3). Previous studies have suggested that estrogen and its receptor (ER) may serve important roles. In vitro studies have revealed that the ER promotes NSCLC vasculogenic mimicry and cell invasion (17). The ER is also activated in EGFR-tyrosine kinase inhibitor mediated secondary resistance (18). In addition, a high expression level of ER is a significant prognostic factor for survival in advanced NSCLC (19). There are three types of estrogen receptors: ERα, ERβ and GPER. ERα and ERβ are important nuclear transcription factors located in the cell nucleus. GPER is a G-protein coupled receptor containing seven transmembrane domains located in the cell membrane (19). Few studies have investigated the association between ERs and lung adenocarcinoma in non-smoking female patients (20,21). In the present study, gene expression profiles from the GSE32863 and GSE75037 datasets and data from the TCGA database were analyzed using bioinformatic methods. A total of 170 DEGs between lung adenocarcinoma and normal lung tissue samples from non-smoking women were common to both databases. Additionally, the GO terms and KEGG pathways associated with these DEGs, which might significantly affect lung adenocarcinoma in non-smoking females, were identified. The PPI network analysis identified that 27 DEGs were considered as hub genes in the network. A network consisting of the ERs, DEGs and hub genes was constructed and it was revealed that the hub genes CAV1, SPP1, MMP9 and COL1A1 are significantly associated with ER function. CAV1 is the main structural component of the caveolae, which form flask-shaped invaginations that are involved in cell signaling and transport (22). Low expression levels of CAV1 induce a hyper-proliferative state, promoting cell proliferation, angiogenesis and tumor progression in certain tumors, suggesting that loss of CAV1 regulation is an important step in the acquisition of a transformed phenotype (23). Ramírez et al (24) demonstrated that ERα is present in caveolae and is stabilized by CAV1. Interactions between ERα with CAVI were demonstrated using epitope proximity ligation assays (25). In vitro, the association between ERα and caveolin-1 increased in tumors that regressed in response to estradiol (26). In addition, CAV1 is associated with prognosis in cancer, such as in breast, colon and ovarian carcinoma (22). In the present study, non-smoking female patients with lung adenocarcinoma with high CAV1 expression had an improved prognosis compared with patients with a low CAV1 expression. The interaction between CAV1 and ERα may therefore serve an important role in the pathogenesis and prognosis of lung adenocarcinoma in non-smoking female patients. SPP1 encodes secreted phosphoprotein 1, also known as OPN. OPN is a highly phosphorylated glycophosphoprotein rich in aspartic acid, which facilitates cell-matrix interactions (27). Previous studies investigating lung cancer have reported that tumor development, progression and metastasis are promoted by increasing the release of OPN (28–30). In addition, OPN expression levels were significantly associated with lung cancer differentiation and the efficacy of platinum-based treatment (31). OPN levels may have a significant predictive potential in estimating survival of NSCLC and high OPN expression levels were significantly associated with poor prognosis in NSCLC compared with low expression (32). The survival analysis of non-smoking female patients with lung adenocarcinoma in the current study supported the aforementioned study in NSCLC, as a low expression level of OPN was associated with improved prognosis compared with high expression. SPP1 may be a candidate molecular marker associated with the pathogenesis and prognosis of lung adenocarcinoma in non-smoking female patients. MMP9 regulates various cellular behaviors associated with cancer cell differentiation, migration, invasion and immune system surveillance (33). Suppression of ESR2 in breast cancer cells may affect the expression of MMP9 though microRNA-145 (34). In lung adenocarcinoma, downregulation of ESR2 inhibits cell growth though decreased expression of MMP9 (8). MMP9 may therefore be implicated in lung adenocarcinoma in non-smoking women. COL1A1 is dysregulated in a variety of tumors, including breast and gastric cancer (35). COL1A1 gene expression is inhibited by halofuginone, resulting in inhibition of the proliferation of bladder carcinoma cells (36). Thus, COL1A1 may serve as a potential therapeutic target for lung adenocarcinoma in non-smoking female patients. The present study had a number of limitations. The expression levels of the DEGs, their functions, the hub genes and the association between the ERs and DEGs require experimental validation. The lack of tissues collected from newly diagnosed patients with adenocarcinoma in non-smoking female patients and clinical data were also a limitation and should be conducted in future studies. In conclusion, the present study used bioinformatics analysis to explore the pathogenesis of lung adenocarcinoma in non-smoking female patients and to identify prognostic biomarkers for this disease. Additionally, the effect of genetic and molecular effect of estrogen in non-smoking female patients with lung adenocarcinoma was investigated. The results obtained in the present study provide novel insights into the molecular mechanisms of lung adenocarcinoma in non-smoking female patients.
  36 in total

1.  Interaction of estrogen receptor β5 and interleukin 6 receptor in the progression of non-small cell lung cancer.

Authors:  Hexiao Tang; Yuquan Bai; Lecai Xiong; Li Zhang; Yanhong Wei; Minglin Zhu; Xiaoling Wu; Ding Long; Junhui Yang; Li Yu; Shufang Xu; Jinping Zhao
Journal:  J Cell Biochem       Date:  2018-09-14       Impact factor: 4.429

2.  Expression of Estrogen Receptor-α and Survival in Advanced-stage Non-small Cell Lung Cancer.

Authors:  Marius Lund-Iversen; Helge Scott; Erik H Strøm; Noah Theiss; Odd Terje Brustugun; Bjørn H Grønberg
Journal:  Anticancer Res       Date:  2018-04       Impact factor: 2.480

Review 3.  Current status of research and treatment for non-small cell lung cancer in never-smoking females.

Authors:  Shin Saito; Fernando Espinoza-Mercado; Hui Liu; Naohiro Sata; Xiaojiang Cui; Harmik J Soukiasian
Journal:  Cancer Biol Ther       Date:  2017-05-11       Impact factor: 4.742

4.  Estrogen receptor beta as epigenetic mediator of miR-10b and miR-145 in mammary cancer.

Authors:  Zoi Piperigkou; Marco Franchi; Martin Götte; Nikos K Karamanos
Journal:  Matrix Biol       Date:  2017-08-08       Impact factor: 11.583

5.  Expression of estrogen receptors alpha and beta in human lung tissue and cell lines.

Authors:  Steen Mollerup; Kjersti Jørgensen; Gisle Berge; Aage Haugen
Journal:  Lung Cancer       Date:  2002-08       Impact factor: 5.705

6.  Estrogen Receptor Gene Polymorphisms and Lung Adenocarcinoma Risk in Never-Smoking Women.

Authors:  Kuan-Yu Chen; Chin-Fu Hsiao; Gee-Chen Chang; Ying-Huang Tsai; Wu-Chou Su; Yuh-Min Chen; Ming-Shyan Huang; Fang-Yu Tsai; Shih-Sheng Jiang; I-Shou Chang; Chih-Yi Chen; Chao A Hsiung; Chien-Jen Chen; Pan-Chyr Yang
Journal:  J Thorac Oncol       Date:  2015-10       Impact factor: 15.609

7.  Adjuvant Therapy in Patients With Completely Resected Non-small-cell Lung Cancer: Current Status and Perspectives.

Authors:  Robert Pirker; Martin Filipits
Journal:  Clin Lung Cancer       Date:  2018-09-24       Impact factor: 4.785

8.  17β-estradiol upregulates IL6 expression through the ERβ pathway to promote lung adenocarcinoma progression.

Authors:  Quanfu Huang; Zheng Zhang; Yongde Liao; Changyu Liu; Sheng Fan; Xiao Wei; Bo Ai; Jing Xiong
Journal:  J Exp Clin Cancer Res       Date:  2018-07-03

9.  Extranuclear ERα is associated with regression of T47D PKCα-overexpressing, tamoxifen-resistant breast cancer.

Authors:  Bethany Perez White; Mary Ellen Molloy; Huiping Zhao; Yiyun Zhang; Debra A Tonetti
Journal:  Mol Cancer       Date:  2013-05-01       Impact factor: 27.401

10.  Identification of Key Genes and Pathways in Female Lung Cancer Patients Who Never Smoked by a Bioinformatics Analysis.

Authors:  Ke Shi; Na Li; Meilan Yang; Wei Li
Journal:  J Cancer       Date:  2019-01-01       Impact factor: 4.207

View more
  2 in total

1.  High Expression of UBB, RAC1, and ITGB1 Predicts Worse Prognosis among Nonsmoking Patients with Lung Adenocarcinoma through Bioinformatics Analysis.

Authors:  Huan Deng; Yichao Huang; Li Wang; Ming Chen
Journal:  Biomed Res Int       Date:  2020-10-20       Impact factor: 3.411

2.  Diagnostic Biomarkers and Immune Infiltration in Patients With T Cell-Mediated Rejection After Kidney Transplantation.

Authors:  Hai Zhou; Hongcheng Lu; Li Sun; Zijie Wang; Ming Zheng; Zhou Hang; Dongliang Zhang; Ruoyun Tan; Min Gu
Journal:  Front Immunol       Date:  2022-01-04       Impact factor: 7.561

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.