Jie Li1, Ben Wang2, Xin Li3, Yuxi Zhu1. 1. Department of Oncology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China (mainland). 2. Department of Orthopedics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China (mainland). 3. Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China (mainland).
Abstract
BACKGROUND In recent years, the morbidity and mortality rates of lung adenocarcinoma in non-smoking females have been increasing dramatically. Although much research has been done with some progress, the molecular mechanism remains unclear. In this study we aimed to estimate hub genes and infiltrating immune cells in non-smoking females with lung adenocarcinoma. MATERIAL AND METHODS Firstly, we obtained differentially expressed genes (DEGs) by GEO2R analysis based on 3 independent mRNA microarray datasets of GSE10072, GSE31547, and GSE32863. The DAVID database was utilized for functional enrichment analysis of DEGs. Moreover, we identified hub genes with prognostic value by STRING, Cytoscape, and Kaplan Meier plotter. Subsequently, these genes were further analyzed by Gene Expression Profiling Interactive Analysis, Oncomine, Tumor Immune Estimation Resource, and Human Protein Atlas. Finally, the immune infiltration analysis was performed by CIBERSORT and The Cancer Genome Atlas with R packages. RESULTS We found 315 DEGs enriching in the extracellular matrix organization, cell adhesion, integrin binding, angiogenesis, and hypoxic response. And among these DEGs, we identified 10 hub genes (SPP1, ENG, ATF3, TOP2A, COL1A1, PAICS, CAV1, CAT, TGFBR2, and ANGPT1) of significant prognostic value. Simultaneously, we illustrated the distribution and differential expressions of 22 immune cell subtypes. and dendritic cells resting and macrophages M1 were identified with prognostic significance. CONCLUSIONS The results indicated that 10 hub genes and 2 immune cell subtypes might be promising biomarkers for lung adenocarcinoma in non-smoking females. This finding needs to be further evaluated.
BACKGROUND In recent years, the morbidity and mortality rates of lung adenocarcinoma in non-smoking females have been increasing dramatically. Although much research has been done with some progress, the molecular mechanism remains unclear. In this study we aimed to estimate hub genes and infiltrating immune cells in non-smoking females with lung adenocarcinoma. MATERIAL AND METHODS Firstly, we obtained differentially expressed genes (DEGs) by GEO2R analysis based on 3 independent mRNA microarray datasets of GSE10072, GSE31547, and GSE32863. The DAVID database was utilized for functional enrichment analysis of DEGs. Moreover, we identified hub genes with prognostic value by STRING, Cytoscape, and Kaplan Meier plotter. Subsequently, these genes were further analyzed by Gene Expression Profiling Interactive Analysis, Oncomine, Tumor Immune Estimation Resource, and Human Protein Atlas. Finally, the immune infiltration analysis was performed by CIBERSORT and The Cancer Genome Atlas with R packages. RESULTS We found 315 DEGs enriching in the extracellular matrix organization, cell adhesion, integrin binding, angiogenesis, and hypoxic response. And among these DEGs, we identified 10 hub genes (SPP1, ENG, ATF3, TOP2A, COL1A1, PAICS, CAV1, CAT, TGFBR2, and ANGPT1) of significant prognostic value. Simultaneously, we illustrated the distribution and differential expressions of 22 immune cell subtypes. and dendritic cells resting and macrophages M1 were identified with prognostic significance. CONCLUSIONS The results indicated that 10 hub genes and 2 immune cell subtypes might be promising biomarkers for lung adenocarcinoma in non-smoking females. This finding needs to be further evaluated.
Lung cancer has become the chief cause of malignancy deaths worldwide, and adenocarcinoma is the most common histologic type of lung cancer [1]. Previously, smoking was thought to be the major cause of lung adenocarcinoma (LUAD). However, the morbidity of LUAD has increased in never-smokers, especially in females [2]. Studies have shown that non-smoking lung cancer should be considered as a separate subtype [3]. Epidemiological, pathological, and molecular evidence suggested that estrogen appears to participate in the carcinogenic effect of lung cancer besides smoking [4,5]. A study in South Korea found morbidity differences in gender and histological subtypes in smoking-related lung cancer. Compared with males, females were more likely to develop non-smoking related LUAD, thus, gender was also an independent prognostic factor [6,7]. One study reported that females benefited significantly more from immunotherapy for lung cancer than males [8]. Additionally, some studies have indicated that anti-estrogen could reduce non-small cell lung cancer (NSCLC) cell proliferation [9]. Therefore, more attention should be paid to the treatment and prognostic evaluation of LUAD in non-smoking females [10].Although its pathogenesis remains unclear, the application of bioinformatics analysis in precision medicine might contribute to finding the key biomarkers in the big data era [11]. Data mining in cancers has played a vital part in cancer diagnosis and management [12]. Consequently, we explored the promising molecular mechanism of LUAD in non-smoking females by bioinformatics. We identified the differentially expressed genes (DEGs) between LUAD and normal samples of non-smoking females by data mining. Simultaneously, CIBERSORT and The Cancer Genome Atlas (TCGA) were utilized for immune infiltration analysis. Finally, we found 10 hub genes and 2 immune cell subtypes as promising biomarkers for LUAD in non-smoking females, which provided useful information for further exploration.
Material and Methods
Microarray data
We downloaded qualified datasets from the Gene Expression Omnibus (GEO) database (). In this study, datasets that meet the following criteria were included: a) it contained LUAD tissue samples and normal lung tissue samples of non-smoking females; b) at least 10 samples were included. Finally, GSE10072, GSE31547, and GSE32863 were qualified for further analysis. GSE10072 contained 13 LUAD tissue samples and 11 normal lung tissue samples of non-smoking females. GSE31547 contained 6 LUAD tissue samples and 5 normal lung tissue samples of non-smoking females. GSE32863 contained 23 LUAD tissue samples and 23 normal lung tissue samples of non-smoking females.
Identification of DEGs
GEO2R (), a web application using BioConductor R packages [13], could compare DEGs from 2 or more datasets in the GEO series. It was universally applied in various bioinformatics analyses [14-16], and it provided the native R script for researchers to replicate their analyses. We utilized GEO2R to screen DEGs between LUAD tissue samples and normal tissue samples of non-smoking females. |log FC| >1 and P<0.01 was set as the cutoff criterion. Moreover, we replicated this analysis by the native R script to ensure the reliability of the present study.
Functional enrichment analyses of DEGs
Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of DEGs were performed by the DAVID database () (version 6.8) [17]. P<0.05 was set as statistically significant.
PPI network and module analysis
STRING () (version 10.0) provides the prediction of quality-controlled protein-protein association networks. We performed the STRING database to construct a protein-protein interaction (PPI) network for DEGs, and combined score >0.4 was set as statistically significant. Cytoscape (version 3.6.1) [18] was performed to visualize molecular interaction networks. The plugin Molecular Complexity Detection (MCODE) (version 1.5.1) in Cytoscape was used to identify the most important module from the PPI network. And the condition was set as follows: Degree cutoff=2, k-core=5, max. Depth=100, and node score cutoff=0.2. Subsequently, functional enrichment analysis was performed for genes in this module by the online bioinformatics database Metascape () [19].
Hub genes selection and analysis
The plugin cytoHubba of Cytoscape was performed to calculate the degree of genes in the PPI network. DEGs with degrees >10 were selected as hub genes. The Kaplan Meier plotter () [20] is an online platform to estimate the prognostic value of thousands of genes in several cancer types based on the data from GEO, Genomic Expression Archive, and TCGA database. And we performed overall survival (OS) analysis of hub genes in LUAD with non-smoking females by Kaplan Meier plotter. Gene Expression Profiling Interactive Analysis (GEPIA; ) [21] provides the differential analysis based on the Genotype-Tissue Expression and the TCGA database. Moreover, we visualized the differential expression of the most significant hub genes in LUAD by GEPIA. Finally, further analyses were performed on SPP1, the hub gene with the highest degree found by cytoHubba. Tumor Immune Estimation Resource (TIMER; ) [22] was used to assess the expression profile of SPP1 in various humantumors based on TCGA database. And a meta-analysis of expression of SPP1 in LUAD compared with normal tissues in different datasets was estimated based on the Oncomine database () [23]. SPP1 protein expression analysis in LUAD tissues and normal tissues was performed by the Human Protein Atlas () [24].
Distribution and prognostic analysis of infiltrating immune cells in non-smoking female LUAD
Firstly, we downloaded the Transcriptome Profiling data and Clinical data of female LUAD from TCGA database. Among them, 34 normal female lung tissue samples and 47 non-smoking female LUAD tissue samples were included in this study (as for only 5 normal non-smoking female lung tissue samples were available in TCGA, we included all of 34 normal female lung tissue samples as the control group). And the raw data was converted to which could be matched with CIBERSORT [25] by Practical Extraction and Report Language (Perl). Moreover, we randomized the converted data by limma packages (version 3.8). After deleting samples with P>0.05, 32 normal samples and 42 tumor samples were left. Then we predicted the distribution of 22 infiltrating immune cells in these samples by CIBERSORT. Finally, vioplot packages and survival packages were performed to illustrate the distribution and prognostic analysis of 22 infiltrating immune cells of non-smoking female LUAD.
Results
The detailed sample information of the included datasets was presented in Supplementary Table 1. We identified 315 overlapped DEGs among 3 datasets (Figure 1), consisting of 254 downregulated DEGs and 61 upregulated DEGs (Supplementary Table 2). Notably, the regulation of these 315 DEGs was consistent in all these 3 datasets.
Table 1
GO and KEGG pathway enrichment analysis of 315 DEGs.
Term
Description
Gene count
P-value
GO-CC: 0005615
Extracellular space
64
1.21E-13
GO-CC: 0070062
Extracellular exosome
96
8.15E-12
GO-CC: 0005578
Proteinaceous extracellular matrix
24
2.17E-10
GO-CC: 0005576
Extracellular region
60
1.26E-08
GO-CC: 0005887
Integral component of plasma membrane
55
1.52E-08
GO-BP: 0030198
Extracellular matrix organization
24
1.10E-12
GO-BP: 0007155
Cell adhesion
34
1.36E-11
GO-BP: 0016337
Single organismal cell-cell adhesion
13
2.73E-07
GO-BP: 0050900
Leukocyte migration
14
3.14E-07
GO-BP: 0001666
Response to hypoxia
16
5.41E-07
GO-MF: 0008201
Heparin binding
17
1.31E-08
GO-MF: 0005539
Glycosaminoglycan binding
7
3.44E-07
GO-MF: 0005515
Protein binding
189
1.11E-06
GO-MF: 0005178
Integrin binding
12
1.60E-06
GO-MF: 0050431
Transforming growth factor beta binding
6
4.97E-06
KEGG_pathway
hsa05144
Malaria
8
1.37E-04
hsa04610
Complement and coagulation cascades
8
1.16E-03
hsa04514
Cell adhesion molecules (CAMs)
11
1.91E-03
hsa04530
Tight junction
8
4.40E-03
hsa04512
ECM-receptor interaction
8
4.40E-03
GO – Gene Ontology; KEGG – Kyoto Encyclopedia of Genes and Genomes; DEGs – differentially expressed genes; BP – biological processes; CC – cell component; MF – molecular function; ECM – extracellular matrix.
Figure 1
Venn diagram for overlapping DEGs in 3 microarray datasets. |log FC| >1 and P<0.01 was set as the cutoff criterion. There were 315 overlapped DEGs among 3 datasets (GSE10072, GSE31547, GSE32863) identified. DEGs – differentially expressed genes.
Table 2
Functional roles of 10 hub genes with prognostic significance.
No.
Gene symbol
Regulation
Degree
Full name
Function
1
SPP1
Up
21
Secreted Phosphoprotein 1
SPP1 participates in a range of biological functions, including cell proliferation, adhesion, invasion, migration, and tumor angiogenesis
2
ENG
Down
18
Endoglin
ENG activated the TGF-β/ALK1 signaling pathway and promoted endothelial cell proliferation and migration in cancers
3
CAV1
Down
18
Caveolin 1
CAV1 transforms suppressor activity abnormal expressed in T cell leukemia in lung carcinoma and in breast carcinoma
4
ATF3
Down
17
Activating Transcription Factor 3
Over-expression of ATF3 upregulated p53 and inhibited the tumorigenesis of lung cancer
5
TOP2A
Up
17
DNA Topoisomerase II Alpha
TOP2A is involved in the process of chromosome condensation, chromatid separation, DNA transcription and replication. It is the target of several anti-cancer drugs
6
COL1A1
Up
16
Collagen Type I Alpha 1
COL1A1 encodes the pro-alpha1 chains of type I collagen. It is related to hypoxia and is significantly overexpressed in non-small cell lung cancer
7
CAT
Down
15
Catalase
CAT serves to protect cells from the toxic effects of hydrogen peroxide, and it changes the migration and invasion ability of lung cancer cell
8
TGFBR2
Down
15
Transforming Growth Factor Beta Receptor 2
TGFBR2 often alters during adenoma-carcinoma progression of some cancers
9
ANGPT1
Down
14
Angiopoietin 1
ANGPT1, a member of the angiopoietin family, plays an important role in vascular development and angiogenesis
10
PAICS
Up
11
Phosphoribosyl Aminoimidazole Carboxylase
PAICS is identified as an oncogene of various tumor types
The whole results of GO and KEGG enrichment analyses for 315 DEGs were presented in Supplementary Table 3, and the top 5 GO and KEGG terms were visualized in Table 1.The PPI network of 315 DEGs was constructed (Figure 2), including 315 nodes and 708 edges. And the most significant module was illustrated in Figure 3A. Subsequently, our results suggested that DEGs in this module were mostly enriched in ERK1 and ERK2 cascade, vasculature development, cellular response to hormone stimulus and adrenomedullin receptor signaling pathway (Figure 3B).
Figure 2
PPI network of 315 DEGs construction using STRING and Cytoscape. This network includes 315 nodes and 708 edges. Nodes stand for the DEGs and edges stand for the association of DEGs. Red nodes represent upregulated DEGs, while blue nodes represent downregulated DEGs. PPI – protein–protein interactions; DEGs – differentially expressed genes.
Figure 3
Analysis of the most significant module. (A) Identification of the most significant module from PPI network using MCODE plugin of Cytoscape. Red nodes stand for upregulated DEGs, while blue nodes represent downregulated DEGs. (B) Functional enrichment analysis of the most significant module performed by Metascape. PPI – protein–protein interactions; MCODE – Molecular Complexity Detection; DEGs – differentially expressed genes.
Hub gene screening and analysis
In total, we identified 36 DEGs as hub genes with degrees >10 (Supplementary Table 4). Subsequently, 10 hub genes (Table 2) were screened out with prognostic value (Figure 4). Our results indicated that over-expression of SPP1, ENG, ATF3, TOP2A, COL1A1, and PAICS was related to worse OS for non-smoking females with LUAD (P<0.05). On the other hand, under-expression of CAV1, CAT, TGFBR2, and ANGPT1 was associated with a poorer OS for non-smoking females with LUAD (P<0.05). As illustrated in Figure 5, compared with normal tissues, the expressing of SPP1, TOP2A, COL1A1, and PAICS increased in LUAD tissues, while ENG, ATF3, CAV1, CAT, TGFBR2, and ANGPT1 decreased based on GEPIA. These results were coordinated with the results of differential expression analysis based on the GEO database, which validated the reliability of GEO analysis indirectly. Among these 10 most significant hub genes, SPP1 accounts for the highest degree of 21, suggesting the potential significance. The result of TIMER indicated that SPP1 was overexpressed in some cancers compared with normal tissues, including LUAD, breast invasive carcinoma (BRCA), colon adenocarcinoma (COAD), liver hepatocellular carcinoma (LIHC), stomach adenocarcinoma (STAD), uterine corpus endometrial carcinoma (UCEC), etc., (Figure 6A). A meta-analysis based on Oncomine datasets revealed that SPP1 was over-expressed in LUAD compared with normal tissues (Figure 6B). As shown in Figure 6C, SPP1 protein was higher expressed in patients with LUAD compared with normal tissue.
Figure 4
Overall survival analysis of 10 most significant hub genes in non-smoking females with LUAD based on Kaplan Meier plotter platform using the data of GEO, GEA, and TCGA databases, including SPP1 (A), ENG (B), ATF3 (C), TOP2A (D), COL1A1 (E), PAICS (F), CAV1 (G), CAT (H), TGFBR2 (I), ANGPT1 (J). LUAD – lung adenocarcinoma; GEO – Gene Expression Omnibus; GEA – Genomic Expression Archive; TCGA – The Cancer Genome Atlas.
Figure 5
The differential expression analysis of 10 most significant hub genes in LUAD based on GEPIA using the data of TCGA and GTEx databases, including SPP1 (A), ENG (B), ATF3 (C), TOP2A (D), COL1A1 (E), PAICS (F), CAV1 (G), CAT (H), TGFBR2 (I), ANGPT1 (J). LUAD – lung adenocarcinoma; GEPIA – Gene Expression Profiling Interactive Analysis; TCGA – The Cancer Genome Atlas; GTEx – Genotype-Tissue Expression.
Figure 6
The upregulation of SPP1 was validated in different databases. (A) The expression profiling of SPP1 in various tumor types performed by TIMER according to the TCGA database. The red columns stand for tumor samples and the blue columns stand for normal samples. *** Represents that the P<0.001, and ** represents that the P<0.01. (B) A meta-analysis of SPP1 across 5 analyses in the Oncomine database showed that SPP1 was upregulated in LUAD. (C) Immunohistochemistry results of SPP1 protein expression in LUAD tissue and normal tissue from HPA database. TIMER – Tumor Immune Estimation Resource; TCGA – The Cancer Genome Atlas; HPA – Human Protein Atlas; LUAD – lung adenocarcinoma.
The detailed clinical information of included samples was shown in Supplementary Table 5. The distribution of 22 kinds of infiltrating immune cells of non-smoking female LUAD indicated that T cells CD4 memory resting, macrophages M2 and macrophages M0 accounted for the largest proportion (Figure 7A). The results revealed that a series of cells are differentially expressed between tumor tissues and normal tissues with P-value <0.05. Some cells have higher expression in tumor tissue than that in normal tissue, including plasma cells, T cells regulatory, macrophages M1, and dendritic cells resting. In contrast, some cells have lower expression in tumor tissue than that in normal tissue, consisting of T cells CD4 memory resting, natural killer (NK) cells resting, monocytes, macrophages M0, mast cells resting, and neutrophils. In addition, among these 22 infiltrating immune cells, only dendritic cells resting and macrophages M1 were found to be statistically significant (P<0.05) for prognostic value. As shown in Figure 7B and 7C, lower expression of dendritic cells resting indicated a poor prognosis, while lower expression of macrophages M1 suggested a better prognosis.
Figure 7
Distribution and prognostic analysis of infiltrating immune cells in non-smoking females with LUAD based on TCGA database and CIBERSORT. (A) Distribution landscape of infiltrating immune cell in non-smoking females with LUAD. The overall survival analysis of dendritic cells resting (B) and macrophages M1 (C). LUAD – lung adenocarcinoma; TCGA – The Cancer Genome Atlas.
Discussion
In the present study, we found 10 prognostic hub genes and 2 kinds of significant infiltrating immune cells of LUAD in non-smoking females, which were verified with multiple databases. The biological functions and signaling pathways enriched in DEGs might participate in the tumorigenesis and development of LUAD in non-smoking females. Notably, this work was repeated 3 times by 3 individual researchers to ensure the reliability of the results.Among these 10 most significant hub genes with prognostic value, SPP1 accounted for the highest degree, suggesting its potential significance in non-smoking females with LUAD. SPP1, also known as osteopontin, has been reported to be upregulated in some tumors, such as colorectal cancer [26], cervical cancer [27], and breast cancer [28], which was consistent with our results (Figure 6). Both experimental and clinical analyses revealed that high expression of SPP1 predicted a poor prognosis. Immunohistochemical analysis of 318 NSCLC tumor samples indicated that SPP1 was significantly over-expressed in NSCLC tissues compared with normal tissues [29]. In clinical investigations, elevated plasma SPP1 level was found in early-stage and relapsed NSCLCpatients, suggesting the potential diagnostic and prognostic value of SPP1 [30]. However, the mechanism of SPP1 in non-smoking female LUAD is poorly understood. As illustrated in Table 1 and Supplementary Table 3, DEGs in our study were enriched in integrin binding, extracellular matrix (ECM) organization, angiogenesis, phosphoinositide 3-kinase (PI3K)-Akt signaling pathway, and ECM-receptor interaction; this finding might help us understand the mechanism of LUAD in non-smoking females. It has been reported that SPP1 interacts with various integrins and CD44 to participate in a range of biological processes [31]. This interacting contributes to cell proliferation via PI3K/Akt signaling pathway [32]. Also, SPP1-induced cell motility and ECM-invasion are essential for tumor metastasis [33,34]. Moreover, vascular endothelial growth factor could induce tumor angiogenesis by promoting endothelial cell migration and capillary formation via SPP1, just like a hit falling dominoes [35]. Studies have shown that siSPP1 increased the sensitivity of lung cancer cells to afatinib in afatinib-resistant lung cancer cells [36], suggesting that SPP1 might also be involved in the tyrosine kinase inhibitor (TKI) resistance mechanisms. Additionally, SPP1 also mediated tumor immunity, such as tumor-associated macrophages (TAMs) polarization, upregulation of PD-L1 and promotion of the immune escape of LUAD cells [37]. As a result, we speculated that SPP1 might be involved in the progression of LUAD via these signaling pathways and biological functions. Interestingly, the correlations between SPP1 and gender and smoking were also observed. SPP1 polymorphism was found to be related to a higher risk of gastric precancerous lesions in males [38]. However, another study indicated that estrogen may upregulate the expression of SPP1 [39]. Among lung cancerpatients, non-smokers exhibited lower expression levels of SPP1 [40]. Besides, tobacco extract could induce the expression of SPP1 in vitro [41]. Regrettably, with little information of SPP1 on the evolvement of LUAD in non-smoking females, further investigations are required to confirm these results of the function of SPP1.In order to further explore the pathogenesis of LUAD in non-smoking females, we screened the most important module (Figure 3A) from the PPI network. The DEGs in this module were mostly enriched in vasculature development, transforming growth factor (TGF)-beta signal pathway, and cellular response to hormone stimulus (Figure 3B), which could be involved in oncogenesis and progress of LUAD in non-smoking females. Among them, ENG, ATF3, and ANGPT1 were downregulated and were all included in the most important module. ANGPT1, a member of the angiopoietin family, plays a vital role in vascular development and angiogenesis [42]. ANGPT1 was identified as a tumor suppressor gene related to female lung cancer in a sex-specific SNP-SNP interaction analysis based on the same dataset (GSE10072) [43], which is in line with our results. Moreover, the tumor metastasis of ANGPT1 knockout mice increased significantly compared with the control group, which suggested that ANGPT1 might be a prognostic marker [44]. However, the prognostic value of ENG and ATF3 in lung cancer remains controversial. Over-expression of ENG was found to activate the TGF-β/ALK1 signaling pathway and promote endothelial cell proliferation and migration in one study [45], while another study showed that ENG haplo-insufficient mice with lung cancer could decrease tumor size and vascular density [46]. It was also reported that over-expression of ATF3 significantly upregulated p53 and inhibited the tumorigenesis of lung cancer [47]. On the other hand, immunohistochemical staining results suggested that lung cancer cells proliferation was evidently inhibited through ATF3 knockdown in vitro [48]. Therefore, the function of ENG and ATF3 in the tumorigenesis is complicated, which might be related to specific regulating signals and variable tumor microenvironments. The specific molecular mechanisms deserve further exploration, and smoking and gender could be considered as regulators.Our results indicated that over-expression of COL1A1, PAICS, and TOP2A, and under-expression of CAT, CAV1, and TGFBR2 resulted in poor prognosis (Figure 4). Using the same dataset (GSE32863), Qiong Wu et al. identified COL1A1 as a prognostic biomarker in NSCLC [49], which suggested the significance of COL1A1 in lung cancer and the reliability of the present study. In addition, COL1A1 was demonstrated to promote the migration of colorectal cancer cells by Transwell assays in vitro [50]. The proliferation of breast cancer cells and colon cancer cells was apparently inhibited through knockdown of PAICS and TOP2A, respectively [51,52]. It was reported that low expression of CAT, which could involve in oxidative stress defending, enhanced the invasion of lung cancer cells [53]. Moreover, CAV1 was proven to inhibit LUAD cells proliferation [54]. Stephen et al. found that TGFBR2 deletion in mouse airway epithelia increased migration and invasion, and led to poor survival of NSCLCpatients [55]. The correlations between these genes and smoking have also been reported. Compared with smokers, the expression level of CAV1 [56] and TOP2A [57] in non-smokers was reported to be lower, while the expression levels of TGFBR2 [58] and CAT [59] were reported to be higher in non-smokers. Taken together, although these hub genes all played essential parts in the evolution and progression of LUAD, the specific mechanisms remain not clarified, and future experimental analyses are still demanded.Furthermore, the immune microenvironment of LUAD in non-smoking females might also contribute to the tumorigenesis. A model of ovarian cancer indicated that increased immune infiltrates contributed to tumor progression, including dendritic cells and macrophages [60], which supported our findings (Figure 7A). However, Figure 7B illustrated that lower expression level of dendritic cells resting resulted in poor prognosis. As for relapsed colorectal cancerpatients, fewer tumor-infiltrating dendritic cells were detected [61]. On the other hand, the increased tumor-infiltrating dendritic cells were observed in a mouse model of ovarian cancer as tumor progressed [62]. This suggested that during the process of tumorigenesis and development, the amount, subtypes, and functions of dendritic cells were changing [63], and gender and smoking might be included as impact factors to understand its complex functions. Accumulated evidence revealed the essential value of TAMs, M1 phenotype of TAMs, was identified as a tumor-suppressing factor in LUAD [64]. Based on TCGA and CIBERSORT, our results suggested that lower level of macrophages M1 was associated with better prognosis, which is contrary to previous reports, suggesting that female and smoking might be independent factors. It has been reported that the ratio of macrophages M1 increased from 26% to 84% with smoking severity [65]. Regrettably, the impact of gender and smoking on macrophages M1 in LUAD has not been assessed in previous studies, and further research is urgently demanded.
Conclusions
Importantly, in this study, the mechanisms of LUAD in non-smoking females were explored by bioinformatics methods, and promising biomarkers and possible signaling pathways were identified and validated based on multiple databases were combined to confirm these results. Ten hub genes and 2 immune cell subtypes were found with prognostic significance, including SPP1, ENG, ATF3, TOP2A, COL1A1, PAICS, CAV1, CAT, TGFBR2, ANGPT1, dendritic cells resting, and macrophages M1. However, as a result of the limitation of the relatively small sample size of online data in this field, the specific mechanism of these hub genes and infiltrating immune cells were still unrevealed. Therefore, further research on the mechanism of LUAD in non-smoking females is necessary.
Authors: Mathias Uhlén; Linn Fagerberg; Björn M Hallström; Cecilia Lindskog; Per Oksvold; Adil Mardinoglu; Åsa Sivertsson; Caroline Kampf; Evelina Sjöstedt; Anna Asplund; IngMarie Olsson; Karolina Edlund; Emma Lundberg; Sanjay Navani; Cristina Al-Khalili Szigyarto; Jacob Odeberg; Dijana Djureinovic; Jenny Ottosson Takanen; Sophia Hober; Tove Alm; Per-Henrik Edqvist; Holger Berling; Hanna Tegel; Jan Mulder; Johan Rockberg; Peter Nilsson; Jochen M Schwenk; Marica Hamsten; Kalle von Feilitzen; Mattias Forsberg; Lukas Persson; Fredric Johansson; Martin Zwahlen; Gunnar von Heijne; Jens Nielsen; Fredrik Pontén Journal: Science Date: 2015-01-23 Impact factor: 47.728
Authors: Iikki Donner; Riku Katainen; Lauri J Sipilä; Mervi Aavikko; Eero Pukkala; Lauri A Aaltonen Journal: Lung Cancer Date: 2018-05-31 Impact factor: 5.705
Authors: Justin D Blasberg; Harvey I Pass; Chandra M Goparaju; Raja M Flores; Suzie Lee; Jessica S Donington Journal: J Clin Oncol Date: 2010-01-19 Impact factor: 44.544