Literature DB >> 34113128

Identification of Susceptible Genes for Chronic Obstructive Pulmonary Disease with Lung Adenocarcinoma by Weighted Gene Co-Expression Network Analysis.

Ping Li1, Youyu Wang2, Xiaoli Wang3, Lin Liu4, Lei Chen1.   

Abstract

PURPOSE: Chronic obstructive pulmonary disease (COPD) and lung adenocarcinoma (LUAD) are common disorders and usually co-exists. However, genetic mechanisms between COPD and LUAD are rarely reported. This study aims to identify susceptible genes of COPD with LUAD.
METHODS: Using the published data of GSE106899, co-expression modules were constructed by weighted gene co-expression network analysis (WGCNA). Subsequently, top 50 genes in the most tumor-related module were identified, among which hub genes were selected and validated.
RESULTS: Twenty co-expression modules were constructed on 13,865 genes from 62 lung tissues of COPD patients with or without LUAD, in which one module (blue) was most related to tumorigenesis. Functional enrichment analyses showed that the genes in the blue module were mainly enriched in cell cycle, DNA transcription/replication and cancer pathways, etc. Combined with protein-protein interaction network, MTA1, PKMYT1 and FZR1 genes had the most intramodular connectivity, which were regarded as the hub genes. However, only FZR1 was validated to be overexpressed in lung tissues of COPD with LUAD and cigarette smoke extract-stimulated A549 cells, a human LUAD cell line.
CONCLUSION: This study suggests overexpression of FZR1 may play a key role in the tumorigenesis of LUAD in patients with COPD.
© 2021 Li et al.

Entities:  

Keywords:  COPD; WGCNA; chronic obstructive pulmonary disease; lung adenocarcinoma; weighted gene co-expression network analysis

Year:  2021        PMID: 34113128      PMCID: PMC8187107          DOI: 10.2147/OTT.S303544

Source DB:  PubMed          Journal:  Onco Targets Ther        ISSN: 1178-6930            Impact factor:   4.147


Plain Language Summary

What is Already Known on the Subject?

COPD and LUAD are common disorders and usually co-exists, and genetic mechanisms play potentially important roles in co-existing of COPD and LUAD.

What is the Current Concern?

The genetic-coexisting mechanisms of COPD and LUAD have not been fully elucidated, and no genetic biomarkers in screening high risk for tumorigenesis of LUAD in COPD patients have been investigated.

What is Added to This Topic?

Three susceptible genes, including MTA1, PKMYT1 and FZR1, for COPD and LUAD were identified by WGCNA. However, FZR1, but not MTA1 and PKMYT1, was validated to be overexpressed in lung tissues from COPD with LUAD and cigarette smoke extract treated A549 cells.

Introduction

Chronic obstructive pulmonary disease (COPD) and lung cancer are common causes of death worldwide.1 Previous studies reported COPD was more likely to develop lung cancer than those with normal lung function,2–5 indicating COPD is an independent risk factor for lung cancer,3,6–8 which may result in co-existing of COPD and lung cancer. In this process, genetic susceptibility is one of the well-described mechanisms for co-existing of COPD and lung cancer.9–11 However, there are limited reports on genetic mechanisms between COPD and lung adenocarcinoma (LUAD), the most common pathologic type of lung cancer.12 Weighted gene co-expression network analysis (WGCNA) is an efficient systematic biological method to analyze the relationship between genes and diseases.13 WGCNA is characterized by clustering modules based on genes with similar expression profiles, which can analyze the relationship between modules and specific phenotypes, and further explore the signal pathways related to phenotypic features.14 Therefore, in this study, we constructed a co-expression network by WGCNA using available databases to identify susceptible genes in COPD with LUAD. Lung tissues from COPD patients with LUAD and a human LUAD cell line (A549) were used to validate the findings.

Materials and Methods

Microarray Data and Processing

Raw RNA sequence data of GSE106899 were downloaded from Gene Expression Omnibus (GEO, ), including 66 lung tissues from COPD patients with or without LUAD. Raw data (raw reads) of FastQC format were firstly processed through in-house perl scripts. Next, clean data (clean reads) were obtained by removing reads containing adapter, reads containing ploy-N and low-quality reads. All the downstream analyses were based on clean reads with high quality. Clean reads were aligned to the reference genome () using HISAT2 version 2.1.0. Then, HTSeq version 0.6.1 was used to count the reads mapped to each gene (Gene annotation file: ). Fragments per Kilobase per Million (FPKM) is commonly used to estimate gene expression levels, and the FPKM of each gene was then calculated based on the length of the gene and reads count mapped to this gene. Subsequently, four samples (GSM2857291, GSM2857323, GSM2857324, GSM2857325) were removed during FPKM valuation, since the gene expression was found to be discrete in the subsequent analyses. Afterwards, we selected the genes with FPKM value greater than 1 in 62 samples of the construction by the co-expression network.

Weighted Gene Co-Expression Network Analysis (WGCNA)

To find clusters (modules) of highly correlated genes, WGCNA was carried out on the selected 13,865 genes by the R language (version 3.5.3) with WGCNA package (version 1.67).13,15 The scale-free co-expression network was established using the WGCNA algorithm.13 Firstly, after removing the outlier samples, the correlation matrix between two genes was constructed using the main connecting rod and the Pearson’s correlation matrix. Secondly, the hierarchical clustering analysis was performed with the hclust R function, and the soft-threshold power was determined by analysis of network topology. The adjacency was transformed into a topological overlap matrix.16 Then, network construction and module detection were performed. We merged modules whose size less than minModuleSize (=30) into one module. The calculation process was performed by the blockwiseConsensusModules function in WGCNA package. According to the gene expression level, flashClust toolkit in R language was used for cluster analysis of samples.

Correlation Analysis of Modules and Traits

To assess the potential correlation between modules and traits, the module eigengene (ME) is utilized. MEs were regarded as the major component in the principal component analysis for each gene module and the expression patterns of all genes could be summarized into a single characteristic expression profile within a given module. Besides, we evaluated the correlation between MEs and tumor progression by Pearson’s correlation test to identify the tumor-related module. Then, the correlation between MEs and individual genes were assessed using signed module membership (MM). The module highly related to phenotype was selected as a tumor-related module for further analyses. When recording the tissue types, “1” stands for normal tissues in COPD, “2” stands for adjacent non-malignant tissue in COPD, “3” stands for lung tumor in COPD. All work was done using the R language with WGCNA package.13

Functional Enrichment Analyses

To investigate the potential mechanisms on module genes regulating tumor progressions, GO and KEGG enrichment analyses were performed utilizing the Database for Annotation, Visualization and Integrated Discovery (DAVID, version 6.8) by uploaded all correlated genes in tumor-related module, and KEGG orthology-based annotation system (KOBAS, version 3.0), respectively.17 Adjusted P-value < 0.05 was set as the threshold.

Screening for Hub Genes

To identify the hub genes, we selected the module most relevant to the trait, and then, protein–protein interactions (PPI) network was developed by using the online database STRING18 in the most relevant module. A protein interaction relationship network table was downloaded and visualized using Cytoscape software.19 The positively intramodular connectivities within the top 50 genes and PPI network were calculated according to the visualization results, and then sorted by the total amount of intramodular connectivity. The top three genes were regarded as “real” hub genes for further analyses.

Immunohistochemistry

Immunohistochemical analysis for “real” hub genes was performed on paraffin sections of resected lung tissues from patients with COPD and matched COPD with LUAD (Table 1). As described previously, the immuno-stained areas for hub genes were regarded as area of interest (AOI) to measure independently, followed by quantitative analyses with Image-Pro Plus 6.0 (Media Cybernetics Inc., Bethesda, MD, USA)20. The immunohistochemical study was approved by the Ethics Committee of Sichuan Provincial People’s Hospital and all patients provided written informed consents for participation in the study.
Table 1

Clinical Data of Subjects

COPDCOPD with LUAD
Number54
Gender (male)54
Age62.5±6.366.2±5.3*
Pack-year27.7±9.728.9±10.5*
FEV1/FVC%62.6±6.660.8±8.4
FEV1 % pred69.4±7.966.9±3.3
TNM stage-T1-2N0M0

Note: *p>0.05 vs COPD group.

Clinical Data of Subjects Note: *p>0.05 vs COPD group.

Cell Culture and Real-Time PCR

Human LUAD cell line (A549), purchased from the American Type Culture Collection (ATCC, Manassas, VA, USA), was cultured in RPMI 1640 media supplemented with 10% fetal bovine serum, 1% antibiotics (penicillin and streptomycin sulfate) under the condition with 5% CO2 at 37 °C. Cigarette smoke extract (CSE) was prepared according to a previous article.21 Briefly, the smoke was generated from commercial cigarettes with 1.0 mg nicotine and 14 mg tar per cigarette, and each cigarette was yielded five draws by a 50mL syringe with almost 10 seconds for each draw, which was dissolved into 5mL of serum-free RPMI 1640 culture medium, then adjusted to 7.4 of pH value and finally sterilized with a 0.22-mm syringe filter. The prepared CSE was termed 100% of concentration and used in this study with dilutions between 2% and 16%. Besides, the cell viability was determined with a Cell Counting Kit-8 (CCK-8, Dojindo Laboratories, Tokyo, Japan) according to the protocol of manufacturers. Briefly, A549 cells were seeded into 96-well plates with at the density of 5000 per well and cultured in RPMI 1640 media containing serum for 24h, which were followed by treatment with different concentrations of CSE (0, 2%, 4%, 8%, 10%, 12% and 16%) for 24, 48, 72 hours, respectively. Thereafter, 10μL of CCK8 solution was added to each well for another 1-hour incubation, and the absorbance at 450 nm was finally measured. The total RNA was isolated from A549 cells using the Total RNA Kit I (Omega Bio-Tek, USA), and cDNA was then synthesized from total RNA using PrimeiScript RT reagent Kit (TaKaRa), according to the manufacturer’s protocol. Next, the PCR reaction was performed in triplicates with FastStart Essential DNA Green Master (Roche), using the specific primers (). The relative PCR amplification was conducted using the LightCycler® 96 PCR system (Roche Molecular Systems, USA). Briefly, preincubation (95°C for 10 minutes for 1 cycle) was firstly performed, and a 3‐step amplification was then followed as 95°C for 10 seconds, Tm for 15 seconds and 72°C for 15 seconds for 40 cycles, and in final step, it is melting, including 95°C for 10 seconds, 65°C for 60 seconds and 97°C for 1 seconds for 1 cycle. All data were normalized to GAPDH gene expression, and relative expression levels were determined using the 2-ΔΔCt method.

Statistical Analyses

All the original data were presented as mean ± SD using the SPSS 21 for Windows. Comparisons between the two groups were performed with the t-test. The original data, which did not coincided with a normal distribution, were performed using Kruskal–Wallis test. P<0.05 was considered statistically significant.

Results

Construction of Weighted Gene Co-Expression Network

The analysis process of this study is shown in Figure 1. A total of 13,865 genes were considered as the candidate genes in 62 samples, which were used for the construction of weighted gene co-expression networks by WGCNA package based on R language. Cluster analysis was performed on these samples using the flashclust toolkit, and the results are shown in Figure 2. The threshold power of β is an important parameter for constructing co-expression modules, which mainly reflects the size independence and average connectivity of modules. As shown in Figure 3, when β=4, the scale independence reached 0.9 (Figure 3A) and the mean connectivity was high (Figure 3B). Therefore, β=4 was used to construct the co-expression module, and the results of WGCNA showed that a total of 20 modules were identified with different colors (Figure 3C). Furthermore, the gray module was utilized for housing genes that were not co-expressed with other genes, and these genes could not be assigned to any of other modules and would be ignored in the following study. The number of genes in each module was shown in .
Figure 1

The analysis process of this study.

Figure 2

Sample clustering to detect outliers.

Figure 3

Weighted gene co-expression network analysis (WGCNA). (A) Analysis of scale independence of co-expressed module genes under different soft-threshold powers. (B) Analysis of average connectivity of co-expression module genes under different soft-threshold powers. (C) Construction of genes coexpression modules by WGCNA. (D) Heatmap of module–trait relationships between module eigengenes and clinical traits. (E) Scatterplots of gene significance (GS) for COPD profile traits and module membership (MM) in the blue module.

The analysis process of this study. Sample clustering to detect outliers. Weighted gene co-expression network analysis (WGCNA). (A) Analysis of scale independence of co-expressed module genes under different soft-threshold powers. (B) Analysis of average connectivity of co-expression module genes under different soft-threshold powers. (C) Construction of genes coexpression modules by WGCNA. (D) Heatmap of module–trait relationships between module eigengenes and clinical traits. (E) Scatterplots of gene significance (GS) for COPD profile traits and module membership (MM) in the blue module.

Correlation Between Co-Expression Modules and Traits

WGCNA was used to correlate each module with all clinical traits. The blue module (3475 genes) had the most highly positive correlation with the clinical phenotype (Figure 3D). A P-value was then calculated for each module‐trait correlation. Consequently, the blue module (cor=0.4, P=0.001), mostly related to tumor progressions in COPD with LUAD, was selected as the target module. Gene significance (GS), defined as the correlation between gene expression and COPD traits, was put in relation to MM, defined as the correlation between the ME and gene expression profile. The scatter plot of GS vs MM for the blue module was also performed (Figure 3E).

Functional Enrichment Analyses on the Blue Module

GO and KEGG enrichment analyses were performed on the blue module, which was mostly correlated with tumor progression of COPD with LUAD. GO enrichment analysis showed that the genes in the blue module were significantly enriched in multiple biological processes related to cell division, regulation of transcription, DNA replication, etc. (Figure 4A and ). Moreover, KEGG enrichment analysis indicated the genes mainly participated in pathways involving cancer, cell cycles, AMPK signaling pathway, etc. (Figure 4B and ).
Figure 4

(A) GO biological process analysis and (B) KEGG analysis of genes in the blue module.

(A) GO biological process analysis and (B) KEGG analysis of genes in the blue module.

Visualization for the Hub Genes

In addition, we selected the top 50 genes in the blue module, which strongly correlated with tumor progressions in COPD with LUAD, and then imported into Cytoscape for visualization (Figure 5A). PPI network in the tumor-related module was also visualized with the STRING (Figure 5B). According to the total number of intramodular connectivity in gene–gene interactions and PPI network in the blue module, the top 3 genes (MTA1, PKMYT1 and FZR1) were selected as hub genes (Table 2 and ) for further analyses.
Figure 5

Visualization for (A) gene–gene interactions and (B) PPI network of the top 50 genes in the blue module.

Table 2

Hub Genes Selected by Total Number of Intramodular Connectivity

GeneHub Genes in Blue ModuleHub Genes in PPI NetworkTotal Number
Intramodular ConnectivityIntramodular Connectivity
MTA148654
PKMYT149453
FZR149453

Abbreviations: MTA1, metastasis associated 1; PKMYT1, protein kinase, membrane associated tyrosine/threonine 1; FZR1, fizzy and cell division cycle 20 related 1; PPI, protein–protein interaction.

Hub Genes Selected by Total Number of Intramodular Connectivity Abbreviations: MTA1, metastasis associated 1; PKMYT1, protein kinase, membrane associated tyrosine/threonine 1; FZR1, fizzy and cell division cycle 20 related 1; PPI, protein–protein interaction. Visualization for (A) gene–gene interactions and (B) PPI network of the top 50 genes in the blue module.

Expression of MTA1, PKMYT1 and FZR1 in Lung Tissues

Lung tissues from COPD and COPD with LUAD patients were used to immunostain with MTA1, PKMYT1 and FZR1 antibodies (Figure 6A). Immunohistochemical analysis revealed that only FZR1 was significantly overexpressed in lung tissues of COPD with LUAD compared with COPD (Figure 6B).
Figure 6

(A) The immunostaining of MTA1, PKMYT1 and FZR1 in lung tissues of COPD patients with or without lung adenocarcinoma. ×600 magnification. (B) The mean density of the immunohistochemical images of MTA1, PKMYT1 and FZR1. *P<0.05 vs COPD group. (C) The expression of MTA1, PKMYT1 and FZR1 gene in CSE-exposed A549 cells. *P<0.05 vs control group.

(A) The immunostaining of MTA1, PKMYT1 and FZR1 in lung tissues of COPD patients with or without lung adenocarcinoma. ×600 magnification. (B) The mean density of the immunohistochemical images of MTA1, PKMYT1 and FZR1. *P<0.05 vs COPD group. (C) The expression of MTA1, PKMYT1 and FZR1 gene in CSE-exposed A549 cells. *P<0.05 vs control group.

Expression of MTA1, PKMYT1 and FZR1 Genes in A549 Cells After CSE Stimulation

A549 cells were exposed to CSE at different concentrations (0–16%) and incubated for up to 72h. Cell growth was assessed by CCK8 assay (). According to the cell growth assessment, we treated A549 cells with 6%, 8% and 10% for 48h. The results of real-time PCR showed that only FZR1 gene expression was increased than controls in a dose-dependent manner after 48h CSE exposure (Figure 6C).

Discussion

In this study, we used a RNA sequence data of 62 lung tissue samples from the patients of COPD with LUAD to construct co-expression modules by WGCNA. As a result, a co-expression module (blue) was identified to be mostly associated with tumor progression in COPD with LUAD. Functional enrichment analyses further indicated these genes in the blue module were mainly enriched in biological processes and pathways related to tumorigenesis. According to the visualization of intramodular connectivity, three genes (MTA1, PKMYT1 and FZR1) in the blue module were finally selected as hub genes. Overexpression of MTA1 and PKMYT1 has been reported in a variety of carcinomas, including lung cancer (adenocarcinoma), which widely contributes to tumorigenesis, invasion, metastasis, recurrence and poor prognosis,22–25 mainly via regulation of p53 stability and function, and G2/M transformation of cell cycle, respectively.26,27 However, the overexpression of MTA1 and PKMYT1 was not validated in the present study. Differently, FZR1 plays a crucial, but controversial role in tumorigenesis, by regulates cell proliferation via targeting multiple cell cycle regulators for dependent degradation.28,29 Some studies suggested that genomic stability was associated with FZR1,30,31 and FZR1 might be a tumor suppressor gene in tumor development.30,32 However, others reported FZR1 was overexpressed in malignant tumors,33,34 and might contribute to over-replication of the genome and lead to genomic instability,29 which may promote tumor development in specific tumors.33 In this study, the overexpression of FZR1 gene was firstly validated in lung tissues from COPD with LUAD and also A549 cells exposed to CSE, a common risk factor for COPD and lung cancer. Noticeably, FZR1 gene expression was statistically enhanced in both airways and alveoli in COPD patients with LUAD. These data indicated FZR1 might play a key role in tumorigenesis of LUAD in patients with COPD. Overall, among the three hub genes (MTA1, PKMYT1 and FZR1) identified in the present study, only FZR1 but not MTA1 and PKMYT1, was validated to be associated with tumor progression in COPD with LUAD, which may provide a novel genetic target for detecting coexisting mechanisms of COPD and LUAD, and a new genetic biomarker in screening high risk for tumorigenesis of LUAD in COPD patients.
  34 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  Fast R Functions for Robust Correlations and Hierarchical Clustering.

Authors:  Peter Langfelder; Steve Horvath
Journal:  J Stat Softw       Date:  2012-03       Impact factor: 6.440

Review 3.  Role of MTA1 in cancer progression and metastasis.

Authors:  Nirmalya Sen; Bin Gui; Rakesh Kumar
Journal:  Cancer Metastasis Rev       Date:  2014-12       Impact factor: 9.264

4.  Nonperiodic activity of the human anaphase-promoting complex-Cdh1 ubiquitin ligase results in continuous DNA synthesis uncoupled from mitosis.

Authors:  C S Sorensen; C Lukas; E R Kramer; J M Peters; J Bartek; J Lukas
Journal:  Mol Cell Biol       Date:  2000-10       Impact factor: 4.272

5.  The APC/C E3 Ligase Complex Activator FZR1 Restricts BRAF Oncogenic Function.

Authors:  Lixin Wan; Ming Chen; Juxiang Cao; Xiangpeng Dai; Qing Yin; Jinfang Zhang; Su-Jung Song; Ying Lu; Jing Liu; Hiroyuki Inuzuka; Jesse M Katon; Kelsey Berry; Jacqueline Fung; Christopher Ng; Pengda Liu; Min Sup Song; Lian Xue; Roderick T Bronson; Marc W Kirschner; Rutao Cui; Pier Paolo Pandolfi; Wenyi Wei
Journal:  Cancer Discov       Date:  2017-02-07       Impact factor: 39.397

6.  Static lung hyperinflation is an independent risk factor for lung cancer in patients with chronic obstructive pulmonary disease.

Authors:  Ester Zamarrón; Eva Prats; Elena Tejero; Paloma Pardo; Raúl Galera; Raquel Casitas; Elisabet Martínez-Cerón; Delia Romera; Ana Jaureguizar; Francisco García-Río
Journal:  Lung Cancer       Date:  2018-12-14       Impact factor: 5.705

7.  Features of COPD as Predictors of Lung Cancer.

Authors:  Laurie L Carr; Sean Jacobson; David A Lynch; Marilyn G Foreman; Eric L Flenaugh; Craig P Hersh; Frank C Sciurba; David O Wilson; Jessica C Sieren; Patrick Mulhall; Victor Kim; C Matthew Kinsey; Russell P Bowler
Journal:  Chest       Date:  2018-02-13       Impact factor: 9.410

8.  Genomic stability and tumour suppression by the APC/C cofactor Cdh1.

Authors:  Irene García-Higuera; Eusebio Manchado; Pierre Dubus; Marta Cañamero; Juan Méndez; Sergio Moreno; Marcos Malumbres
Journal:  Nat Cell Biol       Date:  2008-06-15       Impact factor: 28.824

9.  An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

Authors:  Juan A Botía; Jana Vandrovcova; Paola Forabosco; Sebastian Guelfi; Karishma D'Sa; John Hardy; Cathryn M Lewis; Mina Ryten; Michael E Weale
Journal:  BMC Syst Biol       Date:  2017-04-12

Review 10.  The relationship between COPD and lung cancer.

Authors:  A L Durham; I M Adcock
Journal:  Lung Cancer       Date:  2015-08-29       Impact factor: 5.705

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.