Literature DB >> 30532529

Identification of novel candidate genes involved in the progression of emphysema by bioinformatic methods.

Wei-Ping Hu1, Ying-Ying Zeng1, Yi-Hui Zuo1, Jing Zhang1.   

Abstract

PURPOSE: By reanalyzing the gene expression profile GSE76925 in the Gene Expression Omnibus database using bioinformatic methods, we attempted to identify novel candidate genes promoting the development of emphysema in patients with COPD. PATIENTS AND METHODS: According to the Quantitative CT data in GSE76925, patients were divided into mild emphysema group (%LAA-950<20%, n=12) and severe emphysema group (%LAA-950>50%, n=11). Differentially expressed genes (DEGs) were identified using Agilent GeneSpring GX v11.5 (corrected P-value <0.05 and |Fold Change|>1.3). Known driver genes of COPD were acquired by mining literatures and retrieving databases. Direct protein-protein interaction network (PPi) of DEGs and known driver genes was constructed by STRING.org to screen the DEGs directly interacting with driver genes. In addition, we used STRING.org to obtain the first-layer proteins interacting with DEGs' products and constructed the indirect PPi of these interaction proteins. By merging the indirect PPi with driver genes' PPi using Cytoscape v3.6.1, we attempted to discover potential pathways promoting emphysema's development.
RESULTS: All the patients had COPD with severe airflow limitation (age=62±8, FEV1%=28±12). A total of 57 DEGs (including 12 pseudogenes) and 135 known driving genes were identified. Direct PPi suggested that GPR65, GNB4, P2RY13, NPSR1, BCR, BAG4, and IMPDH2 were potential pathogenic genes. GPR65 could regulate the response of immune cells to the acidic microenvironment, and NPSR1's expression on eosinophils was associated with asthma's severity and IgE level. Indirect merging PPi demonstrated that the interacting network of TP53, IL8, CCR2, HSPA1A, ELANE, PIK3CA was associated with the development of emphysema. IL8, ELANE, and PIK3CA were molecules involved in the pathological mechanisms of emphysema, which also in return proved the role of TP53 in emphysema.
CONCLUSION: Candidate genes such as GPR65, NPSR1, and TP53 may be involved in the progression of emphysema.

Entities:  

Keywords:  candidate genes; chronic obstructive pulmonary disease; differentially expressed genes; emphysema; protein-protein interaction network analysis

Mesh:

Substances:

Year:  2018        PMID: 30532529      PMCID: PMC6241693          DOI: 10.2147/COPD.S183100

Source DB:  PubMed          Journal:  Int J Chron Obstruct Pulmon Dis        ISSN: 1176-9106


Introduction

COPD, characterized by persistent respiratory symptoms and airflow limitation, is the third leading cause of mortality worldwide.1 Airflow limitation is mainly due to small airway obstruction and emphysema, which have distinct physiopathologic mechanisms.2,3 Most patients with COPD have pathological alterations of both emphysema and small airway obstruction, while some have only one or no obvious change.4 Therefore, the two pathological phenotypes are regarded as potential subtypes of COPD.5 Contrary to the feature of small airway remodeling, emphysema is due to decreased deposition and excessive destruction of extracellular matrix, leading to loss of alveolar septum and attachment.6,7 However, many studies show that the pathogenesis and progression mechanism of emphysema are complex and heterogeneous, which need to be further elucidated.6,8,9 As a noninvasive tool to measure morphological indices, quantitative computed tomography (QCT) is an effective approach to determine the severity of COPD and distinguishing the above subtypes.10 Its assessment of emphysema has been demonstrated to be reliable, correlating well with indices of lung function, microscopic manifestations of emphysema, and clinical status of COPD patients. In addition, its assessment of small airway obstruction is also well associated with FEV1%.11,12 After searching the Gene Expression Omnibus (GEO) database, which is one of the largest gene expression databases in the world, we found the original gene expression profile GSE76925 with records of QCT index.13 To investigate the inherent molecular mechanisms in emphysema subtype of COPD, by using several bioinformatics methods,14–16 we constructed the interacting network of differentially expressed genes (DEGs) in this profile and known COPD driver genes to identify novel candidate genes promoting progression of emphysema.

Materials and methods

Acquisition of microarray data

The GEO database (http://www.ncbi.nlm.nih.gov/geo, May 18, 2017) was retrieved to obtain gene expression profiles of lung tissues of COPD patients. The dataset GSE76925, the only one with QCT indices, was downloaded.13 Tests for these surgically resected lung tissue samples in GSE76925 dataset were performed using the GPL10558 platform, Illumina HumanHT-12 V4.0 expression beadchip.

Group division and statistical analysis

After screening samples’ phenotype information (from GSM2040796 to GSM2040942), samples without records of percents of low attenuation areas <−950 Hounsfield unit on inspiratory CT (%LAA-950) were ruled out. Based on the value of %LAA-950, we divided the remaining samples into two groups, severe emphysema group (%LAA-950>50%, n=11) and mild emphysema group (%LAA-950<20%, n=12). All the continuous variables were expressed as mean ± standard deviation, and t-tests were applied to make comparison between the two groups. The categorical variables were described by constituent ratio and analyzed by Pearson chi-squared test. All statistical analyses were performed using GraphPad Prism 7 (GraphPad Software Inc, La Jolla, CA, USA). A two-side P<0.05 was considered to be statistically significant.

Identification of DEGs between severe and mild emphysema groups

To explore the underlying genes, we filtered DEGs between severe and mild emphysema groups, using GeneSpring GX software v11.5 (Agilent technologies, Santa Clara, CA, USA) at the cutoff value of corrected P-value <0.05 and |Fold Change|>1.3. We annotated them with Gene Oncology by manually retrieving Gene database (http://www.ncbi.nlm.nih. gov/gene, July 16, 2017) and roughly classified them according to the section of biological process in Gene Oncology17 by retrieving the Database for Annotation, Visualization and Integrated Discovery (DAVID)18 v6.8 (https://david.ncifcrf. gov/, October 9, 2018).

Retrieval of COPD driver genes

There has been a variety of known COPD-related genes in Global Initiative for Chronic Obstructive Pulmonary Disease (GOLD) guideline,3 peer-reviewed literatures,19–23 Online Mendelian Inheritance (OMIM) database24 (https://www.ncbi.nlm.nih.gov/omim/, July 4, 2017), and Genetic Association Database25 (GAD, http://geneticassociationdbnih.gov/). The GOLD guideline illustrated some mainstream mecha nisms of COPD and emphysema, such as protease– antiprotease imbalance, which guided us to further search for specific genes in some canonical reviews. In addition, OMIM and GAD are open access databases, providing a comprehensive and authoritative compendium of genetic alterations associated with disease phenotypes. Based on the keywords of COPD or emphysema, we retrieved the above literatures and databases and identified driver genes of COPD.

Direct protein–protein interaction network of DEGs and known driver genes

The topological and functional analysis of protein interaction network is helpful in the identification of key genes and functional modules that participate in disease onset and progression.16 In network pharmacology, merging the interaction networks of drug predicted targets and driver genes of disease is an effective and original method to identify the concrete genes or pathways by which drug affects the disease.15,16 Enlightened by this analytical method, we tried to analyze the interacted relationship between DEGs and accepted mechanism of COPD in order to identify more credible DEGs participating in emphysema development. STRING v10.526 (https://www.string-db.org/, July 20, 2017), a web database recording physical and functional protein–protein interaction (PPi) information, was used to predict the interacted relationship between driver genes and DEGs. A variety of active interaction sources in STRING were included into our search strategy, such as text mining, experiment record, database record, coexpression, neighbor-hood, gene fusion, and co-occurrence. The interaction network was further visualized by Cytoscape27 v3.6.1 which is an open access software aimed at annotating and visualizing biological pathways and molecular interaction networks.

Indirect PPi of DEGs and known driver genes

Weighed protein–protein interaction network analysis has been regarded as a novel approach to highlight key functional genes of complex disorders like frontotemporal dementia.14 It indicates that analyzing disease-spectrum genes, also known as first-layer interacting proteins of key genes, is a greatly potential approach to validate previous findings and explore novel disease-related mechanisms. Thus, we retrieved STRING database v10.5 to obtain the first-layer proteins associated with DEGs products and constructed an indirect PPi of these proteins. The first-layer interacting proteins were roughly classified according to the clustering annotation of Gene Oncology and Kyoto Encyclopedia of Genes and Genomes28 by the Functional Annotation tool in the DAVID database. Then, the merge tool of Cytoscape software v3.6.127 was applied to merge the indirect PPi with driver genes’ PPi to discover the interconnected and intersected functional modules and target the core genes. In addition, highly connected nodes with a great number of edges in the network are likely to be significantly functional in the disease context and defined as hub genes.29 The number of each gene node’s edges in the indirect PPi network was ranked to identify hub genes with functional significance in emphysema by Cytoscape software.

Identification of candidate transcription factors

TRANSFAC® Professional database30 is an authoritative and paid database, recording comprehensive information of transcription factor (TF), their regulated genes and binding sites prediction profiles. We performed the TF prediction of core genes by using Gene Radar tool on the GCBI website (Genminix Informatics Ltd., Shanghai, China). Based on all transcripts of each gene (Ensembl database GRCh38 version), the Gene Radar tool could acquire comprehensive TF prediction results from the TRANSFAC Professional database. In addition, Gene Radar tool could screen out high-recommended TFs by integrating the scores from the TRANSFAC database, the existence of single-nucleotide polymorphism (SNP) loci and methylation modification in TF binding sites. Therefore, we identified the candidate TFs of core genes with high recommendation grade.

Results

Baseline characteristic between severe and mild emphysema groups

As Table 1 shows, all patients were former smokers and presented with severe to very severe airflow limitation according to the GOLD guideline.3 Despite relatively small sample size, a significant difference of many characteristics between two groups was observed, like the ratio of FEV1/FVC and body mass index, proving the credibility of %LAA-950-dependent grouping method.
Table 1

Comparisons of baseline characteristics suggested the credibility of %LAA-950-dependent grouping method

CharacteristicsSevere emphysema group (n=11)Mild emphysema group (n=12)P-value
Age (years)61.6±5.963.1±10.4.0.05
Male/female8/36/6.0.05
BMI (kg/m2)23.0±3.327.9±5.30.0102
Smoking history (pack-years)66.5±21.258.6±29.20.0013
%LAA-95052.7±2.07.3±5.3<0.0001
FEV1 (%predicted)23.0±8.432.7±13.30.051
FEV1/FVC (%)25.5±4.842.3±14.20.0012

Abbreviations: BMI, body mass index; %LAA-950, percents of low attenuation areas <−950 Hounsfield unit on inspiratory CT.

DEGs between two groups and the list of COPD driver genes

We identified 57 DEGs including 15 upregulated genes, 30 downregulated genes, and 12 pseudogenes (unlisted) in severe emphysema group, compared with the mild emphysema group (shown in Table 2). The Gene Oncology annotations of 45 genes were shown in Table S1.
Table 2

Forty-five DEGs were identified between severe and mild emphysema groups

Functional categoryGene symbolDysregulationP-valueFold change
Transcriptional regulationKANK1PHF1PHF6TADA2ATRIM34ZFHX3ZNF322ZNF451DownDownUpUpUpDownUpUp5.58E–054.45E–052.08E–058.32E–053.85E–064.31E–052.88E–068.66E–05−1.82−1.632.012.421.54−1.541.582.54
Membrane receptor andsignal pathwayBAG4BCRFYB1GNB4GPR65NPSR1NPHP4P2RY13RNF213ZFP106ZC3HAV1UpDownUpUpUpDownDownUpUpDownDown0.0000616.05E–055.24E–067.66E–057.58E–057.72E–058.48E–053.03E–056.32E–052.71E–058.41E–051.71−1.592.432.492.54−1.73−1.934.22−1.5−1.43
MetabolismDPM3ELOVL3ETNK2IMPDH2DownDownDownDown7.09E–054.72E–052.63E–057.49E–05−1.44−1.47−1.74−1.5
CiliumIFT140TMEM80DownDown1.65E–055.48E–05−1.93−1.75
Protein modificationUSP33NUP58PARP16UpUpDown2.29E–053.89E–056.93E–052.642.96−1.73
OthersDNAJB14ZBTB8OSEHBP1ATP1B2FAM149ATLN1SF3A1FAM168BCYB5D2KCNJ4ZCCHC3MRPS24swi5SERPINI1SVEP1VPS28OGFOD3UpUpDownDownDownDownDownDownDownDownDownDownDownDownDownDownDown4.21E–056.47E–059.96E–077.85E–067.86E–062.68E–052.46E–054.84E–053.72E–058.54E–057.39E–054.46E–053.21E–057.31E–055.94E–056.99E–057.25E–052.161.65−1.42−3.09−3.32−1.47−1.49−1.71−1.33−1.77−1.57−1.36−1.34−1.58−1.67−1.52−1.57

Note: DEGs were roughly classified according to the BP and MF terms of Gene Oncology by using the Functional Annotation tool in the DAVID database.

Abbreviations: BP, biological process; DAVID, the Database for Annotation, Visualization and Integrated Discovery; DEGs, differentially expressed genes; MF, molecular function.

According to involved pathways, 135 retrieved COPD driver genes were separately placed in extracellular matrix-associated column, oxidative stress column, inflammation column, and others column (shown in Table 3).
Table 3

Known driver genes of COPD as grouped into four categories

Synthesis and degradation of ECM (n=37)Oxidative stress (n=12)Abnormal inflammation (n=36)Others (n=50)

ELNCOL1A1GSTP1TNFCCL5LTA4HCLASP1VEGFA
FBLN4COL1A2GSTM1TNFRSF1AIL17FMUC5ACADRB2GABPA
FBLN5FBN2HMOX1TNFRSF1BIL1RNMUC5BGCMTHFR
FBN1FBN3NOS2IL-17AIFNGSLC6A4DEFB1SPAR
ATP7ACOL3A1NOS3IL-18TSLPEGFCYP21A2HSPA1B
TGFB1COL8A1SOD2IL1bCCR2EGFRCFTRCAT
TGFBR3COL4A1MMAC1HDAC2IL8RBFGF10APOEOGG1
LTBP4FN4SOD3IL12IL13CHRNA3AGTR1PDE4D
SERPINE2FN1PIK3CAIL21IL11CHRNA5ADRB3TCEAL1
ELANEDCNPIK3R1IL22CCR5IREB2TFBCL2
MMP1BGNNFE2L2IL-23CCR6FAM13ASFTPBDBP
TIMP1TGFBR1EPHX1IL27CXCL8FTOSERPINE1HCK
MMP2SMAD3IL32CXCR1BICD1SERPINA1
MMP3SMAD7IL-4CXCR2HHIPNAT2
MMP8VCANIL-6CXCR3ACELTA
MMP9TNCIL-10TLR9KCNIP4HSPA1L
MMP10SPP1CCL11IL8RACRHR1HSPA1A
MMP14TIMP2CLL2CCL1CYP1A2HRAS
MMP12TP53SCGB1A1

Abbreviation: ECM, extracellular matrix.

Candidate genes directly interacted with driver genes

A total of 180 genes (45 DEGs +135 driver genes) were recruited to construct the network, 7 were withdrawn for failed identification of gene symbol and 147 were found to have interaction with others. Eight of the 45 DEGs were found to have interaction relationship with driver genes: G-protein coupling receptor 65 (GPR65), Neuropeptide S receptor 1 (NPSR1), purinergic receptor P2RY13, RhoGEF and GTPase activating protein (BCR), G protein subunit β4 (GNB4), BCL2-associated athanogene 4 (BAG4), inosine monophosphate dehydrogenase 2 (IMDPH2), and Hsp40 member 14 (DNAJB14; shown in Figure 1).
Figure 1

Candidate genes were screened by direct PPi of DEGs and COPD driver genes.

Notes: The left panel shows the 147 genes-constructed interaction network. Eight driver genes-associated DEGs are highlighted at the center of circle and the red lines identify the interaction relationship of the eight highlighted DEGs and the corresponding driver genes. The right panel amplifies the mutual relationship of eight DEGs and their interacted COPD driver genes.

Abbreviations: DEGs, differentially expressed genes; PPi, protein–protein interaction.

A relatively separate interacting set was composed of GPR65, NPSR1, P2RY13, and GNB4. In addition, BCR and BAG4, IMDPH2 and DNAJB14 had separately bilateral relationship. When the cutoff value of the combined interaction score was set at 0.9, we found that GNB4 and P2RY13 mostly interacted with chemokines and chemokine receptors, such as CXCR1, CCR2, CXCR2, IL8, CXCR3, CCR5, CCL5, and CCR6. BAG4 interacted with TNFα and its receptor as well as heat shock protein (HSP) family. In addition, PIK3CA and PIK3R1 may play an important role by interacting with GPR65, GNB4, BCR, NPSR1, and BAG4.

Common key genes and their TFs filtered by merging of indirect PPi and driver PPi

A total of 422 first-layer interacting proteins were attained by retrieving STRING database v10.5 (shown in Table S2). Among them, 375 proteins were recruited to construct the indirect PPi and the remaining proteins were withdrawn due to failed identification or isolation from interaction network. According to the number of each node’s edges in the topological network, 375 proteins in indirect PPi were ranked and the top 20 are shown in Table S3. PIK3CA, TP53, and MAPK1, the top three genes in the rank of topological nodes of indirect PPi, had separately 86, 74, and 72 interacting nodes, which showed their potentially predominant and interconnected roles in the mechanism of emphysema progression. The merged network illustrated in Figure 2 shows a total of 10 genes that constituted the intersection network of the two networks: TP53, IL8, CCR2, CXCR2, PIK3CA, ELANE, HSPA1A, HSPA1B, HSPA1L, and ADRB2.
Figure 2

Common key genes are screened by merging of indirect PPi and driver PPi.

Notes: The left panel represents the 125 driver genes-constructed interaction network, and the right panel shows the 375 genes-constructed network of first-layer proteins of DEGs. The top three genes in the rank of topological network node stand out at the center of right circle and the red lines identify their interaction relationship with other first-layer proteins. The upper panel shows the merge network of the lower two PPis, representing the common genes and pathways involved in the two networks.

Abbreviations: DEGs, differentially expressed genes; PPi, protein–protein interaction.

TFs that could bind to promoter region of the above eight genes were retrieved and shown in Table S4. Because ADRB2 was independent from the network and none of the TFs with high recommendation score was retrieved for HSPA1B, they were omitted for presentation. What’s more, SPIB, CPBP, SATB1, ZNF333, HOXA13, KID3, SOX4, and FOXO1A were potentially meaningful TFs, which could regulate no less than half genes of the above nine genes.

Discussion

We identified eight novel candidate genes (GPR65, GNB4, P2RY13, NPSR1, BCR, BAG4, IMPDH2, and TP53) promoting the progression of emphysema by means of network analysis of DEGs and COPD driver genes. This is the first study that QCT index was applied to classify emphysema for analyzing DEGs, and known COPD driver genes were retrieved to construct interacting networks with DEGs. Our method of direct and indirect network analysis has some merit. For analysis of DEGs, it is a difficult problem to interpret the biological role of the identified single gene in the pathogenic mechanism. Performing external and experimental validation for all DEGs is cumbersome and inefficient. Incorporating driver genes into direct network analysis with DEGs is helpful in quickly highlighting causative DEGs and excluding random DEGs caused by covariates, making the role of identified DEGs more credible. In addition, protein function is regulated not only at transcriptional level but also at posttranscription level which DNA microarray could not detect. A previous study of breast cancer demonstrated that known driver genes, with their expression profiles not changed, were still capable of interconnecting many transcriptionally dysregulated genes in the protein interacting network.31 Therefore, we innovatively used the method of first-layer protein interacting network to further explore the indirect effects of DEGs and seek potential ignored genes. Furthermore, for the polygenic complex disease, a single gene is incapable of comprehensively illustrating the molecular mechanism of phenotypes. By merging the indirect PPi and driver PPi, we could efficiently extract the candidate protein networks involved in a specific disease phenotype. This method,14 of which the efficacy has been confirmed in a study of frontotemporal dementia, could be applied in exploring other complex disorders or extended to other phenotypes of COPD, such as airway remodeling. By comparing difference of the critical protein interactome in different phenotypes, we could unveil the different molecular mechanisms promoting complex pathological processes, which was crucial to promote biomarker and drug discovery. As a proton-sensing receptor, GPR65 could regulate the immune response of T cells and macrophages and induce the production of MMP3 in the acidic microenvironment.32–34 Asthma, another chronic airway disease with obstructive airflow limitation, was demonstrated to have local acidic microenvironment,35 where eosinophil showed decreased apoptosis and increased viability in a GPR65-depedent manner.36 The SNP of NPSR1 was associated with the decline of FEV1 after adjusting for covariates in normal aging population.37 Moreover, DNA methylation status of NPSR1 in adult severe asthma population and childhood allergic asthma population was distinct from that of control population.38 NPSR1’s expression on peripheral blood eosinophils was positively correlated with asthma’s severity and serum IgE level.39 Asthma and COPD have many common traits in terms of risk factors, inflammatory responses, clinical features, and therapeutic methods.3,40 Furthermore, the role of eosinophils in the pathogenesis and treatment of COPD is gradually recognized.3,41 Therefore, we speculated that the above genes related to asthma were highly likely to be involved in the pathogenesis of COPD and emphysema. As an extracellular ADP receptor, P2RY13 participated in purinergic signaling pathway, resulting in the apoptosis of pancreatic β-cells42 and differentiation of marrow stem cells into osteoblasts.43 Since the roles of extracellular adenosine ATP and its receptor P2RX in COPD have been confirmed,44,45 ADP, the intermediate in purinergic metabolic pathways, may also have pathogenic effects on COPD. In addition, common key genes identified by the indirect method matched well with the two-hit hypothesis of COPD,46 especially the part of senescence and senescence-associated secretary phenotype (SASP).47 Senescence is an irreversible cell state, at which a cell is deprived of its replicative capacity with cell cycle arrest.48 The p53 (encoded by TP53)/p21 pathway participated in all types of senescence mechanisms, arresting cell cycle at the G1/S and G2/M check points.49 SASP refers to the alteration of aging cell’s secretome toward more production of proinflammatory cytokines, including IL-8 and monocyte chemotactic protein 1 (MCP-1).49 IL-8 and its receptor CXCR2, with neutrophil chemotactic ability, are just one of the most important chemokine-receptor pairs in COPD pathogenesis, as well as MCP-1 encoded by CCR2.6 Moreover, phosphoinositide 3 kinase (PI3K), the product of PIK3CA, was also known as a pro-senescent kinase by inactivating HDAC-2 which is an antiaging molecule, because knockdown of HDAC-2 could induce cellular senescence by enhancing p53-dependent transcriptional responses.50 Based on these evidence, we hypothesized that TP53 might play a central role in promoting progression of emphysema. Firstly, beside IL-8, CXCR2 and CCR2, elastase, the products of ELANE, and PI3K are also well-recognized COPD driver genes playing an important role in protease– antiprotease imbalance and chronic inflammation of COPD.6 The involvement of these genes supports our results and in return proves the role of TP53. Secondly, TP53 can induce cell cycle arrest, apoptosis, senescence, DNA repair, or metabolic alterations, in response to oxidative stress and DNA damage.51 Some studies confirmed that TP53 was overexpressed in the emphysematous lung tissue.52 A Genome-Wide Association Study for 365 patients with emphysema proved the association of TP53’s SNP with apoptotic signaling and smoking-related emphysematous changes in smoker’s lungs.53 Furthermore, a RNA-sequencing study of COPD patients’ lung tissues identified the enrichment of p53/hypoxia pathway and the phenomenon of much frequent molecule’s alternative splicing in this pathway.54 Thirdly, the role of TP53 in senescence might reveal its effects in COPD. Many evidences have shown the association between senescence and pathogenesis of COPD. Cellular experiments proved that alveolar epithelial and endothelial cell as well as fibroblast underwent accelerated senescence in emphysematous lung.55,56 Epidemiological surveys indicated that the incidence of COPD and the decline of FEV1 increased with growth of age.3 Moreover, airway and parenchyma of the patients with COPD and healthy senior citizens had similar structural changes.50,57 There are many studies searching for key genes associated with emphysema. In a research on seeking differently expressed miRNAs of emphysema, the miR-638 was identified as an effector molecule and it could regulate accelerated senescence, which was partially consistent with our hypothesis.58 However, we did not reproduce the DEGs of other studies for emphysema. On one hand, it was due to different grouping methods;59 on the other hand, their samples mainly came from patients with moderate COPD (FEV1% was about 60%),60,61 so their results mainly explained the early mechanisms of emphysema progression. Our study has some limitations. Firstly, the sample size is relatively small,62 which is due to limited numbers of accessible datasets in GEO database. Secondly, the selection of COPD driver genes is potentially biased and incomplete so that some meaningful DEGs may be ignored. Thirdly, PPi prediction has false positives and false negatives. The web tool STRING v10.5 defines PPi by the standard of text mining, experiment record, database record, coexpression, neighborhood, gene fusion, and co-occurrence, which may have a bit of controversy. In addition, interactions proved by experiments in vitro may also have differences compared with those in vivo. Despite these limitations, our study has put forward some novel candidate genes, and following experiments or larger databases are needed to testify the role of the above candidate genes in the mechanism of emphysema progression.

Conclusion

We have identified several novel candidate genes promoting emphysema, like GPR65, NPSR1, and TP53, which may be helpful in filling in the gap of knowledge in the field of COPD. GO annotation of DEGs between severe and mild emphysema groups Note: Some COPD-associated genes are highlighted in bold. Abbreviations: DEGs, differentially expressed genes; GO annotation, Gene Oncology annotation. The list of first-layer interacting proteins associated with DEGs between two groups Note: The first-layer interacting proteins were roughly classified according to the BP terms of Gene Oncology and KEGG by the Functional Annotation tool in the DAVID database. Abbreviations: BP, biological process; DAVID, the Database for Annotation, Visualization and Integrated Discovery; DEGs, differentially expressed genes; KEGG, Kyoto Encyclopedia of Genes and Genomes. Top 20 topological network nodes in first-layer’s PPi Abbreviations: GO, Gene Oncology; PPi, protein–protein interaction. Transcript factors of common key genes screened by indirect PPi Note: Key genes are highlighted in bold. Abbreviation: PPi, protein–protein interaction.
Table S1

GO annotation of DEGs between severe and mild emphysema groups

Functional categoryGene SymbolGO annotation
TranscriptionalregulationKANK1PHF1PHF6TADA2ATRIM34ZFHX3ZNF322ZNF451Positive regulation of Wnt signaling pathway, negative regulation of actin filament polymerization and so onInvolved in regulation of histone H3-K27 methylation and cellular response to DNA damage stimulusNegative regulation of transcription from RNA polymerase II promoter, an oncogeneA transcriptional activator adaptor; acetylating and destabilizing nucleosomesInvolved in interferon signaling pathwaysTranscription factor activity, RNA polymerase II distal enhancer sequence-specific bindingRegulate transcriptional activation in MAPK signaling pathwaysNegative regulation of transcription initiation from RNA polymerase II promoter, histone H3-K9 acetylation and TGF-β signaling pathway
Membrane receptor and signal pathwayBAG4BCRFYB1GNB4GPR65NPSR1NPHP4P2RY13RNF213ZFP106ZC3HAV1Negative regulation of apoptotic process, response to TNFαGTPase activator activity and Rho guanyl-nucleotide exchange factor activityInvolved in TCR signaling pathways, the expression of IL-2 and process of NLS-bearing protein importing into nucleusA subunit of heterotrimeric guanine nucleotide-binding proteins involved in cellular response to glucagon stimulusInvolved in G-protein coupled receptor signaling pathway, actin cytoskeleton reorganization, and apoptotic processNeuropeptide and vasopressin receptor activity, increased expression in lung for asthmaInvolved in actin cytoskeleton organization, hippo signaling, and negative regulation of canonical Wnt signaling pathwayG-protein coupled purinergic nucleotide receptor and negative regulation of adenylate cyclase activity signaling pathwayATPase activity, ubiquitin-protein transferase activity, and negative regulation of noncanonical Wnt signaling pathwayInsulin receptor signaling pathwayDefense response to virus
MetabolismDPM3ELOVL3ETNK2IMPDH2GPI anchor biosynthetic process and protein mannosylationFatty acid elongase activity providing precursors for synthesis of sphingolipids and ceramidesA member of choline/ethanolamine kinase family that catalyses phosphatidylethanolamine biosynthetic processPurine ribonucleoside monophosphate biosynthetic process, neutrophil degranulation, and oxidation-reduced process
CiliumIFT140TMEM80Intraciliary transport involved in cilium assemblyIntegral component of membrane
Protein modificationUSP33NUP58PARP16Protein deubiquitination and involved in slit-dependent cell migration and beta-2 adrenergic receptor signalingA component of the nuclear pore complex playing a role of nucleocytoplasmic transporter activityNAD+ ADP-ribosyltransferase activity and protein serine/threonine kinase activator activity
OthersDNAJB14ZBTB8OSEHBP1ATP1B2FAM149ATLN1SF3A1FAM168BCYB5D2KCNJ4ZCCHC3MRPS24swi5SERPINI1SVEP1VPS28OGFOD3Hsp70 protein binding and chaperone cofactor-dependent protein refoldingtRNA splicing via endonucleolytic cleavage and ligationEndocytosis and its mutation associated with prostate cancerATP hydrolysis coupled transmembrane transport and cell adhesionAssociated with acute mountain sicknessIntegrin-mediated signaling pathway, cell–cell and cell–substrate junction assembly such as actinA component of the mature U2 snRNP playing a role of pre-mRNA splicingMyelin-associated neurite-outgrowth inhibitorPositive regulation of neuron differentiationA member of the inward rectifier potassium channel familyRNA bindingA structural constituent of ribosome related to mitochondrial translationDNA repair protein swi5 homologSerine-type endopeptidase inhibitor activity and association with central and peripheral nervous system developmentA ligand for integrin α9β1 and involved in cell adhesionEndosomal transport, macroautophagy, negative regulation of protein ubiquitination and viral buddingOxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen

Note: Some COPD-associated genes are highlighted in bold.

Abbreviations: DEGs, differentially expressed genes; GO annotation, Gene Oncology annotation.

Table S2

The list of first-layer interacting proteins associated with DEGs between two groups

Translational regulationSignaling pathwayMatrix and cell adhesionVirus-associated genesOthers

RPL12TP53PIK3CACHMP6KCNJ2RBBP7ZNF91GPR37L1
RPL13APIK3CAMAPK1RAE1KCNJ4AEBP2AP4E1NPHP4
RPL18MAPK1FYNZC3HAV1FOSCYB5D2NEURLPIP5K1C
RPL18AGRB2CRKLDDX1GARTPARP14TEX10SEL1L
RPL8SHC1GRB2DDX58IFT122RIPK4AP4M1TMEM67
RPS15PDGFRBSHC1IRF3MKS1ZCCHC3DPAGT1ATIC
RPS3ITGB1PXNIRF7P2RY13AFPNEURL1BEHD2
RPS9ITGB3PDGFRBRPL12WDR19IMPDH1APBB1IPNPLOC4
MRPS10ITGB5ITGB1IRF9ACACAPARP16PI15PISD
MRPS12VWFZYXRPL13ACIDEARNF19BTLX1SELV
PAIP1GNAI1ITGB2RPL18GCC2ZFHX3ARCTMEM80
MRPS14GNB1TLN1RPL18AIFT140AGBL3NMRAL1EHD3
MRPS15GNB2ITGB3RPL8MOCS1IMPDH2TLX3GRAP2
MRPS22GNB3ITGB5RPS15WDR35RNF213ARHGEF6KIF21A
MRPS23GNG2VASPRPS3ACACBTADA2AITPANPSR1
MRPS24GNAO1VCLRPS9COL14A1ZFP106NNMTPLAT
MRPS31GNB4VWFTMEM48GM2AAGXT2L1PIGMATP1A1
MRPS33GNG10ATG7IFT172DGAT2TMEM171ELANE
MRPS5GNG13TNFMOCS2IQCB1ARHGEF7KNG1
RPL10LGNG3Substance transportKPNB1P2RY4PCYT2DR1NUDT11
SEPSECSGNG4RAE1NUP107ACE2RPE65NOL10PLAU
CHCHD1GNG5TMEM48NUP188CPB1TADA3PIGVSERP1NI1
GNG7NUP107TRIM34GMPSZNF322TMEM218TNFRSF1A
JAK2NUP188NUP205IFT52AKAP9DTX3LATP1A2
Protein modification and foldingGNGT2NUP205NUP35P2RY8DMRT1JARID2ELAVL4
RAE1GNA11NUP35PMLCPS1IRF2NOL12GRM6
TMEM48GNAQNUP62TSG101IFT57RPGRIP1LPIK3C2BPLEKHG7
NUP107ADRBK1NUP93NUP54SUPT3HTAF1SASH1SF3A1
NUP188IL8NUPL1HERC5YEATS2ZNF385DTMEM222ATP1A4
NUP205CCR2SUMO1NUP62ADRA1ALG1EEDELOVL3
NUP35ABL1SUMO2NUP93GNA14DNAJB14GOLT1AH2AFV
TSG101P2RY14SUMO3NUPL1IFT80MSRB1KAT2ALACE
NUP62GNA15SUMO4OAS1PARLPFASNOL3PLG
NUP93SYKRANGAP1PPIASUZ12TAF9SCDSF3A2
NUPL1MYCPAIP1OAS2YIPF6ALG3TMEM247ATP1B1
VPS37APHLPP1STAT3OAS3ADRB2DOCK8ATG3ENAH
VPS4APHLPP2KPNB1OASLCTPS1PHF19EHBP1H2AFZ
SUMO1GPR65NUP54VPS28IFT81TCEB2GPCPD1PLRG1
SUMO2LATUBE2IVPS37ARBBP4ZNF768KAT2BSF3A3
SUMO3TSHBNUTF2VPS37BSVEP1DOLKNPHP1TRMT10C
SUMO4LPAR1VPS37CZBTB8OSNBEAL1PIP5K1AATP1B3
RANGAP1LPAR2VPS4AIFT88PHF6SDCCAG8ENG
PARP1OXGR1PARP10TCTN1EHD1HERC1
ZNF451FPR1VPS36MIB2CEP290SIRPA
ALG5FPR2DNA repairCEPT1OS9OGFOD3USP33
DOLPP1ADCY3SUMO1HSPA6SLTMSKAP2CAD
DPM1CRKLPARP1CERS2FCGR1BUXTFAM98A
NFATC2IPCXCR2RPS3FXYD1HSPA2CCP110HSPA14
DPM2GNAI3RAD51STAT5AMICAL1FAM98CMBIP
PIAS3ARRB1RAD51BFXYD2SNF8HSPA1BOBSCN
DPM3ARRB2RAD51CFXYD6VHLMIB1PVR
TP53LYNXRCC2HUWE1CDKL5OPHN1SKAP1
FAM125APXNRAD51DFXYD7HSPA4SLC20A2USP48
FAM125BSTAT3XRCC3HSPA5MKI67IPFASNCCDC101
GNAI1STAT5BSWI5SPATA5SPATA2HSPA1LFAM98B
GNAI3PARP2SH3GL1LATS1LLPHB9D2
GNB1ATMC1orf177SF3B1POTEIEZH2
GNB2SFPQFAM155BAWAT1SF3B4HERC6
GNB3SFR1HSD17B12ETNK2UBA7HSPA13
GNG2CDC5LPRPF19HERC4HIST1H1AMAX
GNAO1USP20LCP2POTEJPTPRF
PPIAC22orf28POTEEBCRFAM149A
BAG4FAM168BSF3B2HMG20AHPRT1
SIL1HSPA12BB4GALNT1POU1F1LRRK2
MESDC2PRPF6EVLUBR2POTEF
HSPA8USP22LIASC14orf166SF3B3
HSPA9C2orf49HSPA1AFAM71E1TTC21B

Note: The first-layer interacting proteins were roughly classified according to the BP terms of Gene Oncology and KEGG by the Functional Annotation tool in the DAVID database.

Abbreviations: BP, biological process; DAVID, the Database for Annotation, Visualization and Integrated Discovery; DEGs, differentially expressed genes; KEGG, Kyoto Encyclopedia of Genes and Genomes.

Table S3

Top 20 topological network nodes in first-layer’s PPi

DegreeNameGO annotation

86PIK3CAProtein serine/threonine kinase activity and signaling pathway
74TP53Transcription factor activity
72MAPK1Protein serine/threonine kinase activity and signaling pathway
68HSPA8Chaperone and protein folding
64ACACAAcetyl-CoA carboxylase activity
63CADAspartate carbamoyltransferase activity
62ACACBAcetyl-CoA carboxylase activity
61POTEFRetina homeostasis
60LRRK2Protein serine/threonine kinase activity
58IL8Neutrophil chemotaxis
54POTEERetina homeostasis
53POTEIRetina homeostasis
53POTEJRetina homeostasis
53PHLPP1Protein dephosphorylation and signaling pathway
53RIPK4Protein serine/threonine kinase activity
53PHLPP2Protein dephosphorylation
52GARTPurine nucleobase biosynthetic process
50MYCTranscription factor activity
48GMPSPurine nucleobase biosynthetic process
47OAS2Purine nucleobase biosynthetic process

Abbreviations: GO, Gene Oncology; PPi, protein–protein interaction.

Table S4

Transcript factors of common key genes screened by indirect PPi

TP53IL-8PIK3CAELANECCR2CXCR2

AHRMEIS1ALX3MYBAP2GAMMASPIBALX3TEF1AHRHIFPMX1
AML1MTF1AP1NF1CBCL6PARPAML1THAP1AML1POU4F3
AP1MYBBCL6NFE2CBFNR2C2AP1TORC2AP2REPPRRX2
AP2GAMMAMYOGENINCDX1NKX61CDXAP2GAMMAUSFBARHL1RORA1
BENMZF1CDX2NMYCCDX2BRCAUSF2BARHL2RPC155
BRCANEURODCDXANURR1CDXAHSPA1LCEBPZBTB2BENRXRA
CACDNFAT1CETS1PARPCHCHKID3CEBPBZEB1BRCASATB1
CDX1NFAT3CFOSPEA3CPBPEGR1CEBPDZNF333BRN1SMAD2
CDX2NFAT4CMYBPLZFBEGR3GABPACMYBCDX1SOX10
CEBPNKX25CPBPPMX1ETFCPBPCEBPASOX17
CEBPANKX2BCREBPRRX2ETS1CPEB1CEBPBSOX18
CETS1NMYCELF1RAXFOSHSPA1ADLX3CHCHSOX4
CHCHNR1B2ELK1RORBETAFOSL1CHCHE2ACP2SOX5
CJUNOSXEMX1SATB1FOXM1CPBPE2F1CPBPSPI1
CMAFPARPETS2SF1FOXO1AE2F3EGR3DMBX1SPIB
CMYBPAX4ETV7SOX17GABPAEGR1EMX1E2ASREBP2
CPBPPEA3EVX1SOX4GATA3KLF7EVX1E2F1SRY
CPEB1PITX2EVX2SPI1GKLFKLF8EVX2E2F3STAT3
DR4PLAGL2FLI1SPIBHFH8LKLFFOXK1ELF1TAL1
E2F3PRRX2FOSL2SREBP1HNF3AMOVOBFOXP3ETSTCF1
EBF1PU1FOXC1TAF1HOXA13PLAGL2GATA1ETV7TCF11
EGR1PUR1FOXD2TATAISL2SP2GKLFFOXC1TCFE2A
EHFRELAFOXD3TBPKID3SP3GRFOXO1ATFAP2C
ELF1RORBETAFOXG1TEF1KLFSP6HMGIYGATA2TORC2
ELF5SALL2FOXI1TTF1LKLFVMYBHOXA10GKLFUSF2
ELK1SATB1FOXK1ZNF333MAZRZBTB2HOXA13GMEB2VDRRXRALPHA
ETS2SMADFOXL1MITFHOXA2GRZBTB2
ETV7SMAD2FOXM1NFAT1HOXD3HOXA13ZFP770
FLI1SMAD5FOX01NFAT4HSF2HSF4ZNF333
FOXA1SOX10FOXO1ANFE4IKIKZNF536
FOXJ3SOX11FOXO3NR2E1IK2ING4
FOXL1SOX17FOXO4PLZFBKID3IPF1
FOXM1SOX18FOXO6PMX1KLFIRF7
FOXO1SOX30FOXP3RXRALHX2ISL2
FOXO1ASOX4FRA1SALL2LKLFIKD3
FOXO3ASOX9FXRSATB1MAXLBP1
FOXP3SPI1GATA1SMAD5MEOX2LMX1
FRA1SPIBGATA3SOX4MYBLPOLYA
GABPASREBP1GATA4SOX5MYCMAXMEF2C
GATA1SRYGATA5SPIBNF1AMOVOB
GATA2STATGSX1SREBP2NFATC2MTF1
GATA3STAT3GSX2SRYNMYCMYC
GATA4TCF4HMGIYTCF2-OctMYOD
GKLFTEL2HMX3TFEPARPMYOGENIN
HDAC1THAP1HOXA1WT1PAX5MZF1
HMGIYTORC2HOXA13ZACPEBP2BNANOG
HNF3AUSFHOXA2ZFP641PMX1NEUROD
HNF3GUSF2HOXB13ZNF333PRNKX32
HOXA13VMYBHOXB5ZNF641PRRX2NMYC
HSF1WT1HOXC13RELANR1B2
HSF4YY1HOXD13SALL2NR2C2
IKZBTB44JUNBSATB11-Oct
ING4ZEB1KID3SMAD2OG2
IRF1ZFP532LBX2SOX10OTX
IRF7ZFP536LMX1ASOX17P300
KID3ZIC1LRH1SOX18PARP
KLFZIC3MEF2DSPIBPAX5
KLF17ZNF333MEOX2SREBP2PEA3
LHX2ZNF367MIXL1TCF1PEBP2B
LKLFZNF515MSX2TCF11PIT1

Note: Key genes are highlighted in bold.

Abbreviation: PPi, protein–protein interaction.

  60 in total

1.  Role of the P2Y13 receptor in the differentiation of bone marrow stromal cells into osteoblasts and adipocytes.

Authors:  Galadrielle Biver; Ning Wang; Alison Gartland; Isabel Orriss; Timothy R Arnett; Jean-Marie Boeynaems; Bernard Robaye
Journal:  Stem Cells       Date:  2013-12       Impact factor: 6.277

Review 2.  Epithelial cell senescence: an adaptive response to pre-carcinogenic stresses?

Authors:  Corinne Abbadie; Olivier Pluquet; Albin Pourtier
Journal:  Cell Mol Life Sci       Date:  2017-07-13       Impact factor: 9.261

3.  Emphysema- and airway-dominant COPD phenotypes defined by standardised quantitative computed tomography.

Authors:  Deepak R Subramanian; Sumit Gupta; Dorothe Burggraf; Suzan J Vom Silberberg; Irene Heimbeck; Marion S Heiss-Neumann; Karl Haeussinger; Chris Newby; Beverley Hargadon; Vimal Raj; Dave Singh; Umme Kolsum; Thomas P Hofer; Khaled Al-Shair; Niklas Luetzen; Antje Prasse; Joachim Müller-Quernheim; Giorgio Benea; Stefano Leprotti; Piera Boschetto; Dorota Gorecka; Adam Nowinski; Karina Oniszh; Wolfgang Zu Castell; Michael Hagen; Imre Barta; Balázs Döme; Janos Strausz; Timm Greulich; Claus Vogelmeier; Andreas R Koczulla; Ivo Gut; Jens Hohlfeld; Tobias Welte; Mahyar Lavae-Mokhtari; Loems Ziegler-Heitbrock; Christopher Brightling; David G Parr
Journal:  Eur Respir J       Date:  2016-05-26       Impact factor: 16.671

Review 4.  New insights into the immunology of chronic obstructive pulmonary disease.

Authors:  Guy G Brusselle; Guy F Joos; Ken R Bracke
Journal:  Lancet       Date:  2011-09-10       Impact factor: 79.321

5.  Network Analysis of Lung Transcriptomics Reveals a Distinct B-Cell Signature in Emphysema.

Authors:  Rosa Faner; Tamara Cruz; Teresa Casserras; Alejandra López-Giraldo; Guillaume Noell; Ignacio Coca; Ruth Tal-Singer; Bruce Miller; Roberto Rodriguez-Roisin; Avrum Spira; Susana G Kalko; Alvar Agustí
Journal:  Am J Respir Crit Care Med       Date:  2016-06-01       Impact factor: 21.405

6.  Distinct quantitative computed tomography emphysema patterns are associated with physiology and function in smokers.

Authors:  Peter J Castaldi; Raúl San José Estépar; Carlos S Mendoza; Craig P Hersh; Nan Laird; James D Crapo; David A Lynch; Edwin K Silverman; George R Washko
Journal:  Am J Respir Crit Care Med       Date:  2013-11-01       Impact factor: 21.405

Review 7.  The instructive extracellular matrix of the lung: basic composition and alterations in chronic lung disease.

Authors:  Gerald Burgstaller; Bettina Oehrle; Michael Gerckens; Eric S White; Herbert B Schiller; Oliver Eickelberg
Journal:  Eur Respir J       Date:  2017-07-05       Impact factor: 16.671

Review 8.  Pathogenesis of chronic obstructive pulmonary disease.

Authors:  Rubin M Tuder; Irina Petrache
Journal:  J Clin Invest       Date:  2012-08-01       Impact factor: 14.808

Review 9.  Oxidative Stress in COPD: Sources, Markers, and Potential Mechanisms.

Authors:  Adam John Anthony McGuinness; Elizabeth Sapey
Journal:  J Clin Med       Date:  2017-02-15       Impact factor: 4.241

10.  The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.

Authors:  Damian Szklarczyk; John H Morris; Helen Cook; Michael Kuhn; Stefan Wyder; Milan Simonovic; Alberto Santos; Nadezhda T Doncheva; Alexander Roth; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2016-10-18       Impact factor: 16.971

View more
  5 in total

1.  Newborn DNA methylation and asthma acquisition across adolescence and early adulthood.

Authors:  Liang Li; John W Holloway; Susan Ewart; Syed Hasan Arshad; Caroline L Relton; Wilfried Karmaus; Hongmei Zhang
Journal:  Clin Exp Allergy       Date:  2022-01-16       Impact factor: 5.401

2.  Chronic Obstructive Pulmonary Disease Is Associated with Epigenome-Wide Differential Methylation in BAL Lung Cells.

Authors:  Jonas Eriksson Ström; Simon Kebede Merid; Jamshid Pourazar; Anders Blomberg; Anne Lindberg; Mikael V Ringh; Michael Hagemann-Jensen; Tomas J Ekström; Annelie F Behndig; Erik Melén
Journal:  Am J Respir Cell Mol Biol       Date:  2022-06       Impact factor: 7.748

3.  Host Factor Interaction Networks Identified by Integrative Bioinformatics Analysis Reveals Therapeutic Implications in COPD Patients With COVID-19.

Authors:  Wenjiang Zheng; Ting Wang; Peng Wu; Qian Yan; Chengxin Liu; Hui Wu; Shaofeng Zhan; Xiaohong Liu; Yong Jiang; Hongfa Zhuang
Journal:  Front Pharmacol       Date:  2021-12-23       Impact factor: 5.810

4.  Identification of Inflammation-Related Biomarker Lp-PLA2 for Patients With COPD by Comprehensive Analysis.

Authors:  Mingming Deng; Yan Yin; Qin Zhang; Xiaoming Zhou; Gang Hou
Journal:  Front Immunol       Date:  2021-05-21       Impact factor: 7.561

5.  STRING data mining of GWAS data in canine hereditary pigment-associated deafness.

Authors:  Maria Kelly-Smith; George M Strain
Journal:  Vet Anim Sci       Date:  2020-05-12
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.