Xiaochuang Liu1, Yanyan Zhang2, Hui Jiang1, Nannan Jiang3, Jiarong Gao1. 1. Department of Pharmacy, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, Anhui 230031, P.R. China. 2. Department of Pharmacy, Anhui Medical College, Hefei, Anhui 230601, P.R. China. 3. Department of Pharmacy, Anhui University of Chinese Medicine, Hefei, Anhui 230012, P.R. China.
Abstract
Asthma, a common but poorly controlled disease, is one of the most serious health problems worldwide; however, the mechanisms underlying the development of asthma remain unknown. Long non‑coding RNAs (lncRNAs) and mRNAs serve important roles in the initiation and progression of various diseases. The present study aimed to investigate the role of differentially expressed lncRNAs and mRNAs associated with asthma. Differentially expressed lncRNAs and mRNAs were screened between the expression data of 62 patients with asthma and 43 healthy controls. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed to investigate the biological functions and pathways associated with the lncRNAs and mRNAs identified. Protein‑protein interaction (PPI) networks were subsequently generated. In addition, lncRNA‑mRNA weighted co‑expression networks were obtained. In total, 159 differentially expressed lncRNAs and 1,261 mRNAs were identified. GO and KEGG analyses revealed that differentially expressed mRNAs regulated asthma by participating in the 'vascular endothelial (VEGF) signaling pathway', 'oxidative phosphorylation', 'Fc ε RI signaling pathway', 'amino sugar and nucleotide sugar metabolism', 'histidine metabolism', 'β‑alanine metabolism' and 'extracellular matrix‑receptor interaction' (P<0.05). Furthermore, protein kinase B 1 had the highest connectivity degree in the PPI network, and was significantly enriched in the 'VEGF signaling pathway' and 'Fc ε RI signaling pathway'. A total of 8 lncRNAs in the lncRNA‑mRNA co‑expression network were reported to interact with 52 differentially expressed genes, which were enriched in asthma‑associated GO and KEGG pathways. The results obtained in the present study may provide insight into the profile of differentially expressed lncRNAs associated with asthma. The identification of a cluster of dysregulated lncRNAs and mRNAs may serve as a potential therapeutic strategy to reverse the progression of asthma.
Asthma, a common but poorly controlled disease, is one of the most serious health problems worldwide; however, the mechanisms underlying the development of asthma remain unknown. Long non‑coding RNAs (lncRNAs) and mRNAs serve important roles in the initiation and progression of various diseases. The present study aimed to investigate the role of differentially expressed lncRNAs and mRNAs associated with asthma. Differentially expressed lncRNAs and mRNAs were screened between the expression data of 62 patients with asthma and 43 healthy controls. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed to investigate the biological functions and pathways associated with the lncRNAs and mRNAs identified. Protein‑protein interaction (PPI) networks were subsequently generated. In addition, lncRNA‑mRNA weighted co‑expression networks were obtained. In total, 159 differentially expressed lncRNAs and 1,261 mRNAs were identified. GO and KEGG analyses revealed that differentially expressed mRNAs regulated asthma by participating in the 'vascular endothelial (VEGF) signaling pathway', 'oxidative phosphorylation', 'Fc ε RI signaling pathway', 'amino sugar and nucleotide sugar metabolism', 'histidine metabolism', 'β‑alanine metabolism' and 'extracellular matrix‑receptor interaction' (P<0.05). Furthermore, protein kinase B 1 had the highest connectivity degree in the PPI network, and was significantly enriched in the 'VEGF signaling pathway' and 'Fc ε RI signaling pathway'. A total of 8 lncRNAs in the lncRNA‑mRNA co‑expression network were reported to interact with 52 differentially expressed genes, which were enriched in asthma‑associated GO and KEGG pathways. The results obtained in the present study may provide insight into the profile of differentially expressed lncRNAs associated with asthma. The identification of a cluster of dysregulated lncRNAs and mRNAs may serve as a potential therapeutic strategy to reverse the progression of asthma.
Asthma is a chronic inflammatory disease of the airways, characterized by airway hyperresponsiveness, chronic lung inflammation and airway remodeling (1,2). Asthma affects an estimated 358 million people worldwide, leading to a mortality rate of 0.4 million in 2015. Inhaled corticosteroids are the main method of pharmaceutical treatment; however, their effects are not satisfactory. Smoking and occupational asthmagens are the primary risk factors for asthma (3). Asthma affects people of all ages and there are no effective clinical treatments due to the complex pathophysiology of the condition (4). Therefore, understanding the molecular basis underlying the pathophysiology of asthma, and the screening of molecular markers for the development of therapeutic targets remains critical.Long non-coding RNAs (lncRNAs) are RNA transcripts >200 nucleotides in length (5). The various functions of lncRNAs have been reported with advances in sequencing and bioinformatics analysis (6). Previous studies have suggested that lncRNAs regulate gene expression by interacting with DNA, RNA or proteins (5,6). For example, the lncRNA MAR1 acts as a sponge for miR-487b, promoting skeletal muscle differentiation and regeneration (7). Thus, these transcripts may be considered as potential biomarkers and therapeutic targets (8–12); however, the expression profiles of lncRNAs and mRNAs in asthma remain unclear.In the present study, microarray data was downloaded from the National Center of Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database for the analysis of differentially expressed mRNA and lncRNA profiles in asthma. In addition, the database was used to investigate asthma-associated mRNAs and lncRNAs in airway epithelial brushings. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), protein-protein interaction (PPI) network and weighted correlation network analyses (WGCNAs) were applied to investigate the role of the differentially expressed lncRNAs and mRNAs in asthma. To the best of our knowledge, the present study is the first to comprehensively analyze mRNAs and lncRNAs in asthma. The findings may provide in-depth molecular insight into the pathophysiology of this condition.
Materials and methods
Tissue samples and data acquisition
The gene expression data of the GSE67472 dataset was downloaded from the NCBI GEO (www.ncbi.nlm.nih.gov/geo). The GEO is the largest database for high-throughput molecular data, primarily gene expression data (13), and contains links to ~20,000 published studies comprising 800,000 samples derived from >1,600 organisms (14). Following screening, the GSE67472 dataset with a large sample size was selected for analysis to ensure the stability and reliability of the data (15). Analysis was conducted via the GPL1355 platform of Affymetrix Human Genome U133 Plus 2.0 (Thermo Fisher Scientific, Inc.). The GSE67472 dataset containing the data of airway epithelial gene expression in 62 patients with asthma and 43 healthy control samples from airway epithelial brushings was included in the microarray analysis (16).
GeneChip probe re-annotation
Numerous lncRNAs were identified via the Affymetrix microarray according to the lncRNA classification pipeline developed in a previous study (17). The latest version of NetAffx Annotation File (release 36; HG-U133_Plus_2 Annotations; CSV format; 36 MB; accessed on 7th December 2016) was obtained from Affymetrix (Thermo Fisher Scientific, Inc.). This annotation file was mapped to the identifications (IDs) of the HG-U133_Plus_2 probe sets. For the probe sets from the RefSeq database (RefSeq Release 89; www.ncbi.nlm.nih.gov/refseq/), probes labeled with protein mixed (NP) were removed, but those with an ID beginning with non-coding RNAs (NR) were included. For the probe sets from the Ensembl database (ensemble 96; www.ensembl.org/index.html), the online software BioMart (GRCh38.p11; http://asia.ensembl.org/biomart/martview/f0b2ccb5ee23510bf3f1e71d87ba7122) was applied to convert Affymetrix microarray IDs to Ensembl IDs with the corresponding gene type. Probe sets from NONCODE were retained. Furthermore, genes annotated as ‘lincRNA’, ‘sense_intronic’, ‘processed_transcript’, ‘antisense’, ‘sense_overlapping’, ‘3prime_overlapping_ncRNA’ or ‘misc_RNA’ were retained. Finally, probe set IDs annotated as ‘microRNA’ or ‘snoRNAs’, and other small RNAs were removed.
Data preprocessing
Affymetrix Expression Console (v1.4; http://www.affymetrix.com/support/technical/byproduct.affx?product=expressionconsole) with Ro-bust microarray was applied to normalize the raw files. Limma package (http://bioinf.wehi.edu.au/limma) in R (version 3.5.1; http://cran.r-project.org/) was used to identify differentially expressed lncRNAs and mRNAs among the asthma and control groups via a t-test (18). Fold change (FC)>1.2 and P<0.05 were set as the criteria for differential expression (19). Hierarchical clustering was conducted in R scripts.
GO and pathway enrichment analysis
GO (geneontology.org) analysis, frequently used in functional enrichment studies of large-scale genes (20), and KEGG (www.genome.jp/kegg) enrichment analysis were performed to investigate the biological pathways that involve differentially expressed mRNAs. In the present study, clusterProfiler (v3.12.0; guangchuangyu.github.io/software/clusterProfiler) and Database for Annotation, Visualization and Integrated Discovery tools (v6.8; http://david.ncifcrf.gov/) were used to analyze the functional enrichment conditions for dysregulated mRNAs (21–23). The false discovery rate (FDR) was calculated to correct the P-value and FDR<0.05 was selected as the threshold for a statistically significant difference.
PPI network construction
Protein interactions were analyzed using the online Search Tool for the Retrieval of Interacting Genes (v 10.5; string-db.org) tool and a combined score of >0.7 was used as the cut-off criterion (24–26). PPI networks were subsequently generated using Cytoscape software (v3.2.8; cytoscapeweb.cytoscape.org) (27,28).
Construction of the lncRNA-mRNA WGCNA
The WGCNA package (29) in R was used to construct the co-expression network of lncRNAs and mRNAs as follows: i) Network construction, the weighted co-expression network of lncRNAs and mRNAs was specified in its adjacency matrix (amn), which encodes the network connection strength between nodes m and n. In order to calculate the amn, the default approach was employed, which defines the co-expression similarity (Smn) as the absolute value of the correlation coefficient between the nodes of m and n, and Smn=|cor (am, an)|. The weighted adjacency amn between two genes is proportional to their similarity on a logarithmic scale with absolute value of the correlation to a power β≥0.8, log (amn)=β × log (Smn). Adjacency functions were obtained by using the approximate scale-free topology criterion (30). The amn was converted into a topology matrix; and ii) module detection: The Dynamic Tree Cut and Static Tree Cut methods were applied to detect modules with ≥30 lncRNAs/genes (31). The cluster dendrogram was visualization using the WGCNA package.
Results
Identification of differentially expressed lncRNAs and mRNAs
Based on the screening criteria, FC>1.2 and P<0.05, differentially expressed lncRNAs and mRNAs in the GSE67472 dataset were identified. A total of 159 differentially expressed lncRNAs and 1,261 mRNAs were identified. In total, 48 lncRNAs were found to be upregulated and 111 lncRNAs were found to be downregulated while 744 mRNAs were found to be upregulated and 517 mRNAs were found to be downregulated. Hierarchical clustering was conducted to evaluate the altered expression of lncRNAs and mRNAs in the samples (Fig. 1).
Figure 1.
Hierarchical clustering of the differentially expressed lncRNAs and mRNAs. Upregulated expression is presented in red and downregulated expression is presented in green. Each row represents an lncRNA/mRNA probe, and each column represents a tissue sample. 1 (yellow) indicates samples from the healthy control group, 2 (green) indicates samples from the asthma group. lncRNA, long non-coding RNA.
Functional annotation and pathway analysis for differentially expressed mRNAs
GO and KEGG pathway enrichment analyses were performed to determine the functions of the identified differentially expressed mRNAs. GO analysis revealed that the top 30 terms of the 2,184 and 1,605 GO terms were enriched in the upregulated (Fig. 2A) and downregulated (Fig. 2C) genes, respectively. In the pathway analysis, the top 30 of the 245 and 219 pathways were enriched in the upregulated (Fig. 2B) and downregulated (Fig. 2D) genes, respectively.
Figure 2.
Function annotation and pathway analysis for differentially expressed mRNAs. The top 30 GO enrichment terms of the (A) up- and (C) downregulated genes are presented. The top 30 KEGG pathway terms from enrichment analysis of the (B) up- and (D) downregulated genes are presented. The enrichment factor refers to the ratio of the number of genes in the GO/KEGG entry following enrichment to the total number of genes in the GO/KEGG entry. An increased enrichment factor indicates greater enrichment. Lower P-values indicate higher significance. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
PPI network
The PPI network of genes with significantly altered expression (FC>1.2 and P<0.05) was delineated using the STRING database. The PPI network contained 421 nodes and 1,232 edges (Fig. 3). Genes with the top 15 highest degrees are presented in Table I. Genes with a high degree may be potential targets for clinical treatment (32).
Figure 3.
PPI network of the differentially expressed genes constructed with Cytoscape software. Red nodes represent upregulated genes, while green nodes represent downregulated genes. The edges indicated the association between genes. The degree of one gene represents the number of interactions with other genes. The larger the node, the higher the connectivity. A high connectivity indicates the importance of a gene in the PPI network. PPI, protein-protein interaction.
Table I.
Top 15 genes with the connectivity in the protein-protein interaction network.
Gene
Node degree
Fold change (asthma/control)
P-value
Trend
AKT1
55
1.24
3.13×10−4
Up
ACACB
49
1.27
1.71×10−3
Down
DNAJC10
37
1.36
2.09×10−2
Up
FOS
35
1.37
6.04×10−3
Down
GART
35
1.43
3.91×10−4
Down
LRGUK
33
1.25
3.20×10−3
Down
INSR
31
1.44
1.00×10−5
Down
KIT
28
1.62
2.25×10−7
Up
ENO2
26
1.22
1.75×10−3
Down
HSPA5
25
1.33
1.33×10−2
Up
CD44
21
1.62
4.30×10−5
Up
DYNC2H1
21
1.33
4.00×10−5
Down
EGR1
21
1.31
4.19×10−2
Down
HDAC9
21
1.66
1.59×10−7
Up
HPGDS
21
1.48
2.00×10−5
Up
Up, upregulated; down, downregulated.
The lncRNA-mRNA co-expression network was established to investigate the association between differentially expressed lncRNAs and mRNAs. In the present study, a cluster dendrogram was obtained using the WGCNA package in R (Fig. 4A); two weighted co-expression subnetworks were identified. lncRNA-mRNA weighted co-expression networks were generated based on genes with the top 30 degrees in the two modules, which comprised 8 lncRNAs and 52 mRNAs. The two modules included a number of lncRNA-mRNA co-expression interactions, which indicated the mRNAs that could be regulated by certain lncRNAs. The lncRNAs in the blue and turquoise modules are listed in Table II. As presented in Fig. 4B and C, the round and arrow-shaped nodes represented mRNAs and lncRNAs, respectively; red indicated upregulation, while green indicated downregulation expression.
Figure 4.
Construction of the lncRNA-mRNA weighted co-expression network. (A) Cluster dendrogram. The subnetwork of genes with the top 30 connectivity degrees in (B) the blue module and (C) the turquoise module. The gene dendrogram indicated the co-expression modules defined by the weighted correlation network analysis. Red and green arrow heads represent up- and downregulated lncRNAs, respectively. Up- and downregulated mRNAs were represented by red and green circles, respectively. lncRNA, long non-coding RNA.
Table II.
lncRNAs in the blue module and the turquoise module.
lncRNA
Module
Trend
P-value
Fold change
AC124067.4
Blue
Down
1.99×10−2
1.31
ZNF667-AS1
Blue
Down
4.80×10−2
1.24
AC005906.2
Blue
Down
2.28×10−2
1.26
AL357568.2
Blue
Down
6.92×10−4
1.22
AC130650.2
Turquoise
Down
9.64×10−7
1.31
STX18-AS1
Turquoise
Down
8.91×10−4
1.24
LINC02363
Turquoise
Down
1.34×10−2
1.22
LINC02145
Turquoise
Down
2.00×10−6
1.37
lncRNA, long non-coding RNA; down, downregulated.
Functional annotation and pathway analysis for the differentially expressed mRNAs in the blue and turquoise modules
GO analysis revealed a total of 756 and 1,882 enriched terms in the blue and the turquoise modules, respectively. The top 30 terms are presented in Fig. 5A and C. KEGG analysis revealed that 116 and 164 pathways were enriched in the two modules, respectively. The top 30 pathways are presented in Fig. 5B and D.
Figure 5.
Functional annotation and pathway analysis for the differentially expressed mRNAs in the blue and turquoise modules. The top 30 GO annotations of the differentially expressed mRNAs in (A) the blue module and (C) the turquoise module are presented. The top 30 KEGG pathways from enrichment analysis in (B) the blue module and (D) the turquoise module are presented. The enrichment factor refers to the ratio of the number of genes located in the GO/KEGG entry following enrichment to the total number of genes in the GO/KEGG entry. An increased rich factor indicates greater enrichment. Lower P-values indicate higher significance. GO, gene ontolgy; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Discussion
Asthma is a common health issue, which poses an economic and social burden to patients (2); however, the pathogenesis of this condition remains poorly understood. To investigate the pathogenesis of asthma and to identify potential biomarkers for clinical treatment in the present study, lncRNA and mRNA expression data in the GSE67472 dataset were downloaded. A total of 159 and 1,261 dysregulated lncRNAs and mRNAs, respectively, were identified in bronchial mucosa samples obtained from patients with asthma compared with samples obtained from normal controls.Genes and their protein products are the basic components of cells, and can be assembled into functional modules. Gene co-expression networks are used to investigate the associations between gene transcripts (33). WGCNA is a biological application for screening clusters (modules) of highly associated genes and demonstrates the link between genes across microarray samples. This analytic tool has been utilized in numerous studies (12,18,29,34). Using WGCNA and bioinformatics analysis, the major biological functions of asthma-associated lncRNAs, and the potential underlying molecular mechanisms in which these lncRNAs may be involved, were investigated in the present study. WGCNA analysis identified 8 key lncRNAs and 52 genes in two modules; genes in the blue module were enriched in the GOs of several metabolic and catabolic processes. Previous studies have indicated that lncRNAs are involved in asthma (35,36). ZNF667 antisense RNA 1 (ZNF667-AS1) inhibits the inflammatory response of spinal cord injury by suppressing the Janus kinase-STAT pathway (37–41). The present study observed that the level of lncRNA ZNF667-AS1 in asthma was downregulated in comparison with healthy controls. Therefore, lncRNA ZNF667-AS1 may play a role in the pathogenesis of asthma.Airway remodeling plays an important role in the progressive worsening of asthma and is irreversible (1,2). Vascular endothelial growth factor (42,43), oxidative stress (44,45), the Fc ε RI signaling pathway (46,47), and the dysregulation of amino acids, sugar and nucleotide metabolism (48,49) have been implicated in asthma. Functional gene analysis of the 159 lncRNAs and 1,261 mRNAs identified in the present study demonstrated that the aforementioned pathways are involved in the pathogenesis of asthma. In order to investigate the role of the 8 key lncRNAs and 52 genes in two modules, functional analysis using GO and KEGG was conducted. Certain pathways, including ‘Toll-like receptors’ (50,51), ‘activation of PPAR signaling pathway’ (52) and ‘eosinophil apoptosis’ (53), are important for elucidating the mechanisms underlying the pathogenesis of asthma. Genes in the turquoise module were enriched in the following GO terms: ‘respiratory chain’, ‘regulation of T cell differentiation in thymus’ and ‘cilium movement involved in cell motility’. KEGG analysis revealed that genes in the turquoise module were enriched in the ‘Fc ε RI signaling pathway’ and ‘ECM-receptor interaction’. These results indicated that genes in the two modules may serve key roles in the initiation and development of asthma.In the present study, upregulated protein kinase B (Akt) exhibited the highest degree in the PPI network and was significantly enriched in the ‘VEGF signaling pathway’ and ‘Fc ε RI signaling pathway’. Akt is also involved in the phosphoinositide 3-kinase/Akt signaling pathway, which served a key role in lung inflammation and airway remodeling in a rat model of ovalbumin (OVA)-induced asthma (54). The expression of phosphorylated-Akt was increased in the lung tissues of rats with OVA-induced asthma compared with controls.lncRNAs regulate the transcription and expression of mRNAs, and participate in the initiation and development of various diseases (7,55). Therefore, the lncRNAs identified in the present study and their interacting genes may serve important roles in the onset and development of asthma. There are some limitations to the present study. The data was only extracted from one dataset that was not confirmed in independent studies. Additionally, cell, animal and clinical experiments are required to corroborate the results obtained in the present study as the mRNA expression level may not always be consistent with the protein level (56). Furthermore, the interactions between the identified lncRNA and mRNA requires experimental verification. Such future research may elucidate the mechanisms underlying lncRNAs and mRNAs in the development of asthma.