Yufeng Du1, Xiaoyan Hao1, Xuejun Liu1. 1. Department of Geriatric Diseases, The First Affiliated Hospital of Shanxi Medical University, Taiyuan, Shanxi 030001, P.R. China.
Abstract
The present study aimed to investigate the expression of long non-coding RNA (lncRNA) cyclin dependent kinase inhibitor-2B-antisense RNA 1 CDKN2B-AS1 in patients with peripheral blood of idiopathic pulmonary fibrosis (IPF). A total of 24 patients with IPF and 24 healthy controls were included in the study, four patients with IPF and four healthy controls were selected randomly to extract RNA. There were no other diseases such as hypertension and diabetes in the two groups. RNA from peripheral blood was extracted by high-throughput sequencing and bioinformatics analysis was performed. Based on selected differentially expressed lncRNA and mRNA, gene ontology analysis was performed to screen out the tumor-associated mRNA. A total of 20 samples were chosen to avoid variance due to individual differences. A total of 20 patients with IPF, and 20 controls were further studied, RNA extracted from peripheral blood was used to verify the lncRNA and mRNA levels. A total of 440 lncRNAs were identified to be upregulated and 1,376 downregulated according to the screening results of differential expression. High-throughput sequencing and bioinformatics analysis demonstrated that the expression of CDKN2B-AS1 decreased significantly in patients with IPF compared with healthy controls. The adjacent gene mRNA of CDKN2B-AS1 was identified as CDKN2A, an important anti-oncogene, which is concentrated on the p53 signaling-pathway according to the Kyoto Encyclopedia of Genes and Genomes database. CDKN2A mRNA expression levels were lower in patients with IPF and higher in the control group. The expression of CDKN2B-AS1 and CDKN2A mRNA was significantly lower in IPF group compared with in the control group (P<0.05). The results suggest the expression of the CDKN2B-AS1 and adjacent gene, CDKN2A, are downregulated in the peripheral blood of patients with IPF, which activates the p53-signaling pathway to promote lung cancer formation.
The present study aimed to investigate the expression of long non-coding RNA (lncRNA) cyclin dependent kinase inhibitor-2B-antisense RNA 1 CDKN2B-AS1 in patients with peripheral blood of idiopathic pulmonary fibrosis (IPF). A total of 24 patients with IPF and 24 healthy controls were included in the study, four patients with IPF and four healthy controls were selected randomly to extract RNA. There were no other diseases such as hypertension and diabetes in the two groups. RNA from peripheral blood was extracted by high-throughput sequencing and bioinformatics analysis was performed. Based on selected differentially expressed lncRNA and mRNA, gene ontology analysis was performed to screen out the tumor-associated mRNA. A total of 20 samples were chosen to avoid variance due to individual differences. A total of 20 patients with IPF, and 20 controls were further studied, RNA extracted from peripheral blood was used to verify the lncRNA and mRNA levels. A total of 440 lncRNAs were identified to be upregulated and 1,376 downregulated according to the screening results of differential expression. High-throughput sequencing and bioinformatics analysis demonstrated that the expression of CDKN2B-AS1 decreased significantly in patients with IPF compared with healthy controls. The adjacent gene mRNA of CDKN2B-AS1 was identified as CDKN2A, an important anti-oncogene, which is concentrated on the p53 signaling-pathway according to the Kyoto Encyclopedia of Genes and Genomes database. CDKN2A mRNA expression levels were lower in patients with IPF and higher in the control group. The expression of CDKN2B-AS1 and CDKN2A mRNA was significantly lower in IPF group compared with in the control group (P<0.05). The results suggest the expression of the CDKN2B-AS1 and adjacent gene, CDKN2A, are downregulated in the peripheral blood of patients with IPF, which activates the p53-signaling pathway to promote lung cancer formation.
Idiopathic pulmonary fibrosis (IPF) is a progressive and usually fatal lung disease characterized by fibroblast proliferation, and extracellular matrix remodeling (1,2). It is a common disease in the elderly population, particularly those who are between 50–70 years old. IPF may be associated with additional comorbidities, which have an impact on the quality of life and survival of patients in addition to the progressive exertional dyspnea. IPF has a small number of treatment options. Therefore, it is hypothesized that since there are no effective therapies, timely detection and reduction of the complications are important in order to improve the quality of life of patients. It has been reported that IPF is an independent risk factor of lung cancer (1,2), with non-small cell lung carcinoma (NSCLC) being the main pathological type. However, the underlying mechanism remains poorly understood.Epigenetic alterations are involved in the pathogenesis of IPF and lung cancer. Long non-coding RNA (lncRNA) is a class of RNA with the length of >200 bases. With the development of gene sequencing technology and bioinformatics technology, increasing evidence has demonstrated that changes in lncRNA expression levels are associated with numerous diseases including lung disease and neurological disease (3,4). The p53 gene has been revealed to have the highest number of genetic correlations with humantumor types and is an important tumor suppressor gene (5). The p53-mediated cell-signaling pathway serves an important role in the regulation of normal cellular activities.Studies have demonstrated that lncRNA alterations and the p53-signaling pathway are involved in the process of IPF, and lung cancer formation. Therefore, it was hypothesized that there are certain changes in lncRNA related to the p53 gene, further associating IPF with lung cancer. Thus, the present study aimed to investigate the differential expression of lncRNAs through high-throughput sequencing and bioinformatics analysis.
Materials and methods
Study population
A total of 24 patients with IPF, according to diagnostic criteria established by the American Thoracic Society (6), and 24 healthy controls were involved in the study (all males; aged 67±3.2 vs. 64±2.8 years, respectively; P>0.05). Based on the uniformity of background including age and gender, four patients with IPF and four healthy controls were selected for RNA extraction. RNA from peripheral blood was extracted using high-throughput sequencing and bioinformatics analysis was performed for the expression of lncRNA. The remaining 20 patients with IPF and 20 healthy controls were further studied; RNA extracted from peripheral blood was used to verify the lncRNA and mRNA. The 3 ml of blood sample was stored in −70°C for further study. The present study was approved by the Shanxi Medical University Ethics Committee (Taiyuan, China). Written informed consent was obtained from each patient and healthy individuals.
RNA extraction
Total RNA was isolated using TRIzol (Invitrogen; Thermo Fisher Scientific, Inc.) according to the manufacturer's protocol. RNA concentration and quality were assessed using a NanoDrop ND-1000 Spectrophotometer (NanoDrop Technologies; Thermo Fisher Scientific, Inc., Waltham, MA, USA). The ratio A260/A280 was between 1.8–2.1.
RNA sequencing
Ribosomal (r)RNA was removed the total RNA (1 µg) samples using the Ribo-Zero Gold kit (Illumina, Inc., San Diego, CA, USA) according to the manufacturer's protocol. The rRNA-depleted samples were used for library construction using the NEBNext® Ultra™ RNA Library Prep kit according to the manufacturer's protocol (New England Biolabs, Inc., Ipswich, MA, USA). Libraries were sequenced with the Illumina HiSeq Sequencer according to the manufacturer's protocol (Illumina, San Diego, CA, USA). Reads were trimmed and cleaned of Illumina adaptors, and low quality sequences using Cutadapt software (version 1.9.2; https://github.com/marcelm/cutadapt). Clean reads were mapped to the Human genome [University of California Santa Cruz (UCSC) hg19; http://genome.ucsc.edu/cgi-bin/hgGateway] using TopHat2 software (version 2.1.2; http://ccb.jhu.edu/software/tophat/index.shtml) and unmapped reads were discarded. Cuffdiff software (version 2.1.2; part of cufflinks; https://github.com/cole-trapnell-lab/cufflinks) was used to perform expression analysis and differential expression analysis. LncRNAs were considered to be differentially expressed based on Fragments/kb of transcript/million mapped reads (FPKM) >0.5 and fold change >2.0.
Gene ontology (GO) analysis and pathway analysis
The GO enrichment analysis was performed for functional analysis of LncRNA-associated genes using R package ‘topGO’ from Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/topGO.html). The significant pathways for predicting target genes were identified according to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www.genome.jp/kegg/). The Fisher's exact test was used to select the significant Gene ontology and pathways, and the threshold of significance was defined by P<0.05. RNA Sequencing reads may be mapped into the reference genome. Any gene of interest may be directly visualized in the Integrative Genomics Viewer (http://www.broadinstitute.org/igv/).
cDNA synthesis and reverse transcription-quantitative polymerase chain reaction (RT-qPCR)
Total RNA was used to make cDNA using PowerScript RT-PCR kit (Cloud-Seq Inc., Shanghai, China) according to the manufacturer's protocol with random primers supplied with the kit. The expression levels of lncRNAs and mRNAs were determined using SYBR Green I-based qPCR. The qPCR thermocycling conditions were maintained as follows: 95°C for 10 min; followed by 40 cycles of 95°C for 10 sec and 60°C for 60 sec. Data were analyzed using the comparative ∆Cq method (7) with β-actin as an endogenous reference gene. Primers were designed using Primer (version 5.0; https://wheat.pw.usda.gov/demos/BatchPrimer3/). The primers used were as follows: Cyclin dependent kinase inhibitor (CDKN)2B-antisense RNA 1 (AS1) forward, AACCGGGGAGATCTATTTGG and reverse, GGTGTGGTGTCTCACACCTG; CDKN2A forward, GGCTGTTCCTGGTCATGAT and reverse, TGTCCAGGAAGCCCTCC.
Statistical analysis
Results are expressed as the mean ± standard deviation. Differences between groups were analyzed by one-way analysis of variance (ANOVA). Bonferroni post hoc test was used to identify which comparison is significantly different after ANOVA analysis. Statistical significance was determined using SPSS software (version 17.0; SPSS, Inc., Chicago, IL, USA). P<0.05 was considered to indicate a statistically significant difference.
Results
lncRNA expression
A total of 1,816 differentially expressed lncRNAs were identified via screening, including 440 upregulated and 1,376 downregulated lncRNAs (Fig. 1). In addition, 1,124 differentially expressed mRNAs were identified (Fig. 2). Notably, downregulated lncRNAs were more common compared with upregulated lncRNAs. Among the lncRNAs, CDKN2B-AS1 (chr9:21802541-22121096) was identified to be the most significantly downregulated lncRNA.
Figure 1.
Differentially expressed lncRNA. An lncRNA expression signature of IPF. 1816 differentially expressed lncRNAs (rows) from hierarchical clustering were identified between IPF samples and normal samples (columns). Patient ID numbers are shown below the columns. The expression level of each lncRNA is represented by the number of standard deviations above (blue) or below (green) the average value for that gene across all samples. lncRNA, long noncoding RNA.
Figure 2.
Differentially expressed mRNA. An mRNA expression signature of IPF. 1485 differentially expressed mRNAs (rows) from hierarchical clustering were identified between IPF samples and normal samples (columns). Patient ID numbers are shown below the columns. The expression level of each lncRNA is represented by the number of standard deviations above (blue) or below (green) the average value for that gene across all samples.
Pathway analysis
The significant pathways for predicting the target gene were identified according to the KEGG database using Fishers exact test. The p53-signaling pathway was significantly associated with lung cancer (Fisher's exact test; P<0.013) and target gene-related pathways (Fisher's exact test; P<0.0039; Fig. 3; Table I).
Figure 3.
Target gene-related enriched pathways. p53-signaling pathway in the top significant pathway of different genes according to the pathway analysis, which is the one of the bioinformatics analysis. DE, differentially expressed; TNF, tumor necrosis factor; NF-κB, nuclear factor-κB; HIF-1, hypoxia inducible factor-1; ErbB, epidermal growth factor family of receptor tyrosine kinases.
CDKN2A is in the p53-signaling pathway. The Fisher P-values were 0.013 and 0.0039. CDKN2A was identified to be concentrated on the p53-signaling pathway according to the high-throughput sequencing results. TNF, tumor necrosis factor; CDKN2A, cyclin dependent kinase inhibitor 2A; NOD, nucleotide-binding oligomerization domain.
Target lncRNA
CDKN2A is an important component of the p53-signaling pathway. Using bioinformatics analysis, it was revealed that the adjacent mRNA gene of CDKN2A was CDKN2B-AS1. Through the aforementioned selected differentially expressed lncRNA and mRNA, the key lncRNA is CDKN2B-AS1 (chr9:21802541-22121096) with change of 3.78-folds. Differentially expressed lncRNA and mRNA were screened out and it was found that CDKN2B-AS1 (chr9:21802541-22121096) decreased significantly in IPF patients. The gene ID of CDKN2B-AS1 is ENSG00000240498. According to the UCSC database, the CDKN2B-AS1 gene length is 631 bp. CDKN2B-AS1 was identified in the ‘Homo_sapiens_HG19.sorted.gtf’ database.From the Ensembl database (http://www.ensembl.org/Homo_sapiens/Info/Index) and CNKI gene database, the transcript of CDKN2A was demonstrated to contain an alternate open reading frame (ARF) that encodes a protein, which is structurally unassociated with the products of the other variants. This ARF product functions as a stabilizer of the tumor suppressor protein p53 as they can interact with each other (8). This gene is known as an important tumor suppressor gene and is associated with IPF. The positional association between the two genes is antisense (Fig. 4), and the adjacent gene of the CDKN2B-AS1 is CDKN2A. It further suggests that these two genes CDKN2B-AS1 and CDKN2A are simultaneously transcribed. This lncRNA is located on chromosome 9 at the approximate location chr9:21802541-22121096 and the adjacent gene is CDKN2A in Fig. 4.
Figure 4.
Association between CDKN2A and CDKN2B-AS1. This long noncoding RNA is located on chromosome 9 at approximate locations (chr9:21802541-22121096), and the adjacent gene is CDKN2A. CDKN2B-AS1, cyclin dependent kinase inhibitor 2B-antisense RNA 1; Chr, chromosome.
From the aforementioned results, the CDKN2B-AS1 and CDKN2A expression levels were determined in the remaining IPF and control group cases (n=20 each) using RT-qPCR technology. The results revealed that CDKN2B-AS1 and CDKN2A expression levels were significantly decreased in the IPF group compared with the control group (*P<0.05; Fig. 5).
Figure 5.
Validation of long noncoding RNA and mRNA microarray data by reverse transcription-quantitative PCR. The relative expression level of each lncRNA and mRNA were normalized. Data are presented as the mean ± standard deviation, *P<0.05. PCR, polymerase chain reaction; IPF, idiopathic pulmonary fibrosis.
Discussion
IPF is a chronic, progressive and fatal diffuse interstitial lung disease. It has re-emerged as a focus of scientific study, due to its increasing incidence, the progressive dyspnea and lack of effective treatments (9). The diagnosis of IPF often requires a multidisciplinary approach, involving pulmonologists, radiologists, and pathologists experienced in the field of interstitial lung diseases (6,10). As early as 1957, the incidence of lung cancer in patients with IPF was identified to be higher, compared with healthy control patients. (11), is has been confirmed that the incidence of lung cancer is increased in patients with IPF compared to the general population (12–16). A previous study identified that IPF is an independent risk factor of lung cancer (14). Recently, the view that lung cancer occurs as a late complication of IPF rather than being incidental has been supported (17). Studies consistently demonstrated that elderly male patients with IPF with a history of smoking are more likely to develop lung cancer (18–20).Typical HRCT findings from lung cancer with IPF are well-defined nodules with lobulation in the peripheral subpleural areas inside or adjacent to the fibrosis (21). Kim et al (22) reported that the highest proportion of IPF with lung cancer cases is adenocarcinoma, but Lee et al (23) reported that the highest proportion was squamous cell carcinoma. These two differences may be associated with the different sample populations that were chosen; however, patients with IPF are prone to having non-small cell lung cancer (24). It has been reported that the incidence of IPF associated with lung cancer is inconsistent, which may be due to the different diagnostic criteria used. Certain reports have noted the incidence of lung cancer as 4.8–48% and without pulmonary fibrosis as 2.0–6.4% (1), thus it has been considered that IPF is a precancerous condition. IPF has been associated with lung cancer; however, the underlying mechanism remains to be elucidated. There are a number of genes and signaling pathways involved in this process, including the p53-signaling pathway.Of all humanscancers, ~50% lack a wild-type p53 allele and thus fail to produce a normal version of the p53 protein (5). The presence of multiple mutations in the p53 gene may explain the high incidence of IPF complicated by lung carcinoma (25). Kawasaki et al (26) have reported that tumor suppressor p53 is altered in squamous metaplasia, and dysplastic bronchial and alveolar epithelia in patients with IPF. Exposure to carcinogens, tobacco and aging may cause the inactivation of tumor suppressor genes, and lead to lung cancer in patients with IPF (1). Reduced levels of p53 have been identified in cancer of the colon, lung, esophagus, breast, liver, brain, reticuloendothelial tissues and hemopoietic tissues. The incidence of positive anti-p53 antibody in IPF, irrespective of the existence of lung cancer, was as high as that in lung cancer (27). These findings suggest that the p53-signaling pathway is associated with lung cancer (28–30) and with IPF (31). Thus, we hypothesized that p53 mutation downregulation may be associated with a high incidence of lung cancer in patients with IPF.Previously, lncRNA molecules were generally considered as byproducts formed during the transcription of the genome, and were defined as transcriptome ‘background noise’ without biological functions. It has now been reported that lncRNAs are involved in numerous important biological processes, including gene imprinting, cell proliferation and differentiation, immune responses, and chromosome structure. LncRNA may serve a role in a variety of mechanisms, including the following: Rupturing of small RNA (shortRNAs); specific binding to chromosomes in Hox gene loci; regulating epigenetic activity by transacting; RNA may also form DNA-DNA-RNA triple helix structures to inhibit promoter activity; coding mRNA antisense transcripts to regulate gene activity (32). Previously, research confirmed that lncRNA is associated with numerous chronic pulmonary disease types, which serves important roles in the biological processes of lung cancer and IPF (33).Through microarray analysis of bleomycin-induced pulmonary fibrosis in a mouse model, the differential expression of lncRNA was confirmed (34). Furthermore, lncRNA imbalance is a characteristic of numerous types of cancer, which may be involved in promoting tumor progression, invasion and metastasis (35–37). lncRNA has been demonstrated to be involved in cell proliferation, apoptosis, epithelial-mesenchymal transition and other biological processes, which regulate tumorigenesis, and metastasis (38–40). lncRNAs may either facilitate or inhibit the progression of lung cancer and the various pathways involved (41–43).In the present study, differentially expressed lncRNAs were screened in IPF, lncRNA CDKN2B-AS1 was screened and a fold-change of 3.78 FPKM was observed (P<0.05). The gene ID for CDKN2B-AS1 was ENSG00000240498. According to the UCSC database, the CDKN2B-AS1 gene length was 631 bp. However, there is currently little research on CDKN2B-AS1. Certain reports have noted that CDKN2B-AS1 is associated with hypertension and myocardial injury (44). In genome wide association studies of individual cancer types, genetic polymorphisms in the CDKN2A/2B-AS1/2B/methylthioadenosine phosphorylase gene cluster were associated with melanoma (45).There are several mechanisms of lncRNA, it is able to inhibit the expression of certain genes and may enhance the expression of its neighboring gene (46,47). A previous study demonstrated that lncRNAs serve an important role in the process of transcription, particularly that of neighboring genes (33). It has been reported that adjacent pairs of genes often exhibit correlated expression patterns throughout the cell cycle (48–50). lncRNAs may regulate the transcription of adjacent genes, thus affecting their biological roles. According to the bioinformatics analysis in the present study, it was revealed that the adjacent gene mRNA of CDKN2B-AS1 was CDKN2A. CDKN2A is the cyclin-dependent kinase inhibitor, which is an important tumor suppressor gene. It is involved in the regulation of cell proliferation and apoptosis, encoding proteins p16INK4a and p14ARF serve a function via retinoblastoma protein and p53 protein respectively (51). Altered expression levels of the CDKN2A gene have been reported in numerous tumor types such as tumors of lung, breast, brain, bone, skin, bladder, kidney, ovary, and lymphocyte (52), thus, there it is a focus of oncogenetic studies. CDKN2A is considered as an exogenous marker, which is able to be detected at an early stage of sputum and bronchoalveolar lavage fluid in patients with lung cancer (53). Busch et al (8) identified that induction of ARF is an early response in lung tumorigenesis that mounts a barrier against tumor growth and malignant progression. According to the GeneWays7.0 database, Li et al (54) reported that CDKN2A is associated with non-small lung cancer. In addition, Cisneros et al (55) demonstrated that many IPF fibroblasts exhibit decreased expression of the proapoptotic p14ARF attributable to promoter hypermethylation (55). The present study revealed that CDKN2A expression was decreased significantly in patients with IPF, which was consistent with previous report (55). Furthermore, CDKN2A expression was concentrated on the p53-signaling pathway according to the high-throughput sequencing results. It is involved in the regulation of gene p53, there may be a key factor in IPF patients with lung cancer.In conclusion, the current study demonstrated that CDKN2B-AS1 expression is decreased significantly in patients with IPF, while its adjacent gene CDKN2A expression is reduced simultaneously. Thus, it may promote the occurrence of lung cancer by regulating p53-signaling pathways.
Authors: P Carninci; T Kasukawa; S Katayama; J Gough; M C Frith; N Maeda; R Oyama; T Ravasi; B Lenhard; C Wells; R Kodzius; K Shimokawa; V B Bajic; S E Brenner; S Batalov; A R R Forrest; M Zavolan; M J Davis; L G Wilming; V Aidinis; J E Allen; A Ambesi-Impiombato; R Apweiler; R N Aturaliya; T L Bailey; M Bansal; L Baxter; K W Beisel; T Bersano; H Bono; A M Chalk; K P Chiu; V Choudhary; A Christoffels; D R Clutterbuck; M L Crowe; E Dalla; B P Dalrymple; B de Bono; G Della Gatta; D di Bernardo; T Down; P Engstrom; M Fagiolini; G Faulkner; C F Fletcher; T Fukushima; M Furuno; S Futaki; M Gariboldi; P Georgii-Hemming; T R Gingeras; T Gojobori; R E Green; S Gustincich; M Harbers; Y Hayashi; T K Hensch; N Hirokawa; D Hill; L Huminiecki; M Iacono; K Ikeo; A Iwama; T Ishikawa; M Jakt; A Kanapin; M Katoh; Y Kawasawa; J Kelso; H Kitamura; H Kitano; G Kollias; S P T Krishnan; A Kruger; S K Kummerfeld; I V Kurochkin; L F Lareau; D Lazarevic; L Lipovich; J Liu; S Liuni; S McWilliam; M Madan Babu; M Madera; L Marchionni; H Matsuda; S Matsuzawa; H Miki; F Mignone; S Miyake; K Morris; S Mottagui-Tabar; N Mulder; N Nakano; H Nakauchi; P Ng; R Nilsson; S Nishiguchi; S Nishikawa; F Nori; O Ohara; Y Okazaki; V Orlando; K C Pang; W J Pavan; G Pavesi; G Pesole; N Petrovsky; S Piazza; J Reed; J F Reid; B Z Ring; M Ringwald; B Rost; Y Ruan; S L Salzberg; A Sandelin; C Schneider; C Schönbach; K Sekiguchi; C A M Semple; S Seno; L Sessa; Y Sheng; Y Shibata; H Shimada; K Shimada; D Silva; B Sinclair; S Sperling; E Stupka; K Sugiura; R Sultana; Y Takenaka; K Taki; K Tammoja; S L Tan; S Tang; M S Taylor; J Tegner; S A Teichmann; H R Ueda; E van Nimwegen; R Verardo; C L Wei; K Yagi; H Yamanishi; E Zabarovsky; S Zhu; A Zimmer; W Hide; C Bult; S M Grimmond; R D Teasdale; E T Liu; V Brusic; J Quackenbush; C Wahlestedt; J S Mattick; D A Hume; C Kai; D Sasaki; Y Tomaru; S Fukuda; M Kanamori-Katayama; M Suzuki; J Aoki; T Arakawa; J Iida; K Imamura; M Itoh; T Kato; H Kawaji; N Kawagashira; T Kawashima; M Kojima; S Kondo; H Konno; K Nakano; N Ninomiya; T Nishio; M Okada; C Plessy; K Shibata; T Shiraki; S Suzuki; M Tagami; K Waki; A Watahiki; Y Okamura-Oho; H Suzuki; J Kawai; Y Hayashizaki Journal: Science Date: 2005-09-02 Impact factor: 47.728