Literature DB >> 30290038

A group of long noncoding RNAs identified by data mining can predict the prognosis of lung adenocarcinoma.

Meijian Liao1,2, Qing Liu1,2, Bing Li1,2, Weijie Liao1,2, Weidong Xie2,3, Yaou Zhang2,3.   

Abstract

Long noncoding RNAs (lncRNA) are reported to be potential cancer biomarkers. This study aims to find new lncRNA biomarker relevant to lung adenocarcinoma. Gene expression profile and clinical data of lung adenocarcinoma and lung squamous cell carcinoma patients were downloaded from the UCSC Xena database. These data were analyzed to identify potential lncRNA prognostic biomarkers, and the candidate lncRNAs were analyzed and verified with association analysis, meta-analysis, survival analysis, gene ontology analysis, gene set enrichment analysis, and other statistical methods. A group of 5 lncRNAs was identified from the 1965 differentially expressed (fold-change >2) genes. Four of these 5 lncRNAs were expressed at a lower level in lung adenocarcinoma tissues and the other one at a higher level (P < .0001). A risk score model was constructed using a linear combination of the expression levels of these lncRNAs. High-risk patients showed poorer overall survival (hazard ratio [HR] = 2.14; 95% confidence interval [CI], 1.67-3.06, P < .0001), disease-free survival (HR = 1.84; 95% CI, 1.26-2.35, P = .0007), and recurrence-free survival (HR = 1.51; 95% CI, 1.02-2.40, P = .04). The 5-fold cross-validation and subsequent meta-analysis further verified that patients in the low-risk group had better survival (95% CI, 0.74-1.79, Z = 4.72, P < .00001). Furthermore, both univariate and multivariate Cox regression analyses revealed that the prognostic value of these 5 lncRNAs was independent of other clinical prognostic factors. Further analysis indicated that these 5 lncRNAs might be associated with tumor metastasis. Taken together, our study suggests new prognostic lncRNA biomarkers for lung adenocarcinoma.
© 2018 The Authors. Cancer Science published by John Wiley & Sons Australia, Ltd on behalf of Japanese Cancer Association.

Entities:  

Keywords:  adenocarcinoma of lung; long noncoding RNA; prognosis; survival; tumor biomarker

Mesh:

Substances:

Year:  2018        PMID: 30290038      PMCID: PMC6272079          DOI: 10.1111/cas.13822

Source DB:  PubMed          Journal:  Cancer Sci        ISSN: 1347-9032            Impact factor:   6.716


confidence interval gene ontology gene set enrichment analysis hazard ratio long noncoding RNA noncoding RNA The Cancer Genome Atlas

INTRODUCTION

Lung cancer is one of the most common and life‐threatening cancers worldwide.1 In fact, the 5‐year survival rate of lung cancer patients is only 10%‐15% due to late diagnosis and the limitations of conventional treatments.2, 3 The molecular characterization of lung cancer is becoming essential for pathological diagnosis, treatment decisions, and prognosis estimation. Approximately 85% of lung cancer is non‐small‐cell lung cancer (NSCLC), and approximately 50% of NSCLC is lung adenocarcinoma. Therefore, we focused on lung adenocarcinoma in this study. Some lung adenocarcinoma patients show EML4ALK rearrangement, KRAS (KRAS proto‐oncogene, GTPase) mutations, and epidermal growth factor receptor (EGFR) overexpression or mutations,4, 5, 6, 7 and these alterations have been used as biomarkers in lung cancer patients. However, only a small percentage of patients show these abnormalities. Thus, more lung adenocarcinoma biomarkers are needed. Protein molecules are common biomarkers; however, protein‐coding genes constitute <2% of the mammalian genome, and more than 80% of genes produce ncRNAs.8 Long lncRNAs are a class of ncRNAs longer than 200 nucleotides.9 Although the biological functions of most lncRNAs have not been characterized, there is increasing evidence that they play important roles in physiological and pathological processes, such as regulating cancer metastasis.10, 11, 12, 13, 14, 15 Long ncRNAs have been reported to act as potential biomarkers that have predictive value for the survival of cancer patients. For example, prostate cancer antigen 3 (PCA3) is considered to be an important biomarker in prostate cancer.16, 17 Additionally, metastasis‐associated lung adenocarcinoma transcript 1 (MALAT‐1) and colon cancer‐associated transcript 2 (CCAT2) have been reported to act as biomarkers in lung cancer patients.18, 19, 20 In this study, we aimed to find and validate new lncRNAs that can serve as prognostic biomarkers in lung adenocarcinoma patients.

MATERIALS AND METHODS

Datasets

The gene expression profile data of lung adenocarcinoma and lung squamous cell carcinoma patients were downloaded from the UCSC Xena database (http://xena.ucsc.edu/). The corresponding clinical information was retrieved from TCGA database.21 Tissues without expression or clinical survival information were removed from the analysis. The UCSC Xena website offers tools for the visualization and exploration of TCGA genomic data.

Hierarchical clustering

Information regarding LOC723809 (LHFPL3‐AS2), LOC150622 (LINC01105), NCRNA00092 (LINC00092), LOC284276 (LINC00908), and LOC100131726 (FAM83A‐AS1) expression in lung adenocarcinoma tissues was downloaded and normalized using a Z score analysis. Hierarchical clustering was carried out using R package gplots.22

Gene ontology analysis

Gene co‐expression with these 5 lncRNAs was defined by Pearson's correlation coefficient for the correlation between the expression of genes and these 5 lncRNAs. Pearson's correlation coefficient was calculated using the cor function in R. Genes with absolute coefficients higher than 0.3 were selected for a functional enrichment analysis using the DAVID Bioinformatics Tool (https://david.ncifcrf.gov/).23 Gene ontology functional clusters with P < .05 were considered to indicate potential biological functions of these lncRNAs.

Gene co‐expression network

Gene co‐expression networks were established to study the relationships between these 5 lncRNAs. Pearson's correlation coefficients of the lncRNA expression profiles were calculated. The network was completed using Cytoscape software.24 In the gene coexpression networks, genes were connected by edge.

Association analysis

High and low lncRNA expression was determined based on the median patient expression level. Associations were analyzed using the apriori function in the arules package in R.25, 26 The subset function was used to select rules connected to survival status or lymph node status. The results of the association analysis were visualized by the arulesViz package in R.27

Meta‐analysis of survival datasets

The meta‐analysis was carried out using Review Manager Version 5.3 (2014; The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, Denmark). The HR with a 95% CI in a fixed model was used to analyze the correlation between survival and risk score level. The significance of the pooled HR was determined through a Z test with a threshold of P < .05. A heterogeneity analysis was carried out using the I 2 statistic and χ2 test, and the combination of I 2 > 50% plus a χ2 test P value < .1 was defined as heterogeneity across the studies. No heterogeneity was observed in our study; therefore, the pooled HR estimates were calculated using the fixed‐effects model.

Survival analysis

The relationship between lncRNA expression and patient survival was assessed by Cox regression analysis using the coxph function of the R statistical software. A risk score model was built using a linear combination of the expression levels of the 5 lncRNAs with weighted coefficients. The patients were divided into low‐risk and high‐risk groups according to the best cut‐off value of the risk score. Patients with risk scores equal to or less than the best cut‐off value were defined as low‐risk patients, while those with risk scores higher than the best cut‐off value were defined as high‐risk patients. Kaplan‐Meier survival and log‐rank tests were undertaken to assess the differences between these two groups.

Gene set enrichment analysis

The potential biological pathways of the identified lncRNAs were analyzed using GSEA version 2.2.0 software.28 All patient risk scores were calculated according to the expression pattern of the lncRNAs. The patients were then divided into two groups based on the median risk score. Patients with an expression level above the median formed part of the high‐risk group (N = 127), and those with an expression level equal to or less than the median were defined as the low‐risk group (N = 128). The gene sets were analyzed using h.all.v5.1.symbols.gmt downloaded from MSigDB (http://software.broadinstitute.org/gsea/msigdb/download_file.jsp?filePath=/resources/msigdb/5.1/h.all.v5.1.symbols.gmt). One thousand permutations of each gene set were used.

Statistical analyses

A Mann‐Whitney U analysis was applied to compare the expression levels of lncRNAs between normal and adenocarcinoma lung tissues. The log‐rank test was used to compare the survival rate between two groups. The χ2 test was used to compare the death status, survival time, and tumor stage between two groups. A P value <0.05 was considered to indicate statistical significance.

RESULTS

Identification of a group of lncRNAs associated with survival of lung adenocarcinoma patients

To identify potential lncRNA biomarkers, we analyzed the lung adenocarcinoma patients in TCGA cohort. We first compared gene expression between normal (N = 58) and adenocarcinoma (N = 513) lung tissues and identified 1,965 genes (fold‐change >2) showing differential expression between the two groups. To identify a group of associated lncRNAs, we analyzed the relationships between the lncRNAs within these 1,965 genes. A Pearson correlation coefficient with an absolute value larger than 0.3 was considered to indicate a correlation. This analysis identified 5 lncRNAs, and we further investigated the relationships between these genes by constructing a gene coexpression network. The expression of LOC100131726 (FAM83A‐AS1) was negatively correlated with that of LOC723809 (LHFPL3‐AS2), whereas the expression levels of LOC723809 (LHFPL3‐AS2), LOC150622 (LINC01105), LOC284736 (LINC00908), and NCRNA00092 (LINC00092) were positively correlated with each other (Figure 1A). An association analysis was performed to confirm this result, and the results showed that the expression of these 5 lncRNAs formed 2 independent clusters (Figure 1B). Four of the lncRNAs (LOC723809 [LHFPL3‐AS2], LOC150622 [LINC01105], NCRNA00092 [LINC00092], and LOC284276 [LINC00908]) were expressed at a lower level and one (LOC100131726 [FAM83A‐AS1]) was overexpressed in adenocarcinoma tissues (P < .0001; Figure 1C). To further confirm our results, hierarchical clustering was used to analyze the systematic variations of these 5 lncRNAs in the same samples. It is clear from Figure 1D that the expression pattern of LOC100131726 (FAM83A‐AS1) is different from the other 4 lncRNAs. Finally, the alterations in their DNA copy number were investigated in 7,589 adenocarcinoma samples.29 The LOC723809 (LHFPL3‐AS2) and LOC150622 (LINC01105) genomic loci were not frequently lost. The NCRNA00092 (LINC00092) locus was deleted in 10%‐15% of the patients, whereas the LOC284276 (LINC00908) locus was deleted in 30%‐45% of the samples, and LOC100131726 (FAM83A‐AS1) was amplified in 30%‐40% of the patients (Figure 1E).
Figure 1

Association of long noncoding RNAs (lncRNAs) with lung adenocarcinoma. A, Gene expression network of 5 lncRNAs in lung adenocarcinoma tissues. Green lines indicate genes that are positively correlated with each other; red lines indicate negative correlations. B, Association analysis of the expression of these 5 lncRNAs. The lift value is shown by the color intensity, and the size of each circle indicates the confidence value. C, Mann‐Whitney U analysis comparing the expression levels of the 5 lncRNAs between normal lung (blue squares; N = 58) and lung adenocarcinoma (red circles; N = 513) tissues. D, Heat map of the lncRNA expression levels in normal lung and lung adenocarcinoma tissues. E, DNA copy number alterations across all chromosomes in 7,589 adenocarcinoma samples (Progenetix histoplot). Blue indicates DNA deletion; yellow indicates DNA amplification

Association of long noncoding RNAs (lncRNAs) with lung adenocarcinoma. A, Gene expression network of 5 lncRNAs in lung adenocarcinoma tissues. Green lines indicate genes that are positively correlated with each other; red lines indicate negative correlations. B, Association analysis of the expression of these 5 lncRNAs. The lift value is shown by the color intensity, and the size of each circle indicates the confidence value. C, Mann‐Whitney U analysis comparing the expression levels of the 5 lncRNAs between normal lung (blue squares; N = 58) and lung adenocarcinoma (red circles; N = 513) tissues. D, Heat map of the lncRNA expression levels in normal lung and lung adenocarcinoma tissues. E, DNA copy number alterations across all chromosomes in 7,589 adenocarcinoma samples (Progenetix histoplot). Blue indicates DNA deletion; yellow indicates DNA amplification

Analysis of the prognostic value of these lncRNAs in lung adenocarcinoma patients

After identifying a group of lncRNAs showing differential expression in lung adenocarcinoma, we examined whether their expression was associated with prognosis in lung adenocarcinoma patients. A risk score model was constructed using a linear combination of the expression levels of these 5 lncRNAs with weighted coefficients. A time‐dependent receiver operating characteristic curve was determined to evaluate the optimal cut‐off value. Patients with a risk score equal to or less than 0.258 were defined as low‐risk patients, whereas those with a score >0.258 were defined as high‐risk patients. A Kaplan‐Meier survival curve was plotted to compare the overall survival difference between these 2 groups (Figure 2A, N = 502). The same method was used to analyze the relationships between this group of lncRNAs and disease‐free (Figure 2B, N = 428) or recurrence‐free survival (Figure 2C, N = 351). High‐risk patients showed poor overall survival (HR = 2.14; 95% CI, 1.67‐3.06, P < .0001), disease‐free survival (HR = 1.84; 95% CI, 1.26‐2.35, P = .0007), and recurrence‐free survival (HR = 1.51; 95% CI, 1.02‐2.40, P = .04). We then investigated the relationship between the risk score and clinicopathological factors in the same cohort and found that the lymph node status (P < .0001), tumor grade (P = .016), tumor stage (P < .0001), and smoking status (P = .008), but not gender or tumor size, were correlated with the risk score (Table 1).
Figure 2

Overall survival, recurrence‐free survival, and disease‐free survival in relation to long non‐coding RNA (lncRNA) expression levels. A‐C, Kaplan‐Meier survival curves comparing overall survival (A, N = 502), disease‐free survival (B, N = 428), and recurrence‐free survival (C, N = 351) between low‐ and high‐risk lung adenocarcinoma patients. D,E, Association between the expression of these 5 lncRNAs and a survival status of living (D) or deceased (E). HR, hazard ratio

Table 1

Associations of risk score with clinicopathological factors of patients with lung adenocarcinoma or lung squamous cell carcinoma

Patient featuresSample sizeHigh risk, N (%)Low risk, N (%) P value
Lymph nodeNegative327145 (29.06)182 (36.47)<.0001
Positive172107 (21.44)65 (13.03)
NA12
Tumor gradeT116867 (13.19)101 (19.88).016
T2275151 (29.72)124 (24.41)
T34625 (4.92)21 (4.13)
T41911 (2.17)8 (1.57)
NA3
Age, years≤65235126 (25.66)109 (22.20).114
>65256119 (24.24)137 (27.90)
NA20
SmokingNon‐smoking7527 (5.44)48 (9.68).008
Smoking421221 (44.56)200 (40.32)
NA15
GenderFemale274137 (26.86)137 (26.86)1.000
Male236118 (23.14)118 (23.14)
NA1
Tumor size≤1 cm17181 (22.13)90 (24.59).722
>1 cm19596 (26.23)99 (27.05)
NA145
Tumor stageStage I278120 (23.58)158 (31.04)<.0001
Stage II12164 (12.57)57 (11.20)
Stage III8459 (11.59)25 (4.91)
Stage IV2612 (2.36)14 (2.75)
NA2

NA, not available.

Overall survival, recurrence‐free survival, and disease‐free survival in relation to long non‐coding RNA (lncRNA) expression levels. A‐C, Kaplan‐Meier survival curves comparing overall survival (A, N = 502), disease‐free survival (B, N = 428), and recurrence‐free survival (C, N = 351) between low‐ and high‐risk lung adenocarcinoma patients. D,E, Association between the expression of these 5 lncRNAs and a survival status of living (D) or deceased (E). HR, hazard ratio Associations of risk score with clinicopathological factors of patients with lung adenocarcinoma or lung squamous cell carcinoma NA, not available. To further confirm our results, an association analysis was carried out to examine the correlation between survival status and lncRNA expression, using the arules package of R. Twenty rules were identified in the live patients. Here, rules means the association relationships between the expression of lncRNAs and survival status. Low LOC100131726 (FAM83A‐AS1) expression and high LOC723809 (LHFPL3‐AS2), LOC150622 (LINC01105), NCRNA00092 (LINC00092), and LOC284276 (LINC00908) expression were associated with survival (Figure 2D). Fourteen rules were found in the deceased patients. High LOC100131726 (FAM83A‐AS1) expression and low LOC723809 (LHFPL3‐AS2), LOC150622 (LINC01105), NCRNA00092 (LINC00092), and LOC284276 (LINC00908) expression were associated with death (Figure 2E).

Validation of the prognostic value of these lncRNAs in lung adenocarcinoma

We used 5‐fold cross‐validation to validate the prognostic value of these 5 lncRNAs. The same cohort of lung adenocarcinoma patients as in the previous section (N = 502) were randomly divided into 5 groups of approximately equal number of samples (N1 = N2 = 101, N3 = N4 = N5 = 100). One of the 5 samples was used as the validation data and the remaining four samples as training data. This process was repeated 5 times, with each of the 5 samples used exactly once as the validation data. We then used the same method as in the previous section to generate a risk model for comparing overall survival between low‐risk and high‐risk patients. Three of the 5 groups of patients showed a significantly different overall survival rate between the two risk groups (Figure 3A). A fixed‐effects meta‐analysis was undertaken to study the comprehensive HR of these 5 groups, and an aggregated HR = of 1.26 (95% CI, 0.74‐1.79, Z = 4.72, P < .00001) suggested that low risk was better for survival (Figure 3B).
Figure 3

Validation of the prognostic value of this group of long non‐coding RNAs (lncRNAs) in lung adenocarcinoma. A, Kaplan‐Meier survival curves comparing overall survival between low‐ and high‐risk patients in different groups. B, Meta‐analysis estimating the association between risk score levels and prognosis in 5 groups of patients. The series ID, combined hazard ratio (HR) with 95% confidence interval, and SE of the HR are shown. The generic inverse variance data type, inverse variance method, and fixed‐effects model were used to perform this estimation

Validation of the prognostic value of this group of long non‐coding RNAs (lncRNAs) in lung adenocarcinoma. A, Kaplan‐Meier survival curves comparing overall survival between low‐ and high‐risk patients in different groups. B, Meta‐analysis estimating the association between risk score levels and prognosis in 5 groups of patients. The series ID, combined hazard ratio (HR) with 95% confidence interval, and SE of the HR are shown. The generic inverse variance data type, inverse variance method, and fixed‐effects model were used to perform this estimation To further confirm the prognostic value of these 5 lncRNAs, we investigated the relationship between the expression of each lncRNA and cancer risk in 255 lung adenocarcinoma patients (ID: Lung Adenocarcinoma TCGA) included in the SurvExpress database and found that high LOC100131726 (FAM83A‐AS1) expression and low LOC723809 (LHFPL3‐AS2), LOC150622 (LINC01105), NCRNA00092 (LINC00092), and LOC284276 (LINC00908) expression were correlated with poor survival (Figure 4A). The distribution of risk scores, death status, survival time, tumor stage, and expression pattern of the 5 lncRNAs is shown in Figure 4B. High‐ and low‐risk scores were found to be highly correlated with patient status (P = .002, χ2 test), survival time (P = .002, χ2 test), and tumor stage (P = .023, χ2 test). Most of the advanced stage patients were in the high‐risk group. A hierarchical clustering analysis revealed that the expression pattern of this group of lncRNAs was significantly correlated with tumor risk. Moreover, all of the patients in the high‐risk group showed poor survival outcomes, with an HR of 3.01 (95% CI, 1.85‐4.88, P = 8.25e‐06) (Figure 4C). Even in an analysis of the 80 patients who died, the high‐risk group showed poorer survival outcomes than the low‐risk group, with an HR of 3.02 (Figure 4D).
Figure 4

Prognostic value of long non‐coding RNAs (lncRNAs) in lung adenocarcinoma patients. A, Top panels: Kaplan‐Meier survival curves of lung adenocarcinoma patients. The patients were stratified by risk group using the SurvExpress database. Bottom panels: expression of lncRNAs in the two groups of patients, high‐risk (red lines) and low‐risk (green lines). B, Distribution of risk scores, survival status, survival times, tumor stage, and expression patterns of this group of lncRNAs in lung adenocarcinoma (SurvExpress). C,D, Kaplan‐Meier survival curves between all lung adenocarcinoma patients (C) and 80 deceased patients (D) at low and high risk according to the expression pattern of this group of lncRNAs (SurvExpress). CI, confidence interval; NA, not available

Prognostic value of long non‐coding RNAs (lncRNAs) in lung adenocarcinoma patients. A, Top panels: Kaplan‐Meier survival curves of lung adenocarcinoma patients. The patients were stratified by risk group using the SurvExpress database. Bottom panels: expression of lncRNAs in the two groups of patients, high‐risk (red lines) and low‐risk (green lines). B, Distribution of risk scores, survival status, survival times, tumor stage, and expression patterns of this group of lncRNAs in lung adenocarcinoma (SurvExpress). C,D, Kaplan‐Meier survival curves between all lung adenocarcinoma patients (C) and 80 deceased patients (D) at low and high risk according to the expression pattern of this group of lncRNAs (SurvExpress). CI, confidence interval; NA, not available

Independence of the prognostic value of these lncRNAs

To investigate whether the predictive capacity of this group of lncRNAs was independent of other clinical factors, such as age, gender, tumor grade, smoking, and lymph node status, we undertook univariate and multivariate Cox regression analyses. The univariate analysis showed that these 5 lncRNAs (HR = 2.718, 95% CI, 1.810‐4.081, P = 1.4e‐06), lymph node status (HR = 1.754, 95% CI, 1.450‐2.121, P = 5.1e‐08), and tumor stage (HR = 1.717, 95% CI, 1.451‐2.031, P = 3e‐10) were significantly associated with survival. The multivariate analysis revealed that these 5 lncRNAs (HR = 2.662, 95% CI, 1.716‐4.128, P = 1.2e‐05), age (HR = 1.024, 95% CI, 1.006‐1.043, P = .0092), and tumor stage (HR = 1.551, 95% CI, 1.222‐1.967, P = .0003) were independent prognostic factors (Table 2).
Table 2

Univariable and multivariable Cox regression analysis of overall survival in lung adenocarcinoma patients (N = 366)

VariableUnivariable analysisMultivariable analysis
HR95% CI of HR P valueHR95% CI of HR P value
Five lncRNAs2.7181.810‐4.0811.4e‐062.6621.716‐4.1281.2e‐05
Lymph node1.7541.450‐2.1215.1e‐081.1640.890‐1.522.2672
Age1.0120.994‐1.0300.181.0241.006‐1.043.0092
Smoking1.0100.846‐1.2060.911.0210.847‐1.232.8249
Gender1.0930.774‐1.5420.611.0760.753‐1.537.6880
Tumor stage1.7171.451‐2.0313e‐101.5511.222‐1.967.0003

CI, confidence interval; HR, hazard ratio; lncRNA, long noncoding RNA.

Univariable and multivariable Cox regression analysis of overall survival in lung adenocarcinoma patients (N = 366) CI, confidence interval; HR, hazard ratio; lncRNA, long noncoding RNA. We further classified the patients into subgroups according to their tumor stage, tumor size, smoking history, and lymph node status. Patients at tumor stages I and II were defined as early stage, and those at stages III and IV were classified as advanced stage. The patients in the early and advanced stage group were further stratified into low‐risk and high‐risk subgroups based on their risk score. Patients in low‐ and high‐risk groups showed significantly different overall survival (P < .0001) (Figure 5A). The high‐risk patients in the advanced‐stage group showed poor survival, with an HR of 1.88 (Figure 5B). In patients with tumors in which the longest dimension was longer or shorter than 1 cm, lymph node‐negative or lymph node‐positive patients, and smoking or non‐smoking patients, this group of lncRNAs showed similar prognostic value (P < .05; Figure 5C‐H).
Figure 5

Survival curves of patients with different risk scores classified by clinical factors. Kaplan‐Meier survival curves of patients with early (A) and advanced (B) tumor stage, patients with tumors in which the longest dimension was less (C) and greater (D) than 1 cm, patients without (E) and with (F) lymph node metastasis, and non‐smoking (G) and smoking (H) patients. HR, hazard ratio

Survival curves of patients with different risk scores classified by clinical factors. Kaplan‐Meier survival curves of patients with early (A) and advanced (B) tumor stage, patients with tumors in which the longest dimension was less (C) and greater (D) than 1 cm, patients without (E) and with (F) lymph node metastasis, and non‐smoking (G) and smoking (H) patients. HR, hazard ratio

Evaluation of the prognostic value of these lncRNAs in lung squamous cell carcinoma patients

We wondered whether this group of lncRNAs, which were identified as a valuable prognostic marker in adenocarcinoma patients, would also have prognostic value in other types of lung cancer. Thus, we assessed lung squamous cell carcinoma patients using the SurvExpress database (ID: Lung Squamous Cell Carcinoma TCGA). The relationship between the expression of each lncRNA and survival time was examined. In contrast to the finding in lung adenocarcinoma patients, LOC723809 (LHFPL3‐AS2) and LOC284276 (LINC00908) were not associated with survival in lung squamous cell carcinoma patients (P > .05; Figure 6A). However, low LOC150622 (LINC01105) and NCRNA00092 (LINC00092) expression increased the risk of death, and low LOC100131726 (FAM83A‐AS1) expression was associated with a low risk of death. Although LOC723809 (LHFPL3‐AS2) and LOC284276 (LINC00908) showed no differences in expression between low‐ and high‐risk patients, a Cox regression analysis indicated that the overall expression pattern of all 5 lncRNAs as a group (Figure 6B) is still a better prognostic marker of lung squamous cell carcinoma than the expression pattern of only the three lncRNAs that showed differential expression between the different risk groups (Figure 6C). It is possible that we did not observe differential LOC723809 (LHFPL3‐AS2) and LOC284276 (LINC00908) expression between the low‐ and high‐risk patients because the number of patients was too low. However, the use of the combination of these 5 lncRNAs as a prognostic marker in lung squamous cell carcinoma requires further analysis.
Figure 6

Relationship between long non‐coding RNAs (lncRNAs) and survival in lung squamous cell carcinoma. Upper panels: Kaplan‐Meier survival curves of lung squamous cell carcinoma patients. The patients were stratified by risk group based on each lncRNA (A), all 5 lncRNAs (B), and three lncRNAs (C) using the SurvExpress database. Lower panels: gene expression stratified by risk group using SurvExpress. Red lines, patients at high risk; green lines, patients at low risk. CI, confidence interval; HR, hazard ratio

Relationship between long non‐coding RNAs (lncRNAs) and survival in lung squamous cell carcinoma. Upper panels: Kaplan‐Meier survival curves of lung squamous cell carcinoma patients. The patients were stratified by risk group based on each lncRNA (A), all 5 lncRNAs (B), and three lncRNAs (C) using the SurvExpress database. Lower panels: gene expression stratified by risk group using SurvExpress. Red lines, patients at high risk; green lines, patients at low risk. CI, confidence interval; HR, hazard ratio

Association of these lncRNAs with tumor metastasis

To study the biological pathways of these lncRNAs, each patient's risk score was calculated, and the patients were then stratified into high‐ and low‐risk groups according to their median risk score. A GSEA revealed that the genes involved in the epithelial‐mesenchymal transition pathway were enriched in the high‐risk group (Figure 7A), which suggested that these lncRNAs might be involved in metastasis‐related pathways. We undertook a GO functional enrichment analysis to confirm this potential function. Pearson's correlation coefficients between the expression of various genes and these 5 lncRNAs were calculated. The genes with an absolute correlation coefficient value higher than 0.3 were selected for GO analysis. Genes involved in cell‐cell adherens junctions were enriched (Figure 7B), and we then compared the expression profiles of these genes between patients with positive and negative lymph node statuses. Four of the lncRNAs (LOC723809 [LHFPL3‐AS2], LOC150622 [LINC01105], NCRNA00092 [LINC00092], and LOC284276 [LINC00908]) were expressed at significantly lower levels in the lymph node‐positive group than in the lymph node‐negative group (Figure 7C). An association analysis was also undertaken to confirm that the expression of these lncRNAs was associated with lymph node metastasis status. Twenty‐two rules demonstrated that high LOC100131726 (FAM83A‐AS1) expression and low LOC723809 (LHFPL3‐AS2), LOC150622 (LINC01105), NCRNA00092 (LINC00092), and LOC284276 (LINC00908) expression were associated with the occurrence of lymph node metastasis (Figure 7D). Thirteen rules revealed that the opposite expression patterns were associated with a lymph node‐negative status (Figure 7E). Here, “rules” means the association relationships between the expression of lncRNAs and the status of lymph node metastasis that was learned by the association rule‐learning algorithm.
Figure 7

Association between long non‐coding RNAs (lncRNAs) and tumor metastasis in lung adenocarcinoma patients. A, Gene set enrichment analysis enrichment score curves showing the relationship between the epithelial‐mesenchymal transition (EMT) pathway and the risk of lung adenocarcinoma (N = 255). Top panel: X‐axis indicates genes with high expression in the high‐risk (left end) and low‐risk (right end) patients. The green curve indicates the enrichment score. The positive enrichment score at the high‐risk end indicates upregulation of the EMT pathway in the high‐risk samples. Middle panel: black lines indicate genes expressed in the EMT pathway. Bottom panel: Gene list ordered by ranking metric. Positive value indicates correlation with high risk and negative value indicates correlation with low risk. B, Gene Ontology function cluster analysis of the genes coexpressed with these 5 lncRNAs. C, Expression profiles of these 5 lncRNAs in lymph node‐positive and lymph node‐negative lung adenocarcinoma patients. D,E, Connection between the expression of these 5 lncRNAs and a positive (red circles) (D) or negative (blue squares) (E) lymph node metastasis status. The lift value is shown by the color intensity and the size of the circle indicates the confidence value

Association between long non‐coding RNAs (lncRNAs) and tumor metastasis in lung adenocarcinoma patients. A, Gene set enrichment analysis enrichment score curves showing the relationship between the epithelial‐mesenchymal transition (EMT) pathway and the risk of lung adenocarcinoma (N = 255). Top panel: X‐axis indicates genes with high expression in the high‐risk (left end) and low‐risk (right end) patients. The green curve indicates the enrichment score. The positive enrichment score at the high‐risk end indicates upregulation of the EMT pathway in the high‐risk samples. Middle panel: black lines indicate genes expressed in the EMT pathway. Bottom panel: Gene list ordered by ranking metric. Positive value indicates correlation with high risk and negative value indicates correlation with low risk. B, Gene Ontology function cluster analysis of the genes coexpressed with these 5 lncRNAs. C, Expression profiles of these 5 lncRNAs in lymph node‐positive and lymph node‐negative lung adenocarcinoma patients. D,E, Connection between the expression of these 5 lncRNAs and a positive (red circles) (D) or negative (blue squares) (E) lymph node metastasis status. The lift value is shown by the color intensity and the size of the circle indicates the confidence value

DISCUSSION

Lung adenocarcinoma is often triggered by a class of aberrant genes. However, 30%‐50% of lung adenocarcinoma patients lack aberrations of the biomarker genes. Therefore, more sensitive biomarkers of lung adenocarcinoma are needed. Multigene expression signatures focusing on lncRNAs, miRNAs, and protein‐coding genes have been used for predicting risk and survival.30, 31, 32, 33, 34, 35, 36 In this study, we report the prognostic value of 5 lncRNAs (LOC723809 [LHFPL3‐AS2], LOC150622 [LINC01105], NCRNA00092 [LINC00092], LOC284276 [LINC00908], and LOC100131726 [FAM83A‐AS1]) in lung adenocarcinoma. LOC150622 (LINC01105) is a stage‐specific biomarker in lung adenocarcinoma. It is also highly expressed in neuroblastoma tissue, where it affects cellular proliferation and apoptosis.37, 38 Methylation of the LOC284276 (LINC00908) gene is negatively associated with birth weight.39 NCRNA00092 (LINC00092) acts in cancer‐associated fibroblasts to drive glycolysis and progression of ovarian cancer.40 No studies have investigated the biological functions of LOC723809 (LHFPL3‐AS2) or LOC100131726 (FAM83A‐AS1). The results of this study indicate that their expression is correlated with each other. In addition, 4 of these lncRNAs (LOC723809 [LHFPL3‐AS2], LOC150622 [LINC01105], NCRNA00092 [LINC00092], and LOC284276 [LINC00908]) are expressed at low levels, whereas LOC100131726 (FAM83A‐AS1) is expressed at a high level in lung adenocarcinoma tissue. The abnormal expression of these 5 lncRNAs is related to patient survival and tumor metastasis. Moreover, their expression signature might independently predict survival in lung adenocarcinoma patients. In lung adenocarcinoma, abnormal expression of this group of lncRNAs was found to be associated with poor prognosis. Hierarchical clustering also revealed that the expression pattern of this group of lncRNAs was significantly correlated with survival. The risk score model also revealed a correlation between the expression of this group of lncRNAs and overall survival, disease‐free survival, and recurrence‐free survival. Moreover, we found that the expression levels of these lncRNAs were associated with each other in lung adenocarcinoma. The expression of the 4 positively associated lncRNAs might be regulated by the same mechanism or they might positively regulate each other's expression. In addition, LOC723809 (LHFPL3‐AS2) and LOC100131726 (FAM83A‐AS1) might negatively regulate each other. Finally, both univariate and multivariate Cox regression analyses indicated that this group of lncRNAs was independent of other clinicopathological risk factors. Overall, multiple lines of evidence showed the prognostic value of this group of lncRNAs in assessing the risk of lung adenocarcinoma. A Cox regression analysis indicated that tumor stage was an independent clinicopathological factor for predicting the risk of lung adenocarcinoma. This finding is consistent with those of previous studies, which reported that approximately 70% of lung adenocarcinoma patients show locally advanced (stage IIIB) or metastatic disease (stage IV). The 5‐year survival rate of stage IIIB and IV patients is 7% and 2%, respectively.41, 42 However, a survival rate higher than 80% is achieved with lung resection at an early stage of disease.43 Although cigarette smoking is the major cause of lung cancer, both univariate and multivariate Cox regression analyses revealed that cigarette smoking was not correlated with survival, which agrees with previous reports.7, 44, 45 Both GSEA and GO function cluster analyses found that genes involved in the epithelial‐mesenchymal transition and cell‐cell adherens junction were associated with this group of lncRNAs. A correlation analysis between their expression and the lymph node metastasis status also revealed that high LOC100131726 (FAM83A‐AS1) expression and low LOC723809 (LHFPL3‐AS2), LOC150622 (LINC01105), NCRNA00092 (LINC00092), and LOC284276 (LINC00908) expression were associated with lymph node positivity. This result was consistent with the survival status or poor prognosis. Thus, the regulation of tumor metastasis might be a mechanism through which this group of lncRNAs affects survival. In conclusion, we have established the prognostic value of a group of lncRNAs showing abnormal expression levels in lung adenocarcinoma. These lncRNAs might not only predict prognosis but also provide a theoretical basis for molecularly targeted therapy in the future. This study identified, by data mining, a group of lncRNAs that can act as a prognostic biomarker for lung adenocarcinoma patients, but it has its limitations. All the statistical and bioinformatic analyses in this study were carried out in silico. We did not undertake any wet laboratory experiments. We know through statistical methods that these 5 lncRNA are associated with the prognosis of lung adenocarcinoma patients, but we do not know the exact biological mechanism underlying this association. Whether or not and how these lncRNAs are tied to lung cancer proliferation, progression, or invasion needs to be investigated by elaborately designed wet laboratory experiments in the future. Another limitation of this study is that the risk score model was only validated with cross‐validation. In an ideal world, a predictive model should always be validated with independent data to overcome the overfitting problem. Unfortunately, it is currently difficult to find another independent lung adenocarcinoma cohort that is of comparable size within TCGA that has the necessary clinical data, so we had to use the same lung adenocarcinoma cohort from TCGA to both build and validate the risk score model. This is where cross‐validation comes in. By dividing the cohort into subgroups and using different groups to build and validate the model, the ability of the model to generalize to independent data can thus be assessed and the overfitting problem can be overcome.

CONFLICT OF INTEREST

The authors have no conflict of interest.
  42 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  RNA maps reveal new RNA classes and a possible function for pervasive transcription.

Authors:  Philipp Kapranov; Jill Cheng; Sujit Dike; David A Nix; Radharani Duttagupta; Aarron T Willingham; Peter F Stadler; Jana Hertel; Jörg Hackermüller; Ivo L Hofacker; Ian Bell; Evelyn Cheung; Jorg Drenkow; Erica Dumais; Sandeep Patel; Gregg Helt; Madhavan Ganesh; Srinka Ghosh; Antonio Piccolboni; Victor Sementchenko; Hari Tammana; Thomas R Gingeras
Journal:  Science       Date:  2007-05-17       Impact factor: 47.728

Review 3.  Can ENCODE tell us how much junk DNA we carry in our genome?

Authors:  Deng-Ke Niu; Li Jiang
Journal:  Biochem Biophys Res Commun       Date:  2012-12-22       Impact factor: 3.575

4.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

5.  Never-smokers with lung cancer: epidemiologic evidence of a distinct disease entity.

Authors:  Chee-Keong Toh; Fei Gao; Wan-Teck Lim; Swan-Swan Leong; Kam-Weng Fong; Swee-Peng Yap; Anne A L Hsu; Philip Eng; Heng-Nung Koong; Agasthian Thirugnanam; Eng-Huat Tan
Journal:  J Clin Oncol       Date:  2006-05-20       Impact factor: 44.544

6.  Pathway-based identification of a smoking associated 6-gene signature predictive of lung cancer risk and survival.

Authors:  Nancy Lan Guo; Ying-Wooi Wan
Journal:  Artif Intell Med       Date:  2012-02-11       Impact factor: 5.326

Review 7.  The International Epidemiology of Lung Cancer: geographical distribution and secular trends.

Authors:  Danny R Youlden; Susanna M Cramb; Peter D Baade
Journal:  J Thorac Oncol       Date:  2008-08       Impact factor: 15.609

8.  A first-generation multiplex biomarker analysis of urine for the early detection of prostate cancer.

Authors:  Bharathi Laxman; David S Morris; Jianjun Yu; Javed Siddiqui; Jie Cao; Rohit Mehra; Robert J Lonigro; Alex Tsodikov; John T Wei; Scott A Tomlins; Arul M Chinnaiyan
Journal:  Cancer Res       Date:  2008-02-01       Impact factor: 12.701

9.  The transcriptional landscape of the mammalian genome.

Authors:  P Carninci; T Kasukawa; S Katayama; J Gough; M C Frith; N Maeda; R Oyama; T Ravasi; B Lenhard; C Wells; R Kodzius; K Shimokawa; V B Bajic; S E Brenner; S Batalov; A R R Forrest; M Zavolan; M J Davis; L G Wilming; V Aidinis; J E Allen; A Ambesi-Impiombato; R Apweiler; R N Aturaliya; T L Bailey; M Bansal; L Baxter; K W Beisel; T Bersano; H Bono; A M Chalk; K P Chiu; V Choudhary; A Christoffels; D R Clutterbuck; M L Crowe; E Dalla; B P Dalrymple; B de Bono; G Della Gatta; D di Bernardo; T Down; P Engstrom; M Fagiolini; G Faulkner; C F Fletcher; T Fukushima; M Furuno; S Futaki; M Gariboldi; P Georgii-Hemming; T R Gingeras; T Gojobori; R E Green; S Gustincich; M Harbers; Y Hayashi; T K Hensch; N Hirokawa; D Hill; L Huminiecki; M Iacono; K Ikeo; A Iwama; T Ishikawa; M Jakt; A Kanapin; M Katoh; Y Kawasawa; J Kelso; H Kitamura; H Kitano; G Kollias; S P T Krishnan; A Kruger; S K Kummerfeld; I V Kurochkin; L F Lareau; D Lazarevic; L Lipovich; J Liu; S Liuni; S McWilliam; M Madan Babu; M Madera; L Marchionni; H Matsuda; S Matsuzawa; H Miki; F Mignone; S Miyake; K Morris; S Mottagui-Tabar; N Mulder; N Nakano; H Nakauchi; P Ng; R Nilsson; S Nishiguchi; S Nishikawa; F Nori; O Ohara; Y Okazaki; V Orlando; K C Pang; W J Pavan; G Pavesi; G Pesole; N Petrovsky; S Piazza; J Reed; J F Reid; B Z Ring; M Ringwald; B Rost; Y Ruan; S L Salzberg; A Sandelin; C Schneider; C Schönbach; K Sekiguchi; C A M Semple; S Seno; L Sessa; Y Sheng; Y Shibata; H Shimada; K Shimada; D Silva; B Sinclair; S Sperling; E Stupka; K Sugiura; R Sultana; Y Takenaka; K Taki; K Tammoja; S L Tan; S Tang; M S Taylor; J Tegner; S A Teichmann; H R Ueda; E van Nimwegen; R Verardo; C L Wei; K Yagi; H Yamanishi; E Zabarovsky; S Zhu; A Zimmer; W Hide; C Bult; S M Grimmond; R D Teasdale; E T Liu; V Brusic; J Quackenbush; C Wahlestedt; J S Mattick; D A Hume; C Kai; D Sasaki; Y Tomaru; S Fukuda; M Kanamori-Katayama; M Suzuki; J Aoki; T Arakawa; J Iida; K Imamura; M Itoh; T Kato; H Kawaji; N Kawagashira; T Kawashima; M Kojima; S Kondo; H Konno; K Nakano; N Ninomiya; T Nishio; M Okada; C Plessy; K Shibata; T Shiraki; S Suzuki; M Tagami; K Waki; A Watahiki; Y Okamura-Oho; H Suzuki; J Kawai; Y Hayashizaki
Journal:  Science       Date:  2005-09-02       Impact factor: 47.728

10.  KLC1-ALK: a novel fusion in lung cancer identified using a formalin-fixed paraffin-embedded tissue only.

Authors:  Yuki Togashi; Manabu Soda; Seiji Sakata; Emiko Sugawara; Satoko Hatano; Reimi Asaka; Takashi Nakajima; Hiroyuki Mano; Kengo Takeuchi
Journal:  PLoS One       Date:  2012-02-08       Impact factor: 3.240

View more
  10 in total

1.  LncRNAs Associated with Chemoradiotherapy Response and Prognosis in Locally Advanced Rectal Cancer.

Authors:  Yiyi Zhang; Bingjie Guan; Yong Wu; Fan Du; Jinfu Zhuang; Yuanfeng Yang; Guoxian Guan; Xing Liu
Journal:  J Inflamm Res       Date:  2021-11-27

2.  Prognostic Value of the FOXK Family Expression in Patients with Locally Advanced Rectal Cancer Following Neoadjuvant Chemoradiotherapy.

Authors:  Yiyi Zhang; Meifang Xu; Jianhua Chen; Kui Chen; Jinfu Zhuang; Yuanfeng Yang; Xing Liu; Guoxian Guan
Journal:  Onco Targets Ther       Date:  2020-09-16       Impact factor: 4.147

3.  Survival analysis and functional annotation of long non-coding RNAs in lung adenocarcinoma.

Authors:  Abbas Salavaty; Zahra Rezvani; Ali Najafi
Journal:  J Cell Mol Med       Date:  2019-06-18       Impact factor: 5.310

4.  Identification of the Prognostic Significance of Somatic Mutation-Derived LncRNA Signatures of Genomic Instability in Lung Adenocarcinoma.

Authors:  Wei Geng; Zhilei Lv; Jinshuo Fan; Juanjuan Xu; Kaimin Mao; Zhengrong Yin; Wanlu Qing; Yang Jin
Journal:  Front Cell Dev Biol       Date:  2021-03-29

5.  Identification of LncRNAs Associated With FOLFOX Chemoresistance in mCRC and Construction of a Predictive Model.

Authors:  Yiyi Zhang; Meifang Xu; Yanwu Sun; Ying Chen; Pan Chi; Zongbin Xu; Xingrong Lu
Journal:  Front Cell Dev Biol       Date:  2021-01-28

6.  Nonnegative matrix factorization-based bioinformatics analysis reveals that TPX2 and SELENBP1 are two predictors of the inner sub-consensuses of lung adenocarcinoma.

Authors:  Haiwei Wang; Xinrui Wang; Liangpu Xu; Hua Cao; Ji Zhang
Journal:  Cancer Med       Date:  2021-11-03       Impact factor: 4.452

7.  Mechanism of Shishiwei Wendan Decoction in the Prevention and Treatment of Lung Adenocarcinoma Using Network Pharmacology and Molecular Docking.

Authors:  Xiaofan Li; Qi Sun; Wenli Ma; Xinzhe Ma; Hongyue Pan; Wei Guo
Journal:  Biomed Res Int       Date:  2022-09-23       Impact factor: 3.246

8.  A group of long noncoding RNAs identified by data mining can predict the prognosis of lung adenocarcinoma.

Authors:  Meijian Liao; Qing Liu; Bing Li; Weijie Liao; Weidong Xie; Yaou Zhang
Journal:  Cancer Sci       Date:  2018-11-04       Impact factor: 6.716

9.  A Long Intergenic Non-coding RNA, LINC01426, Promotes Cancer Progression via AZGP1 and Predicts Poor Prognosis in Patients with LUAD.

Authors:  Baorui Tian; Xiaoyang Han; Guanzhen Li; Hua Jiang; Jianni Qi; Jiamei Li; Yingying Tian; Chuanxi Wang
Journal:  Mol Ther Methods Clin Dev       Date:  2020-08-05       Impact factor: 6.698

10.  Dendrobium officinalis inhibited tumor growth in non-small cell lung cancer.

Authors:  Chen Pang; Xiuling Zhang; Min Huang; Guangyuan Xie; Shanshan Liu; Xingjiang Ye; Xiliu Zhang
Journal:  Transl Cancer Res       Date:  2020-04       Impact factor: 1.241

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.