Literature DB >> 30510449

A 15-lncRNA signature predicts survival and functions as a ceRNA in patients with colorectal cancer.

Xuning Wang1, Jianguo Zhou2, Maolin Xu1, Yongfeng Yan3, Liang Huang1, Yanshen Kuang1, Yuansheng Liu4, Peng Li3, Wei Zheng3, Hongyi Liu3, Baoqing Jia3.   

Abstract

PURPOSE: Colorectal cancer (CRC) is one of the most common malignant tumors worldwide. This study aimed to explore the prognostic value of lncRNAs in CRC.
MATERIAL AND METHODS: We performed gene expression profiling to identify differentially expressed lncRNAs between 51 normal and 646 tumor tissues from The Cancer Genome Atlas database. Cox regression and robust likelihood-based survival models were used to find prognosis-related lncRNAs. A lncRNA signature was developed to predict the overall survival of patients with CRC. In addition, a receiver operating characteristic curve analysis was performed to identify the optimal cutoff with the best Youden index to divide patients into different groups based on risk level.
RESULTS: Eighty survival-related lncRNAs were identified and a 15-lncRNA signature was developed on the basis of a risk score to comprehensively predict the overall survival of patients with CRC. The prognostic value of the 15-lncRNA risk score was validated using the internal testing set and total set. The risk indicator was shown to be an independent prognostic factor (hazard ratio =2.92; 95% CI: 1.73-4.94; P<0.001). Notably, all 15 lncRNAs (AC024581.1, FOXD3-AS1, AC012531.1, AC003101.2, LINC01219, AC083967.1, AL590483.1, AC105118.1, AC010789.1, AC067930.5, AC105219.2, LINC01354, LINC02474, LINC02257, and AC079612.1) were newly found to correlate with the prognosis of patients with CRC. Furthermore, the function of 15 lncRNAs was explored through the ceRNA network. These lncRNAs regulated coding genes that were involved in many key cancer pathways.
CONCLUSION: A 15-lncRNA expression signature was discovered as a prognostic indicator for patients with CRC, which may act as competing endogenous RNA (ceRNAs) to play a crucial role in the modulation of cancer-related pathways. These findings may allow a better understanding of the prognostic value of lncRNAs.

Entities:  

Keywords:  biomarker; ceRNA; colorectal cancer; competing endogenous RNA; long noncoding RNA; survival

Year:  2018        PMID: 30510449      PMCID: PMC6248371          DOI: 10.2147/CMAR.S178732

Source DB:  PubMed          Journal:  Cancer Manag Res        ISSN: 1179-1322            Impact factor:   3.989


Introduction

Colorectal cancer (CRC) is one of the most common malignant tumors of the gastrointestinal tract worldwide, as well as the fourth leading cause of cancer-related death owing to its prevalence and mortality.1 Studies have shown that CRC is caused by several genetic factors, including changes in chromosomal copy number, aberrant gene methylation, and dysregulated gene expression.2,3 Considerable progress has been made in the diagnosis and treatment of CRC in the last several decades. However, the current prognostic factors for patients with CRC do not meet clinical needs, making it necessary to identify novel biomarkers in a sensitive and accurate way to better predict overall survival. lncRNAs, usually >200 nucleotides in length, are a class of RNAs that do not code for proteins.4 lncRNAs used to be considered “transcript junk,” but have recently emerged as key molecules in multiple complex biological processes (BP),4,5 including proliferation, cell cycle progression, and survival.6 Several reports have shown that lncRNAs serve as modulators of carcinogenesis and affect the rates of invasion and metastasis in several types of cancer.6 However, the biological function and prognostic value of many lncRNAs remain unknown. Interestingly, it has been shown that numerous lncRNAs can act as competing endogenous RNAs (ceRNAs) to regulate the expression of coding genes7 that have common miRNA response elements (MREs). In this study, the predictive value of lncRNAs in patients with CRC was explored. Furthermore, the function of these lncRNAs was investigated using the ceRNA network.

Materials and methods

Data processing and computational analysis

Figure 1 shows the overall workflow of this study. The data of 697 RNA expression profiles (level 3), including 51 normal tissues and 646 tumor tissues, were downloaded from The Cancer Genome Atlas (TCGA) data portal (dated to September 18, 2017). This study met the publication guidelines provided by TCGA (http://cancergenome.nih.gov/publications/publicationguidelines). According to TCGA guidelines, RNA expression profiles can be studied in three forms: HT-seq raw read count, Fragments per Kilobase of transcript per Million mapped reads (FPKM), and FPKM-UQ (upper quartile normalization). Here, HT-seq raw read count was chosen. lncRNAs general feature format file (Gencode.v27) was used as the lncRNA annotation reference.8 The expression profiles of lncRNAs were analyzed by edgeR.9,10 Differentially expressed lncRNAs were selected according to P-value (≤0.01) and absolute fold change (≥2).
Figure 1

Main workflow for the identification of cancer-related lncRNAs.

Identification of lncRNAs related to patient prognosis

Samples were filtered by removing cases without complete survival data to yield 616 samples that were included in our analysis. All samples were randomly divided into either training set (308 samples) or validation set (308 samples) groups. The clinical and demographic characteristics of the study population are shown in Table 1. There was no statistical difference between the two sets. To determine the feasibility and reliability of survival-associated lncRNAs as prognostic markers in CRC, univariate Cox proportional hazards regression was applied to identify overall survival-related lncRNAs. The robust likelihood-based survival model, using the R package analysis method (Rbsurv), was then applied to further identify prognosis-related lncRNAs.11 The protocol of this method was as follows: first, the model randomly put N(1 − p) samples into the training set and Np cases into the validation set. Here, we chose p=1/3. Second, the model added a gene to the training set and obtained the parameter for the gene. The loglik was evaluated for each parameter and validated within the internal validation samples. The procedure was repeated 1,000 times to select the best prognosis-related lncRNAs with the smallest mean negative loglik. Next, the Akaike information criterion (AIC) was computed and used as an estimator of the relative quality of statistical models for a given set of data, and the optimal model was chosen with the smallest AIC. P<0.05 was considered statistically significant.
Table 1

Clinical covariates for TCGA colorectal cancer

CovariateTotal set (n=616)Training set (n=308)Validation set (n=308)P-valuea

Age (years), n0.162
 ≥65368175193
 <65248133115
Gender1.000
 Male329164165
 Female287144143
Pathological stage, n0.805
 I + II330166164
 III + IV267131136
 Not report19118

Note:

χ2 test.

Abbreviation: TCGA, The Cancer Genome Atlas.

Establishment and validation of the risk formula

lncRNAs chosen from the previous step were inserted into the multiply Cox proportional model to calculate the coefficients in the training set, thereby establishing the risk formula. Risk scores for each sample were calculated using this formula. All patients were classified into either the high-risk or the low-risk group on the basis of the median of their risk score. The Kaplan–Meier method and the log-rank test were applied to analyze the overall survival of the two groups using the R package survival analysis.12,13 A time-dependent receiver operating characteristic curve (ROC) was constructed to evaluate the prediction value of the model (version 1.0.3),14 and the figures were plotted by ggplot2 (version 2.2.1)15 and ggfortify (version 0.4.1).16,17 All data were processed and analyzed by perl 5 version 24, excel 2010, and R (version 3.4.1).

Determination of lncRNA function

The function of the lncRNAs was explored using the triple ceRNA (lncRNA–miRNA–mRNA) network. The sequences of the identified lncRNAs were obtained from Ensembl18 and inputted into the miRDB19,20 database to predict their miRNA targets. The corresponding coding genes were then identified using miRDB,19,20 miRTarBase,21 and TargetScan.22 The triple ceRNA network was visualized and constructed by Cytoscape v3.5.1.23 The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis coding genes were annotated by the R package of clusterProfiler.24 The cutoff P-value was 0.05.

Results

Differential expression of lncRNAs

A total of 1,103 differentially expressed lncRNAs were identified in patients with CRC. These lncRNAs are listed in Table S1. Eighty lncRNAs that were associated with overall survival were identified through our univariate Cox regression analysis in the total set (Table S2).

Identification of a 15-lncRNA signature

The 20 lncRNAs with the lowest P-value were selected (Table 2) and analyzed with the robust likelihood-based survival model. Fifteen lncRNAs were selected with the lowest AIC values. The risk coefficients for these lncRNAs were calculated using the multivariable Cox proportional hazards model. The risk formula used to calculate the risk score was as follows: (0.238*AC024581.1)+(0.053*FOXD3-AS1)+(0.067*AC0 12531.1)+(0.221*AC003101.2)+(0.357*LINC01219)+(0.0 82*AC083967.1)+(−0.113*AL590483.1)+(0.060*AC1051-18.1)+(0.031*AC010789.1)+(0.126*AC067930.5)+(0.161*AC105219.2)+(0.317*LINC01354)+(0.139*L INC02474)+(−0.131*LINC02257)+(−0.269*AC079612.1). Additionally, the risk scores were calculated for each patient in the training set. The patients were divided into two groups on the basis of the median of the risk scores (Figure 2A). Figure 2B shows the distribution of patient survival status and survival time. Survival, assessed with the Kaplan–Meier method and log-rank test, indicated that patients with a high-risk score had a shorter survival time (P<0.001) (Figure 2C). In our analysis, survival time was negatively correlated with risk score.
Table 2

Top 20 survival-related lncRNAs

lncRNAsHRP-value

AC093895.11.2070630.000129
AC012531.11.2018730.000225
AC020891.21.272430.000854
AC002076.11.347850.001103
AC016027.10.6829980.001165
AC105118.11.3294650.001222
LINC024741.102040.001489
AC079612.10.7435410.00256
AC083967.11.2131430.002645
AC067930.51.206420.002924
AC010789.11.1582920.003233
AL590483.30.8320730.004144
LINC012191.2588210.004814
AL590483.10.8324610.004836
AC003101.21.2651780.005871
FOXD3-AS11.1888580.006167
AC105219.21.1820660.006402
LINC022571.1591310.006745
AC024581.11.2892770.00682
LINC013541.1999760.009076
Figure 2

Risk score of lncRNAs in the training set.

Notes: (A) The risk score of patients in the training set based on risk formula. (B) The distribution of patient survival status and survival time. (C) Survival curve of the low-risk and high-risk groups based on median risk score using the Kaplan–Meier method.

Validation of the prognostic value of the lncRNAs

To assess prognostic value, ROC was conducted for the 15-lncRNA signature (Figure 3A). For our analysis, the area under curve was 0.708. 2.027 was chosen as the best optimal cutoff, taking into account the maximal sensitivity and specificity of our survival prediction. Patients from the data sets (total set and validating set) were further divided into high-risk or low-risk groups. Figure 3B and 3C shows the Kaplan–Meier survival curves for the testing set and the total set, respectively, where the results were all consistent with our model.
Figure 3

Clinical significance of the 15-lncRNA signature.

Notes: (A) The ROC curve of the 15 lncRNA model. (B) The survival curve of the low-risk and high-risk groups based on the optimal cutoff in the testing set. (C) The survival curve of the low-risk and high-risk groups based on the optimal cutoff in the complete set.

Abbreviations: AUC, area under curve; ROC, receiver operating characteristic curve.

The 15 lncRNAs identified in our study were inputted into the miRDB database to predict their miRNA targets (yielding a total of 222 miRNAs), and the coding genes for these miRNAs were then predicted (yielding 1,179 genes). Figure 4A shows an overview of the triple ceRNA (lncRNA–miRNA–mRNA) network. The detailed interactions of the ceRNA network are shown in Table S3. The functional enrichment assay identified 691 GO terms in BP, 46 GO terms in cellular components, 81 GO terms in molecular function (Table S4), and 46 pathways (Table S5). It also showed that these genes are involved in multiple BP, such as regulation of cell morphogenesis, and Wnt-mediated cell signaling. The top ten GO results are shown in Figure 4B. The top 20 KEGG pathways are shown in Figure 4C. KEGG was enriched in several cancer-related pathways, including the p53 and Wnt signaling pathways. lncRNA AC012531.1 was not only related to the mTOR signal pathway by regulating hsa-mir-424-5p, and hsa-mir-16-5p, has-mir-410-3p, which targeted ATK3, SEH1L, and GSK3B, respectively, but also took part in the MAPK signal pathway. lncRNA LINC01354 participated in the TP53 signal pathway by hsa-mir-107 and hsa-mir-497-5 p, which regulated CDK6 and CCNE1, respectively. lncRNA LINC02257, indirectly regulating ROCK2 through hsa-mir-138-5p, played an important role in the Wnt signal pathway. lncRNA AC079612.1 interacted with hsa-mir-760 targeting PPIP5K1 to involve in the phosphatidylinositol signal. Furthermore, these four lncRNAs were also involved in other pathways. However, the rest of the lncRNAs in this study have not been found involved in pathways through interaction with miRNAs.
Figure 4

ceRNA network of 15 lncRNAs.

Notes: (A) The overall ceRNA network of 15 lncRNAs. The red rhombus refers to lncRNA. Green sexangle refers to miRNA. Yellow sexangle refers to mRNA. (B) Top ten GO enrichment results. (C) Top 20 KEGG pathways.

Abbreviations: GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.

Discussion

Recently, much attention has been given to the clinical significance of lncRNAs, which account for the majority of transcriptional products in the cell.25,26 Many lncRNAs have tissue-specific expression patterns and play crucial roles in the progression of diseases,27 such as gastric cancer28 and breast cancer.29 Those lncRNAs expressed in CRC were comprehensively analyzed, and 1,103 differentially expressed lncRNAs were identified. Then, 80 lncRNAs that were correlated with the overall survival of patients with CRC were selected using the univariate Cox regression model. The robust likelihood-based survival model was then applied, and the 20 lncRNAs with the lowest P-value selected to identify a 15-lncRNA signature that predicts the 5-year overall survival of patients with CRC. This model showed excellent performance and consistency throughout the training set, testing set, and total set. These results imply that the 15-lncRNA signature identified in our study may be used as a biomarker to predict patient prognosis in clinical practice. A literature search in PubMed and Google Scholar indicates this is the first time these 15 lncRNAs are reported to be correlated with CRC. Previous studies have shown that there is signaling “crosstalk” between different transcriptional products.30,31 Many cancer-related phenotypes are driven by lncRNAs,25 either directly or indirectly, by modulating the stability of various molecules, including DNA, proteins, and miRNAs. The hypothesis of ceRNA is that transcriptional products that share common MREs with target genes communicate with different genes through miRNAs.7 Furthermore, any transcriptional product that has MREs can act as a ceRNA. These transcriptional products, which share common MREs, including lncRNAs, circular RNAs, and pseudogenes, regulate corresponding genes through miRNAs that function in RNA posttranscriptional silencing by binding the 3′-untranslated region to influence transcript stability. Thus, lncRNAs may act as ceRNAs to indirectly regulate coding genes through miRNAs. It is therefore necessary to explore the role of lncRNAs as ceRNAs. In this study, a triple ceRNA (lncRNA–miRNA–mRNA) network was constructed. Bioinformatics analyses of this ceRNA network revealed that 15 lncRNAs may function as ceRNAs to regulate genes that participate in cancer-associated signaling, including p53 and Wnt signaling.32 Furthermore, this ceRNA network may be involved in other types of cancer because KEGG analysis results of ceRNA showed this network was associated with many cancer-related pathways. For example, the TP53 signaling pathway participates in multiple tumor genesis.33–35 Taken together, these results suggest that 15 differentially expressed lncRNAs play an important role in oncogenesis and may be used as a prognostic biomarker in clinical practice. However, there were still some limits to our study. Our results are based on a bioinformatics analysis and were validated using in vitro or in vivo experimentation. In addition, as the binding affinity between miRNAs and their RNA targets is influenced by the matching between MRE and the seeds regions (as well as other factors), we could not adequately assess the exact function of each ceRNA. Future studies will assess the biological functions of these lncRNAs by measuring their effects on cell proliferation and apoptosis and will further evaluate these lncRNAs as prognostic biomarkers.

Conclusion

In summary, we identified 1,103 lncRNAs that were differentially expressed in CRC. A 15-lncRNAs’ risk formula was developed that correlated with the overall survival of patients with CRC using a robust likelihood-based survival model, and the function of these newly identified survival-associated lncRNAs was explored. Our results justify further study of the transcriptional regulatory network of lncRNAs in CRC and provide a new resource to discover novel prognostic biomarkers.
  29 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers.

Authors:  Ashton C Berger; Anil Korkut; Rupa S Kanchi; Apurva M Hegde; Walter Lenoir; Wenbin Liu; Yuexin Liu; Huihui Fan; Hui Shen; Visweswaran Ravikumar; Arvind Rao; Andre Schultz; Xubin Li; Pavel Sumazin; Cecilia Williams; Pieter Mestdagh; Preethi H Gunaratne; Christina Yau; Reanne Bowlby; A Gordon Robertson; Daniel G Tiezzi; Chen Wang; Andrew D Cherniack; Andrew K Godwin; Nicole M Kuderer; Janet S Rader; Rosemary E Zuna; Anil K Sood; Alexander J Lazar; Akinyemi I Ojesina; Clement Adebamowo; Sally N Adebamowo; Keith A Baggerly; Ting-Wen Chen; Hua-Sheng Chiu; Steve Lefever; Liang Liu; Karen MacKenzie; Sandra Orsulic; Jason Roszik; Carl Simon Shelley; Qianqian Song; Christopher P Vellano; Nicolas Wentzensen; John N Weinstein; Gordon B Mills; Douglas A Levine; Rehan Akbani
Journal:  Cancer Cell       Date:  2018-04-02       Impact factor: 31.743

3.  lncRNA Epigenetic Landscape Analysis Identifies EPIC1 as an Oncogenic lncRNA that Interacts with MYC and Promotes Cell-Cycle Progression in Cancer.

Authors:  Zehua Wang; Bo Yang; Min Zhang; Weiwei Guo; Zhiyuan Wu; Yue Wang; Lin Jia; Song Li; Wen Xie; Da Yang
Journal:  Cancer Cell       Date:  2018-04-02       Impact factor: 31.743

Review 4.  Genetic Landscape and Biomarkers of Hepatocellular Carcinoma.

Authors:  Jessica Zucman-Rossi; Augusto Villanueva; Jean-Charles Nault; Josep M Llovet
Journal:  Gastroenterology       Date:  2015-06-20       Impact factor: 22.682

Review 5.  Long noncoding RNAs: cellular address codes in development and disease.

Authors:  Pedro J Batista; Howard Y Chang
Journal:  Cell       Date:  2013-03-14       Impact factor: 41.582

Review 6.  Genetics and Genetic Biomarkers in Sporadic Colorectal Cancer.

Authors:  John M Carethers; Barbara H Jung
Journal:  Gastroenterology       Date:  2015-07-26       Impact factor: 22.682

Review 7.  Colorectal cancer.

Authors:  Hermann Brenner; Matthias Kloor; Christian Peter Pox
Journal:  Lancet       Date:  2013-11-11       Impact factor: 79.321

8.  The lncRNA H19 promotes epithelial to mesenchymal transition by functioning as miRNA sponges in colorectal cancer.

Authors:  Wei-Cheng Liang; Wei-Ming Fu; Cheuk-Wa Wong; Yan Wang; Wei-Mao Wang; Guo-Xin Hu; Li Zhang; Li-Jia Xiao; David Chi-Cheong Wan; Jin-Fang Zhang; Mary Miu-Yee Waye
Journal:  Oncotarget       Date:  2015-09-08

9.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

10.  The Ensembl gene annotation system.

Authors:  Bronwen L Aken; Sarah Ayling; Daniel Barrell; Laura Clarke; Valery Curwen; Susan Fairley; Julio Fernandez Banet; Konstantinos Billis; Carlos García Girón; Thibaut Hourlier; Kevin Howe; Andreas Kähäri; Felix Kokocinski; Fergal J Martin; Daniel N Murphy; Rishi Nag; Magali Ruffier; Michael Schuster; Y Amy Tang; Jan-Hinnerk Vogel; Simon White; Amonida Zadissa; Paul Flicek; Stephen M J Searle
Journal:  Database (Oxford)       Date:  2016-06-23       Impact factor: 3.451

View more
  28 in total

1.  Advancing Pan-cancer Gene Expression Survial Analysis by Inclusion of Non-coding RNA.

Authors:  Bo Ye; Jianxin Shi; Huining Kang; Olufunmilola Oyebamiji; Deirdre Hill; Hui Yu; Scott Ness; Fei Ye; Jie Ping; Jiapeng He; Jeremy Edwards; Ying-Yong Zhao; Yan Guo
Journal:  RNA Biol       Date:  2019-10-18       Impact factor: 4.652

2.  Expression characteristics of long non-coding RNA in colon adenocarcinoma and its potential value for judging the survival and prognosis of patients: bioinformatics analysis based on The Cancer Genome Atlas database.

Authors:  Ruofan Li; Xu Gao; Haitao Sun; Lixin Sun; Xiaojian Hu
Journal:  J Gastrointest Oncol       Date:  2022-06

3.  [m7G-lncRNAs are potential biomarkers for prognosis and tumor microenvironment in patients with colon cancer].

Authors:  S Chen; R Dong; Y Li; H Wu; M Liu
Journal:  Nan Fang Yi Ke Da Xue Xue Bao       Date:  2022-05-20

4.  NLR, PLR, LMR and MWR as diagnostic and prognostic markers for laryngeal carcinoma.

Authors:  Pingdong Li; Haiyang Li; Shuo Ding; Jing Zhou
Journal:  Am J Transl Res       Date:  2022-05-15       Impact factor: 3.940

5.  LncRNA FOXD3-AS1 promoted chemo-resistance of NSCLC cells via directly acting on miR-127-3p/MDM2 axis.

Authors:  Zhaolong Zeng; Guofang Zhao; Huangkai Zhu; Liangqin Nie; Lifeng He; Jiangtao Liu; Rui Li; Shuai Xiao; Gang Hua
Journal:  Cancer Cell Int       Date:  2020-07-29       Impact factor: 5.722

6.  Comprehensive analysis of lncRNA biomarkers in kidney renal clear cell carcinoma by lncRNA-mediated ceRNA network.

Authors:  Ke Gong; Ting Xie; Yong Luo; Hui Guo; Jinlan Chen; Zhiping Tan; Yifeng Yang; Li Xie
Journal:  PLoS One       Date:  2021-06-08       Impact factor: 3.240

7.  RNA processing genes characterize RNA splicing and further stratify colorectal cancer.

Authors:  Xiaofan Lu; Yujie Zhou; Jialin Meng; Liyun Jiang; Jun Gao; Yu Cheng; Hangyu Yan; Yang Wang; Bing Zhang; Xiaobo Li; Fangrong Yan
Journal:  Cell Prolif       Date:  2020-06-28       Impact factor: 6.831

8.  Development of an Immune Infiltration-Related Eight-Gene Prognostic Signature in Colorectal Cancer Microenvironment.

Authors:  Beilei Wu; Lijun Tao; Daqing Yang; Wei Li; Hongbo Xu; Qianggui He
Journal:  Biomed Res Int       Date:  2020-08-27       Impact factor: 3.411

9.  Methylation and transcriptome analysis reveal lung adenocarcinoma-specific diagnostic biomarkers.

Authors:  Rui Li; Yi-E Yang; Yun-Hong Yin; Meng-Yu Zhang; Hao Li; Yi-Qing Qu
Journal:  J Transl Med       Date:  2019-09-27       Impact factor: 5.531

10.  Hub Long Noncoding RNAs with m6A Modification for Signatures and Prognostic Values in Kidney Renal Clear Cell Carcinoma.

Authors:  Gaoteng Lin; Huadong Wang; Yuqi Wu; Keruo Wang; Gang Li
Journal:  Front Mol Biosci       Date:  2021-07-06
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.