Literature DB >> 28187432

Comprehensive investigation of a novel differentially expressed lncRNA expression profile signature to assess the survival of patients with colorectal adenocarcinoma.

Jiang-Hui Zeng1, Liang Liang2, Rong-Quan He3, Rui-Xue Tang1, Xiao-Yong Cai2, Jun-Qiang Chen4, Dian-Zhong Luo1, Gang Chen1.   

Abstract

Growing evidence has shown that long non-coding RNAs (lncRNAs) can serve as prospective markers for survival in patients with colorectal adenocarcinoma. However, most studies have explored a limited number of lncRNAs in a small number of cases. The objective of this study is to identify a panel of lncRNA signature that could evaluate the prognosis in colorectal adenocarcinoma based on the data from The Cancer Genome Atlas (TCGA). Altogether, 371 colon adenocarcinoma (COAD) patients with complete clinical data were included in our study as the test cohort. A total of 578 differentially expressed lncRNAs (DELs) were observed, among which 20 lncRNAs closely related to overall survival (OS) in COAD patients were identified using a Cox proportional regression model. A risk score formula was developed to assess the prognostic value of the lncRNA signature in COAD with four lncRNAs (LINC01555, RP11-610P16.1, RP11-108K3.1 and LINC01207), which were identified to possess the most remarkable correlation with OS in COAD patients. COAD patients with a high-risk score had poorer OS than those with a low-risk score. The multivariate Cox regression analyses confirmed that the four-lncRNA signature could function as an independent prognostic indicator for COAD patients, which was largely mirrored in the validating cohort with rectal adenocarcinoma (READ) containing 158 cases. In addition, the correlative genes of LINC01555 and LINC01207 were enriched in the cAMP signaling and mucin type O-Glycan biosynthesis pathways. With further validation in the future, our study indicates that the four-lncRNA signature could serve as an independent biomarker for survival of colorectal adenocarcinoma.

Entities:  

Keywords:  COAD; READ; lncRNA; prognostic biomarker; survival

Mesh:

Substances:

Year:  2017        PMID: 28187432      PMCID: PMC5370003          DOI: 10.18632/oncotarget.15161

Source DB:  PubMed          Journal:  Oncotarget        ISSN: 1949-2553


INTRODUCTION

Colon adenocarcinoma (COAD) is one of the most frequently diagnosed cancers and a top cause of cancer death globally. Approximately 1.2 million new cases are diagnosed, causing 0.6 million deaths per year all over the world. Cancer metastasis remains the key cause of COAD death [1-6]. The five-year overall survival (OS) rate of patients with primary COAD can be up to 80–90%, but it is reduced to 5–10% in patients with metastatic tumor [7-12]. Rectal adenocarcinoma (READ), which shares similar molecular mechanism with COAD, has a comparable high incidence and poor prognosis [1–4, 6–8, 10]. Therefore, the assessment of prognostic factors is pivotal for the management of unresectable colorectal cancer patients. Several underlying mechanisms have been described in the past decades, including multiple molecular alterations, which confer the tumorigenesis and progression of colorectal cancer [13-18]. However, the exact underlying molecular markers for survival assessment of colorectal cancer remain largely unknown. Long non-coding RNAs (lncRNAs) are a class of non-protein coding RNAs of more than 200 nucleotides, which are broadly distributed in the genome and can modulate gene expression [19-22]. Recent accumulating evidence has demonstrated that lncRNA expression profiles are frequently changed in tumors compared to that of the adjacent non-tumorous tissues in numerous cancers [23-27]. The altered lncRNA expression profile has been proposed to correlate with the progression and survival in patients with various cancers, including colorectal cancer, which reveals the potential of lncRNAs to act as cancer biomarkers [28-35]. However, most of the previous studies explored a limited number of lncRNAs in a small number of cases [36-39]. Previous studies have verified that a lncRNA expression signature could be obtained from the database of The Cancer Genome Atlas (TCGA), which offers a platform for researchers to download and assess free public datasets [40-45]. Towards this, in the current study, the TCGA database was first applied to gather lncRNA gene expression profiles in COAD. By performing a comprehensive lncRNA expression profile assessment, we identified a lncRNA signature in COAD with four lncRNAs (LINC01555, RP11-610P16.1, RP11-108K3.1, and LINC01207), as a new candidate indicator with the potential to predict the OS in COAD patients. Furthermore, the prognostic value of this four-lncRNA-signature was validated in a READ cohort (Supplementary Figure 1).

RESULTS

Differentially expressed lncRNAs (DELs)

We initially performed differential expression analysis by comparing the expression of 7589 lncRNAs in COAD and normal colon tissue. The edgeR package identified 1430 differentially expressed lncRNAs (DELs, Figure 1A) and the DEseq package identified 584 DELs (Figure 1B). We combined these two groups of DELs together and 578 DELs showed a consistent direction of differential expression across the two methods (Figure 1C, Figure 2). Next, we excluded those cases without sufficient survival data, leaving 224 DELs that were selected for further survival analysis (Supplementary Table 1).
Figure 1

Differentially expressed lncRNAs (DELs) analysis

(A) DELs identified using the edgeR package; (B) DELs identified using the DESeq package; (C) Overlapping DELs.

Figure 2

The 578 differentially expressed lncRNAs (DELs) in COAD

A heatmap was drawn to show the DELs.

Differentially expressed lncRNAs (DELs) analysis

(A) DELs identified using the edgeR package; (B) DELs identified using the DESeq package; (C) Overlapping DELs.

The 578 differentially expressed lncRNAs (DELs) in COAD

A heatmap was drawn to show the DELs.

Construction of the DEL-based prognostic signature

First, a univariate Cox proportional hazards regression showed that a total of 20 among the 224 DELs were identified to maintain a significant prognostic value (Table 1). Second, a multivariate Cox proportional hazards regression analysis indicated that only four DELs exhibited a significant prognostic value for COAD, including LINC01555, RP11-610P16.1, RP11-108K3.1 and LINC01207 (Figure 3). Afterwards, the risk score for predicting the OS was constructed with the formula: Risk score = exp LINC01555*(-0.191) + exp RP11-610P16.1*(-0.338) + exp RP11-108K3.1*(0.318) + exp LINC01207*(-0.163).
Table 1

Prognostic value of the DELs by univariate cox regression analysis

EstimateStdErrChiSqProbChiSq*HazardRatio
AC016831.7–0.2090.06510.5210.0010.811
LINC01555–0.1770.0569.9850.0020.838
RP11-610P16.1–0.3060.1048.5920.0030.736
AC006273.50.2390.0838.4120.0041.271
RP11-108K3.10.2520.0917.7600.0051.287
RP1-193H18.2–0.2170.0827.0780.0080.805
LINC00675–0.1790.0696.7530.0090.836
CTD-2619J13.17–0.2260.0935.9250.0150.797
RP11-449D8.1–0.1540.0645.8380.0160.857
AF064858.6–0.1540.0645.7910.0160.857
RP11-150O12.3–0.1380.0595.5040.0190.871
TP53TG1–0.2340.1015.4190.0200.791
LINC00959–0.2480.1114.9830.0260.780
SUCLG2-AS1–0.1890.0884.6470.0310.827
LINC01315–0.1800.0864.4080.0360.836
RP11-474D1.3–0.0660.0324.3400.0370.936
MAFTRR–0.1840.0884.3310.0370.832
KBTBD11-OT1–0.1720.0844.1740.0410.842
LINC01207–0.1200.0594.0990.0430.887
LINC01132–0.2140.1064.0550.0440.808

* ProbChiSq equal P-value.

Figure 3

Clinical significance of four differentially expressed lncRNAs (DELs)

(A) The expression levels of four DELs in the tumor group compared with that in the normal group; (B) ROC curves of the four DELs to distinguish COAD tissue from normal colon tissue; (C) The Kaplan-Meier curves showing the relationship between the four DELs and the overall survival. The cases were divided into a high and low expression group by the mean DEL level. *P < 0.001.

* ProbChiSq equal P-value.

Clinical significance of four differentially expressed lncRNAs (DELs)

(A) The expression levels of four DELs in the tumor group compared with that in the normal group; (B) ROC curves of the four DELs to distinguish COAD tissue from normal colon tissue; (C) The Kaplan-Meier curves showing the relationship between the four DELs and the overall survival. The cases were divided into a high and low expression group by the mean DEL level. *P < 0.001. The COAD patients were divided into two groups of low-risk and high-risk based on the individual inflection point of the prognostic risk score (Figure 4). The risk score could largely predict the 5-year survival of COAD patients, as the area under ROC curve (AUC) was 0.706 (Figure 5A). Furthermore, K-M curves confirmed that the survival time of patients in the high-risk group was 72.935 ± 7.398 months, predominantly shorter than that of the low-risk group (103.402 ± 8.679 months, P < 0.001, Figure 5B).
Figure 4

Risk score analysis of the differentially expressed lncRNA (DEL) signature of COAD

(A) Patient survival status and time distributed by risk score; (B) Risk score curve of the four-DEL-signature; (C) Heatmap of four DELs from COAD patients. Color from blue to red indicates the expression level from low to high. The dotted line represents the individual inflection point of the risk score curve, by which the COAD patients were classified in the low-risk or high-risk group.

Figure 5

The prognostic performance of the four-differentially expressed lncRNA (DEL) signature of COAD

(A) The prognostic performance of the risk score showed by the time-dependent receiver operating characteristic (ROC) curve for predicting the 5-years survival. (B) The Kaplan-Meier test of the risk score for the overall survival.

Risk score analysis of the differentially expressed lncRNA (DEL) signature of COAD

(A) Patient survival status and time distributed by risk score; (B) Risk score curve of the four-DEL-signature; (C) Heatmap of four DELs from COAD patients. Color from blue to red indicates the expression level from low to high. The dotted line represents the individual inflection point of the risk score curve, by which the COAD patients were classified in the low-risk or high-risk group.

The prognostic performance of the four-differentially expressed lncRNA (DEL) signature of COAD

(A) The prognostic performance of the risk score showed by the time-dependent receiver operating characteristic (ROC) curve for predicting the 5-years survival. (B) The Kaplan-Meier test of the risk score for the overall survival. Meanwhile, the prognostic value of different clinical parameters was also compared to that of the risk score. The univariate Cox proportional hazards regression showed that a number of parameters could predict poorer survival of COAD (Table 2). However, when analyzed by multivariate Cox proportional hazards regression test, only neoplasm recurrence together with the risk score from the DELs, was independent prognostic indictor of COAD (Table 2). The K-M curves of the above clinical features are shown in Figure 6.
Table 2

The predictive values of related clinical parameters and risk score

VariablesPatient n = 371Univariate analysisMultivariate analysisP
HR (95% CI)PHR (95% CI)
SexFemale1701 (reference)
Male2011.180 (0.761–1.831)0.459
Age<= 65 years1621 (reference)
> 65 years2091.284 (0.816–2.022)0.280
Disease stageI671 (reference)1 (reference)
II1462.187 (0.759–6.304)0.1471.225E4 (0.000–1.976E86)0.922
III993.904 (1.354–11.257)0.0128.180E4 (0.000–1.328E87)0.907
IV529.645 (3.351–27.758)< 0.0012.394E5 (0.000–3.875E87)0.989
T stageT1101 (reference)
T2680.452 (0.087–2.352)0.345
T32491.052 (0.256–4.332)< 0.944
T4444.078 (0.937–17.747)< 0.061
N stageN02271 (reference)1 (reference)
N1821.827 (1.059–3.151)0.0300.624 (0.096–4.049)0.621
N2-N3614.348 (2.577–7.337)< 0.0010.868 (0.136–5.555)0.881
M stageM02691 (reference)
M1514.601 (2.773–7.636)< 0.001
Lymphatic invasionNO2111 (reference)1 (reference)
YES1201.960 (1.219–3.150)0.0050.380 (0.121–1.189)0.096
Venous invasionNO2481 (reference)1 (reference)
YES742.490 (1.547–4.006)< 0.0011.331 (0.438–4.048)0.614
Treatment outcomeCR+PR1231 (reference)
SD+PD297.320 (3.491–15.347)<0.001
RadiotherapyNO3081 (reference)
YES80.740 (0.102–5.355)0.765
Neoplasm recurrenceNO2561 (reference)1 (reference)
YES602.990 (1.855–4.819)< 0.0013.030 (1.234–7.440)0.016
Residual tumorR02631 (reference)1 (reference)
R1+R2203.953 (1.933–8.082)< 0.0011.660 (0.607–4.537)0.323
Dimession<=10 mm1371 (reference)
> 10 mm1240.944 (0.537–1.659)0.840
Risk scoreLow1701 (reference)1 (reference)
High2012.948 (1.779 – 4.886)< 0.0019.389 (2.737–32.213)<0.001

HR: hazard ratio; CI: confidence interval; SD: stable disease; PD: progressive disease; CR: complete response; PR: partial response.

Figure 6

The prognostic value of different parameters for survival of COAD patients

Kaplan-Meier curves of nine independent prognostic indictors, including pathologic tumor stage, pathologic node stage, pathologic metastasis, pathologic stage, lymphatic invasion, venous invasion, residual tumor, tumor recurrence and treatment outcome. CR, complete response; PR, partial response; SD, stable disease; PD, progressive disease.

HR: hazard ratio; CI: confidence interval; SD: stable disease; PD: progressive disease; CR: complete response; PR: partial response.

The prognostic value of different parameters for survival of COAD patients

Kaplan-Meier curves of nine independent prognostic indictors, including pathologic tumor stage, pathologic node stage, pathologic metastasis, pathologic stage, lymphatic invasion, venous invasion, residual tumor, tumor recurrence and treatment outcome. CR, complete response; PR, partial response; SD, stable disease; PD, progressive disease. We also assessed the relationship between the risk score based on the DELs signature and various clinical features, and the risk score showed moderate prognostic value for predicting the status of tumor stage, metastasis and lymphatic invasion (Figure 7). The expression pattern of these four DELs in the low- and high-risk group is also presented in Figure 8.
Figure 7

The predictive value of the risk score for the clinical status

characteristic (ROC) curve predicting the different clinical parameters: pathologic tumor stage, pathologic node stage, pathologic metastasis, pathologic stage, lymphatic invasion, venous invasion, residual tumor, tumor recurrence and treatment outcome.

Figure 8

The expression level of the four lncRNAs in the low- and high-risk groups

The difference in the expression level of LINC01555, RP11-610P16.1, RP11-108K3.1 and LINC01207 between the low-risk group and high-risk group. *P < 0.001.

The predictive value of the risk score for the clinical status

characteristic (ROC) curve predicting the different clinical parameters: pathologic tumor stage, pathologic node stage, pathologic metastasis, pathologic stage, lymphatic invasion, venous invasion, residual tumor, tumor recurrence and treatment outcome.

The expression level of the four lncRNAs in the low- and high-risk groups

The difference in the expression level of LINC01555, RP11-610P16.1, RP11-108K3.1 and LINC01207 between the low-risk group and high-risk group. *P < 0.001.

Validation of the four-DEL-signature in READ

For validation, the READ patients were also divided into low-risk and high-risk groups on the basis of the prognostic risk score (Figure 9A). Though the AUC of ROC to predict 5-year survival was slightly less than that in COAD patients (Figure 9B), K-M curves did show a close relationship between the four-DEL-signature and survival (P = 0.014, Figure 9C). Additionally, both univariate (HR: 3.006, 95% CI: 1.192 - 7.586, P = 0.020) and multivariate Cox proportional hazards regression tests (HR: 8.602, 95% CI: 1.159–63.839, P = 0.035) revealed that the risk score of four-DEL-signature was an independent prognostic indictor of READ (Supplementary Table 2).
Figure 9

Validation of the four-DEL-signature in READ

(A) READ patients were divided into high- and low-risk groups using the risk score generated from the four-DEL-signature. The relationship between survival and risk score is presented (top); The risk score curve is drawn to classify the READ patients into low- and high- risk groups (middle); The risk scores of the READ patients are shown in a heatmap (bottom). (B) ROC curve for predicting 5-year survival in READ patients by the risk score. (C) Kaplan-Meier curves of the four-lncRNA signature for READ patients.

Validation of the four-DEL-signature in READ

(A) READ patients were divided into high- and low-risk groups using the risk score generated from the four-DEL-signature. The relationship between survival and risk score is presented (top); The risk score curve is drawn to classify the READ patients into low- and high- risk groups (middle); The risk scores of the READ patients are shown in a heatmap (bottom). (B) ROC curve for predicting 5-year survival in READ patients by the risk score. (C) Kaplan-Meier curves of the four-lncRNA signature for READ patients.

Functional assessment of the prognostic DELs

Only correlative genes for LINC01555 and LINC01207 could be determined using the Multi-Experiment Matrix (MEM, Figure 10 and Figure 11). Forty genes were collected for LINC01555 and 78 genes for LINC01207. Using kyoto encyclopedia of genes and genomes (KEGG) pathway analysis, the correlative genes of LINC01555 were found to be enriched in the cAMP signaling pathway and the neuroactive ligand-receptor interaction pathways, while genes related to LINC01207 were enriched in mucin type O-Glycan biosynthesis pathway, the - lacto/neolacto-series glycosphingolipid biosynthesis pathway and metabolic pathway. Similarly, some Gene Ontology (GO) terms were also enriched (Table 3, Figure 12). For construction of the protein-protein interaction (PPI) network, there were 3 genes with more than 3 nodes for LINC01207 (GALNT4, GALNT7, MUC13), which were regarded as hub genes. However, we failed to observe similar hub genes for LINC01555.
Figure 10

Gene network of the correlative genes of LINC01207

The network of genes co-expressed with LINC01207. LncRNAs are shown as green nodes and mRNAs as red nodes. We measured LINC01207 co-expression using the Multi Experiment Matrix (MEM) software. The Affymetrix GeneChip Human Genome U133 Plus 2.0 platform type was selected for further analysis. The top 100 genes were used to draw the network schematic. The bottom box displays the protein-protein interaction (PPI) network of related genes.

Figure 11

Gene network of the correlative genes of LINC01555

The network of genes co-expressed with LINC01555. LncRNAs are shown as green nodes and mRNAs are shown as red nodes. We detected the co-expression of LINC01555 using Multi Experiment Matrix (MEM) software. The Affymetrix GeneChip Human Genome U133 Plus 2.0 platform type was applied for further examination. The top 100 genes were used to draw the network schematic. The bottom box displays the protein-protein interaction (PPI) network of related genes.

Table 3

GO terms and pathway analyses with the correlative genes of LINC01555 and LINC01207

CategoryTermP valueGenes
LINC01555
KEGG pathwaycAMP signaling pathway0.016GIPR, ADCY10, GLP1R
Neuroactive ligand-receptor interaction0.029SSTR3, GIPR, GLP1R
Biological processMeiotic nuclear division0.034TUBGCP6, FKBP6
Regulation of heart contraction0.043TNNT2, GLP1R
Cellular response to estradiol stimulus0.049SSTR3, MYOG
Cellular componentBasal part of cell0.021OTOF, ADCY10
Molecular functionStructural molecule activity0.047KRT74, SNTG1, KRT35
LINC01207
KEGG pathwayMucin type O-Glycan biosynthesis< 0.001GCNT3, GALNT7, POC1B-GALNT4, ST6GALNAC1
Glycosphingolipid biosynthesis - lacto and neolacto series0.003FUT6, FUT3, FUT2
Metabolic pathways0.007GCNT3, CYP2J2, GALNT7, GMDS, POC1B-GALNT4, FUT6, FUT3, HGD, FUT2, ST6GALNAC1
Biological processProtein glycosylation< 0.001GALNT7, POC1B-GALNT4, FUT6, FUT3, FUT2, ST6GALNAC1
L-fucose catabolic process0.001FUT6, FUT3, FUT2
Fucosylation0.001FUT6, FUT3, FUT2
Maintenance of gastrointestinal epithelium0.001TFF1, MUC13, PBLD
O-glycan processing0.002GCNT3, GALNT7, POC1B-GALNT4, MUC13
Carbohydrate metabolic process0.005GCNT3, GALNT7, POC1B-GALNT4, TFF1, FUT2
Positive regulation of establishment of protein localization to plasma membrane0.006PLS1, AGR2, CIB1
Lung goblet cell differentiation0.012SPDEF, AGR2
Oligosaccharide biosynthetic process0.042FUT3, ST6GALNAC1
Cellular componentExtracellular exosome< 0.001GCNT3, TSPAN1, CYP2J2, GALNT7, CLDN3, SLC44A4, POC1B-GALNT4, FUT6, KIAA1324, TSPAN8, PBLD, CANT1, ZG16B, ANG, PLS1, FUT3, TFF3, CEACAM5, FUT2, MUC13, GOLM1, CIB1, BCAS1, TMPRSS2, CAPN5, SLC12A2, GMDS, TMC5, TMC4, HGD, GPA33
Golgi cisterna membrane0.003FUT6, FUT3, FUT2, CANT1
Golgi apparatus0.006SH3RF1, POC1B-GALNT4, FUT6, KIAA1324, ATP8B1, FUT3, FUT2, GOLM1, ST6GALNAC1, CIB1
Integral component of plasma membrane0.008TMPRSS2, ATP2C2, TNFRSF11A, TSPAN1, SLC12A2, CLDN3, SLC22A23, KIAA1324, ATP8B1, GPA33, TSPAN8, CEACAM5, GOLM1
Integral component of membrane0.009GCNT3, GPR160, FAM3D, GALNT7, CLDN3, SLC44A4, POC1B-GALNT4, FUT6, KIAA1324, CLDN12, CANT1, TMEM144, STX19, ATP8B1, FUT3, CREB3L1, FUT2, MUC13, TMPRSS2, TMEM45B, SLC12A2, FA2H, TMC5, SLC22A23, TMC4, ST6GALNAC1, STYK1, ATP2C2, CDH17, SMIM22, MGST2
Bicellular tight junction0.010CLDN3, POF1B, CLDN12, TJP3
Golgi membrane0.028ATP2C2, GCNT3, GALNT7, POC1B-GALNT4, FUT6, FUT3, ST6GALNAC1
Apical plasma membrane0.030SHROOM3, SLC12A2, ATP8B1, MUC13, CIB1
Perinuclear region of cytoplasm0.040SH3RF1, ATP2C2, TSPAN1, RASEF, POC1B-GALNT4, RHPN2, CIB1
Organelle membrane0.044CYP2J2, FA2H, MGST2
Molecular functionFucosyltransferase activity< 0.001FUT6, FUT3, FUT2
3-galactosyl-N-acetylglucosaminide 4-alpha-L-fucosyltransferase activity0.011FUT6, FUT3
Alpha-(1->3)-fucosyltransferase activity0.030FUT6, FUT3
Dystroglycan binding0.034AGR3, AGR2
carbohydrate binding0.040ZG16B, GALNT7, LGALS4, POC1B-GALNT4
Figure 12

GO and KEGG term analysis of potential genes related to LINC01555 and LINC01207

The Rich factor shows the degree of enrichment, which was calculated by the formula: (the number of selected genes in a term/total number of selected genes)/(the total number of genes in a term of the database/ the total number of genes in the database). The Node size represents the number of selected genes, and color represents the p-value of the enrichment analysis. CC, cellular component; MF, molecular function; BP, biological process.

Gene network of the correlative genes of LINC01207

The network of genes co-expressed with LINC01207. LncRNAs are shown as green nodes and mRNAs as red nodes. We measured LINC01207 co-expression using the Multi Experiment Matrix (MEM) software. The Affymetrix GeneChip Human Genome U133 Plus 2.0 platform type was selected for further analysis. The top 100 genes were used to draw the network schematic. The bottom box displays the protein-protein interaction (PPI) network of related genes.

Gene network of the correlative genes of LINC01555

The network of genes co-expressed with LINC01555. LncRNAs are shown as green nodes and mRNAs are shown as red nodes. We detected the co-expression of LINC01555 using Multi Experiment Matrix (MEM) software. The Affymetrix GeneChip Human Genome U133 Plus 2.0 platform type was applied for further examination. The top 100 genes were used to draw the network schematic. The bottom box displays the protein-protein interaction (PPI) network of related genes.

GO and KEGG term analysis of potential genes related to LINC01555 and LINC01207

The Rich factor shows the degree of enrichment, which was calculated by the formula: (the number of selected genes in a term/total number of selected genes)/(the total number of genes in a term of the database/ the total number of genes in the database). The Node size represents the number of selected genes, and color represents the p-value of the enrichment analysis. CC, cellular component; MF, molecular function; BP, biological process.

DISCUSSION

In the current study, to define lncRNAs significantly related to OS, a univariate Cox proportional hazards regression with the significance level set at 0.05 was first performed on 7589 lncRNAs from 371 COAD patients according to the defined criterion in a large number of COAD patients based on the data downloaded from the TCGA database. A total of four lncRNAs were identified. We then developed a risk score by combining the four lncRNAs and found that this four-lncRNA signature could independently predict OS in COAD patients, which was further validated in READ patients. As far as we know, this is the first study to construct a risk score by mining TCGA data for the survival assessment of colorectal cancer patients. Colorectal cancer is one of the deadliest solid malignancies, and the involvement of noncoding RNAs in the development, diagnosis, and prognosis of colorectal cancer has been widely investigated [33-35]. In addition to the aggressive properties of colorectal cancer, the lack of specific biomarkers for its diagnosis, therapeutic effect monitoring and prognosis might also be responsible for the low survival rate. Hence, there is a critical need for reliable prognostic factors pinpointing a poor outcome. Recent large-scale genomic analyses have made it possible to reveal a catalogue of molecular characteristics associated with colorectal cancer outcome [44, 46–53]. However, most of the existing studies have focused on mRNA and microRNA expression [54-56]. Knowledge is now rapidly emerging on the functional roles of lncRNAs in cancer initiation and progression, representing a significant untapped molecular resource for cancer prognosis as well. The relationship between aberrant lncRNAs and survival of colorectal cancer has been studied in small samples using distinct approaches. Li et al. [57] analyzed the prognostic value of 21 lncRNAs by PCR array in 30 colorectal cancer patients and reported that higher levels of AFAP1-AS1, BCAR4, H19, HOXA-AS2, MALAT1 or PVT1 and a lower level of ADAMTS9-AS2 could predict a poor prognosis of colorectal cancer patients. Similarly, Wang et al. [58] studied lncRNA expression profiling using microarray in six cases of colorectal cancer patients. Multivariate Cox analysis revealed that lncRNA NR_029373 and NR_034119 were both independently related to the disease-specific survival rate. Furthermore, Sun et al. [25] searched in GEO datasets and achieved five studies: GSE8671, GSE22598, GSE23878, GSE9348, and GSE37364, that studied lncRNAs in 150 cases of colorectal cancer patients. They found that one lncRNA, AK098081 could be considered an independent risk factor for colorectal cancer patients (HR = 1.896, 95% CI = 1.393–2.579, P < 0.001). Surprisingly, no consistent lncRNA has been verified by different groups, potentially due, at least in part, to the limited sample size and differing detection methods. Compared with previous studies, our study uses data from the TCGA database with high-throughput analysis of lncRNAs from a larger sample size. Herein, we report that expression of four novel lncRNAs (LINC01555, RP11-610P16.1, RP11-108K3.1 and LINC01207) could also become a new independent risk factor for colorectal cancer patients. Moreover, the risk score constructed from these four lncRNAs could be an indicator for the colorectal cancer patients in the clinical setting. No study as of yet has investigated the function of the aforementioned four lncRNAs. Here, we performed MEM to gather the correlative genes of these four lncRNAs. However, the correlative genes were only found for LINC01555 and LINC01207 in this step. Interestingly, the correlative genes of LINC01555 were enriched in cAMP signaling pathway and neuroactive ligand-receptor interaction pathway, whereas the genes correlative to LINC01207 were enriched in mucin type O-Glycan biosynthesis pathway, – the lacto/neolacto-series glycosphingolipid biosynthesis pathway and metabolic pathway, which are all classical signaling pathways closely related to the tumorigenesis and progression of malignancies. For instance, intracellular cAMP has been proposed to impact the biological behavior, namely to suppress the growth of colorectal cancer cells [59, 60]. Most likely, LINC01555 could play substantial roles in the tumorigenesis and development of colorectal cancer via influencing cAMP signaling pathway. Therefore, the functional enrichment analysis may offer a clue for elucidating the role of LINC01555 and LINC01207 in carcinogenesis of colorectal cancer and the specific underlying molecular mechanisms. However, since the research on the clinical and biological function of LINC01555, RP11-610P16.1, RP11-108K3.1, and LINC01207 is still nonexistent in colorectal cancer patients, there is a lot of research that needs to be accomplished. The findings of the current study may have substantial clinical significance or implications; however, some limitations should be considered. First, the mean time of follow-up was 29.375 months for COAD and 26.965 months for READ patients, and a study including a longer follow-up time is warranted to validate our findings in the future. Second, the data from TCGA were based on the RNA-seq technique; other experimental methods are needed to verify the current finding. Third, the roles of LINC01555, RP11-610P16.1, RP11-108K3.1, and LINC01207 in colorectal cancer are unknown; in vitro and in vivo experiments are expected to answer this question. In conclusion, by analyzing the genome-wide lncRNA expression profiles in a large cohort from TCGA, we identified a four-lncRNA signature, which could act as an indicator for patient outcome and could be a potential independent biomarker for prognosis prediction of colorectal cancer. We will gather clinical samples and validated these findings experimentally in our future work.

MATERIALS AND METHODS

Differentially expressed lncRNAs

RNA sequencing (RNA-Seq) data from 521 individuals with COAD were obtained from TCGA data portal (https://tcga-data.nci.nih.gov/docs/publications/tcga/?), including data from 480 COAD tissue samples and 41 non-tumorous adjacent-normal colon tissue samples up to November 9, 2016. Since the data were provided by TCGA, additional approval by an ethics committee was not needed. Data processing was performed in line with the TCGA human subject protection and data access policies. This dataset consisted of called gene counts for 60,244 mRNAs, which were assessed on the IlluminaHiSeq RNA-Seq platform. In the current paper, only lncRNAs with description from NCBI or Ensemble were selected for further study. Finally, we obtained the expression profiles of 7581 lncRNAs. We then filtered the differentially expressed lncRNAs (DELs) using two individual R packages: edgeR [61, 62] and DEseq [63, 64], with Padj < 0.05 and logFC > 1 of expression level between comparison of tumor and adjacent normal colon tissue. Since a strategy that combines edgeR and DESeq is proposed for large sample sizes from TCGA [65], here in the current study, this combination of edgeR and DESeq was adopted. The overlapping DELs obtained using both edgeR and DEseq were sent for further survival analysis. For validation, the relevant data including lncRNA levels and clinicopathological parameters were also downloaded for 158 READ tissues and 10 non-tumorous controls.

Construction of the DEL-based prognostic signature and statistical analysis

The DELs that were 0 in greater than 10% of all subjects were eliminated. The expression level of each DEL was log2-transformed for further analysis. Meanwhile, clinicopathological parameters and survival data were also downloaded from TCGA. The subjects without clinical data were excluded, which resulted in a 371-sample cohort with 224 DELs enrolled in the survival analysis (Supplementary Table 1). The end-point in our study was OS. The average follow-up time was 29.375 months in this COAD cohort and 26.965 months in READ cases. The univariate Cox proportional hazards regression with a significance level set at 0.05 was performed to obtain the DELs that are closely correlated with the OS. A total of four DELs were identified. The multivariate cox regression model was further performed to test the prognostic value of the DELs. A prognosis risk score for predicting OS was established on the basis of a linear combination of the expression level multiplied by the regression coefficient derived from the multivariate cox regression model (β) with the following formula as previously reported: Risk score = expDEL1*βDEL1 + expDEL2*βDEL2 + … expDELn*βDELn. COAD and READ patients were divided into two groups: high-score and low-score, with the cut-off of the individual inflection point of the prognostic risk score [66]. Univariate Cox proportional hazards regression analyses were further conducted to investigate the effects of various clinical characteristics and the risk score on the OS of COAD and READ patients. Each predictor identified via the univariate analysis was further evaluated by a multivariate Cox proportional hazards regression analysis to determine whether the lncRNA prognostic model was independent of other clinical variables, adjusting for age, tumor stage, grade, surgical debulking status, and risk scores. Hazard ratio (HR) and 95% confidence intervals (CIs) were assessed. The time-dependent receiver operating characteristic (ROC) curve analysis within 5 years as the defining point was also performed using the R package “survivalROC”, to evaluate the predictive accuracy of the prognostic model for time-dependent disease outcomes [67]. Kaplan-Meier survival curves were used to estimate the OS time for COAD and READ patients with predicted high- or low-risk scores, and the survival differences between the high-risk group and low-risk group were assessed by a two-sided log-rank test using SPSS 22.0 software (SPSS Inc., Chicago, IL, USA) [68]. The relationship between the DEL signature and clinical features were examined using a Chi-square test. An ROC curve was drawn to assess the predictive significance of the risk score for the patient outcome after the first course of treatment. If a two-sided P-value was less than 0.05, statistical significance was determined. The statistical analyses were conducted with SPSS22.0 software (SPSS Inc., Chicago, IL, USA).

Functional assessment of prognostic DELs

The correlative genes of the DELs were collected using the Multi-Experiment Matrix (MEM) (http://biit.cs.ut.ee/mem/index.cgi) [69]. Given a gene as an input, the MEM ranks other genes by similarity in each separable data set. In essence, this analysis is a new rank aggregation method that takes the individual rankings and determines a score of significance, and then, a ranking across all datasets at the same time. Functional enrichment analysis at the GO and KEGG pathway levels and PPI assessments were employed to infer the lncRNA function with the DAVID Bioinformatics Tool (https://david.ncifcrf.gov/, version 6.7) and STRING database. The enriched GO terms and the KEGG pathways with p-value < 0.05 were regarded as potential function of the prognostic lncRNAs. The DELs and the correlative genes were visualized as a network using Cytoscape. A selection of protein pairs from the PPI assessment with an association score greater than 0.4 and a number of nodes beyond 3 were the final products of the correlative genes.
  68 in total

Review 1.  Use of NCCN Guidelines, Other Guidelines, and Biomarkers for Colorectal Cancer Screening.

Authors:  Christina D Williams; William M Grady; Leah L Zullig
Journal:  J Natl Compr Canc Netw       Date:  2016-11       Impact factor: 11.908

2.  Prognostic significance of the preoperative serum C-reactive protein level in patients with stage IV colorectal cancer.

Authors:  Masatsune Shibutani; Kiyoshi Maeda; Hisashi Nagahara; Eiji Noda; Hiroshi Ohtani; Yukio Nishiguchi; Kosei Hirakawa
Journal:  Surg Today       Date:  2014-05-10       Impact factor: 2.549

3.  Association of long non-coding RNA HOTTIP with progression and prognosis in colorectal cancer.

Authors:  Ying-Kun Ren; Yan Xiao; Xiang-Bin Wan; Yu-Zhou Zhao; Jian Li; Yin Li; Guang-Sen Han; Xiao-Bing Chen; Qi-Yun Zou; Gang-Cheng Wang; Chao-Min Lu; Yong-Chao Xu; You-Cai Wang
Journal:  Int J Clin Exp Pathol       Date:  2015-09-01

4.  The Impact of Postoperative Complications on Long-term Oncologic Outcomes After Laparoscopic Low Anterior Resection for Rectal Cancer.

Authors:  Eun Jung Park; Seung Hyuk Baik; Jeonghyun Kang; Hyuk Hur; Byung Soh Min; Kang Young Lee; Nam Kyu Kim
Journal:  Medicine (Baltimore)       Date:  2016-04       Impact factor: 1.889

5.  Long noncoding RNA H19 indicates a poor prognosis of colorectal cancer and promotes tumor growth by recruiting and binding to eIF4A3.

Authors:  Dong Han; Xu Gao; Meng Wang; Yu Qiao; Ya Xu; Jing Yang; Nazhen Dong; Jun He; Qian Sun; Guixiang Lv; Changqing Xu; Ji Tao; Ning Ma
Journal:  Oncotarget       Date:  2016-04-19

6.  A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data.

Authors:  Sheng Yang; Li Guo; Fang Shao; Yang Zhao; Feng Chen
Journal:  Comput Math Methods Med       Date:  2015-10-05       Impact factor: 2.238

7.  Cyclic phosphatidic acid stimulates cAMP production and inhibits growth in human colon cancer cells.

Authors:  Tamotsu Tsukahara; Yoshikazu Matsuda; Hisao Haniu
Journal:  PLoS One       Date:  2013-11-25       Impact factor: 3.240

8.  Elevated Resistin Gene Expression in African American Estrogen and Progesterone Receptor Negative Breast Cancer.

Authors:  Karin A Vallega; NingNing Liu; Jennifer S Myers; Kaixian Yu; Qing-Xiang Amy Sang
Journal:  PLoS One       Date:  2016-06-17       Impact factor: 3.240

9.  Increased expression of the long noncoding RNA CRNDE-h indicates a poor prognosis in colorectal cancer, and is positively correlated with IRX5 mRNA expression.

Authors:  Tong Liu; Xin Zhang; Yong-Mei Yang; Lu-Tao Du; Chuan-Xin Wang
Journal:  Onco Targets Ther       Date:  2016-03-11       Impact factor: 4.147

10.  Mutational Profiling Can Establish Clonal or Independent Origin in Synchronous Bilateral Breast and Other Tumors.

Authors:  Lei Bao; Karen Messer; Richard Schwab; Olivier Harismendy; Minya Pu; Brian Crain; Shawn Yost; Kelly A Frazer; Brinda Rana; Farnaz Hasteh; Anne Wallace; Barbara A Parker
Journal:  PLoS One       Date:  2015-11-10       Impact factor: 3.240

View more
  53 in total

1.  Single-cell cloning of human T-cell lines reveals clonal variation in cell death responses to chemotherapeutics.

Authors:  Kathleen Hanlon; Alex Thompson; Lorena Pantano; John N Hutchinson; Arshed Al-Obeidi; Shu Wang; Meghan Bliss-Moreau; Jennifer Helble; Gabriela Alexe; Kimberly Stegmaier; Daniel E Bauer; Ben A Croker
Journal:  Cancer Genet       Date:  2019-06-12

2.  Pancreatic adenocarcinoma associated immune-gene signature as a novo risk factor for clinical prognosis prediction in hepatocellular carcinoma.

Authors:  Lei Dai; Joseph Mugaanyi; Xingchen Cai; Caide Lu; Changjiang Lu
Journal:  Sci Rep       Date:  2022-07-13       Impact factor: 4.996

3.  Integrative Proteo-Genomic Analysis for Recurrent Survival Prognosis in Colon Adenocarcinoma.

Authors:  FeiYan Ai; Wenhao Wang; Shaojun Liu; Decai Zhang; Zhenyu Yang; Fen Liu
Journal:  Front Oncol       Date:  2022-06-30       Impact factor: 5.738

4.  Clinical value and potential pathways of miR-183-5p in bladder cancer: A study based on miRNA-seq data and bioinformatics analysis.

Authors:  Jia-Min Gao; Lin-Zhen Huang; Zhi-Guang Huang; Rong-Quan He
Journal:  Oncol Lett       Date:  2018-02-06       Impact factor: 2.967

5.  Long non-coding RNA LINC00959 predicts colorectal cancer patient prognosis and inhibits tumor progression.

Authors:  Zhen-Qiang Sun; Chen Chen; Quan-Bo Zhou; Jin-Bo Liu; Shuai-Xi Yang; Zhen Li; Chun-Lin Ou; Xian-Tao Sun; Gui-Xian Wang; Jun-Min Song; Zhi-Yong Zhang; Wei-Tang Yuan
Journal:  Oncotarget       Date:  2017-09-21

6.  Comprehensive analysis of a novel four-lncRNA signature as a prognostic biomarker for human gastric cancer.

Authors:  Yan Miao; Jing Sui; Si-Yi Xu; Ge-Yu Liang; Yue-Pu Pu; Li-Hong Yin
Journal:  Oncotarget       Date:  2017-08-24

7.  Diagnosis, prognosis and bioinformatics analysis of lncRNAs in hepatocellular carcinoma.

Authors:  Meiyu Dai; Siyuan Chen; Xiaomou Wei; Xuan Zhu; Fang Lan; Shengming Dai; Xue Qin
Journal:  Oncotarget       Date:  2017-09-28

8.  Survival prediction of kidney renal papillary cell carcinoma by comprehensive LncRNA characterization.

Authors:  Huihua Lan; Jianghui Zeng; Gang Chen; Huayi Huang
Journal:  Oncotarget       Date:  2017-11-28

9.  Expression profiles analysis of long non-coding RNAs identified novel lncRNA biomarkers with predictive value in outcome of cutaneous melanoma.

Authors:  Xu Ma; Zhijuan He; Ling Li; Daping Yang; Guofeng Liu
Journal:  Oncotarget       Date:  2017-09-08

10.  Identification of potential long non-coding RNA biomarkers associated with the progression of colon cancer.

Authors:  Jingwen Li; Weinan Xue; Junli Lv; Peng Han; Yanlong Liu; Binbin Cui
Journal:  Oncotarget       Date:  2017-05-17
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.