| Literature DB >> 34866362 |
Chen-Rui Guo1, Yan Mao2, Feng Jiang3, Chen-Xia Juan4, Guo-Ping Zhou2, Ning Li1.
Abstract
Evidence has been emerging of the importance of long non-coding RNAs (lncRNAs) in genome instability. However, no study has established how to classify such lncRNAs linked to genomic instability, and whether that connection poses a therapeutic significance. Here, we established a computational frame derived from mutator hypothesis by combining profiles of lncRNA expression and those of somatic mutations in a tumor genome, and identified 185 candidate lncRNAs associated with genomic instability in lung adenocarcinoma (LUAD). Through further studies, we established a six lncRNA-based signature, which assigned patients to the high- and low-risk groups with different prognosis. Further validation of this signature was performed in a number of separate cohorts of LUAD patients. In addition, the signature was found closely linked to genomic mutation rates in patients, indicating it could be a useful way to quantify genomic instability. In summary, this research offered a novel method by through which more studies may explore the function of lncRNAs and presented a possible new way for detecting biomarkers associated with genomic instability in cancers.Entities:
Keywords: genome instability; long non-coding RNAs; lung adenocarcinoma; mutator phenotype
Mesh:
Substances:
Year: 2021 PMID: 34866362 PMCID: PMC8817082 DOI: 10.1002/cam4.4471
Source DB: PubMed Journal: Cancer Med ISSN: 2045-7634 Impact factor: 4.452
Information of clinical features in the three LUAD sets from TCGA
| Covariates | Type | TCGA set ( | Testing set ( | Training set ( |
|
|---|---|---|---|---|---|
| Age, no (%) | <=65 | 231 (47.14%) | 119 (48.77%) | 112 (45.53%) | 0.3682 |
| >65 | 249 (50.82%) | 117 (47.95%) | 132 (53.66%) | ||
| Unknow | 10 (2.04%) | 8 (3.28%) | 2 (0.81%) | ||
| Gender, no (%) | Female | 262 (53.47%) | 129 (52.87%) | 133 (54.07%) | 0.8612 |
| Male | 228 (46.53%) | 115 (47.13%) | 113 (45.93%) | ||
| Stage, no (%) | Stage I–II | 378 (77.14%) | 184 (75.41%) | 194 (78.86%) | 0.179 |
| Stage III–IV | 104 (21.22%) | 59 (24.18%) | 45 (18.29%) | ||
| Unknow | 8 (1.63%) | 1 (0.41%) | 7 (2.85%) | ||
| T, no (%) | T1‐2 | 426 (86.94%) | 210 (86.07%) | 216 (87.8%) | 0.745 |
| T3‐4 | 61 (12.45%) | 32 (13.11%) | 29 (11.79%) | ||
| Unknow | 3 (0.61%) | 2 (0.82%) | 1 (0.41%) | ||
| M, no (%) | M0 | 324 (66.12%) | 159 (65.16%) | 165 (67.07%) | 0.0649 |
| M1 | 24 (4.9%) | 17 (6.97%) | 7 (2.85%) | ||
| Unknow | 142 (28.98%) | 68 (27.87%) | 74 (30.08%) | ||
| N, no (%) | N0 | 317 (64.69%) | 160 (65.57%) | 157 (63.82%) | 0.7003 |
| N1‐3 | 162 (33.06%) | 78 (31.97%) | 84 (34.15%) | ||
| Unknow | 11 (2.24%) | 6 (2.46%) | 5 (2.03%) |
Chi square test.
FIGURE 1Computational description of identifying lncRNAs linked to genomic instability. We developed a somatic mutation profile. For each patient, the total number ofS somatic mutations was estimated. The numbers were then sorted in a decreasing order. Next, LUAD patients were classified into two groups, including the GU‐like group (the top 25 percent) and the GS‐like group (the last 25 percent), on the basis of the mutator phenotype. Through analyzing the lncRNA expression profiles between the two groups, lncRNAs, which had a significant correlation with genomic instability, were discovered
FIGURE 2Identification of lncRNAs linked to genomic instability in LUAD patients and further functional enrichment analysis. (A) Unsupervised clustering dependent on the 185 selected genomic instability‐related lncRNAs’ expression trend in 490 LUAD patients. The left red cluster represents the GU‐like group, while the GS‐like group is reflected by blue cluster on the right. (B) Boxplots of somatic mutation counts. The total number of somatic mutations was markedly different between the two groups. For statistical study, the Mann‐Whitney U test was used. Median values were reflected by the horizontal points. (C) Boxplots of the UBQLN4 level in both groups. The UBQLN4 level was obviously lower in the GS‐like group than in the other group. (D) Co‐expression network on the basis of the Pearson correlation coefficient analysis of genomic instability‐associated lncRNAs and mRNAs. LncRNAs are described by red circles, and mRNAs are represented by blue circles. (E) GO and KEGG functional enrichment study for co‐expressed mRNAs
Univariate Cox regression study of 10 of 185 lncRNAs correlated with genomic instability linked with overall survival in LUAD
| Ensembl ID | Gene symbol | Genomic location | HR | 95% CI |
|
|---|---|---|---|---|---|
| ENSG00000258545 | RHOXF1‐AS1 | chrX:120,036,236‐120,146,855 | 0.767 | 0.608–0.967 | 0.025 |
| ENSG00000273877 | AC236972.3 | chrX:153,225,649‐153,230,357 | 0.715 | 0.516–0.989 | 0.0427 |
| ENSG00000280109 | PLAC4 | chr21:41,175,231‐41,186,788 | 1.033 | 1.011–1.056 | 0.003 |
| ENSG00000163364 | LINC01116 | chr2:176,625,118‐176,638,186 | 1.152 | 1.101–1.206 | <0.001 |
| ENSG00000265415 | AC099850.4 | chr17:59,202,677‐59,203,829 | 1.064 | 1.026–1.103 | 0.001 |
| ENSG00000251026 | LINC02163 | chr5:104,079,847‐104,406,121 | 1.234 | 1.010–1.508 | 0.039 |
| ENSG00000225431 | LINC01671 | chr21:42,579,759‐42,615,095 | 1.039 | 1.011–1.068 | 0.005 |
| ENSG00000262454 | MIR193BHG | chr16:14,301,364‐14,336,038 | 1.134 | 1.007–1.277 | 0.038 |
| ENSG00000232415 | ELN‐AS1 | chr7:74,048,744‐74,062,301 | 0.909 | 0.834–0.990 | 0.029 |
| ENSG00000204949 | FAM83A‐AS1 | chr8:123,193,507‐123,202,744 | 1.036 | 1.006–1.066 | 0.016 |
FIGURE 3Identification of the lncRNA signature derived from genomic instability for predicting the LUAD patients' clinical outcomes. (A) Estimates of the LUAD patients' overall survival (OS) in the training set using the Kaplan‐Meier method. All 246 LUAD patients in this set were assigned into the high‐risk or the low‐risk group. Univariate Cox analysis and the log‐rank test were carried out to do the statistical analysis. (B) Analysis of the 3‐year time‐dependent ROC curves for the signature in the training set. (C) LncRNA expression patterns, somatic mutation number's distribution and the UBQLN4 expression's distribution along with the increase of the GILncSig scores. (D) Distribution of somatic mutations for LUAD patients in two risk groups. (E) Expression of UBQLN4 in both groups with different risks. Median values were reflected by horizontal lines. The Mann‐Whitney U test was carried out to analyze those statistics
FIGURE 4Evaluation of the GILncSig performance in the testing and the TCGA sets. (A) Estimates of OS by Kaplan‐Meier in the testing set for low‐ or high‐risk patients. (B) Analysis of ROC curves dependent on time and AUC for 3‐year OS in the testing set for the GILncSig. (C) LncRNA expression patterns, somatic mutation number's distribution and the UBQLN4 expression's distribution for LUAD samples in the testing set. (D) Distribution of somatic mutations for patients with LUAD in two risk groups of the testing set. (E) UBQLN4 expression levels in the two groups with different risks. (F) Kaplan–Meier estimates of OS of two risk groups in the TCGA set. Analysis of statistics was carried out with univariate Cox regression and the log‐rank test. (G) Analysis of ROC curves dependent on time and AUC for 3‐year OS in the whole TCGA set. (H) LncRNA expression patterns, somatic mutation number's distribution and the UBQLN4 expression's distribution for samples in the TCGA set. (I) Distribution of somatic mutations in two risk groups of samples from the TCGA set. (J) UBQLN4 levels in the two groups in the TCGA set. Median values were reflected by the horizontal points. The Mann‐Whitney U test was carried out to complete the statistical analysis
FIGURE 5Evaluation of the GILncSig performance in other three separate GEO data sets. Violin plots for the expression levels of UBQLN4 among patients with low and high PLAC4 expression. The Mann‐Whitney U test was conducted to compare that between two different risk groups
FIGURE 6Time‐dependent ROC curves study of 3‐year OS for the GILncSig, YulncSig and ZenglncSig
Analyses of the GILncSig by Univariate and Multivariate Cox regression models in the three LUAD sets from TCGA
| Variables | Univariable model | Multivariable model | |||||
|---|---|---|---|---|---|---|---|
| HR | 95% CI |
| HR | 95% CI |
| ||
| Training set ( | |||||||
| GILncSig | High/low | 1.037 | 1.019–1.055 | <0.001 | 1.033 | 1.015–1.051 | <0.001 |
| Age | >65/<=65 | 1.001 | 0.981–1.022 | 0.906 | |||
| Gender | Male/female | 1.268 | 0.826–1.946 | 0.277 | |||
| Stage | (III+IV)/(I+II) | 1.816 | 1.473–2.239 | <0.001 | 1.778 | 1.439–2.197 | <0.001 |
| Testing set ( | |||||||
| GILncSig | High/low | 1.073 | 1.019–1.128 | 0.007 | 1.057 | 1.004–1.112 | 0.034 |
| Age | >65/<=65 | 1.007 | 0.984–1.031 | 0.544 | |||
| Gender | Male/female | 0.941 | 0.616–1.439 | 0.780 | |||
| Stage | (III+IV)/(I+II) | 1.500 | 1.238–1.817 | <0.001 | 1.481 | 1.219–1.799 | <0.001 |
| TCGA set ( | |||||||
| GILncSig | High/low | 1.029 | 1.019–1.039 | <0.001 | 1.026 | 1.016–1.036 | <0.001 |
| Age | >65/<=65 | 1.005 | 0.989–1.020 | 0.553 | |||
| Gender | Male/female | 1.113 | 0.825–1.500 | 0.484 | |||
| Stage | (III+IV)/(I+II) | 1.641 | 1.425–1.890 | <0.001 | 1.623 | 1.408–1.871 | <0.001 |
FIGURE 7Analyses of stratification by stage. Analyses of OS of two groups with different risks in the early‐stage group (A) and the late‐stage group (B) using Kaplan–Meier curve method. The log‐rank test and univariate Cox analysis were used to do the statistical analysis