| Literature DB >> 34686717 |
Jian Zhang1, Nan Ding1, Yongxing He2, Chengbin Tao1, Zhongzhen Liang1, Wenhu Xin1, Qianyun Zhang1, Fang Wang3.
Abstract
The research is executed to analyze the connection between genomic instability-associated long non-coding RNAs (lncRNAs) and the prognosis of cervical cancer patients. We set a prognostic model up and explored different risk groups' features. The clinical datasets and gene expression profiles of 307 patients have been downloaded from The Cancer Genome Atlas database. We established a prognostic model that combined somatic mutation profiles and lncRNA expression profiles in a tumor genome and identified 35 genomic instability-associated lncRNAs in cervical cancer as a case study. We then stratified patients into low-risk and high-risk groups and were further checked in multiple independent patient cohorts. Patients were separated into two sets: the testing set and the training set. The prognostic model was built using three genomic instability-associated lncRNAs (AC107464.2, MIR100HG, and AP001527.2). Patients in the training set were divided into the high-risk group with shorter overall survival and the low-risk group with longer overall survival (p < 0.001); in the meantime, similar comparable results were found in the testing set (p = 0.046), whole set (p < 0.001). There are also significant differences in patients with histological grades, FIGO stages, and different ages (p < 0.05). The prognostic model focused on genomic instability-associated lncRNAs could predict the prognosis of cervical cancer patients, paving the way for further research into the function and resource of lncRNAs, as well as a key approach to customizing individual care decision-making.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34686717 PMCID: PMC8536663 DOI: 10.1038/s41598-021-00384-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Clinical information for 3 cervical cancer patients sets in this study.
| Characteristics | Testing set (n = 152) | Training set (n = 152) | Whole set (n = 304) | |
|---|---|---|---|---|
| Young (≤ 46) | 76 (50) | 78 (51.32) | 154 (50.66) | 0.9087 |
| Old (> 46) | 76 (50) | 74 (48.68) | 150 (49.34) | |
| G1–2 | 70 (46.05) | 83 (54.61) | 153 (50.33) | 0.1087 |
| G3 | 66 (43.42) | 52 (34.21) | 118 (38.82) | |
| Unknow | 16 (10.52) | 17 (11.18) | 33 (10.86) | |
| Stage I–IIA | 97 (63.81) | 91 (59.87) | 188 (61.84) | 0.3421 |
| Stage IIB–IVB | 50 (32.89) | 59 (34.21) | 109 (35.86) | |
| Unknow | 5 (3.29) | 2 (1.32) | 7 (2.30) | |
| T1–2 | 104 (68.42) | 107 (70.39) | 211 (69.41) | 0.1492 |
| T3–4 | 10 (6.58) | 20 (13.16) | 30 (9.87) | |
| Unknow | 38 (25) | 25 (16.45) | 63 (20.72) | |
| M0 | 57 (37.5) | 59 (38.82) | 116 (38.16) | 0.1494 |
| M1 | 2 (1.32) | 8 (5.26) | 10 (3.29) | |
| Unknow | 93 (61.18) | 85 (55.92) | 178 (58.55) | |
| N0 | 70 (46.05) | 63 (41.45) | 133 (43.75) | 0.2982 |
| N1 | 26 (17.11) | 34 (22.37) | 60 (19.74) | |
| Unknow | 56 (36.84) | 55 (36.18) | 111 (36.51) | |
| Alive | 124 (81.58) | 110 (72.37) | 234 (76.97) | 0.0766 |
| Dead | 28 (18.42) | 42 (27.63) | 70 (23.03) | |
*Compared testing set with training set by using Chi square test.
Figure 1Computational process of genomic instability-related lncRNAs detection. Calculating the cumulative number of somatic mutations per sample and ranked in decreasing order. Then, somatic mutation profile was built. The columns reflect cervical cancer samples, and the rows reflect genes. The value reflects the number of altered sites for each gene on each sample. Samples were divided into two groups, GU-like group (patients’ mutator phenotype ranked in the top 25%) and GS-like group (the last 25%), according to their mutator phenotype. Genomic instability-related lncRNAs were detected by comparing the lncRNA expression profile between GU group and GS group. Differentially expressed lncRNAs were defined as genomic instability-associated lncRNAs.
Figure 2Identification and functional annotations of genomic instability-related lncRNAs in patients with cervical cancer. (A) Clustering of 147 cervical cancer patients based on the expression pattern of 35 candidate genomic instability-related lncRNAs. The left blue cluster is GS-like group, and the right red cluster is GU-like group. (B) Boxplots of somatic mutations in the GU-like group and GS-like group. Somatic cumulative mutations in the GU-like group are significantly higher than those in the GS-like group (p < 0.001). (C) Boxplots of KRAS, PIK3CA, ARID1A and UBQLN4 expression level in the GU-like group and GS-like group. These genes expression level in the GU-like group is significantly higher than that in the GS-like group (p < 0.001). (D) Co-expression network of genomic instability-related lncRNAs and mRNAs based on the Pearson correlation coefficient. The blue circles represent lncRNAs, and the red circles represent mRNAs. (E) Functional enrichment analysis of GO and KEGG for mRNAs co-expressed lncRNAs.
Differentially expressed lncRNAs and relative mRNAs.
| LncRNA | logFC | Fdr | Relative mRNA | |
|---|---|---|---|---|
| KCNMB2-AS1 | 1.633484 | 9.22E−05 | 0.018886 | BOLA2B, LAMTOR4, UQCC3, BCL7C, FAAP20, NUDT1, IZUMO4, POLR2J, NCBP2AS2, COPS9, ELOB, NT5C, SMUG1, MRPL47, TNNC2 |
| AC093895.2 | 1.544702 | 4.76E−05 | 0.014907 | LPAR6, HEBP2, ARL11, C12orf54, CRABP2, HSPB1, NRN1, XCL1, SLC44A5, SP140L, KRT15, TXNDC17, PERM1, C1orf21, CLEC2A |
| AL162413.1 | 2.309286 | 0.001041 | 0.047346 | KCNQ4, APRT, EXOSC4, CITED4, DPM2, PTGES2, RPL36, BOP1, RPL13, NDUFAF8, CCDC167, MRPL27, MRPL14, RPL8, MVD |
| FIRRE | 0.974686 | 0.000931 | 0.046215 | PAPOLG, TFAP4, KMT5A, KCTD15, KANSL2, KDM3A, METTL8, DKC1, TAF4B, ZC3H8, VANGL2, C21orf91, EFS, METAP1, FANCE |
| LINC00944 | 1.795497 | 0.000303 | 0.028837 | SLC25A22, IL2RG, RELB, GNGT2, ICOS, IL21R, MARCO, FAM24B, APOBEC3G, APOBEC3H, CTLA4, CXCL10, VCAM1, CLIC2, CD2 |
| AC005993.1 | 1.09773 | 0.000443 | 0.037799 | TESMIN, ALPG, IGF1R, CCNA1, DDX17, CCDC3, ANO1, PREX1, DGKZ, KMT5B, SERHL2, TMEM184B, BRMS1L, MBIP, DNAL4 |
| LINC02542 | 1.075473 | 0.000305 | 0.028837 | A1CF, XYLB, GGCX, AGMAT, ACOX2, SLC25A13, ACADSB, SERPIND1, PLG, ITIH1, SLC6A1, AGMO, SLCO1B1, SNTB1, HNF4A |
| LINC00649 | 0.909561 | 1.81E−05 | 0.007423 | IFI16, SNX30, N4BP1, GJB5, RARG, KCTD1, NECTIN1, MAP3K6, TRIM29, GM2A, KLF8, TRERF1, DEF6, NECTIN4, LRRC1 |
| AL023803.2 | 0.553819 | 0.000561 | 0.037799 | PAX9, CALB2, CCNO, FAM83D, MCIDAS, ITGA2B, UBE2C, LIN7B, FOXA1, PCED1A, AC011479.2, TFAP2C, MXD3, ACTR5, KMT5C |
| MAN1B1-DT | − 0.56543 | 0.000984 | 0.047346 | EXOSC6, HIRIP3, DDX28, MDP1, CHAF1B, THAP11, TTC32, C4orf36, TLX2, C9orf78, CTF1, CFDP1, EXOSC2, PIGW, UTP4 |
| AC025580.1 | − 1.87066 | 0.000609 | 0.038385 | TCTE3, SCIN, ZG16B, GFPT1, ZNF585B, FGFBP3, TTC39A, SLC44A4, ZNF345, MYO6, PDXDC1, ZFP14, ZNF529, ARFGEF3, ZNF518A |
| TRAM2-AS1 | − 0.43872 | 0.000128 | 0.02102 | ALDH5A1, BPHL, SIRT5, TPMT, KLC4, CAP2, ACOT13, MMUT, MOCS1, DHTKD1, HIBADH, YIPF3, SLC17A4, FAM8A1, EHHADH |
| RARA-AS1 | − 0.35817 | 0.000119 | 0.02102 | FKBP2, RARA, KRT18, TMEM205, RPS27L, NTHL1, G3BP1, REPS2, FUCA1, CEBPB, BLOC1S1, FAM167B, RAB17, COX14, CD63 |
| LINC01836 | − 0.99887 | 0.000317 | 0.028837 | TMC4, MSLN, WWC1, MISP, RAB20, TMPRSS3, LAMA5, ALDH3B1, TSPAN15, DOCK5, RBMS2, CRIM1, IQCE, PIWIL4, CCL28 |
| SERTAD4-AS1 | − 0.88678 | 6.68E−06 | 0.005235 | SERTAD4, DOK7, TMEM125, SIX1, CRIP2, HOXB6, HOXB5, CCDC160, MRAP2, TSPAN3, SLFN13, CRIP1, COL9A2, IFT172, SCX |
| AC132938.4 | − 0.6378 | 7.33E−05 | 0.017157 | PNPO, PIGV, CPT2, HLF, PDK2, TOM1L1, PCTP, SLC38A10, FBXO31, ACOX1, MTMR4, UGT1A3, SCP2, ZMYND12, CRYZ |
| AC107464.2 | − 0.75617 | 0.001019 | 0.047346 | PDE6B, UCP2, PRAF2, FUZ, DTX3, ZNF232, DOK1, AC005041.1, COL9A2, NAT14, CRIP1, UBXN11, C2orf15, C11orf49, CLUAP1 |
| MIR100HG | − 0.8595 | 0.000184 | 0.025064 | SPRY2, SPRY1, DLG4, KIF26B, MFGE8, ZNF853, FGF18, SPRY4, MFAP4, EFEMP2, REV3L, ETV5, VCAN, KCNH3, LRIG1 |
| AC083964.1 | − 0.47203 | 0.000799 | 0.042653 | TDRP, CCDC28B, FABP6, MARCO, NPPC, KREMEN2, TNNI2, IL11RA, COL16A1, LIFR, FAM71E1, PARM1, CD200, TRAF2, SOCS1 |
| IRAIN | − 0.1848 | 0.000699 | 0.042435 | TESMIN, IGF1R, ALPG, CCNA1, CCDC3, ANO1, CSPG5, PREX1, TMEM184B, SLC39A8, RGS10, DNAL4, KMT5B, RNF32, DDX17 |
| AP001527.2 | − 1.66464 | 0.000317 | 0.028837 | YAP1, BIRC2, CEP126, TMEM123, CFAP300, SYDE1, SLC1A6, DYNC2H1, DCUN1D5, FADS3, BIRC3, IKBIP, HMGB3, ELOVL3, GPAT2 |
| BMPR1B-DT | − 2.95927 | 0.000575 | 0.037799 | BMPR1B, SOX17, FBLN1, PAK1IP1, FAM189A2, MAP2K6, HOXA10, TUBA3D, RBBP7, AADAT, LHX2, ELP3, ASRGL1, IGF1, ALKAL2 |
| AC096733.3 | − 0.41954 | 0.000561 | 0.037799 | TBC1D9, WDFY3, HELQ, USP53, ELF2, SMARCAD1, NEK1, KIAA1109, THUMPD1, SETD1B, KDM6A, KIDINS220, DNAJB14, ARID2, EIF2AK3 |
| AC097359.2 | − 0.38644 | 0.000777 | 0.042653 | TCTA, SLC25A20, QPRT, TK2, FN3K, ABHD6, CMTM8, MYRIP, SLC26A1, ALDH4A1, CPN2, SYPL2, HNF1A, IQSEC1, OAF |
| ADCY6-DT | − 0.96724 | 0.000577 | 0.037799 | JSRP1, PLA2G10, SMIM22, TRIM54, GCNT3, ASPHD1, PDE4C, METTL27, TNNC2, PRR13, RNASEH2C, PGP, RASSF7, ELOB, TMEM238 |
| IQCH-AS1 | − 0.40448 | 0.000238 | 0.027863 | NEK8, NEIL1, C2orf15, COA5, DIS3L, P4HTM, SNAPC5, ZNF33B, BBS4, MYO5C, LZTS3, FAM81A, ARPIN, LRTOMT, CCDC57 |
| AC129510.1 | − 0.47954 | 0.000522 | 0.037799 | CCDC14, AHI1, WDR90, NKTR, PHF12, PNISR, CFAP44, SREK1, MSANTD2, EFHC1, KIF27, VEZF1, PASK, DNAL1, KIAA0753 |
| LINC02875 | − 0.69396 | 0.000465 | 0.037799 | PIGP, RAB6B, SOX2, C6orf226, CDKAL1, TNRC6C, TBX2, TMEM251, CHAF1B, CHST7, ADRA2B, TP53I13, BFSP1, CD200, THAP7 |
| LOXL1-AS1 | − 0.59163 | 9.59E−06 | 0.005235 | LOXL1, ADPGK, CHSY1, LARP6, SLC35E4, RCN2, THAP10, KIAA0753, NCBP3, VCL, CHD3, DTX3, PTPN9, CNTROB, MYO9A |
| FGF14-AS2 | − 1.20708 | 3.01E−06 | 0.004925 | CMBL, ACAA2, TMEM205, BTD, CYP2B6, ZG16, CYP2A6, CYB5A, SERPINA4, HAO1, ACBD4, CLYBL, SLC10A1, CYP2A13, PCK2 |
| AL391422.4 | − 0.56864 | 0.000766 | 0.042653 | PXDC1, TMEM14C, SAA2, CUTA, YIPF3, TRIM27, RNF5, C6orf89, MOCS1, SAA1, NMT2, SLC39A7, SIRT5, C9, MRPL2 |
| AC025265.1 | − 0.56836 | 5.46E−05 | 0.014907 | NT5DC3, MTERF2, OVGP1, GOLGA8B, RPL9, SLC25A16, KLHL23, NR2C1, NSUN6, MPST, CENPV, C12orf73, ZNF577, ABCA5, CHKA |
| ATP1A1-AS1 | − 0.42986 | 0.000822 | 0.042653 | ABCD3, PRKAA2, NBR1, TOM1L1, CNNM3, C16orf58, C1orf56, SPATA25, DDAH1, USP30, CRYZ, ST3GAL3, PARD3B, REPIN1, COX11 |
| EIF3J-DT | − 0.40662 | 0.000201 | 0.025372 | C2orf15, ZBTB26, VPS39, ZNF512, POLR2M, ETAA1, ZBTB14, HNRNPA1L2, ZNF33B, ICE2, MKS1, ZNF248, KAT8, INTS14, CTDSPL2 |
| AC114956.2 | − 0.51268 | 0.000145 | 0.021664 | C5orf34, NIPBL, ZNF131, RAD1, DROSHA, C5orf51, RICTOR, C5orf22, NUP155, TMEM267, DNAJC21, CPLANE1, ICE1, MARCHF6, PAIP1 |
Figure 3Establishment of the prognostic model and validation of the genomic instability-derived lncRNA signature (LncSig) for outcome prediction in the training set. (A) 5 lncRNAs for establishment of the prognostic model. (B) Estimates of overall survival of patients with low or high risk predicted by the LncSig in the training set (p < 0.001). (C) Time-dependent ROC curves analysis of the LncSig at 3 years (AUC = 0.783). (D) With increasing LncSig score, LncRNA expression patterns, the distribution of somatic mutation and KRAS expression. (E) The distribution of somatic cumulative mutations in high- and low-risk groups. (F) KRAS, PIK3CA, ARID1A and UBQLN4 expression in the high- and low-risk groups. The red represents the high-risk group, and the blue represents the low-risk group.
Figure 4Performance evaluation of the LncSig in the testing and whole set. Kaplan–Meier estimates of overall survival of patients with low or high risk predicted by the LncSig in the testing set (A) and whole set (B). Time-dependent ROC curves analysis of the LncSig at 3 years in the testing set (C) and whole set (D). LncRNA expression patterns and the distribution of somatic mutation count distribution and KRAS expression for patients in high- and low-risk groups in the testing set (E) and whole set (F). The distribution of somatic mutation and KRAS expression in patients of high- and low-risk groups in the testing set (G) and whole set (H).
Figure 5Stratification analyses by age, histological grade and FIGO stage. Kaplan–Meier curve analysis of overall survival in high-risk and low-risk groups for younger patients (age ≤ 46) (A) and older patients (age > 46) (B). For early-grade patients (histological grade 1–2) (C) and late-grade patients (histological grade 3) (D). For early-stage patients (FIGO stage I–IIA) (E) and late-stage patients (FIGO stage IIB–IVB) (F). Statistical analysis was performed using the log-rank test and univariate Cox analysis.
Figure 6Combined survival analysis of genotyping and mutation. (A) Kaplan–Meier curve analysis of overall survival is shown for patients classified according to KRAS mutation status and the GU/GS. (B) Kaplan–Meier curve analysis of overall survival is shown for patients classified according to KRAS mutation status and the LncSig.
Figure 7Combined survival analysis of model comparison. The ROC analysis at 3 years of overall survival for the LncSig, AalijahanLncSig and MiguelLncSig.