| Literature DB >> 35224155 |
Yi Zuo1,2, Shaoqiu Chen2,3, Lingling Yan1, Ling Hu1, Scott Bowler2,3, Emory Zitello2,3, Gang Huang4, Youping Deng2.
Abstract
Liver cancer presents divergent clinical behaviors. There remain opportunities for molecular markers to improve liver cancer diagnosis and prognosis, especially since tRNA-derived small RNAs (tsRNA) have rarely been studied. In this study, a random forests (RF) diagnostic model was built based upon tsRNA profiling of paired tumor and adjacent normal samples and validated by independent validation (IV). A LASSO model was used to developed a seven-tsRNA-based risk score signature for liver cancer prognosis. Model performance was evaluated by a receiver operating characteristic curve (ROC curve) and Precision-Recall curve (PR curve). The five-tsRNA-based RF diagnosis model had area under the receiver operating characteristic curve (AUROC) 88% and area under the precision-recall curve (AUPR) 87% in the discovery cohort and 87% and 86% in IV-AUROC and IV-AUPR, respectively. The seven-tsRNA-based prognostic model predicts the overall survival of liver cancer patients (Hazard Ratio 2.02, 95% CI 1.36-3.00, P < 0.001), independent of standard clinicopathological prognostic factors. Moreover, the model successfully categorizes patients into high-low risk groups. Diagnostic and prognostic modeling can be reliably utilized in the diagnosis of liver cancer and high-low risk classification of patients based upon tsRNA characterization.Entities:
Keywords: Diagnosis; Liver cancer; Prognosis; Random forests; tRNA-derived small RNAs
Year: 2021 PMID: 35224155 PMCID: PMC8843861 DOI: 10.1016/j.gendis.2021.01.006
Source DB: PubMed Journal: Genes Dis ISSN: 2352-3042
Demographic and clinical characteristic of patients with liver cancer in discovery Corhort1 set (TCGA) and independent validation Corhot2 (GSE76903).
| Cohort | TCGA | GSE76903 |
|---|---|---|
| 379 | 20 | |
| Adjacent Normal | 57 | 20 |
| Primary Tumor | 57 | 20 |
| 61 (51–59.14) | 50 (45.25–60.25) | |
| Male, count (%) | 251 (66.2) | 17 (85) |
| Asian | 161 | 20 |
| White | 192 | |
| Black or African American | 15 | |
| American Indian or Alaska Native | 2 | |
| Not Reported | 9 | |
| Radiation Therapy | 181 | Not Reported |
| Pharmaceutical Therapy | 198 | |
| Stage I | 189 | Not Reported |
| Stage II | 94 | |
| Stage III | 86 | |
| Stage IV | 10 | |
| Hepatocellular carcinoma | 331 | 20 |
| Cholangiocarcinoma | 48 | |
| Death | 130 | Not Reported |
| Alive | 249 | |
| 575 (308.5–1106.5) | Not Reported | |
Figure 1tsRNA expression analysis of liver cancer diagnosis. Unsupervised hierarchical clustering of all significant tsRNA markers selected for use in the diagnostic model. Each row is tsRNA, and the column is the patient sample.
A list of top 16 tsRNAs that p-value less than 0.05 with tRF sequences when the paired student t-test was evaluated.
| ID | Mintbase ID | tRF Sequences (5′–3′) | Fold Change (C/N) | FDR | |
|---|---|---|---|---|---|
| ts-N7 | NA | GCCCGGATGATCCTCAGTGGTCTGGGGTGCAGGCTTC | 2.91321E-09 | 0.351073565 | 1.19E-07 |
| ts-N63 | tRF-22-RKVP4P9LL | GGGGGTATAGCTCAGTGGTAGA | 2.98038E-07 | 0.420801509 | 6.11E-06 |
| ts-N144 | tRF-18-897PVP04 | TCCTCGTTAGTATAGTGG | 1.58555E-05 | 0.458798884 | 0.000217 |
| ts-N53 | tRF-19-6S7P4PK4 | GGCCGGTTAGCTCAGTTGG | 4.83256E-05 | 0.407564036 | 0.000495 |
| ts-N102 | tRF-18-8R1546D2 | TCCCCAGTACCTCCACCA | 7.6764E-05 | 2.046345387 | 0.000629 |
| ts-N42 | tRF-19-QR18LOJ4 | GCTCCAGTGGCGCAATCGG | 0.000199338 | 0.317559469 | 0.001248 |
| ts-N94 | tRF-18-07QSNHD2 | ACCCTGCTCGCTGCGCCA | 0.000213077 | 0.657910809 | 0.001248 |
| ts-N34 | tRF-20-79MP9P9 M | GTTTCCGTAGTGTAGTGGTC | 0.00031426 | 0.517865005 | 0.001611 |
| ts-N59 | tRF-20-HDK2RSI2 | ATAACCCAGAGGTCGATGGA | 0.000415514 | 2.079287758 | 0.001738 |
| ts-N84 | tRF-28-HJ83RPFQZD0M | ATAGCTCAGTGGTAGAGCATTTGACTGC | 0.000423901 | 0.636436975 | 0.001738 |
| ts-N52 | tRF-25-0P58309NDJ | ACCAGGATGGCCGAGTGGTTAAGGC | 0.002677032 | 0.634413418 | 0.009978 |
| ts-N115 | tRF-17-WSNKP92 | TCTCGCTGGGGCCTCCA | 0.006990814 | 0.775397813 | 0.023885 |
| ts-N37 | tRF-29-RKVP4P9L5FKP | GGGGGTATAGCTCAGTGGTAGAGCATTTG | 0.011555279 | 0.486538973 | 0.036444 |
| ts-N41 | tRF-20-6S7P4PZ3 | GGCCGGTTAGCTCAGTTGGT | 0.013231647 | 2.532122558 | 0.03875 |
| ts-N71 | tRF-20-73VL4YMY | GTGGTTAGTACTCTGCGTTG | 0.045561725 | 0.31115994 | 0.122153 |
| ts-N44 | tRF-24-S3M8309N0Y | GTAGTCGTGGCCGAGTGGTTAAGG | 0.04766928 | 0.711120108 | 0.122153 |
Figure 2Random forest diagnostic model and LASSO selection. (A) ROC of the diagnostic prediction model with tsRNA markers in the discovery data (TCGA) and independent validation data sets (GEO dataset). (B) PR curve in the discovery data (TCGA) and independent validation data sets (GEO dataset). (C) LASSO coefficient profiles of the liver-cancer-associated tsRNAs. (D) Seven tsRNAs selected by LASSO Cox regression analysis.
A list of top 8 tsRNAs that p-value less than 0.1 when the univariate Cox models were applied.
| ID | Mintbase ID | tRF Sequences (5′–3′) | Hazard Ratio | 95% CI | |
|---|---|---|---|---|---|
| ts-N20 | tRF-25-395P4PN3FJ | CCTTCGATAGCTCAGCTGGTAGAGC | 0.081739 | 1.3388 | 0.964–1.86 |
| ts-N21 | tRF-23-395P4PN3X | CCTTCGATAGCTCAGCTGGTAGA | 0.008897 | 1.5525 | 1.115–2.163 |
| ts-N22 | tRF-18-YSQSD2D2 | TTCCGGCTCGAAGGACCA | 7.23E-05 | 0.5108 | 0.364–0.716 |
| ts-N36 | tRF-23-HDK2RSI20K | ATAACCCAGAGGTCGATGGATCG | 0.021841 | 1.4735 | 1.056–2.057 |
| ts-N37 | tRF-29-RKVP4P9L5FKP | GGGGGTATAGCTCAGTGGTAGAGCATTTG | 0.016874 | 1.4965 | 1.073–2.087 |
| ts-N44 | tRF-24-S3M8309N0Y | GTAGTCGTGGCCGAGTGGTTAAGG | 0.008527 | 1.5573 | 1.118–2.168 |
| ts-N45 | tRF-30-87R8WP9N1EWJ | TCCCTGGTGGTCTAGTGGTTAGGATTCGGC | 0.085083 | 1.3346 | 0.96–1.855 |
| ts-N64 | tRF-18-8R6546D2 | TCCCCGGCACCTCCACCA | 0.007235 | 0.6342 | 0.453–0.888 |
Figure 3Risk score model by the seven-tsRNA-based signature, Kaplan–Meier survival in training (A), internal validation (B). P values calculated by the log-rank test.
Figure 4Correlation analysis of diagnostic tsRNA and miRNA. The lower-left corner connection is the correlation between tsRNAs and miRNAs, and the upper-right corner is the correlation analysis between miRNAs and miRNAs. miRNA data down from TCGA-LIHC and TCGA-CHOL project.