| Literature DB >> 35430805 |
Shulei Ren1, Wook Lee1, Kyungsook Han2.
Abstract
BACKGROUND: Lymph node metastasis is usually detected based on the images obtained from clinical examinations. Detecting lymph node metastasis from clinical examinations is a direct way of diagnosing metastasis, but the diagnosis is done after lymph node metastasis occurs.Entities:
Keywords: Competitive endogenous RNA; Lymph node metastasis; Prognosis; miRNA–mediated RNA interaction
Mesh:
Substances:
Year: 2022 PMID: 35430805 PMCID: PMC9014599 DOI: 10.1186/s12920-022-01231-x
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.622
Performance of the prediction model with different types of features in the fivefold cross validation
| Cancer | Feature | #Features | #PCs | SN | SP | ACC | PPV | NPV | AUC |
|---|---|---|---|---|---|---|---|---|---|
| BRCA | EXP | 5119 | 430 | 0.674 | 0.709 | 0.692 | 0.694 | 0.689 | 0.691 |
| 1563 | 480 | ||||||||
| COAD | EXP | 835 | 100 | 0.360 | 0.935 | 0.758 | 0.711 | 0.767 | 0.647 |
| 1969 | 80 | ||||||||
| HNSC | EXP | 292 | 10 | 0.750 | 0.684 | 0.720 | 0.739 | 0.696 | 0.717 |
| 800 | 100 | ||||||||
| LUAD | EXP | 6193 | 110 | 0.477 | 0.882 | 0.741 | 0.683 | 0.759 | 0.679 |
| 12,981 | 200 | ||||||||
| LUSC | EXP | 1371 | 190 | 0.644 | 0.867 | 0.786 | 0.736 | 0.809 | 0.756 |
| 2436 | 200 | ||||||||
| STAD | EXP | 476 | 120 | 0.905 | 0.472 | 0.763 | 0.778 | 0.708 | 0.688 |
| 17,445 | 60 | ||||||||
| THCA | EXP | 4205 | 30 | 0.663 | 0.663 | 0.663 | 0.634 | 0.691 | 0.663 |
| 3397 | 150 |
In comparison of two types of features (RNA expression vs. deltaPCC), the better performances are shown in bold
In all cancer types, prediction with PCCs showed a better performance than that with RNA expression levels
PC, principal component; SN, sensitivity; SP, specificity; ACC, accuracy; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve; EXP, RNA expression level
Performance of the prediction model with different types of features in an independent testing
| Cancer | Feature | #Features | #PCs | SN | SP | ACC | PPV | NPV | AUC |
|---|---|---|---|---|---|---|---|---|---|
| BRCA | EXP | 5119 | 430 | 0.664 | 0.710 | 0.688 | 0.690 | 0.685 | 0.687 |
| 1563 | 480 | ||||||||
| COAD | EXP | 835 | 100 | 0.563 | 0.932 | 0.819 | 0.783 | 0.829 | 0.747 |
| 1969 | 80 | ||||||||
| HNSC | EXP | 292 | 10 | 0.867 | 0.792 | 0.833 | 0.839 | 0.826 | 0.829 |
| 800 | 100 | ||||||||
| LUAD | EXP | 6193 | 110 | 0.622 | 0.943 | 0.832 | 0.852 | 0.825 | 0.782 |
| 12,981 | 200 | ||||||||
| LUSC | EXP | 1371 | 190 | 0.533 | 0.808 | 0.707 | 0.615 | 0.750 | 0.671 |
| 2436 | 200 | ||||||||
| STAD | EXP | 476 | 120 | 0.937 | 0.452 | 0.777 | 0.776 | 0.778 | 0.694 |
| 17,445 | 60 | ||||||||
| THCA | EXP | 4205 | 30 | 0.796 | 0.768 | 0.757 | |||
| 3397 | 150 | 0.658 | 0.745 | 0.761 |
In comparison of two types of features (RNA expression vs. deltaPCC), the better performances are shown in bold
In all cancer types except thyroid cancer (THCA), prediction with PCCs showed a better performance than that with RNA expression levels
PC, principal component; SN, sensitivity; SP, specificity; ACC, accuracy; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve; EXP, RNA expression level
Comparison of the performance of our SVM model with that of Zhang’s SVM model [8]
| Cancer | Method_feature | Train_score | Test_score |
|---|---|---|---|
| BRCA | Our model_ | ||
| Zhang_mRNA | 0.798 | 0.680 | |
| Zhang_miRNA | 0.764 | 0.737 | |
| Zhang_lncRNA | 0.793 | 0.696 | |
| COAD | Our model_ | ||
| Zhang_mRNA | 0.849 | 0.871 | |
| Zhang_miRNA | 0.902 | 0.886 | |
| Zhang_lncRNA | 0.869 | 0.871 | |
| LUAD | Our model_ | ||
| Zhang_mRNA | 0.808 | 0.849 | |
| Zhang_miRNA | 0.885 | 0.795 | |
| Zhang_lncRNA | 0.798 | 0.849 | |
| LUSC | Our model_ | ||
| Zhang_mRNA | 0.871 | 0.900 | |
| Zhang_miRNA | 0.939 | 0.847 | |
| Zhang_lncRNA | 0.861 | 0.900 |
In comparison of two types of features (RNA expression vs. deltaPCC), the better performances are shown in bold
Among the seven types of cancer used in our study, comparison was made in four types of cancer because they are the only common cancer types in both studies. The train_score and test_score were obtained using the scikit-learn package, which was used by Zhang’s study. In all four caner types, our model showed the better performance in both training and testing. our model_PCC: SVM model using PCCs as features. Zhang_X: SVM model using the expression levels of RNA type X as features
Comparison of p-values from the log-rank test with miRNA–RNA pair, and individual RNA and miRNA involved in the pair
| Cancer | miRNA–RNA pair | Type of RNA in the pair | |||
|---|---|---|---|---|---|
| BRCA | miR-26b_AC079414.1 | lncRNA | 2.270E−05 | 9.203E−01 | 5.896E−01 |
| miR-3192_PPDPFL | mRNA | 6.320E−05 | 1.351E−03 | 1.346E−02 | |
| miR-3192_AC013549.3 | lncRNA | .260E−04 | 5.028E−01 | 1.346E−02 | |
| COAD | miR-604_AL162426.1 | lncRNA | 1.869E−04 | 4.365E−01 | 6.730E−01 |
| miR-3679_RPL26P29 | Pseudogene | 3.122E−04 | 1.315E−02 | 8.171E−01 | |
| miR-6835_AC037459.2 | lncRNA | 7.746E−04 | 9.815E−01 | 2.938E−02 | |
| HNSC | miR-4539_KRTAP10-2 | mRNA | 1.849E−04 | 3.033E−01 | 1.629E−03 |
| miR-6730_LINC01435 | lncRNA | 9.783E−04 | 1.038E−02 | 3.211E−03 | |
| miR-5195_AL390067.1 | lncRNA | 1.070E−03 | 8.716E−02 | 3.435E−02 | |
| LUAD | miR-581_LINC00628 | lncRNA | 4.719E−07 | 1.925E−02 | 8.736E−01 |
| miR-7848_AC087588.2 | lncRNA | 2.220E−06 | 1.750E−05 | 7.506E−01 | |
| miR-3680-1_AL138789.1 | lncRNA | 1.300E−05 | 2.386E−02 | 5.371E−01 | |
| LUSC | miR-548z_PNLIPRP2 | Pseudogene | 1.175E−04 | 3.178E−01 | 6.640E−04 |
| miR-3972_CSAG4 | Pseudogene | 1.485E−04 | 5.168E−01 | 4.740E−01 | |
| miR-146b_PHETA2 | mRNA | 1.488E−04 | 4.779E−02 | 2.760E−01 | |
| STAD | miR-604_OLFML3 | mRNA | 1.000E−05 | 4.787E−02 | 4.921E−01 |
| miR-554_OR10A5 | mRNA | 4.040E−05 | 4.727E−03 | 5.852E−02 | |
| miR-149_OR10A5 | mRNA | 1.689E−04 | 4.727E−03 | 8.850E−01 | |
| THCA | miR-5685_GADD45A | mRNA | 3.489E−03 | 7.915E−01 | 2.587E−01 |
| miR-6784_AC093281.2 | lncRNA | 3.762E−03 | 5.934E−01 | 5.559E−02 | |
| miR-8071-2_CFB | mRNA | 3.991E−03 | 1.392E−02 | 9.494E−01 |
Fig. 1Overall survival rates of patients with respect to PCCs of miRNA–RNA pairs in 7 cancer types. PCCs of miRNA–RNA pairs are predictive of the survival rates of patients in all 7 types of cancer
Fig. 2Overall survival rates of patients with respect to expressions of individual RNAs in Fig. 1. In contrast to the miRNA–RNA pairs, none of the individual RNAs showed predictive power of the survival rates of cancer patients
The number of normal samples, tumor samples, tumor samples with lymph node metastasis, and tumor samples without lymph node metastasis in seven types of cancer
| Cancer | #Normal samples | #Tumor samples | #Lymph node metastasis | #Non-metastasis |
|---|---|---|---|---|
| BRCA | 113 | 1102 | 447 | 457 |
| COAD | 41 | 478 | 107 | 242 |
| HNSC | 44 | 500 | 98 | 81 |
| LUAD | 59 | 533 | 123 | 231 |
| LUSC | 49 | 502 | 149 | 259 |
| STAD | 32 | 375 | 210 | 103 |
| THCA | 58 | 502 | 127 | 145 |
Fig. 3ceRNA network for breast invasive carcinoma (BRCA). The network is composed of 1563 miRNA–RNA interactions among 119 miRNAs, 423 lncRNAs, 380 mRNAs and 252 pseudogenes. The small network centered at miR-149 is a blowup of the subnetwork enclosed by a red box
Fig. 4Subnetworks of patient-specific ceRNA networks for two LUAD patients. A LUAD patient (TCGA-44-7670) with a high PCC of the miR-581_LINC00628 pair. B LUAD patient (TCGA-NJ-A55O) with a low PCC of the miR-581_LINC00628 pair. The RNAs involved in the three miRNA–RNA pairs of Table 4 are marked by red boxes. For clarity, subnetworks of the three miRNA–RNA pairs are displayed
Fig. 5The overview of the overall workflow. There are three types of samples: normal samples (gray), tumor samples without lymph node metastasis (sky blue) and tumor samples with lymph node metastasis (pink). In our prediction model, tumor samples with lymph node metastasis (pink) and tumor samples without lymph node metastasis (sky blue) are treated as positive and negative instances, respectively
The number of RNAs of four biotypes in each cancer type studied in this study
| Cancer | #miRNAs | #mRNAs | #lncRNAs | #pseudogenes |
|---|---|---|---|---|
| BRCA | 165 | 18,084 | 8553 | 5528 |
| COAD | 157 | 17,573 | 7284 | 5304 |
| HNSC | 95 | 18,018 | 7427 | 4643 |
| LUAD | 197 | 18,054 | 8755 | 5954 |
| LUSC | 161 | 18,227 | 8706 | 5680 |
| STAD | 379 | 18,617 | 10,354 | 9039 |
| THCA | 153 | 17,568 | 7342 | 4753 |
The number of features left after each filtering process. miRNA–RNA pairs with MIC both in normal samples and tumor samples were removed by MIC filtering
| Cancer | #Features after | #Features after | #Features after |
|---|---|---|---|
| MIC filtering | Wilcox test | PCA | |
| BRCA | 90,837 | 1563 | 480 |
| COAD | 178,973 | 1969 | 80 |
| HNSC | 67,020 | 800 | 100 |
| LUAD | 341,146 | 12,981 | 200 |
| LUSC | 165,765 | 2436 | 200 |
| STAD | 976,763 | 17,445 | 60 |
| THCA | 38,077 | 3397 | 150 |
The miRNA–RNA pairs with a p-value 0.01 were removed by the Wilcox test. The number of features was further reduced after dimension reduction by PCA of PCCs. In both MIC filtering and the Wilcox test, each feature represents a miRNA–RNA pair, In PCA, the number of features indicates the dimension of a feature vector