| Literature DB >> 36051691 |
Xin-Fei Wang1, Chang-Qing Yu1, Li-Ping Li1,2, Zhu-Hong You3, Wen-Zhun Huang1, Yue-Chao Li1, Zhong-Hao Ren1, Yong-Jian Guan1.
Abstract
Emerging evidence has revealed that circular RNA (circRNA) is widely distributed in mammalian cells and functions as microRNA (miRNA) sponges involved in transcriptional and posttranscriptional regulation of gene expression. Recognizing the circRNA-miRNA interaction provides a new perspective for the detection and treatment of human complex diseases. Compared with the traditional biological experimental methods used to predict the association of molecules, which are limited to the small-scale and are time-consuming and laborious, computing models can provide a basis for biological experiments at low cost. Considering that the proposed calculation model is limited, it is necessary to develop an effective computational method to predict the circRNA-miRNA interaction. This study thus proposed a novel computing method, named KGDCMI, to predict the interactions between circRNA and miRNA based on multi-source information extraction and fusion. The KGDCMI obtains RNA attribute information from sequence and similarity, capturing the behavior information in RNA association through a graph-embedding algorithm. Then, the obtained feature vector is extracted further by principal component analysis and sent to the deep neural network for information fusion and prediction. At last, KGDCMI obtains the prediction accuracy (area under the curve [AUC] = 89.30% and area under the precision-recall curve [AUPR] = 87.67%). Meanwhile, with the same dataset, KGDCMI is 2.37% and 3.08%, respectively, higher than the only existing model, and we conducted three groups of comparative experiments, obtaining the best classification strategy, feature extraction parameters, and dimensions. In addition, in the performed case study, 7 of the top 10 interaction pairs were confirmed in PubMed. These results suggest that KGDCMI is a feasible and useful method to predict the circRNA-miRNA interaction and can act as a reliable candidate for related RNA biological experiments.Entities:
Keywords: K-mer; circRNA; circRNA–miRNA interaction; deep neural network; graph embedding
Year: 2022 PMID: 36051691 PMCID: PMC9426772 DOI: 10.3389/fgene.2022.958096
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1The K-mer algorithm for sequence feature extraction.
FIGURE 2Flowchart of KGDCMI.
Five-fold cross-validation results performed by KGDCMI.
| Test Set | ACC. (%) | Sen. (%) | Spec. (%) | Prec. (%) | MCC (%) |
|---|---|---|---|---|---|
| 1 | 82.00 | 80.62 | 83.39 | 82.92 | 64.03 |
| 2 | 82.94 | 78.55 | 87.33 | 86.11 | 66.13 |
| 3 | 82.74 | 80.11 | 85.36 | 84.55 | 65.56 |
| 4 | 83.04 | 81.32 | 84.76 | 84.21 | 66.12 |
| 5 | 82.51 | 80.36 | 84.65 | 83.97 | 65.08 |
| Average |
|
|
|
|
|
FIGURE 3Receiver operating characteristic curves generated by KGDCMI.
FIGURE 4Area under the precision–recall curves generated by KGDCMI.
Performance comparison of five methods based on five-fold cross-validation.
| Methods | DMFCDA | AE–RF | DMFMDA | NTSHMDA | CMIVGSD | KGDCMI |
|---|---|---|---|---|---|---|
| AUC | 0.7321 | 0.7662 | 0.7922 | 0.8526 | 0.8804 | 0.9041 |
| AUPR | 0.7115 | 0.8239 | 0.8230 | 0.8772 | 0.8629 | 0.8937 |
FIGURE 5Performance comparison of five traditional classifiers and DNN in terms of prediction.
Performances of different K values.
|
| Sen. (%) | Spec. (%) | Prec. (%) | Acc. (%) | AUC. (%) |
|---|---|---|---|---|---|
|
| 74.89 | 86.46 | 84.70 | 80.67 | 86.25 |
|
| 77.20 | 85.29 | 84.00 | 81.25 | 86.77 |
|
| 75.49 | 86.55 | 84.88 | 81.02 | 86.33 |
|
| 76.77 | 86.15 | 84.72 | 81.46 | 87.21 |
|
| 76.75 | 85.17 | 83.83 | 80.96 | 86.39 |
|
| 77.27 | 85.29 | 84.04 | 81.27 | 86.63 |
|
| 75.72 | 86.90 | 85.26 | 81.31 | 86.43 |
|
| 77.52 | 85.56 | 84.31 | 81.54 | 86.97 |
|
| 77.79 | 84.86 | 83.91 | 80.83 | 86.16 |
|
| 78.44 | 85.04 | 84.00 | 80.73 | 86.17 |
|
| 77.08 | 85.57 | 84.33 | 80.33 | 86.43 |
|
| 77.67 | 84.42 | 84.21 | 80.74 | 86.61 |
|
| 76.70 | 84.50 | 83.73 | 79.10 | 85.26 |
|
| 76.58 | 84.93 | 83.68 | 80.25 | 85.89 |
|
| 76.74 | 85.00 | 83.93 | 79.11 | 85.75 |
|
| 76.87 | 85.52 | 83.66 | 79.70 | 85.92 |
FIGURE 6Performances of different K values.
FIGURE 7Performance of five-dimensional compression to extract features.
The top 10 results predicted in our model based on the dataset.
| Num | CircRNA | miRNA | Evidence |
|---|---|---|---|
| 1 | hsa_circ_0006916 | hsa-miR-522-3p | PMID:29726904 |
| 2 | hsa_circ_0002142 | hsa-miR-625-5p | PMID:30988674 |
| 3 | hsa_circ_0000977 | hsa-miR-874-3p | PMID:29454093 |
| 4 | hsa_circ_0041089 | hsa-miR-3192-5p | unconfirmed |
| 5 | hsa_circ_0041103 | hsa-miR-103a-3p | PMID:27484176 |
| 6 | hsa_circ_0007915 | hsa-miR-106a-3p | PMID:28727484 |
| 7 | hsa_circ_0000673 | hsa-miR-767-3p | unconfirmed |
| 8 | hsa_circ_100242 | hsa-miR-145-5p | PMID:32218853 |
| 9 | hsa_circ_0092306 | hsa-miR-197-3p | PMID:31689616 |
| 10 | hsa_circ_0089776 | hsa-miR-6752-5p | unconfirmed |