| Literature DB >> 30598066 |
Yuanlin Ma1, Zuguo Yu2,3, Guosheng Han1, Jinyan Li4, Vo Anh1,5.
Abstract
BACKGROUND: Distinction between pre-microRNAs (precursor microRNAs) and length-similar pseudo pre-microRNAs can reveal more about the regulatory mechanism of RNA biological processes. Machine learning techniques have been widely applied to deal with this challenging problem. However, most of them mainly focus on secondary structure information of pre-microRNAs, while ignoring sequence-order information and sequence evolution information.Entities:
Keywords: Hibert-Huang transform; Network; PSI-BLAST profiles; Pre-microRNA; SVM; mRMR
Mesh:
Substances:
Year: 2018 PMID: 30598066 PMCID: PMC6311913 DOI: 10.1186/s12859-018-2518-2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Six IMF components and the residual obtained by EMD of hydrophilicity2 time series of the pre-microRNA hsa-mir-6843
The performance of different feature sets
| Method | Mcc | Accuracy |
|
|
|---|---|---|---|---|
| PSI-BLAST ( | 0.5129 | 0.7564 | 0.7681 | 0.7446 |
| HHT ( | 0.4887 | 0.7440 | 0.7731 | 0.7148 |
| Network ( | 0.7589 | 0.8785 | 0.9144 | 0.8425 |
| PSI-BLAST+Network ( | 0.7707 | 0.8853 | 0.8909 | 0.8797 |
| Network+HHT ( | 0.7212 | 0.8802 | 0.8783 | 0.8841 |
| PSI-BLAST+HHT+Network ( | 0.7850 | 0.8973 | 0.9028 | 0.8718 |
Fig. 2Flow chart of the identification method in this study
The top 30 features by feature selection
| Feature | Number | Feature | Number | ||
|---|---|---|---|---|---|
| Efm | 0.84881 | 1 | CCA% | 0.1727 | 16 |
| A-degree | 0.48748 | 2 | hht125 | 0.17185 | 17 |
| A-Burts | 0.4461 | 3 | hht381 | 0.17139 | 18 |
| A-coreness | 0.44058 | 4 | hht93 | 0.17045 | 19 |
| A-cocitation | 0.31875 | 5 | hht445 | 0.16982 | 20 |
| A-bibliographic | 0.31875 | 6 | hht61 | 0.16125 | 21 |
| V-coreness | 0.31703 | 7 | hht285 | 0.16055 | 22 |
| V-coreness | 0.31703 | 8 | hht66 | 0.15998 | 23 |
| Densith | 0.31196 | 9 | hht82 | 0.15998 | 24 |
| Modularity | 0.23591 | 10 | (G+C)% | 0.13625 | 25 |
| Ecs | 0.12031 | 11 | hht94 | 0.15456 | 26 |
| hht413 | 0.20155 | 12 | CC% | 0.15225 | 27 |
| hht253 | 0.1994 | 13 | hht157 | 0.15029 | 28 |
| N-atriculation | 0.19644 | 14 | hht189 | 0.14989 | 29 |
| Var-Vbetweenness | 0.18213 | 15 | GAA% | 0.14786 | 30 |
The performance of different k-mers: (k=2,3)
| Predictors | Mcc | Accuracy |
|
|
|---|---|---|---|---|
| PSI-BLAST-K-mer | 0.5129 | 0.7205 | 0.7329 | 0.7132 |
| 0.4582 | 0.6990 | 0.6780 | 0.7120 |
The performance of different features of secondary structure
| Predictors | Mcc | Accuracy |
|
|
|---|---|---|---|---|
| Network | 0.7589 | 0.8785 | 0.9144 | 0.8425 |
| Triplet-SVM [ | 0.64 | 0.8185 | 0.7847 | 0.8520 |
| IMcRNA-PseSSC [ | 0.72 | 0.8576 | 0.8836 | 0.8350 |
The performance of different methods on the same benchmark dataset
| Predictors | Mcc | Accuracy |
|
|
|---|---|---|---|---|
| Triplet-SVM [ | 0.64 | 0.8185 | 0.7847 | 0.8520 |
| MiPred [ | 0.75 | 0.8730 | 0.84 | 0.9060 |
| IMcRNA-EXPseSSC [ | 0.80 | 0.8986 | 0.8993 | 0.8978 |
| MicroR-Pred(SVM) [ | 0.88 | 0.9390 | 0.93 |
|
|
|
|
| 0.9010 |
The boldface represents the maximum value of each column
The result of different methods on an independent test set
| Method | Accuracy | Pre-microRNAs which were not correctly identified |
|---|---|---|
| IMcRNA-EXPseSSC [ | 0.8590(67/78) | hsa-mir-8069-2, hsa-mir-1843, hsa-mir-10393, hsa-mir-10394, |
| hsa-mir-10395, hsa-mir-10400, hsa-mir-10527, hsa-mir-11401, | ||
| hsa-mir-12115, hsa-mir-12128, hsa-mir-9500; | ||
| MicroR-Pred(SVM) [ | 0.9103(71/78) | hsa-mir-10395, hsa-mir-9500, hsa-mir-8069-2, hsa-mir-12115, |
| hsa-mir-10400, hsa-mir-11401, hsa-mir-12128; | ||
|
| 0.9615(75/78) | hsa-mir-1843, hsa-mir-12115, hsa-mir-11401. |
Classification accuracy of different methods on independent test sets
| Test sets | Label | Test set size |
| IMcRNA-EXPseSSC | MicroR-Pred(SVM) |
|---|---|---|---|---|---|
| hsa dataset | True | 78 | 0.9615 | 0.8590 | 0.9103 |
| ncRNA dataset | Pseudo | 410 | 0.9313 | 0.8976 | 0.9390 |
| Human negative dateset | Pseudo | 1000 | 0.9663 | 0.9197 | 0.9726 |