| Literature DB >> 32183740 |
Cheng Yan1,2, Fang-Xiang Wu3, Jianxin Wang1, Guihua Duan4.
Abstract
BACKGROUND: MicroRNAs (miRNAs) are a kind of small noncoding RNA molecules that are direct posttranscriptional regulations of mRNA targets. Studies have indicated that miRNAs play key roles in complex diseases by taking part in many biological processes, such as cell growth, cell death and so on. Therefore, in order to improve the effectiveness of disease diagnosis and treatment, it is appealing to develop advanced computational methods for predicting the essentiality of miRNAs. RESULT: In this study, we propose a method (PESM) to predict the miRNA essentiality based on gradient boosting machines and miRNA sequences. First, PESM extracts the sequence and structural features of miRNAs. Then it uses gradient boosting machines to predict the essentiality of miRNAs. We conduct the 5-fold cross-validation to assess the prediction performance of our method. The area under the receiver operating characteristic curve (AUC), F-measure and accuracy (ACC) are used as the metrics to evaluate the prediction performance. We also compare PESM with other three competing methods which include miES, Gaussian Naive Bayes and Support Vector Machine.Entities:
Keywords: Essentiality; Gradient boosting machines; MiRNA
Mesh:
Substances:
Year: 2020 PMID: 32183740 PMCID: PMC7079416 DOI: 10.1186/s12859-020-3426-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
The feature set description
| Category | Description | Number of features |
|---|---|---|
| Base content in pre-miRNAs | The content of base | 3 |
| mature-miRNAs length | The sequence length of mature-miRNAs | 1 |
| Base content in mature-miRNAs | The content of base | 3 |
| non-mature-miRNAs length | The sequence length of non-mature-miRNAs | 1 |
| Base content in non-mature-miRNAs | The content of base | 3 |
| MFE and nMFE | The minimum free energy of pre-miRNA secondary structures and it is divided by its length | 2 |
| Cleavage site base class | The cleavage sites are assigned into 3 classes, 1: all cleavage sites of mature-miRNAs from the same pre-miRNAs are | 1 |
| Dinucleotide pairs frequency in pre-miRNAs | The Dinucleotide pairs | 9 |
| Dinucleotide pairs frequency in mature-miRNAs | The Dinucleotide pairs | 9 |
| The structure feature of pre-miRNAs | Normalized base-pairing propensity ( | 6 |
Fig. 1The ROC plot of the four computational methods with on the 5-fold cross validation
The ACC and F-measure values of four computational methods with on the 5-fold cross validation
| Method | ACC | F-measure |
|---|---|---|
| PESM | 0.8516 | 0.8572 |
| miES | 0.8263 | 0.8326 |
| GaussianNB | 0.8000 | 0.8093 |
| SVM | 0.8206 | 0.8271 |
Fig. 2The relative importance of all 38 features. pre-miR means pre-miRNA; MIR means mature miRNA; non-MIR means non-mature-miRNA
The prediction performances of PESM with different settings of λ
| 0.25 | 0.50 | 0.75 | 1.0 | |
| AUC | 0.9116 | 0.9116 | 0.9116 | 0.9117 |
| 1.25 | 1.50 | 1.75 | 2.0 | |
| AUC | 0.9083 | 0.9041 | 0.9025 | 0.9041 |
The prediction performances of PESM with different settings of K
| 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
|---|---|---|---|---|---|---|---|
| AUC | 0.9066 | 0.9067 | 0.9116 | 0.9117 | 0.9133 | 0.9058 | 0.9053 |
The prediction performances of PESM with different settings of T
| 100 | 500 | 1000 | 1500 | 2000 | |
|---|---|---|---|---|---|
| AUC | 0.8958 | 0.9068 | 0.9117 | 0.9141 | 0.9113 |