| Literature DB >> 34055810 |
Hongyu Li1,2, Li Chen1, Zaoli Huang1, Xiaotong Luo1, Huiqin Li1, Jian Ren1, Yubin Xie1.
Abstract
2'-O-methylations (2'-O-Me or Nm) are one of the most important layers of regulatory control over gene expression. With increasing attentions focused on the characteristics, mechanisms and influences of 2'-O-Me, a revolutionary technique termed Nm-seq were established, allowing the identification of precise 2'-O-Me sites in RNA sequences with high sensitivity. However, as the costs and complexities involved with this new method, the large-scale detection and in-depth study of 2'-O-Me is still largely limited. Therefore, the development of a novel computational method to identify 2'-O-Me sites with adequate reliability is urgently needed at the current stage. To address the above issue, we proposed a hybrid deep-learning algorithm named DeepOMe that combined Convolutional Neural Networks (CNN) and Bidirectional Long Short-term Memory (BLSTM) to accurately predict 2'-O-Me sites in human transcriptome. Validating under 4-, 6-, 8-, and 10-fold cross-validation, we confirmed that our proposed model achieved a high performance (AUC close to 0.998 and AUPR close to 0.880). When testing in the independent data set, DeepOMe was substantially superior to NmSEER V2.0. To facilitate the usage of DeepOMe, a user-friendly web-server was constructed, which can be freely accessed at http://deepome.renlab.org.Entities:
Keywords: 2′-O-methylation; BLSTM; CNN; RNA modification; web service
Year: 2021 PMID: 34055810 PMCID: PMC8160107 DOI: 10.3389/fcell.2021.686894
Source DB: PubMed Journal: Front Cell Dev Biol ISSN: 2296-634X
FIGURE 1The workflow of predicting 2′-O-Me sites from primary mRNA sequences. For splitting the input mRNA sequence into blocks, DeepOMe uses flanking sequence length of 120 and then predicts whether each position in extracted blocks contains 2′-O-methylation.
FIGURE 2The construction of prediction model in DeepOMe. (A) Network architecture of the DeepOMe prediction model. Flanking sequence selection under 4-fold cross-validation by AUPR (B) and AUROC (C).
FIGURE 3Performance evaluation and comparison. The ROC (A) and PR (B) curves in 4-, 6-, 8-,10-fold cross-validation. The ROC (C) and PR (D) curves in testing set between DeepOMe, NmSEER V2.0 and iRNA-2OM.
Comparison of Top-k Accuracy between DeepOMe, NmSEER V2.0, and iRNA-2OM in testing set.
| Top-k Accuracy | DeepOMe | NmSEER V2.0 | iRNA-2OM |
| Top-1 Accuracy | 0.8602 | 0.0 | 0.0 |
| Top-3 Accuracy | 0.9039 | 0.0131 | 0.0 |
| Top-5 Accuracy | 0.9082 | 0.0175 | 0.0 |
| Top-10 Accuracy | 0.9126 | 0.0175 | 0.0043 |
| Top-20 Accuracy | 0.9257 | 0.0306 | 0.0130 |
| Top-30 Accuracy | 0.9344 | 0.0611 | 0.0130 |
| Top-40 Accuracy | 0.9476 | 0.0699 | 0.0261 |
| Top-50 Accuracy | 0.9520 | 0.0699 | 0.0478 |
| Top-60 Accuracy | 0.9520 | 0.0830 | 0.0696 |
| Top-70 Accuracy | 0.9520 | 0.0917 | 0.0739 |
| Top-80 Accuracy | 0.9563 | 0.0961 | 0.0826 |
| Top-90 Accuracy | 0.9563 | 0.1004 | 0.0870 |
| Top-100 Accuracy | 0.9563 | 0.1004 | 0.1087 |
FIGURE 4(A) The main interface of DeepOMe. mRNA sequences can be input into the text area or uploaded as a single FASTA file. Thresholds with high, medium, and low stringencies are provided in the options panel. (B) The submitted task can be checked in the interactive table. (C) The result page of DeepOMe. Detailed information for the predicted modification sites, such as modified position, flanking sequence, prediction score, and prediction threshold, is listed in the table. (D,E) The visualization results of RNA secondary structure and protein domain organization for the input mRNA sequence using ViennaRNA, IBS, and InterProScan.