| Literature DB >> 31581049 |
Wei Chen1, Xiaoming Song2, Hao Lv3, Hao Lin4.
Abstract
RNA N2-methylguanosine (m2G) is one kind of posttranscriptional modification and plays crucial roles in the control and stabilization of tRNA. However, our knowledge about the biological functions of m2G is still limited. The key step of revealing its new function is to recognize the m2G sites in the transcriptome. Since there is no effective method for detecting m2G sites, it is desirable to develop new methods to identify m2G sites. In this study, a computational predictor called iRNA-m2G was proposed to identify m2G sites in eukaryotic transcriptomes. In iRNA-m2G, the RNA sequences were encoded by using nucleotide chemical property and accumulated nucleotide frequency. iRNA-m2G was not only validated by the rigorous jackknife test on the benchmark dataset but also examined by performing cross-species validations. In addition, iRNA-m2G was also tested on an independent dataset. It was found that the accuracies obtained by iRNA-m2G were all quite promising in these tests, indicating that the proposed method could become a powerful tool for identifying m2G sites. Finally, a user-friendly web server for iRNA-m2G is freely accessible at http://lin-group.cn/server/iRNA-m2G.php.Entities:
Keywords: N(2)-methylguanosine; accumulated nucleotide frequency; nucleotide chemical property; web server
Year: 2019 PMID: 31581049 PMCID: PMC6796771 DOI: 10.1016/j.omtn.2019.08.023
Source DB: PubMed Journal: Mol Ther Nucleic Acids ISSN: 2162-2531 Impact factor: 8.886
Figure 1Nucleotide Composition Preferences of m2G-Site- and Non-m2G-Site-Containing Sequences
The m2G site (or non-m2G) site is at position 0 in the top and bottom levels of each panel. (A) Based on the sequences from dataset S1. (B–D) Based on the sequences from (B) H. sapiens, (C) M. musculus, and (D) S. cerevisiae, respectively.
Figure 2Predictive Performance of the Models based on Different Window Size
Results of the Species-Specific Models for Identifying m2G Sites in Different Species
| Species | Parameters | Sn (%) | Sp (%) | Acc (%) | MCC | auROC |
|---|---|---|---|---|---|---|
| 89.13 | 100.00 | 94.56 | 0.90 | 0.950 | ||
| 100.00 | 100.00 | 100.00 | 1.00 | 0.999 | ||
| 92.53 | 100.00 | 96.27 | 0.93 | 0.964 |
auROC, area under the receiving operating characteristic; Sn, sensitivity; Sp, specificity; Acc, accuracy; MCC, Matthews correlation coefficient.
Figure 3Heatmap Showing the Cross-Species Prediction Accuracies
Once a species-specific model was established on its own training dataset, it was tested on the data from the other seven species.