| Literature DB >> 29858081 |
Wei Chen1, Pengmian Feng2, Hui Yang3, Hui Ding3, Hao Lin4, Kuo-Chen Chou5.
Abstract
RNA modifications are additions of chemical groups to nucleotides or their local structural changes. Knowledge about the occurrence sites of these modifications is essential for in-depth understanding of the biological functions and mechanisms and for treating some genomic diseases as well. With the avalanche of RNA sequences generated in the post-genomic age, many computational methods have been proposed for identifying various types of RNA modifications one by one. However, so far no method whatsoever has been developed for simultaneously identifying several different types of RNA modifications. To address such a challenge, we developed a predictor called "iRNA-3typeA," by which we can simultaneously identify the occurrence sites of the following three most frequently observed modifications in RNA: (1) N1-methyladenosine (m1A), (2) N6-methyladenosine (m6A), and (3) adenosine to inosine (A-to-I). It has been shown via rigorous cross-validations for the RNA sequences from Homo sapiens and Mus musculus transcriptomes that the success rates achieved by the powerful new predictor are quite high. For the convenience of broad experimental scientists, a user-friendly web server for iRNA-3typeA has been established at http://lin-group.cn/server/iRNA-3typeA/. It is anticipated that iRNA-3typeA may become a useful high throughput tool for genome analysis.Entities:
Keywords: N(1)-methyladenosine; N(6)-methyladenosine; RNA modification; adenosine to inosine editing; five-step rules; web server
Year: 2018 PMID: 29858081 PMCID: PMC5992483 DOI: 10.1016/j.omtn.2018.03.012
Source DB: PubMed Journal: Mol Ther Nucleic Acids ISSN: 2162-2531 Impact factor: 8.886
Figure 1The Three Common Types of Modifications in RNA
(1) N1-methyladenosine (m1A), (2) N6-methyladenosine (m6A), and (3) adenosine to inosine (A-to-I).
The Success Rates Achieved by iRNA-3typeA via Jackknife Tests on the Benchmark Datasets for H. sapiens and M. musculus, Respectively
| Species | Type of Modification | Sn (%) | Sp (%) | Acc (%) | MCC |
|---|---|---|---|---|---|
| m1A | 98.38 | 99.89 | 99.13 | 0.98 | |
| m6A | 81.68 | 99.11 | 90.38 | 0.82 | |
| 86.18 | 95.23 | 90.71 | 0.82 | ||
| m1A | 97.46 | 100.00 | 98.73 | 0.97 | |
| m6A | 77.79 | 100.00 | 88.39 | 0.80 | |
| 96.75 | 100.00 | 98.38 | 0.96 |
The parameters used for SVM are and = 0.0078125.
The parameters used for SVM are and = 3.05158e-5.
The parameters used for SVM are and = 0.0078125.
The parameters used for SVM are and = 0.0078125.
The parameters used for SVM are and = 0.00012207.
The parameters used for SVM are and = 0.000488281.
The Comparative Results of the Proposed Predictor When Its Operating Algorithm Was Replaced from SVM to Other Classifiers
| Classifier | Species | Modification Type | Sn (%) | Sp (%) | Acc (%) | MCC |
|---|---|---|---|---|---|---|
| BayesNet | m1A | 98.81 | 98.85 | 98.83 | 0.98 | |
| m6A | 82.04 | 100.00 | 91.02 | 0.83 | ||
| 88.50 | 89.57 | 89.03 | 0.78 | |||
| m1A | 97.18 | 98.78 | 97.98 | 0.96 | ||
| m6A | 77.79 | 100.00 | 88.90 | 0.80 | ||
| 96.51 | 99.88 | 98.20 | 0.96 | |||
| Naive Bayes | m1A | 98.16 | 98.30 | 98.23 | 0.96 | |
| m6A | 82.04 | 99.73 | 90.88 | 0.83 | ||
| 89.40 | 87.04 | 88.22 | 0.76 | |||
| m1A | 96.43 | 97.75 | 97.09 | 0.94 | ||
| m6A | 77.79 | 98.62 | 88.22 | 0.78 | ||
| 95.91 | 97.95 | 96.93 | 0.94 | |||
| J48 Tree | m1A | 98.77 | 99.40 | 99.09 | 0.98 | |
| m6A | 82.48 | 84.35 | 83.41 | 0.67 | ||
| 88.18 | 89.04 | 88.60 | 0.77 | |||
| m1A | 96.71 | 98.68 | 97.70 | 0.95 | ||
| m6A | 83.03 | 82.21 | 82.62 | 0.65 | ||
| 96.27 | 99.04 | 97.65 | 0.95 | |||
| SVM | m1A | 98.46 | 99.89 | 99.18 | 0.98 | |
| m6A | 80.44 | 100.00 | 90.23 | 0.82 | ||
| 86.73 | 95.40 | 91.07 | 0.82 | |||
| m1A | 97.46 | 100.00 | 98.73 | 0.97 | ||
| m6A | 77.79 | 100.00 | 88.90 | 0.80 | ||
| 97.35 | 100.00 | 98.67 | 0.97 |
All the rates below are obtained by the 10-fold cross-validations on the same benchmark datasets (Supplemental Information S1 and Supplemental Information S2 available at http://lin-group.cn/server/iRNA3typeA/data.htm).
Taken from the WEKA package.
Proposed in this paper.
Figure 2The Semi-screenshot for the Top Page of the iRNA-3typeA Web Server and the Prediction Result of the Two Example Query Sequences
The Semi-screenshot for the top page of the iRNA-3typeA Web Server (top panel) and the Prediction Result of the two example query sequences (bottom panel).
A Breakdown of the Benchmark Dataset
| Species | Attribute | Number of Samples | ||
|---|---|---|---|---|
| m1A | m6A | |||
| positive | 6,366 | 1,130 | 3,000 | |
| negative | 6,366 | 1,130 | 3,000 | |
| positive | 1,064 | 725 | 831 | |
| negative | 1,064 | 725 | 831 | |
Figure 3A Flowchart to Show How the iRNA-3typeA Predictor Is Working