| Literature DB >> 33815484 |
Cong Pian1, Zhixin Yang1, Yuqian Yang1, Liangyun Zhang1, Yuanyuan Chen1.
Abstract
N6-methyladenosine (m6A), the most common posttranscriptional modification in eukaryotic mRNAs, plays an important role in mRNA splicing, editing, stability, degradation, etc. Since the methylation state is dynamic, methylation sequencing needs to be carried out over different time periods, which brings some difficulties to identify the RNA methyladenine sites. Thus, it is necessary to develop a fast and accurate method to identify the RNA N6-methyladenosine sites in the transcriptome. In this study, we use first-order and second-order Markov models to identify RNA N6-methyladenine sites in three species (Saccharomyces cerevisiae, mouse, and Homo sapiens). These two methods can fully consider the correlation between adjacent nucleotides. The results show that the performance of our method is better than that of other existing methods. Furthermore, the codons encoded by three nucleotides have biases in mRNA, and a second-order Markov model can capture this kind of information exactly. This may be the main reason why the performance of the second-order Markov model is better than that of the first-order Markov model in the m6A prediction problem. In addition, we provide a corresponding web tool called MM-m6APred.Entities:
Keywords: RNA N6-methyladenine sites; codons biases; second-order Markov model; transfer probability matrix; web tool
Year: 2021 PMID: 33815484 PMCID: PMC8017269 DOI: 10.3389/fgene.2021.650803
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Details of benchmark datasets.
| Type | Positive | Negative | Total | Length |
| Yeast cells | 1,300 | 1,300 | 2,600 | 51 nt |
| Mouse | 725 | 725 | 1,450 | 41 nt |
| Homo sapiens | 1,130 | 1,130 | 2,260 | 41 nt |
FIGURE 1The flow chart of m6A site prediction. (A) The construction of second-order Markov model (M and M) based on m6A sequence and non-m6A sequence. (B) The prediction for a test sequence. The sequence “GUAUAUAACUUUUUUCUUCAAGGAGCAGGUGUCUGCCUAA” is used as an example to explain the prediction process.
FIGURE 2The heat map of standardized quotient of the transfer probabilities of the three types of species. The heat map of standardized quotient of the transfer probabilities of the three types of species. (A) Saccharomyces cerevisiae cells, (B) mouse, and (C) Homo sapiens.
FIGURE 3Probability density maps of ln(Ratio) values of the three species. The three density maps (A), (B), and (C) correspond to S. cerevisiae, mouse, and H. sapiens. Red is negative and blue is positive.
Evaluation data comparison table of six methods in (A) S. cerevisiae, (B) mouse, and (C) H. sapiens.
| A | ACC (%) | MCC | ||
| M6APred-EL | 72 | 72.69 | 72.34 | 44.68 |
| SRAMP | 71.92 | 71.38 | 71.65 | 43.31 |
| iRNA-Methyl | 71.69 | 73.45 | 72.57 | 45.15 |
| M6AMRFS | 73.45 | 72.84 | 73.14 | 46.29 |
| First order-MM | 73.85 | 71.69 | 72.30 | 49.23 |
| Second order-MM | 88.46 | 98.46 | 93.46 | 87.36 |
| M6APred-EL | 77.79 | 1 | 88.90 | 79.79 |
| SRAMP | 77.79 | 1 | 88.90 | 79.79 |
| iRNA-Methyl | 77.66 | 99.31 | 88.48 | 78.84 |
| M6AMRFS | 77.79 | 1 | 88.90 | 79.79 |
| First order-MM | 79.98 | 88.88 | 83.55 | 74.85 |
| Second order-MM | 87.50 | 88.88 | 88.29 | 77.45 |
| M6APred-EL | 82.04 | 99.73 | 90.89 | 83.08 |
| SRAMP | 79.65 | 1 | 89.82 | 81.35 |
| iRNA-Methyl | 80.35 | 1 | 90.18 | 81.95 |
| M6AMRFS | 81.95 | 99.82 | 90.89 | 83.11 |
| First order-MM | 84.60 | 87.50 | 85.00 | 73.85 |
| Second order-MM | 86.46 | 94.69 | 90.58 | 81.43 |
FIGURE 4Comparison of the prediction effect of the six methods in (A) S. cerevisiae, (B) mouse, and (C) H. sapiens.
Comparison of the prediction effect of m6A in mice based on the m6Avar database.
| Method | |||
| M6APred-EL | 76.42 | 77.35 | 75.49 |
| SRAMP | 72.03 | 72.29 | 71.77 |
| iRNA-Methyl | 73.45 | 74.72 | 72.18 |
| M6AMRFS | 76.58 | 76.89 | 76.27 |
| First order-MM | 78.15 | 80.01 | 80.14 |
| Second order-MM |
FIGURE 5Comparison of the prediction effect of m6A in mice based on the m6Avar database.