Literature DB >> 32655300

WITMSG: Large-scale Prediction of Human Intronic m⁶A RNA Methylation Sites from Sequence and Genomic Features.

Lian Liu¹, Xiujuan Lei¹, Jia Meng¹, Zhen Wei¹.

Abstract

INTRODUCTION: N 6-methyladenosine (m6A) is one of the most widely studied epigenetic modifications. It plays important roles in various biological processes, such as splicing, RNA localization and degradation, many of which are related to the functions of introns. Although a number of computational approaches have been proposed to predict the m6A sites in different species, none of them were optimized for intronic m6A sites. As existing experimental data overwhelmingly relied on polyA selection in sample preparation and the intronic RNAs are usually underrepresented in the captured RNA library, the accuracy of general m6A sites prediction approaches is limited for intronic m6A sites prediction task.
METHODOLOGY: A computational framework, WITMSG, dedicated to the large-scale prediction of intronic m6A RNA methylation sites in humans has been proposed here for the first time. Based on the random forest algorithm and using only known intronic m6A sites as the training data, WITMSG takes advantage of both conventional sequence features and a variety of genomic characteristics for improved prediction performance of intron-specific m6A sites. RESULTS AND
CONCLUSION: It has been observed that WITMSG outperformed competing approaches (trained with all the m6A sites or intronic m6A sites only) in 10-fold cross-validation (AUC: 0.940) and when tested on independent datasets (AUC: 0.946). WITMSG was also applied intronome-wide in humans to predict all possible intronic m6A sites, and the prediction results are freely accessible at http://rnamd.com/intron/.

Entities: Chemical

Keywords: RNA methylation; genomic features; intron; m6A; sequence features; site prediction

Year: 2020 PMID： 32655300 PMCID： PMC7324894 DOI： 10.2174/1389202921666200211104140

Source DB: PubMed Journal: Curr Genomics ISSN： 1389-2029 Impact factor: 2.236

INTRODUCTION

Recent advances have shown that, among 150 known RNA modifications, N6-methyladenosine (m6A) has attracted the most extensive attention due to its prevalence and various biological functions [1-4]. The m6A RNA methylation usually occurs in the conserved sequence DRACH (D = G, A; H = A, C or U) or GGAC [5]. Studies showed that m6A appears in almost all the RNA transcripts, including coding and non-coding transcripts [6, 7], and is enriched near the stop codon, 3’ untranslated regions and the last exon region of mRNA [8, 9]. Moreover, increasing evidences suggest that pre-mRNA contains a large number of m6A modification sites, and more than 2,000 m6A sites were detected in introns, which may have important functions [10]. Recent studies have found that [11], as a common molecular tag, m6A modification involves in many important biological processes, including RNA localization and degradation [12, 13], RNA structural dynamics [11], variable splicing [12], primary microRNA process [14, 15], cell differentiation and adaptation, and clock regulation [16]. It is also associated with protein translation, obesity, abnormal brain development and other diseases [17]. Therefore, accurate localization of m6A is particularly important for understanding the function of RNA methylation in biology. In addition, there is evidence that methylation modification in introns can affect alternative splicing in three ways. First, RNA modification in introns can affect the interaction between snRNA and pre-mRNA. Secondly, the modification sites in introns can directly regulate the binding of RNA-binding proteins by strengthening the relationship between binding factors and their binding proteins, thus affecting variable splicing. Thirdly, RNA modification indirectly affects splicing sites by altering the secondary structure of RNA [18]. With the rapid development of high-throughput sequencing technology, the appearance of MeRIP-Seq opened the prelude to the global and unbiased analysis of RNA methylation in 2012 [5]. MeRIP-Seq high-throughput sequencing is the first technique to detect the m6A spectrum in the whole transcriptome, in which, the RNA fragments containing m6A are precipitated, purified, sequenced and then further analyzed. It is expected that there are more m6A-containing RNA fragments enriched near true m6A sites in immunoprecipitation samples (IP samples) compared with the input control samples (control samples), and people can use exomePeak [19] or other detection methods to detect the m6A peak (site) with a resolution of around 100nt. As the MeRIP-Seq data depends on both IP and input control samples simultaneously, the process is similar to the peak calling procedure widely used in ChIP-Seq [20] to predict histone modification or transcription factor binding sites. It is possible to further determine the exact location of m6A sites by searching for the m6A conforming to DRACH motif in the peak detected by exomePeak and other methods. However, the main disadvantage of this method is that it is often difficult to distinguish the random DRACH motifs from the real m6A-containing motifs nearby. If all the DRACH motifs (including random ones) located at the m6A peak are reported as the m6A sites, positive predictions will be made. Currently, both MeTDB [21] and RMBase [22] databases report a large number (more than 300,000) of m6A sites in the transcriptome, many of which should be false-positive sites due to the randomly DRACH motifs located within m6A peaks. Besides MeRIP-seq, a base-resolution technique such as miCLIP-Seq [23] has been proposed for the identification of precise m6A sites at base-resolution. However, due to the technical difficulty and the cost of the experiments, it has not been widely used to study the m6A epitranscriptome under different biological contexts, instead it provides necessary information for computational prediction of methylation sites. A number of computational methods have been developed so far for m6A sites prediction. iRNA-Methyl [24] combined the dinucleotide components with the enthalpy, entropy and free energy to form a pseudo dinucleotide composition (PseDNC) that represents the RNA sequence, then used the SVM classifier to predict the m6A methylation sites of Saccharomyces cerevisiae. Zhou et al. [25] proposed an m6A predictor called SRAMP, which takes advantage of sequence coding feature, K-nearest base-pair similarity feature and base-pair frequency feature and the random forest (RF) classifier respectively, and then integrated classification results by weighted sum method for mammalian m6A sites prediction. MethyRNA [26] encoded RNA sequences using the chemical characteristics of nucleotides and accumulated frequency information of nucleotides and predicted the m6A modification sites in Saccharomyces cerevisiae based on SVM classifier. PRNAm-PC [27] extracted 10 physicochemical characteristics of dinucleotides and combined them with their autocovariance and cross-covariance transformation features to form the PseDNC feature representing RNA sequence, and input into the SVM classifier to predict the m6A methylation sites of Saccharomyces cerevisiae. RAM-ESVM [28] uses PseDNC, Transcription Starting Sites (TSS) and Transcription Factor Binding Sites (TFBS) and their k-mer features to build three SVM classifiers, respectively. Then the classification results were integrated with the voting rules to detect the m6A methylation sites of Saccharomyces cerevisiae. BERMP [29] method used two classifiers to predict m6A methylation modification sites. Firstly, the base coding and the frequency of each base in a sliding window of a certain length were input into the Gated Recurrent Unit (GRU) classifier and the random forest classifier, respectively, and the final prediction results were obtained by the logistic regression model based on the results of the two classifiers. Gene2vec [30] employed the Convolutional Neural Network (CNN) for m6A prediction, which represented mRNA sequences with word embedding. Deep-m6A [31] took the product of one-hot coding of sequence characteristics and the reads count of sites in the IP samples as input to predict m6A sites by CNN. In addition, AthMethPre [32] and other methods [33-39] also extract features based on sequence information for the prediction of RNA methylation sites. WHISTLE [40] combined sequence features and 35 genomic features to predict m6A sites, and drafted the entire predicted m6A epitranscriptome. Although there are already many methods proposed for predicting RNA methylation sites, to our best knowledge, all of them focus on the prediction of methylation sites in full transcripts (including both exons and introns) or mature mRNA (including only exons), none is dedicated to predict methylation sites in introns. None of them considered the impact of polyA selection in RNA library preparation and the under-representation of intronic RNAs in the data and the detected results, which should induce strong bias when these approaches were used to apply for m6A sites located in the introns. In this paper, a framework which is based on the whole-intronome m6A methylation sites prediction by combining sequence features with genomic features (WITMSG) dedicated to the prediction of m6A sites in the introns. WITSMG extracted physicochemical characteristics and frequency accumulation characteristics of bases based on the sequence information and multiple genomic features and predicted whole-intronome m6A methylation sites with the random forest classifier.

MATERIALS and METHODS

Datasets Construction

The single based m6A sites used are the same as the raw data in WHISTLE project [40], including six single-base resolution m6A experiments from six datasets of five cell types (Table ), including HEK293T, MOLM13, A549, CD8T and HeLa, where HEK293T has two samples. The positive m6A sites are defined as m6A sites conforming to the DRACH consensus motifs and supported on at least 2 of the 6 datasets. The negative m6A sites were randomly selected on the same transcripts containing the positive sites. There are an equal number of negative and positive sites for each of the training data, and the underlying motifs are restricted on DRACH. The exons and introns are defined by the primary transcript (longest transcript) of each gene. The regions that can be mapped to multiple genes are masked, and no sites are reported from those regions. In the end, 5258 intronic m6A sites were collected, including 2629 positive sites and 2629 negative m6A sites. For testing purposes, a total of 108952 sites in exons were also collected, with 54476 positive m6A sites and 54476 negative m6A sites. Among the total 57105 m6A sites, 95.4% (54476) were from exons, which reflected the bias induced by the RNA library protocol. Four-fifths of the sites were retained for training and the other one-fifth of the sites were retained for testing purposes. We also combine the intronic and exonic sites to mimic the real scenario, in which both the exonic and intronic m6A sites were simultaneously considered for both training and testing (Fig. ).

Feature Representation

In this work, an m6A site is represented by two groups of features, i.e., the sequence features and the genomic features.

Sequence Features

A 21nt sequence around the DRACH motif was described using the same method of MethyRNA. There are four kinds of nucleotides in RNA, including adenine (A), guanine (G), cytosine (C) and uracil (U). According to the different structural properties, a nucleotide was depicted by three features: ring number, chemical functions and hydrogen bonds. For example, cytosine and uracil have only one ring structure, while adenine and guanine have two rings; adenine and cytosine both contain amino, divided into the amino group, while guanine and uracil both contain keto, divided into keto group. In addition, guanine and cytosine have strong hydrogen bonds when forming secondary structure, while adenine and uracil have weak hydrogen bonds. Therefore, following the above features, each nucleotide could be encoded by a three- dimensional vector: Thus, based on chemical properties defined, A can be encoded by a vector (1,1,1), C can be encoded by a vector (0,1,0), G can be encoded by a vector (1,0,0), and U can be encoded by a vector (0,0,1). In addition, base frequency information and the distribution of each base in the sequence were also considered, i.e., the base accumulation frequency feature is the frequency of the i -th base in the previous bases. The density of the i -th base is defined as the frequency of the occurrence of the i -th base before position, that is,, where is defined as the sum number of occurrences of the i -th base in the previous i bases. For example, for the sequence “CUGGAUCGUU”, cytosine appears at the first and seventh positions with cumulative frequencies of 1.00(1/1) and 0.29(2/7), respectively, while uracil frequencies are 0.5(1/2), 0.33(2/6), 0.33(3/9) and 0.4(4/10), respectively.

Genomic Features

Sequence features are often used alone in current RNA methylation sites prediction methods, but the sequence features cannot represent the topological information of RNA methylation sites; therefore 57 additional genomic features were generated that may contribute to the RNA methylation sites prediction. Specifically, genomic features 1-4 stand for the dummy variable features, which represent whether the site is overlapped to the topological region on the major RNA transcript. In order to prevent the influence of transcript isoforms, the primary transcripts (longest transcripts) were selected to extract genomic features. All features were extracted by using the transcriptional annotations of the hg19 TxDb package [22]. Genomic Features 5-6 represent the length of the transcript region containing the methylation site. If the region did not contain the methylation site, the value is set to 0. Feature 7-24 indicate that the adenosine site belongs to which consensus motif it is. Feature 25-28 capture the distances toward the splicing junctions or the nearest neighboring sites. Feature 29-34 represent clustering indicators or motif clustering. They are used to measure the clustering effect of the RNA methylation sites. Feature 35-38 are the scores related to evolutionary conservation, including two Phast-Cons scores and two fitness consequences score, to measure the conservation level. Feature 39 and feature 40 indicate the RNA secondary structures predicted by RNAfold [43]. Feature 41-52 are the RNA annotation related to m6A biology. Supplementary Table () contains the detailed information of the genomic features we considered in the prediction.

RESULTS and DISCUSSION

Evaluation Metrics

In order to measure the performance of the model, we used four kinds of performance metrics, including Sn (sensitivity), Sp (specificity), ACC (accuracy) and MCC (Matthews's correlation coefficient). The sensitivity reflects the success rate of positive sample prediction. The specificity reflects the success rate of negative sample prediction. MCC is a comprehensive performance evaluation index considering unbalanced data sets. The four indicators are defined as follows: where TP is true positive, TN is true negative, FP is false positive, and FN is false negative. In addition, Receiver Operating Characteristic (ROC) curves were plotted and the areas (as called “AUC”) under the curves were calculated and used as the primary evaluation metrics.

Comparison of RF and Other Classifiers in Cross-validation

Four classifiers were used for m6A sites prediction, including random forest (RF) [44, 45], support vector machine (SVM) [46], K-nearest neighbor (KNN) [47] and logistic regression (LR) [48], respectively. RF is one of the most widely used machine learning algorithms for biological data, based on which, SRAMP was developed for predicting the mammalian m6A sites. SVM is also one of the most widely used machine learning algorithms in computational biology. iRNA-Methyl and RAM-ESVM predicted m6A sites using SVM. KNN is one of the simplest methods in data mining classification technology, and LR is a classification model in machine learning, which has the characteristics of simple algorithms and high efficiency. For comparing the four classifiers, 10-fold cross-validation was employed on the training datasets, and the classifier with the best results was used in independent test data. Besides, the data of the introns, exons, and introns merged with exons (as called “combines”) were also tested, respectively. The performance of different classifiers were summarized. As shown in Table , RF, SVM and LR achieved very similar performance under all the 3 modes tested. Notably, the performance achieved on exon (AUC = 0.9133) or intron (AUC = 0.9403) is better than that on combined data (AUC = 0.9095). Although more training sites were available under the combined mode, mixing intronic and exonic m6A sites actually negatively affect the prediction performance, suggesting that the exonic and intronic m6A sites exhibited different characteristics in our data.

Independent Tests

We considered using sites from different regions for training and testing. Specifically, we used m6A sites from intron, exon and combined regions as training and then testing on sites from the 3 categories as well (Fig. ), and the results were summarized in Table , where the red font indicates the maximum value of AUC in the current category. Interestingly, the best prediction performance was achieved when the testing data and training data were from the same category. For intronic m6A sites prediction, substantially better performance was achieved when intronic sites were used (AUC = 0.9458) compared with when exonic sites (AUC = 0.9021) or combined sites were used (AUC = 0.9253). A similar trend is also observed for exonic or general (or combined) m6A sites prediction. In addition, it can be seen that RF gets the best performance among the four methods tested in intronic sites prediction in both cross-validation and independent test. Therefore, RF is chosen as the classifier for predicting whole intronome RNA methylation sites later. Additionally, the ROC curves of these 9 tests were shown in Fig. . We can see that the highest AUC for intronic m6A sites prediction was achieved when intronic sites were used as training. There is little difference in the performance of exonic or general m6A sites prediction between using exonic or general sites as the training data. This is because of the over-representation of exonic sites (95.4%) in the gold standard data from the WHISTLE project, which is likely due to the widely adopted polyA selection RNA library preparation protocol.

Feature Selection for Intronic and Exonic m6A Sites Prediction

To further optimize the prediction performance and differentiate the characteristics of intronic and exonic m6A sites, we used feature selection to identify the most effective features for m6A sites prediction in introns and exons, respectively. Firstly, the importance of features which are calculated by the random-forest R package is ranked as their respective AUCs in ten-fold cross-validation. Then, one feature is added into the training set at each time from the ordered feature set, and the prediction performance was evaluated using ten-fold cross-validation. The feature set returned highest AUC was selected as the optimal feature subset. As shown in Fig. ( and ), the top 5 most important predictive features for exons are the distance to the nearest neighbors (peaked at 2000bp), the distance to the nearest neighbors (peaked at 200bp), the number of neighbors within 100bp flanking regions, the number of neighbors within 1000bp flanking regions and the TREW data of METTL3 RNA binding sites, while the top 5 features for introns are the distance to the nearest neighbors (peaked at 2000bp), the fitness consequences scores 1bp z score, the number of neighbors within 1000bp flanking regions, the distance to the nearest neighbors (peaked at 200bp) and the full transcript length. While clearly indicates that RNA methylation sites exhibit certain clustering characteristics, the difference in top features also suggests the intrinsic difference in m6A sites located in exons and introns. Fig. ( and ) shows the results of the feature selection. The feature set with the highest AUC was selected. In the prediction of methylation sites in exons and introns, the highest AUC can be obtained from the top 77 and 120 features, respectively. Therefore, the top 77 features make up the optimal subset when predicting m6A methylation sites in exons, while the top 120 features for intronic sites prediction.

Comparison with Existing Methods

To further verify the effectiveness of the proposed algorithm in predicting m6A RNA methylation sites in introns, we compared WITMSG with SRAMP, MethyRNA and M6AMRFS. The results were summarized in Table and the ROC curves were shown in Fig. (). It can be seen that the proposed approach substantially outperformed competing approaches in intron-specific m6A sites prediction.

Intronome-wide m6A Sites Prediction

To generate a complete map of all the intronic m6A sites in humans, we searched the entire intronome (collection of all the introns) for the m6A DRACH motifs as the candidate m6A sites and then used the proposed approach to evaluate the probability of m6A methylation at these locations. In the end, 1,841,962 out of the total 20,156,510 intronic DRACH motifs were predicted to contain m6A RNA methylation sites, and the complete prediction results are freely accessible at http://rnamd.com/intron/.

CONCLUSION

With the rapid development of high throughput sequencing technology and bioinformatics efforts, people can now predict transcriptome m6A RNA modification sites with reasonable accuracy. However, till this day, efforts have not been made to address the bias induced in the RNA library preparation, which led to limited accuracy in intron-specific m6A sites prediction. We showed, for the first time, the different characteristics exhibited in intronic and exonic m6A sites, and then presented here WITMSG, a method to predict m6A epitranscriptome in introns. Unlike most of the other methods that relied on sequence information only, WITMSG extracted the physicochemical, frequency accumulation characteristics of bases, and 57 additional genomic characteristics to predict the m6A methylation modification sites in introns based on random forest. To the best of our knowledge, WITMSG is the first m6A sites predictor optimized for introns. By using only intronic m6A sites as the training data and integrating multiple genomic features besides conventional sequence features, WITMSG substantially outperformed competing approaches (SRAMP, M6AMRFS and MethyRNA) in intronic m6A sites prediction. In the end, we scanned the entire intronome for possible intronic m6A sites and made results of 1,841,962. The predicted intronic m6A sites publically accessible to share with researchers of the field, especially those who are interested in the function of m6A related to pre-mRNA. Notably, the proposed WITMSG framework can be easily extended to study the intronic RNA modification sites of other RNA modification types such as PSI, m1A and m5C as well as in other species such as mouse and yeast.

Table 1

Single-base resolution datasets in intronic m6A prediction.

ID	Cell	Note	Source
1	HEK293T	abacm antibody	[23]
2	HEK293T	sysy antibody	[23]
3	MOLM13	-	[41]
4	A549	-	[42]
5	CD8T	-	[42]
6	HeLa	-	[10]

Table 2

Performance under cross-validation.

Data	Method	Evaluation Metrics
Data	Method	Sn	Sp	ACC	MCC	AUC
Introns	RF	0.8184	0.9334	0.8759	0.7573	0.9403
	SVM	0.8242	0.8949	0.8595	0.7217	0.9292
	KNN	0.4988	0.5021	0.5005	0.0010	0.8142
	LR	0.7809	0.9496	0.8652	0.7413	0.9352
Exons	RF	0.8600	0.813	0.8396	0.6798	0.9133
	SVM	0.8383	0.8385	0.8384	0.6769	0.9131
	KNN	0.4993	0.5011	0.5002	0.0004	0.7984
	LR	0.7486	0.8922	0.8204	0.6476	0.9073
Combined	RF	0.8462	0.8253	0.8357	0.6716	0.9095
	SVM	0.8291	0.8341	0.8316	0.6632	0.9065
	KNN	0.4995	0.5010	0.5003	0.0005	0.7954
	LR	0.7250	0.8938	0.8094	0.6279	0.8977

Table 3

Results on independent tests.

Testing	Training	Method	Evaluation Metrics
Testing	Training	Method	Sn	Sp	ACC	MCC	AUC
Intron	intron	RF	0.8229	0.9544	0.8887	0.7841	0.9458
		SVM	0.8362	0.9297	0.8830	0.7693	0.9333
		KNN	0.4981	0.5067	0.5024	0.0047	0.8268
		LR	0.7752	0.9562	0.8658	0.7439	0.9366
	exon	RF	0.4667	0.9924	0.7298	0.5398	0.8794
		SVM	0.4133	0.9962	0.7050	0.5042	0.9021
		KNN	0.4971	0.5010	0.4990	-0.0019	0.6256
		LR	0.2514	1	0.6261	0.3794	0.8934
	combine	RF	0.6019	0.9848	0.7935	0.6353	0.9253
		SVM	0.8398	0.8328	0.8363	0.6726	0.9096
		KNN	0.4981	0.5067	0.5024	0.0047	0.7977
		LR	0.3505	1	0.6755	0.4611	0.8886
Exon	intron	RF	1	0.0012	0.5006	0.0244	0.8412
		SVM	0.9990	0.0310	0.5150	0.1195	0.6938
		KNN	0.4983	0.5027	0.5005	0.0010	0.5459
		LR	0.9989	0.0258	0.5123	0.1072	0.8309
	exon	RF	0.8584	0.8245	0.8414	0.6833	0.9165
		SVM	0.8401	0.8419	0.8410	0.6820	0.9151
		KNN	0.4992	0.5014	0.5003	0.0006	0.8001
		LR	0.7421	0.8951	0.8186	0.6448	0.9081
	combine	RF	0.8568	0.8247	0.8407	0.6819	0.9141
		SVM	0.8398	0.8328	0.8363	0.6726	0.9096
		KNN	0.4994	0.5009	0.5001	0.0003	0.7980
		LR	0.7349	0.8906	0.8128	0.6333	0.9015
Combine	intron	RF	0.9921	0.0444	0.5182	0.1144	0.8270
		SVM	0.4983	0.5028	0.5006	0.0012	0.5555
		KNN	0.9884	0.0686	0.5285	0.1455	0.8165
		LR	0.9914	0.0723	0.5318	0.1618	0.6955
	exon	RF	0.8320	0.8390	0.8355	0.6710	0.9110
		SVM	0.8142	0.8478	0.8310	0.6624	0.9055
		KNN	0.4993	0.5009	0.5009	0.0002	0.7908
		LR	0.7045	0.9024	0.8034	0.6191	0.8968
	combine	RF	0.8463	0.8326	0.8395	0.6790	0.9126
		SVM	0.8294	0.8394	0.8344	0.6689	0.9077
		KNN	0.4994	0.5011	0.5004	0.0005	0.7979
		LR	0.7173	0.8956	0.8065	0.6229	0.8979

Table 4

Performance comparison for intronic m6A sites prediction.

Method	Sn	Sp	ACC	MCC	AUC
SRAMP	0.7333	0.8213	0.7774	0.5568	0.8425
MethyRNA	0.6419	0.6996	0.6708	0.3421	0.7249
M6AMRFS	0.5352	0.2281	0.3815	-0.2487	0.6171
WITMSG	0.8152	0.9506	0.8830	0.7730	0.9458

46 in total

1. RNA-methylation-dependent RNA processing controls the speed of the circadian clock.

Authors: Jean-Michel Fustin; Masao Doi; Yoshiaki Yamaguchi; Hayashi Hida; Shinichi Nishimura; Minoru Yoshida; Takayuki Isagawa; Masaki Suimye Morioka; Hideaki Kakeya; Ichiro Manabe; Hitoshi Okamura
Journal: Cell Date: 2013-11-07 Impact factor: 41.582

2. m(6)A RNA methylation is regulated by microRNAs and promotes reprogramming to pluripotency.

Authors: Tong Chen; Ya-Juan Hao; Ying Zhang; Miao-Miao Li; Meng Wang; Weifang Han; Yongsheng Wu; Ying Lv; Jie Hao; Libin Wang; Ang Li; Ying Yang; Kang-Xuan Jin; Xu Zhao; Yuhuan Li; Xiao-Li Ping; Wei-Yi Lai; Li-Gang Wu; Guibin Jiang; Hai-Lin Wang; Lisi Sang; Xiu-Jie Wang; Yun-Gui Yang; Qi Zhou
Journal: Cell Stem Cell Date: 2015-02-12 Impact factor: 24.633

3. iRNA-2methyl: Identify RNA 2'-O-methylation Sites by Incorporating Sequence-Coupled Effects into General PseKNC and Ensemble Classifier.

Authors: Wang-Ren Qiu; Shi-Yu Jiang; Bi-Qian Sun; Xuan Xiao; Xiang Cheng; Kuo-Chen Chou
Journal: Med Chem Date: 2017 Impact factor: 2.745

Review 4. Gene expression regulation mediated through reversible m⁶A RNA methylation.

Authors: Ye Fu; Dan Dominissini; Gideon Rechavi; Chuan He
Journal: Nat Rev Genet Date: 2014-03-25 Impact factor: 53.242

5. Comprehensive analysis of mRNA methylation reveals enrichment in 3' UTRs and near stop codons.

Authors: Kate D Meyer; Yogesh Saletore; Paul Zumbo; Olivier Elemento; Christopher E Mason; Samie R Jaffrey
Journal: Cell Date: 2012-05-17 Impact factor: 41.582

6. Structure and thermodynamics of N6-methyladenosine in RNA: a spring-loaded base modification.

Authors: Caroline Roost; Stephen R Lynch; Pedro J Batista; Kun Qu; Howard Y Chang; Eric T Kool
Journal: J Am Chem Soc Date: 2015-02-02 Impact factor: 15.419

7. Detecting N⁶-methyladenosine sites from RNA transcriptomes using ensemble Support Vector Machines.

Authors: Wei Chen; Pengwei Xing; Quan Zou
Journal: Sci Rep Date: 2017-01-12 Impact factor: 4.379

8. RMBase v2.0: deciphering the map of RNA modifications from epitranscriptome sequencing data.

Authors: Jia-Jia Xuan; Wen-Ju Sun; Peng-Hui Lin; Ke-Ren Zhou; Shun Liu; Ling-Ling Zheng; Liang-Hu Qu; Jian-Hua Yang
Journal: Nucleic Acids Res Date: 2018-01-04 Impact factor: 16.971

Review 9. iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites.

Authors: Jiangning Song; Yanan Wang; Fuyi Li; Tatsuya Akutsu; Neil D Rawlings; Geoffrey I Webb; Kuo-Chen Chou
Journal: Brief Bioinform Date: 2019-03-25 Impact factor: 11.622

10. PIANO: A Web Server for Pseudouridine-Site (Ψ) Identification and Functional Annotation.

Authors: Bowen Song; Yujiao Tang; Zhen Wei; Gang Liu; Jionglong Su; Jia Meng; Kunqi Chen
Journal: Front Genet Date: 2020-03-12 Impact factor: 4.599

5 in total

1. RMDisease: a database of genetic variants that affect RNA modifications, with implications for epitranscriptome pathogenesis.

Authors: Kunqi Chen; Bowen Song; Yujiao Tang; Zhen Wei; Qingru Xu; Jionglong Su; João Pedro de Magalhães; Daniel J Rigden; Jia Meng
Journal: Nucleic Acids Res Date: 2021-01-08 Impact factor: 16.971

2. The m6A methylation landscape stratifies hepatocellular carcinoma into 3 subtypes with distinct metabolic characteristics.

Authors: Xiaotian Shen; Beiyuan Hu; Jing Xu; Wei Qin; Yan Fu; Shun Wang; Qiongzhu Dong; Lunxiu Qin
Journal: Cancer Biol Med Date: 2020-12-15 Impact factor: 4.248

Review 3. The functional roles, cross-talk and clinical implications of m6A modification and circRNA in hepatocellular carcinoma.

Authors: Sha Qin; Yitao Mao; Xue Chen; Juxiong Xiao; Yan Qin; Luqing Zhao
Journal: Int J Biol Sci Date: 2021-07-22 Impact factor: 6.580

4. Promoter-Bound Full-Length Intronic Circular RNAs-RNA Polymerase II Complexes Regulate Gene Expression in the Human Parasite Entamoeba histolytica.

Authors: Jesús Alberto García-Lerena; Gretter González-Blanco; Odila Saucedo-Cárdenas; Jesús Valdés
Journal: Noncoding RNA Date: 2022-01-27

5. Prediction of RNA Methylation Status From Gene Expression Data Using Classification and Regression Methods.

Authors: Hao Xue; Zhen Wei; Kunqi Chen; Yujiao Tang; Xiangyu Wu; Jionglong Su; Jia Meng
Journal: Evol Bioinform Online Date: 2020-07-20 Impact factor: 1.625

5 in total