Literature DB >> 32721444

Prediction of N7-methylguanosine sites in human RNA based on optimal sequence features.

Yu-He Yang1, Chi Ma1, Jia-Shu Wang1, Hui Yang1, Hui Ding2, Shu-Guang Han3, Yan-Wen Li4.   

Abstract

N-7 methylguanosine (m7G) modification is a ubiquitous post-transcriptional RNA modification which is vital for maintaining RNA function and protein translation. Developing computational tools will help us to easily predict the m7G sites in RNA sequence. In this work, we designed a sequence-based method to identify the modification site in human RNA sequences. At first, several kinds of sequence features were extracted to code m7G and non-m7G samples. Subsequently, we used mRMR, F-score, and Relief to obtain the optimal subset of features which could produce the maximum prediction accuracy. In 10-fold cross-validation, results showed that the highest accuracy is 94.67% achieved by support vector machine (SVM) for identifying m7G sites in human genome. In addition, we examined the performances of other algorithms and found that the SVM-based model outperformed others. The results indicated that the predictor could be a useful tool for studying m7G. A prediction model is available at https://github.com/MapFM/m7g_model.git.
Copyright © 2019. Published by Elsevier Inc.

Entities:  

Keywords:  Feature analysis; Feature extraction; Feature selection; N-7 methylguanosine; Softpackage

Year:  2020        PMID: 32721444     DOI: 10.1016/j.ygeno.2020.07.035

Source DB:  PubMed          Journal:  Genomics        ISSN: 0888-7543            Impact factor:   5.736


  4 in total

1.  i5hmCVec: Identifying 5-Hydroxymethylcytosine Sites of Drosophila RNA Using Sequence Feature Embeddings.

Authors:  Hang-Yu Liu; Pu-Feng Du
Journal:  Front Genet       Date:  2022-05-03       Impact factor: 4.772

Review 2.  The Regulation of RNA Modification Systems: The Next Frontier in Epitranscriptomics?

Authors:  Matthias R Schaefer
Journal:  Genes (Basel)       Date:  2021-02-26       Impact factor: 4.096

3.  Identification and Classification of Enhancers Using Dimension Reduction Technique and Recurrent Neural Network.

Authors:  Qingwen Li; Lei Xu; Qingyuan Li; Lichao Zhang
Journal:  Comput Math Methods Med       Date:  2020-10-18       Impact factor: 2.238

4.  Identifying Heat Shock Protein Families from Imbalanced Data by Using Combined Features.

Authors:  Xiao-Yang Jing; Feng-Min Li
Journal:  Comput Math Methods Med       Date:  2020-09-23       Impact factor: 2.238

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.