Literature DB >> 31714956

Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences.

Zhen Chen1, Pei Zhao2, Fuyi Li3, Yanan Wang4, A Ian Smith5, Geoffrey I Webb6, Tatsuya Akutsu7, Abdelkader Baggag8, Halima Bensmail8, Jiangning Song9.   

Abstract

RNA post-transcriptional modifications play a crucial role in a myriad of biological processes and cellular functions. To date, more than 160 RNA modifications have been discovered; therefore, accurate identification of RNA-modification sites is fundamental for a better understanding of RNA-mediated biological functions and mechanisms. However, due to limitations in experimental methods, systematic identification of different types of RNA-modification sites remains a major challenge. Recently, more than 20 computational methods have been developed to identify RNA-modification sites in tandem with high-throughput experimental methods, with most of these capable of predicting only single types of RNA-modification sites. These methods show high diversity in their dataset size, data quality, core algorithms, features extracted and feature selection techniques and evaluation strategies. Therefore, there is an urgent need to revisit these methods and summarize their methodologies, in order to improve and further develop computational techniques to identify and characterize RNA-modification sites from the large amounts of sequence data. With this goal in mind, first, we provide a comprehensive survey on a large collection of 27 state-of-the-art approaches for predicting N1-methyladenosine and N6-methyladenosine sites. We cover a variety of important aspects that are crucial for the development of successful predictors, including the dataset quality, operating algorithms, sequence and genomic features, feature selection, model performance evaluation and software utility. In addition, we also provide our thoughts on potential strategies to improve the model performance. Second, we propose a computational approach called DeepPromise based on deep learning techniques for simultaneous prediction of N1-methyladenosine and N6-methyladenosine. To extract the sequence context surrounding the modification sites, three feature encodings, including enhanced nucleic acid composition, one-hot encoding, and RNA embedding, were used as the input to seven consecutive layers of convolutional neural networks (CNNs), respectively. Moreover, DeepPromise further combined the prediction score of the CNN-based models and achieved around 43% higher area under receiver-operating curve (AUROC) for m1A site prediction and 2-6% higher AUROC for m6A site prediction, respectively, when compared with several existing state-of-the-art approaches on the independent test. In-depth analyses of characteristic sequence motifs identified from the convolution-layer filters indicated that nucleotide presentation at proximal positions surrounding the modification sites contributed most to the classification, whereas those at distal positions also affected classification but to different extents. To maximize user convenience, a web server was developed as an implementation of DeepPromise and made publicly available at http://DeepPromise.erc.monash.edu/, with the server accepting both RNA sequences and genomic sequences to allow prediction of two types of putative RNA-modification sites.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  RNA post-transcriptional modification; bioinformatics; deep learning; predictor; sequence analysis

Year:  2019        PMID: 31714956     DOI: 10.1093/bib/bbz112

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  20 in total

1.  Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning.

Authors:  Haodong Xu; Peilin Jia; Zhongming Zhao
Journal:  Brief Bioinform       Date:  2021-05-20       Impact factor: 11.622

2.  iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.

Authors:  Zhen Chen; Pei Zhao; Chen Li; Fuyi Li; Dongxu Xiang; Yong-Zi Chen; Tatsuya Akutsu; Roger J Daly; Geoffrey I Webb; Quanzhi Zhao; Lukasz Kurgan; Jiangning Song
Journal:  Nucleic Acids Res       Date:  2021-06-04       Impact factor: 16.971

3.  Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation.

Authors:  Daiyun Huang; Kunqi Chen; Bowen Song; Zhen Wei; Jionglong Su; Frans Coenen; João Pedro de Magalhães; Daniel J Rigden; Jia Meng
Journal:  Nucleic Acids Res       Date:  2022-10-14       Impact factor: 19.160

4.  EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction.

Authors:  Honglei Wang; Hui Liu; Tao Huang; Gangshen Li; Lin Zhang; Yanjing Sun
Journal:  BMC Bioinformatics       Date:  2022-06-08       Impact factor: 3.307

5.  ASPIRER: a new computational approach for identifying non-classical secreted proteins based on deep learning.

Authors:  Xiaoyu Wang; Fuyi Li; Jing Xu; Jia Rong; Geoffrey I Webb; Zongyuan Ge; Jian Li; Jiangning Song
Journal:  Brief Bioinform       Date:  2022-03-10       Impact factor: 13.994

6.  i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation.

Authors:  Md Mehedi Hasan; Balachandran Manavalan; Watshara Shoombuatong; Mst Shamima Khatun; Hiroyuki Kurata
Journal:  Plant Mol Biol       Date:  2020-03-05       Impact factor: 4.076

7.  RMDisease: a database of genetic variants that affect RNA modifications, with implications for epitranscriptome pathogenesis.

Authors:  Kunqi Chen; Bowen Song; Yujiao Tang; Zhen Wei; Qingru Xu; Jionglong Su; João Pedro de Magalhães; Daniel J Rigden; Jia Meng
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

Review 8.  The Regulation of RNA Modification Systems: The Next Frontier in Epitranscriptomics?

Authors:  Matthias R Schaefer
Journal:  Genes (Basel)       Date:  2021-02-26       Impact factor: 4.096

9.  iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets.

Authors:  Zhen Chen; Xuhan Liu; Pei Zhao; Chen Li; Yanan Wang; Fuyi Li; Tatsuya Akutsu; Chris Bain; Robin B Gasser; Junzhou Li; Zuoren Yang; Xin Gao; Lukasz Kurgan; Jiangning Song
Journal:  Nucleic Acids Res       Date:  2022-05-07       Impact factor: 19.160

10.  The m6A epitranscriptome opens a new charter in immune system logic.

Authors:  Zhonghua Ma; Xiangyu Gao; You Shuai; Xiaofang Xing; Jiafu Ji
Journal:  Epigenetics       Date:  2020-10-19       Impact factor: 4.861

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.