Literature DB >> 36155798

Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation.

Daiyun Huang1,2, Kunqi Chen3, Bowen Song4,5, Zhen Wei1,6, Jionglong Su4,7, Frans Coenen2, João Pedro de Magalhães6, Daniel J Rigden5, Jia Meng1,5,8.   

Abstract

As the most pervasive epigenetic mark present on mRNA and lncRNA, N6-methyladenosine (m6A) RNA methylation regulates all stages of RNA life in various biological processes and disease mechanisms. Computational methods for deciphering RNA modification have achieved great success in recent years; nevertheless, their potential remains underexploited. One reason for this is that existing models usually consider only the sequence of transcripts, ignoring the various regions (or geography) of transcripts such as 3'UTR and intron, where the epigenetic mark forms and functions. Here, we developed three simple yet powerful encoding schemes for transcripts to capture the submolecular geographic information of RNA, which is largely independent from sequences. We show that m6A prediction models based on geographic information alone can achieve comparable performances to classic sequence-based methods. Importantly, geographic information substantially enhances the accuracy of sequence-based models, enables isoform- and tissue-specific prediction of m6A sites, and improves m6A signal detection from direct RNA sequencing data. The geographic encoding schemes we developed have exhibited strong interpretability, and are applicable to not only m6A but also N1-methyladenosine (m1A), and can serve as a general and effective complement to the widely used sequence encoding schemes in deep learning applications concerning RNA transcripts.
© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 36155798      PMCID: PMC9561283          DOI: 10.1093/nar/gkac830

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   19.160


  99 in total

1.  The PSIPRED protein structure prediction server.

Authors:  L J McGuffin; K Bryson; D T Jones
Journal:  Bioinformatics       Date:  2000-04       Impact factor: 6.937

2.  PseKRAAC: a flexible web server for generating pseudo K-tuple reduced amino acids composition.

Authors:  Yongchun Zuo; Yuan Li; Yingli Chen; Guangpeng Li; Zhenhe Yan; Lei Yang
Journal:  Bioinformatics       Date:  2016-08-26       Impact factor: 6.937

3.  iRNA(m6A)-PseDNC: Identifying N6-methyladenosine sites using pseudo dinucleotide composition.

Authors:  Wei Chen; Hui Ding; Xu Zhou; Hao Lin; Kuo-Chen Chou
Journal:  Anal Biochem       Date:  2018-09-08       Impact factor: 3.365

4.  Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome.

Authors:  Bastian Linder; Anya V Grozhik; Anthony O Olarerin-George; Cem Meydan; Christopher E Mason; Samie R Jaffrey
Journal:  Nat Methods       Date:  2015-06-29       Impact factor: 28.547

5.  Orthogonal matrix factorization enables integrative analysis of multiple RNA binding proteins.

Authors:  Martin Stražar; Marinka Žitnik; Blaž Zupan; Jernej Ule; Tomaž Curk
Journal:  Bioinformatics       Date:  2016-01-18       Impact factor: 6.937

6.  iRNA-PseU: Identifying RNA pseudouridine sites.

Authors:  Wei Chen; Hua Tang; Jing Ye; Hao Lin; Kuo-Chen Chou
Journal:  Mol Ther Nucleic Acids       Date:  2016

7.  RMBase v2.0: deciphering the map of RNA modifications from epitranscriptome sequencing data.

Authors:  Jia-Jia Xuan; Wen-Ju Sun; Peng-Hui Lin; Ke-Ren Zhou; Shun Liu; Ling-Ling Zheng; Liang-Hu Qu; Jian-Hua Yang
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

8.  Characterizing RNA Pseudouridylation by Convolutional Neural Networks.

Authors:  Xuan He; Sai Zhang; Yanqing Zhang; Zhixin Lei; Tao Jiang; Jianyang Zeng
Journal:  Genomics Proteomics Bioinformatics       Date:  2021-02-23       Impact factor: 6.409

9.  The Gene Ontology Resource: 20 years and still GOing strong.

Authors: 
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

10.  DART-seq: an antibody-free method for global m6A detection.

Authors:  Kate D Meyer
Journal:  Nat Methods       Date:  2019-09-23       Impact factor: 28.547

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.