Literature DB >> 35182749

Penguin: A tool for predicting pseudouridine sites in direct RNA nanopore sequencing data.

Doaa Hassan1, Daniel Acevedo2, Swapna Vidhur Daulatabad3, Quoseena Mir3, Sarath Chandra Janga4.   

Abstract

Pseudouridine is one of the most abundant RNA modifications, occurring when uridines are catalyzed by Pseudouridine synthase proteins. It plays an important role in many biological processes and has been reported to have application in drug development. Recently, the single-molecule sequencing techniques such as the direct RNA sequencing platform offered by Oxford Nanopore technologies have enabled direct detection of RNA modifications on the molecule being sequenced. In this study, we introduce a tool called Penguin that integrates several machine learning (ML) models to identify RNA Pseudouridine sites on Nanopore direct RNA sequencing reads. Pseudouridine sites were identified on single molecule sequencing data collected from direct RNA sequencing resulting in 723 K reads in Hek293 and 500 K reads in Hela cell lines. Penguin extracts a set of features from the raw signal measured by the Oxford Nanopore and the corresponding basecalled k-mer. Those features are used to train the predictors included in Penguin, which in turn, can predict whether the signal is modified by the presence of Pseudouridine sites in the testing phase. We have included various predictors in Penguin, including Support vector machines (SVM), Random Forest (RF), and Neural network (NN). The results on the two benchmark data sets for Hek293 and Hela cell lines show outstanding performance of Penguin either in random split testing or in independent validation testing. In random split testing, Penguin has been able to identify Pseudouridine sites with a high accuracy of 93.38% by applying SVM to Hek293 benchmark dataset. In independent validation testing, Penguin achieves an accuracy of 92.61% by training SVM with Hek293 benchmark dataset and testing it for identifying Pseudouridine sites on Hela benchmark dataset. Thus, Penguin outperforms the existing Pseudouridine predictors in the literature by 16 % higher accuracy than those predictors using independent validation testing. Employing penguin to predict Pseudouridine sites revealed a significant enrichment of "regulation of mRNA 3'-end processing" in Hek293 cell line and 'positive regulation of transcription from RNA polymerase II promoter involved in cellular response to chemical stimulus' in Hela cell line. Penguin software and models are available on GitHub at https://github.com/Janga-Lab/Penguin and can be readily employed for predicting Ψ sites from Nanopore direct RNA-sequencing datasets.
Copyright © 2022 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Nanopore; Pseudouridine; RNA modifications

Mesh:

Substances:

Year:  2022        PMID: 35182749      PMCID: PMC9232934          DOI: 10.1016/j.ymeth.2022.02.005

Source DB:  PubMed          Journal:  Methods        ISSN: 1046-2023            Impact factor:   4.647


  20 in total

1.  Pseudouridine in a new era of RNA modifications.

Authors:  Boxuan Simen Zhao; Chuan He
Journal:  Cell Res       Date:  2014-11-04       Impact factor: 25.617

2.  Minimap2: pairwise alignment for nucleotide sequences.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2018-09-15       Impact factor: 6.937

3.  PPUS: a web server to predict PUS-specific pseudouridine sites.

Authors:  Yan-Hui Li; Gaigai Zhang; Qinghua Cui
Journal:  Bioinformatics       Date:  2015-06-14       Impact factor: 6.937

Review 4.  Pseudouridine: the fifth RNA nucleotide with renewed interests.

Authors:  Xiaoyu Li; Shiqing Ma; Chengqi Yi
Journal:  Curr Opin Chem Biol       Date:  2016-06-24       Impact factor: 8.822

5.  Quantitative profiling of pseudouridylation dynamics in native RNAs with nanopore sequencing.

Authors:  Oguzhan Begik; Morghan C Lucas; Leszek P Pryszcz; Jose Miguel Ramirez; Rebeca Medina; Ivan Milenkovic; Sonia Cruciani; Huanle Liu; Helaine Graziele Santos Vieira; Aldema Sas-Chen; John S Mattick; Schraga Schwartz; Eva Maria Novoa
Journal:  Nat Biotechnol       Date:  2021-05-13       Impact factor: 54.908

Review 6.  New Twists in Detecting mRNA Modification Dynamics.

Authors:  Ina Anreiter; Quoseena Mir; Jared T Simpson; Sarath C Janga; Matthias Soller
Journal:  Trends Biotechnol       Date:  2020-07-01       Impact factor: 19.536

7.  Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells.

Authors:  Thomas M Carlile; Maria F Rojas-Duran; Boris Zinshteyn; Hakyung Shin; Kristen M Bartoli; Wendy V Gilbert
Journal:  Nature       Date:  2014-09-05       Impact factor: 49.962

8.  RF-PseU: A Random Forest Predictor for RNA Pseudouridine Sites.

Authors:  Zhibin Lv; Jun Zhang; Hui Ding; Quan Zou
Journal:  Front Bioeng Biotechnol       Date:  2020-02-26

9.  iPseU-NCP: Identifying RNA pseudouridine sites using random forest and NCP-encoded features.

Authors:  Thanh-Hoang Nguyen-Vo; Quang H Nguyen; Trang T T Do; Thien-Ngan Nguyen; Susanto Rahardja; Binh P Nguyen
Journal:  BMC Genomics       Date:  2019-12-30       Impact factor: 3.969

10.  Interferon inducible pseudouridine modification in human mRNA by quantitative nanopore profiling.

Authors:  Sihao Huang; Wen Zhang; Christopher D Katanski; Devin Dersh; Qing Dai; Karen Lolans; Jonathan Yewdell; A Murat Eren; Tao Pan
Journal:  Genome Biol       Date:  2021-12-06       Impact factor: 13.583

View more
  3 in total

Review 1.  Nanopore-Based Detection of Viral RNA Modifications.

Authors:  Jonathan S Abebe; Ruth Verstraten; Daniel P Depledge
Journal:  mBio       Date:  2022-05-17       Impact factor: 7.786

Review 2.  Ribosomal RNA Pseudouridylation: Will Newly Available Methods Finally Define the Contribution of This Modification to Human Ribosome Plasticity?

Authors:  Chiara Barozzi; Federico Zacchini; Sidra Asghar; Lorenzo Montanaro
Journal:  Front Genet       Date:  2022-06-01       Impact factor: 4.772

Review 3.  Applications and potentials of nanopore sequencing in the (epi)genome and (epi)transcriptome era.

Authors:  Shangqian Xie; Amy Wing-Sze Leung; Zhenxian Zheng; Dake Zhang; Chuanle Xiao; Ruibang Luo; Ming Luo; Shoudong Zhang
Journal:  Innovation (Camb)       Date:  2021-08-11
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.