Literature DB >> 34226915

Porpoise: a new approach for accurate prediction of RNA pseudouridine sites.

Fuyi Li1, Xudong Guo2, Peipei Jin3, Jinxiang Chen4, Dongxu Xiang5, Jiangning Song6, Lachlan J M Coin7.   

Abstract

Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  RNA pseudouridine sit; ebioinformatics; machine learning; sequence analysis; stacking ensemble learning

Mesh:

Substances:

Year:  2021        PMID: 34226915      PMCID: PMC8575008          DOI: 10.1093/bib/bbab245

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   13.994


  34 in total

1.  GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome.

Authors:  Fuyi Li; Chen Li; Mingjun Wang; Geoffrey I Webb; Yang Zhang; James C Whisstock; Jiangning Song
Journal:  Bioinformatics       Date:  2015-01-06       Impact factor: 6.937

2.  Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework.

Authors:  Fuyi Li; Jinxiang Chen; Zongyuan Ge; Ya Wen; Yanwei Yue; Morihiro Hayashida; Abdelkader Baggag; Halima Bensmail; Jiangning Song
Journal:  Brief Bioinform       Date:  2021-03-22       Impact factor: 11.622

3.  Chemical pulldown reveals dynamic pseudouridylation of the mammalian transcriptome.

Authors:  Xiaoyu Li; Ping Zhu; Shiqing Ma; Jinghui Song; Jinyi Bai; Fangfang Sun; Chengqi Yi
Journal:  Nat Chem Biol       Date:  2015-06-15       Impact factor: 15.040

4.  rRNA pseudouridylation defects affect ribosomal ligand binding and translational fidelity from yeast to human cells.

Authors:  Karen Jack; Cristian Bellodi; Dori M Landry; Rachel O Niederer; Arturas Meskauskas; Sharmishtha Musalgaonkar; Noam Kopmar; Olya Krasnykh; Alison M Dean; Sunnie R Thompson; Davide Ruggero; Jonathan D Dinman
Journal:  Mol Cell       Date:  2011-11-18       Impact factor: 17.970

5.  Meta-GDBP: a high-level stacked regression model to improve anticancer drug response prediction.

Authors:  Ran Su; Xinyi Liu; Guobao Xiao; Leyi Wei
Journal:  Brief Bioinform       Date:  2020-05-21       Impact factor: 11.622

6.  GlycoMinestruct: a new bioinformatics tool for highly accurate mapping of the human N-linked and O-linked glycoproteomes by incorporating structural features.

Authors:  Fuyi Li; Chen Li; Jerico Revote; Yang Zhang; Geoffrey I Webb; Jian Li; Jiangning Song; Trevor Lithgow
Journal:  Sci Rep       Date:  2016-10-06       Impact factor: 4.379

7.  iRNA-PseU: Identifying RNA pseudouridine sites.

Authors:  Wei Chen; Hua Tang; Jing Ye; Hao Lin; Kuo-Chen Chou
Journal:  Mol Ther Nucleic Acids       Date:  2016

8.  PIANO: A Web Server for Pseudouridine-Site (Ψ) Identification and Functional Annotation.

Authors:  Bowen Song; Yujiao Tang; Zhen Wei; Gang Liu; Jionglong Su; Jia Meng; Kunqi Chen
Journal:  Front Genet       Date:  2020-03-12       Impact factor: 4.599

View more
  9 in total

1.  STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction.

Authors:  Shaherin Basith; Gwang Lee; Balachandran Manavalan
Journal:  Brief Bioinform       Date:  2022-01-17       Impact factor: 11.622

2.  ASPIRER: a new computational approach for identifying non-classical secreted proteins based on deep learning.

Authors:  Xiaoyu Wang; Fuyi Li; Jing Xu; Jia Rong; Geoffrey I Webb; Zongyuan Ge; Jian Li; Jiangning Song
Journal:  Brief Bioinform       Date:  2022-03-10       Impact factor: 13.994

3.  Deepm5C: A deep-learning-based hybrid framework for identifying human RNA N5-methylcytosine sites using a stacking strategy.

Authors:  Md Mehedi Hasan; Sho Tsukiyama; Jae Youl Cho; Hiroyuki Kurata; Md Ashad Alam; Xiaowen Liu; Balachandran Manavalan; Hong-Wen Deng
Journal:  Mol Ther       Date:  2022-05-06       Impact factor: 12.910

4.  Methylartist: Tools for Visualising Modified Bases from Nanopore Sequence Data.

Authors:  Seth W Cheetham; Michaela Kindlova; Adam D Ewing
Journal:  Bioinformatics       Date:  2022-04-28       Impact factor: 6.931

5.  m5CRegpred: Epitranscriptome Target Prediction of 5-Methylcytosine (m5C) Regulators Based on Sequencing Features.

Authors:  Zhizhou He; Jing Xu; Haoran Shi; Shuxiang Wu
Journal:  Genes (Basel)       Date:  2022-04-12       Impact factor: 4.141

6.  Computational analysis and prediction of PE_PGRS proteins using machine learning.

Authors:  Fuyi Li; Xudong Guo; Dongxu Xiang; Miranda E Pitt; Arnold Bainomugisa; Lachlan J M Coin
Journal:  Comput Struct Biotechnol J       Date:  2022-01-22       Impact factor: 7.271

Review 7.  Recent development of machine learning-based methods for the prediction of defensin family and subfamily.

Authors:  Phasit Charoenkwan; Nalini Schaduangrat; S M Hasan Mahmud; Orawit Thinnukool; Watshara Shoombuatong
Journal:  EXCLI J       Date:  2022-05-05       Impact factor: 4.022

8.  Computational prediction and interpretation of druggable proteins using a stacked ensemble-learning framework.

Authors:  Phasit Charoenkwan; Nalini Schaduangrat; Pietro Lio'; Mohammad Ali Moni; Watshara Shoombuatong; Balachandran Manavalan
Journal:  iScience       Date:  2022-08-05

9.  Interferon inducible pseudouridine modification in human mRNA by quantitative nanopore profiling.

Authors:  Sihao Huang; Wen Zhang; Christopher D Katanski; Devin Dersh; Qing Dai; Karen Lolans; Jonathan Yewdell; A Murat Eren; Tao Pan
Journal:  Genome Biol       Date:  2021-12-06       Impact factor: 13.583

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.