Literature DB >> 33729104

Accurate identification of RNA D modification using multiple features.

Lijun Dou1,2, Wenyang Zhou3, Lichao Zhang4, Lei Xu5, Ke Han6.   

Abstract

As one of the common post-transcriptional modifications in tRNAs, dihydrouridine (D) has prominent effects on regulating the flexibility of tRNA as well as cancerous diseases. Facing with the expensive and time-consuming sequencing techniques to detect D modification, precise computational tools can largely promote the progress of molecular mechanisms and medical developments. We proposed a novel predictor, called iRNAD_XGBoost, to identify potential D sites using multiple RNA sequence representations. In this method, by considering the imbalance problem using hybrid sampling method SMOTEEEN, the XGBoost-selected top 30 features are applied to construct model. The optimized model showed high Sn and Sp values of 97.13% and 97.38% over jackknife test, respectively. For the independent experiment, these two metrics separately achieved 91.67% and 94.74%. Compared with iRNAD method, this model illustrated high generalizability and consistent prediction efficiencies for positive and negative samples, which yielded satisfactory MCC scores of 0.94 and 0.86, respectively. It is inferred that the chemical property and nucleotide density features (CPND), electron-ion interaction pseudopotential (EIIP and PseEIIP) as well as dinucleotide composition (DNC) are crucial to the recognition of D modification. The proposed predictor is a promising tool to help experimental biologists investigate molecular functions.

Entities:  

Keywords:  Dihydrouridine; XGBoost; feature Selection; imbalanced Datasets; prediction

Mesh:

Substances:

Year:  2021        PMID: 33729104      PMCID: PMC8632091          DOI: 10.1080/15476286.2021.1898160

Source DB:  PubMed          Journal:  RNA Biol        ISSN: 1547-6286            Impact factor:   4.652


  71 in total

1.  THE PRESENCE OF 5,6-DIHYDROURIDYLIC ACID IN YEAST "SOLUBLE" RIBONUCLEIC ACID.

Authors:  J T MADISON; R W HOLLEY
Journal:  Biochem Biophys Res Commun       Date:  1965-01-18       Impact factor: 3.575

2.  Sequence clustering in bioinformatics: an empirical study.

Authors:  Quan Zou; Gang Lin; Xingpeng Jiang; Xiangrong Liu; Xiangxiang Zeng
Journal:  Brief Bioinform       Date:  2018-09-18       Impact factor: 11.622

3.  Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response.

Authors:  Ran Su; Xinyi Liu; Leyi Wei; Quan Zou
Journal:  Methods       Date:  2019-02-14       Impact factor: 3.608

4.  Molecular basis of dihydrouridine formation on tRNA.

Authors:  Futao Yu; Yoshikazu Tanaka; Keitaro Yamashita; Takeo Suzuki; Akiyoshi Nakamura; Nagisa Hirano; Tsutomu Suzuki; Min Yao; Isao Tanaka
Journal:  Proc Natl Acad Sci U S A       Date:  2011-11-28       Impact factor: 11.205

Review 5.  The dynamic epitranscriptome: N6-methyladenosine and gene expression control.

Authors:  Kate D Meyer; Samie R Jaffrey
Journal:  Nat Rev Mol Cell Biol       Date:  2014-04-09       Impact factor: 94.444

6.  BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.

Authors:  Bin Liu; Xin Gao; Hanyu Zhang
Journal:  Nucleic Acids Res       Date:  2019-11-18       Impact factor: 16.971

7.  Compilation of tRNA sequences and sequences of tRNA genes.

Authors:  Mathias Sprinzl; Konstantin S Vassilenko
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

8.  MeDReaders: a database for transcription factors that bind to methylated DNA.

Authors:  Guohua Wang; Ximei Luo; Jianan Wang; Jun Wan; Shuli Xia; Heng Zhu; Jiang Qian; Yadong Wang
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 19.160

9.  iPromoter-2L2.0: Identifying Promoters and Their Types by Combining Smoothing Cutting Window Algorithm and Sequence-Based Features.

Authors:  Bin Liu; Kai Li
Journal:  Mol Ther Nucleic Acids       Date:  2019-08-14       Impact factor: 8.886

10.  iRNA-m2G: Identifying N2-methylguanosine Sites Based on Sequence-Derived Information.

Authors:  Wei Chen; Xiaoming Song; Hao Lv; Hao Lin
Journal:  Mol Ther Nucleic Acids       Date:  2019-08-28       Impact factor: 8.886

View more
  2 in total

Review 1.  The Dihydrouridine landscape from tRNA to mRNA: a perspective on synthesis, structural impact and function.

Authors:  Olivier Finet; Carlo Yague-Sanz; Florian Marchand; Damien Hermand
Journal:  RNA Biol       Date:  2022-01       Impact factor: 4.766

2.  Identification of D Modification Sites Using a Random Forest Model Based on Nucleotide Chemical Properties.

Authors:  Huan Zhu; Chun-Yan Ao; Yi-Jie Ding; Hong-Xia Hao; Liang Yu
Journal:  Int J Mol Sci       Date:  2022-03-11       Impact factor: 5.923

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.