Literature DB >> 33568063

ReCGBM: a gradient boosting-based method for predicting human dicer cleavage sites.

Pengyu Liu1, Jiangning Song2, Chun-Yu Lin3,4, Tatsuya Akutsu5.   

Abstract

BACKGROUND: Human dicer is an enzyme that cleaves pre-miRNAs into miRNAs. Several models have been developed to predict human dicer cleavage sites, including PHDCleav and LBSizeCleav. Given an input sequence, these models can predict whether the sequence contains a cleavage site. However, these models only consider each sequence independently and lack interpretability. Therefore, it is necessary to develop an accurate and explainable predictor, which employs relations between different sequences, to enhance the understanding of the mechanism by which human dicer cleaves pre-miRNA.
RESULTS: In this study, we develop an accurate and explainable predictor for human dicer cleavage site - ReCGBM. We design relational features and class features as inputs to a lightGBM model. Computational experiments show that ReCGBM achieves the best performance compared to the existing methods. Further, we find that features in close proximity to the center of pre-miRNA are more important and make a significant contribution to the performance improvement of the developed method.
CONCLUSIONS: The results of this study show that ReCGBM is an interpretable and accurate predictor. Besides, the analyses of feature importance show that it might be of particular interest to consider more informative features close to the center of the pre-miRNA in future predictors.

Entities:  

Keywords:  Cleavage sites; Dicer cleavage site; Gradient boosting machine; Machine learning

Year:  2021        PMID: 33568063      PMCID: PMC7877110          DOI: 10.1186/s12859-021-03993-0

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  23 in total

1.  Vienna RNA secondary structure server.

Authors:  Ivo L Hofacker
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

2.  WebLogo: a sequence logo generator.

Authors:  Gavin E Crooks; Gary Hon; John-Marc Chandonia; Steven E Brenner
Journal:  Genome Res       Date:  2004-06       Impact factor: 9.043

3.  UNAFold: software for nucleic acid folding and hybridization.

Authors:  Nicholas R Markham; Michael Zuker
Journal:  Methods Mol Biol       Date:  2008

4.  The role of microRNA genes in papillary thyroid carcinoma.

Authors:  Huiling He; Krystian Jazdzewski; Wei Li; Sandya Liyanarachchi; Rebecca Nagy; Stefano Volinia; George A Calin; Chang-Gong Liu; Kaarle Franssila; Saul Suster; Richard T Kloos; Carlo M Croce; Albert de la Chapelle
Journal:  Proc Natl Acad Sci U S A       Date:  2005-12-19       Impact factor: 11.205

5.  Pripper: prediction of caspase cleavage sites from whole proteomes.

Authors:  Mirva Piippo; Niina Lietzén; Olli S Nevalainen; Jussi Salmi; Tuula A Nyman
Journal:  BMC Bioinformatics       Date:  2010-06-15       Impact factor: 3.169

6.  LabCaS: labeling calpain substrate cleavage sites from amino acid sequence using conditional random fields.

Authors:  Yong-Xian Fan; Yang Zhang; Hong-Bin Shen
Journal:  Proteins       Date:  2012-12-24

7.  SVM-based prediction of caspase substrate cleavage sites.

Authors:  Lawrence J K Wee; Tin Wee Tan; Shoba Ranganathan
Journal:  BMC Bioinformatics       Date:  2006-12-18       Impact factor: 3.169

8.  LBSizeCleav: improved support vector machine (SVM)-based prediction of Dicer cleavage sites using loop/bulge length.

Authors:  Yu Bao; Morihiro Hayashida; Tatsuya Akutsu
Journal:  BMC Bioinformatics       Date:  2016-11-25       Impact factor: 3.169

9.  Prediction of HIV-1 protease cleavage site using a combination of sequence, structural, and physicochemical features.

Authors:  Onkar Singh; Emily Chia-Yu Su
Journal:  BMC Bioinformatics       Date:  2016-12-23       Impact factor: 3.169

10.  PHDcleav: a SVM based method for predicting human Dicer cleavage sites using sequence and secondary structure of miRNA precursors.

Authors:  Firoz Ahmed; Rakesh Kaundal; Gajendra P S Raghava
Journal:  BMC Bioinformatics       Date:  2013-10-09       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.