Literature DB >> 33367506

Deep forest ensemble learning for classification of alignments of non-coding RNA sequences based on multi-view structure representations.

Ying Li1, Qi Zhang2, Zhaoqian Liu3, Cankun Wang4, Siyu Han5, Qin Ma6, Wei Du2.   

Abstract

Non-coding RNAs (ncRNAs) play crucial roles in multiple biological processes. However, only a few ncRNAs' functions have been well studied. Given the significance of ncRNAs classification for understanding ncRNAs' functions, more and more computational methods have been introduced to improve the classification automatically and accurately. In this paper, based on a convolutional neural network and a deep forest algorithm, multi-grained cascade forest (GcForest), we propose a novel deep fusion learning framework, GcForest fusion method (GCFM), to classify alignments of ncRNA sequences for accurate clustering of ncRNAs. GCFM integrates a multi-view structure feature representation including sequence-structure alignment encoding, structure image representation and shape alignment encoding of structural subunits, enabling us to capture the potential specificity between ncRNAs. For the classification of pairwise alignment of two ncRNA sequences, the F-value of GCFM improves 6% than an existing alignment-based method. Furthermore, the clustering of ncRNA families is carried out based on the classification matrix generated from GCFM. Results suggest better performance (with 20% accuracy improved) than existing ncRNA clustering methods (RNAclust, Ensembleclust and CNNclust). Additionally, we apply GCFM to construct a phylogenetic tree of ncRNA and predict the probability of interactions between RNAs. Most ncRNAs are located correctly in the phylogenetic tree, and the prediction accuracy of RNA interaction is 90.63%. A web server (http://bmbl.sdstate.edu/gcfm/) is developed to maximize its availability, and the source code and related data are available at the same URL.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Keywords:  GcForest; deep fusion framework; multi-view structure feature representation; ncRNAs clustering; pairwise ncRNAs classification

Year:  2021        PMID: 33367506      PMCID: PMC8294561          DOI: 10.1093/bib/bbaa354

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  42 in total

Review 1.  Non-coding RNAs in human disease.

Authors:  Manel Esteller
Journal:  Nat Rev Genet       Date:  2011-11-18       Impact factor: 53.242

Review 2.  Prediction of RNA secondary structure by free energy minimization.

Authors:  David H Mathews; Douglas H Turner
Journal:  Curr Opin Struct Biol       Date:  2006-05-19       Impact factor: 6.809

Review 3.  Biogenesis and germline functions of piRNAs.

Authors:  Carla Klattenhoff; William Theurkauf
Journal:  Development       Date:  2007-11-21       Impact factor: 6.868

4.  The RNA world is alive and well.

Authors:  Blake C Meyers; Marjori Matzke; Venkatesan Sundaresan
Journal:  Trends Plant Sci       Date:  2008-06-16       Impact factor: 18.313

Review 5.  Identification, Prediction and Data Analysis of Noncoding RNAs: A Review.

Authors:  Abbasali Emamjomeh; Javad Zahiri; Mehrdad Asadian; Mehrdad Behmanesh; Barat A Fakheri; Ghasem Mahdevar
Journal:  Med Chem       Date:  2019       Impact factor: 2.745

Review 6.  Posttranscriptional gene regulation by long noncoding RNA.

Authors:  Je-Hyun Yoon; Kotb Abdelmohsen; Myriam Gorospe
Journal:  J Mol Biol       Date:  2012-11-23       Impact factor: 5.469

7.  DAFS: simultaneous aligning and folding of RNA sequences via dual decomposition.

Authors:  Kengo Sato; Yuki Kato; Tatsuya Akutsu; Kiyoshi Asai; Yasubumi Sakakibara
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

8.  Noncoding RNA gene detection using comparative sequence analysis.

Authors:  E Rivas; S R Eddy
Journal:  BMC Bioinformatics       Date:  2001-10-10       Impact factor: 3.169

9.  DotAligner: identification and clustering of RNA structure motifs.

Authors:  Martin A Smith; Stefan E Seemann; Xiu Cheng Quek; John S Mattick
Journal:  Genome Biol       Date:  2017-12-28       Impact factor: 13.583

10.  Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign.

Authors:  Arif Ozgun Harmanci; Gaurav Sharma; David H Mathews
Journal:  BMC Bioinformatics       Date:  2007-04-19       Impact factor: 3.169

View more
  1 in total

1.  mLoc-mRNA: predicting multiple sub-cellular localization of mRNAs using random forest algorithm coupled with feature selection via elastic net.

Authors:  Prabina Kumar Meher; Anil Rai; Atmakuri Ramakrishna Rao
Journal:  BMC Bioinformatics       Date:  2021-06-24       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.