Literature DB >> 30753596

CPPred: coding potential prediction based on the global description of RNA sequence.

Xiaoxue Tong1, Shiyong Liu1.   

Abstract

The rapid and accurate approach to distinguish between coding RNAs and ncRNAs has been playing a critical role in analyzing thousands of novel transcripts, which have been generated in recent years by next-generation sequencing technology. Previously developed methods CPAT, CPC2 and PLEK can distinguish coding RNAs and ncRNAs very well, but poorly distinguish between small coding RNAs and small ncRNAs. Herein, we report an approach, CPPred (coding potential prediction), which is based on SVM classifier and multiple sequence features including novel RNA features encoded by the global description. The CPPred can better distinguish not only between coding RNAs and ncRNAs, but also between small coding RNAs and small ncRNAs than the state-of-the-art methods due to the addition of the novel RNA features. A recent study proposes 1335 novel human coding RNAs from a large number of RNA-seq datasets. However, only 119 transcripts are predicted as coding RNAs by the CPPred. In fact, almost all proposed novel coding RNAs are ncRNAs (91.1%), which is consistent with previous reports. Remarkably, we also reveal that the global description of encoding features (T2, C0 and GC) plays an important role in the prediction of coding potential.
© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 30753596      PMCID: PMC6486542          DOI: 10.1093/nar/gkz087

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  79 in total

1.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors:  Weizhong Li; Adam Godzik
Journal:  Bioinformatics       Date:  2006-05-26       Impact factor: 6.937

Review 2.  Reconstruction of Arabidopsis thaliana fully integrated small RNA pathway.

Authors:  Sadegh Azimzadeh Jamalkandi; Ali Masoudi-Nejad
Journal:  Funct Integr Genomics       Date:  2009-11       Impact factor: 3.410

3.  RBPPred: predicting RNA-binding proteins from sequence using SVM.

Authors:  Xiaoli Zhang; Shiyong Liu
Journal:  Bioinformatics       Date:  2017-03-15       Impact factor: 6.937

4.  ELABELA: a hormone essential for heart development signals via the apelin receptor.

Authors:  Serene C Chng; Lena Ho; Jing Tian; Bruno Reversade
Journal:  Dev Cell       Date:  2013-12-05       Impact factor: 12.270

5.  PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme.

Authors:  Aimin Li; Junying Zhang; Zhongyin Zhou
Journal:  BMC Bioinformatics       Date:  2014-09-19       Impact factor: 3.169

6.  lncScore: alignment-free identification of long noncoding RNA from assembled novel transcripts.

Authors:  Jian Zhao; Xiaofeng Song; Kai Wang
Journal:  Sci Rep       Date:  2016-10-06       Impact factor: 4.379

7.  An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics.

Authors:  Ulrich Omasits; Adithi R Varadarajan; Michael Schmid; Sandra Goetze; Damianos Melidis; Marc Bourqui; Olga Nikolayeva; Maxime Québatte; Andrea Patrignani; Christoph Dehio; Juerg E Frey; Mark D Robinson; Bernd Wollscheid; Christian H Ahrens
Journal:  Genome Res       Date:  2017-11-15       Impact factor: 9.043

8.  COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features.

Authors:  Long Hu; Zhiyu Xu; Boqin Hu; Zhi John Lu
Journal:  Nucleic Acids Res       Date:  2016-09-07       Impact factor: 16.971

9.  Discovery of coding regions in the human genome by integrated proteogenomics analysis workflow.

Authors:  Yafeng Zhu; Lukas M Orre; Henrik J Johansson; Mikael Huss; Jorrit Boekel; Mattias Vesterlund; Alejandro Fernandez-Woodbridge; Rui M M Branca; Janne Lehtiö
Journal:  Nat Commun       Date:  2018-03-02       Impact factor: 14.919

10.  LncRNApred: Classification of Long Non-Coding RNAs and Protein-Coding Transcripts by the Ensemble Algorithm with a New Hybrid Feature.

Authors:  Cong Pian; Guangle Zhang; Zhi Chen; Yuanyuan Chen; Jin Zhang; Tao Yang; Liangyun Zhang
Journal:  PLoS One       Date:  2016-05-26       Impact factor: 3.240

View more
  15 in total

1.  WAFNRLTG: A Novel Model for Predicting LncRNA Target Genes Based on Weighted Average Fusion Network Representation Learning Method.

Authors:  Jianwei Li; Zhenwu Yang; Duanyang Wang; Zhiguang Li
Journal:  Front Cell Dev Biol       Date:  2022-01-19

2.  Class similarity network for coding and long non-coding RNA classification.

Authors:  Yu Zhang; Yahui Long; Chee Keong Kwoh
Journal:  BMC Bioinformatics       Date:  2021-12-20       Impact factor: 3.169

3.  LncRNA-Encoded Short Peptides Identification Using Feature Subset Recombination and Ensemble Learning.

Authors:  Siyuan Zhao; Jun Meng; Yushi Luan
Journal:  Interdiscip Sci       Date:  2021-07-25       Impact factor: 2.233

4.  Genomic Landscapes of Noncoding RNAs Regulating VEGFA and VEGFC Expression in Endothelial Cells.

Authors:  Isidore Mushimiyimana; Vanesa Tomas Bosch; Nihay Laham-Karam; Minna U Kaikkonen; Henri Niskanen; Nicholas L Downes; Pierre R Moreau; Kiley Hartigan; Seppo Ylä-Herttuala
Journal:  Mol Cell Biol       Date:  2021-06-23       Impact factor: 4.272

5.  PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts.

Authors:  Shuai Liu; Xiaohan Zhao; Guangyan Zhang; Weiyang Li; Feng Liu; Shichao Liu; Wen Zhang
Journal:  Genes (Basel)       Date:  2019-09-03       Impact factor: 4.096

6.  Predicting Long non-coding RNAs through feature ensemble learning.

Authors:  Yanzhen Xu; Xiaohan Zhao; Shuai Liu; Wen Zhang
Journal:  BMC Genomics       Date:  2020-12-17       Impact factor: 3.969

Review 7.  Decoding LncRNAs.

Authors:  Lidia Borkiewicz; Joanna Kalafut; Karolina Dudziak; Alicja Przybyszewska-Podstawka; Ilona Telejko
Journal:  Cancers (Basel)       Date:  2021-05-27       Impact factor: 6.639

8.  Large-Scale Multiplexing Permits Full-Length Transcriptome Annotation of 32 Bovine Tissues From a Single Nanopore Flow Cell.

Authors:  Michelle M Halstead; Alma Islas-Trejo; Daniel E Goszczynski; Juan F Medrano; Huaijun Zhou; Pablo J Ross
Journal:  Front Genet       Date:  2021-05-20       Impact factor: 4.599

9.  LncMirNet: Predicting LncRNA-miRNA Interaction Based on Deep Learning of Ribonucleic Acid Sequences.

Authors:  Sen Yang; Yan Wang; Yu Lin; Dan Shao; Kai He; Lan Huang
Journal:  Molecules       Date:  2020-09-23       Impact factor: 4.411

Review 10.  Long non-coding RNAs: emerging players regulating plant abiotic stress response and adaptation.

Authors:  Uday Chand Jha; Harsh Nayyar; Rintu Jha; Muhammad Khurshid; Meiliang Zhou; Nitin Mantri; Kadambot H M Siddique
Journal:  BMC Plant Biol       Date:  2020-10-12       Impact factor: 4.215

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.