Literature DB >> 32886613

Predicting Coding Potential of RNA Sequences by Solving Local Data Imbalance.

Xian-Gan Chen, Shuai Liu, Wen Zhang.   

Abstract

Non-coding RNAs (ncRNAs)play an important role in various biological processes and are associated with diseases. Distinguishing between coding RNAs and ncRNAs, also known as predicting coding potential of RNA sequences, is critical for downstream biological function analysis. Many machine learning-based methods have been proposed for predicting coding potential of RNA sequences. Recent studies reveal that most existing methods have poor performance on RNA sequences with short Open Reading Frames (sORF, ORF length<303nt). In this work, we analyze the distribution of ORF length of RNA sequences, and observe that the number of coding RNAs with sORF is inadequate and coding RNAs with sORF are much less than ncRNAs with sORF. Thus, there exists the problem of local data imbalance in RNA sequences with sORF. We propose a coding potential prediction method CPE-SLDI, which uses data oversampling techniques to augment samples for coding RNAs with sORF so as to alleviate local data imbalance. Compared with existing methods, CPE-SLDI produces the better performances, and studies reveal that data augmentation by various data oversampling techniques can enhance the performance of coding potential prediction, especially for RNA sequences with sORF. The implementation of the proposed method is available at https://github.com/chenxgscuec/CPESLDI.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 32886613     DOI: 10.1109/TCBB.2020.3021800

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  2 in total

1.  AntiDMPpred: a web service for identifying anti-diabetic peptides.

Authors:  Xue Chen; Jian Huang; Bifang He
Journal:  PeerJ       Date:  2022-06-14       Impact factor: 3.061

2.  ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation.

Authors:  Xian-Gan Chen; Wen Zhang; Xiaofei Yang; Chenhong Li; Hengling Chen
Journal:  Front Genet       Date:  2021-06-30       Impact factor: 4.599

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.