Literature DB >> 26355518

Improved and Promising Identification of Human MicroRNAs by Incorporating a High-Quality Negative Set.

Leyi Wei, Minghong Liao, Yue Gao, Rongrong Ji, Zengyou He, Quan Zou.   

Abstract

MicroRNA (miRNA) plays an important role as a regulator in biological processes. Identification of (pre-) miRNAs helps in understanding regulatory processes. Machine learning methods have been designed for pre-miRNA identification. However, most of them cannot provide reliable predictive performances on independent testing data sets. We assumed this is because the training sets, especially the negative training sets, are not sufficiently representative. To generate a representative negative set, we proposed a novel negative sample selection technique, and successfully collected negative samples with improved quality. Two recent classifiers rebuilt with the proposed negative set achieved an improvement of ~6 percent in their predictive performance, which confirmed this assumption. Based on the proposed negative set, we constructed a training set, and developed an online system called miRNApre specifically for human pre-miRNA identification. We showed that miRNApre achieved accuracies on updated human and non-human data sets that were 34.3 and 7.6 percent higher than those achieved by current methods. The results suggest that miRNApre is an effective tool for pre-miRNA identification. Additionally, by integrating miRNApre, we developed a miRNA mining tool, mirnaDetect, which can be applied to find potential miRNAs in genome-scale data. MirnaDetect achieved a comparable mining performance on human chromosome 19 data as other existing methods.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 26355518     DOI: 10.1109/TCBB.2013.146

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  75 in total

1.  repRNA: a web server for generating various feature vectors of RNA sequences.

Authors:  Bin Liu; Fule Liu; Longyun Fang; Xiaolong Wang; Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2015-06-18       Impact factor: 3.291

2.  HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy.

Authors:  Huan Hu; Li Zhang; Haixin Ai; Hui Zhang; Yetian Fan; Qi Zhao; Hongsheng Liu
Journal:  RNA Biol       Date:  2018-06-06       Impact factor: 4.652

3.  CL-PMI: A Precursor MicroRNA Identification Method Based on Convolutional and Long Short-Term Memory Networks.

Authors:  Huiqing Wang; Yue Ma; Chunlin Dong; Chun Li; Jingjing Wang; Dan Liu
Journal:  Front Genet       Date:  2019-10-11       Impact factor: 4.599

4.  A framework for improving microRNA prediction in non-human genomes.

Authors:  Robert J Peace; Kyle K Biggar; Kenneth B Storey; James R Green
Journal:  Nucleic Acids Res       Date:  2015-07-10       Impact factor: 16.971

5.  Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences.

Authors:  Bin Liu; Fule Liu; Xiaolong Wang; Junjie Chen; Longyun Fang; Kuo-Chen Chou
Journal:  Nucleic Acids Res       Date:  2015-05-09       Impact factor: 16.971

6.  Development of a new oligonucleotide block location-based feature extraction (BLBFE) method for the classification of riboswitches.

Authors:  F Golabi; Mousa Shamsi; M H Sedaaghi; A Barzegar; Mohammad Saeid Hejazi
Journal:  Mol Genet Genomics       Date:  2020-01-04       Impact factor: 3.291

7.  iDNA-MT: Identification DNA Modification Sites in Multiple Species by Using Multi-Task Learning Based a Neural Network Tool.

Authors:  Xiao Yang; Xiucai Ye; Xuehong Li; Lesong Wei
Journal:  Front Genet       Date:  2021-03-31       Impact factor: 4.599

8.  Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation.

Authors:  Ruifeng Xu; Jiyun Zhou; Hongpeng Wang; Yulan He; Xiaolong Wang; Bin Liu
Journal:  BMC Syst Biol       Date:  2015-02-06

9.  Sample Selection for Training Cascade Detectors.

Authors:  Noelia Vállez; Oscar Deniz; Gloria Bueno
Journal:  PLoS One       Date:  2015-07-21       Impact factor: 3.240

10.  4mCPred-MTL: Accurate Identification of DNA 4mC Sites in Multiple Species Using Multi-Task Deep Learning Based on Multi-Head Attention Mechanism.

Authors:  Rao Zeng; Song Cheng; Minghong Liao
Journal:  Front Cell Dev Biol       Date:  2021-05-10
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.