Literature DB >> 23079837

High-accuracy splice site prediction based on sequence component and position features.

J L Li1, L F Wang, H Y Wang, L Y Bai, Z M Yuan.   

Abstract

Identification of splice sites plays a key role in the annotation of genes. Consequently, improvement of computational prediction of splice sites would be very useful. We examined the effect of the window size and the number and position of the consensus bases with a chi-square test, and then extracted the sequence multi-scale component features and the position and adjacent position relationship features of consensus sites. Then, we constructed a novel classification model using a support vector machine with the previously selected features and applied it to the Homo sapiens splice site dataset. This method greatly improved cross-validation accuracies for training sets with true and spurious splice sites of both equal and different proportions. This method was also applied to the NN269 dataset for further evaluation and independent testing. The results were superior to those obtained with previous methods, and demonstrate the stability and superiority of this method for prediction of splice sites.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23079837     DOI: 10.4238/2012.September.25.12

Source DB:  PubMed          Journal:  Genet Mol Res        ISSN: 1676-5680


  7 in total

1.  Systematic Computational Identification of Variants That Activate Exonic and Intronic Cryptic Splice Sites.

Authors:  Melissa Lee; Patrick Roos; Neeraj Sharma; Melis Atalar; Taylor A Evans; Matthew J Pellicore; Emily Davis; Anh-Thu N Lam; Susan E Stanley; Sara E Khalil; George M Solomon; Doug Walker; Karen S Raraigh; Briana Vecchio-Pagan; Mary Armanios; Garry R Cutting
Journal:  Am J Hum Genet       Date:  2017-05-04       Impact factor: 11.025

2.  Improved recognition of splice sites in A. thaliana by incorporating secondary structure information into sequence-derived features: a computational study.

Authors:  Prabina Kumar Meher; Subhrajit Satpathy
Journal:  3 Biotech       Date:  2021-10-31       Impact factor: 2.406

3.  A Study of Domain Adaptation Classifiers Derived From Logistic Regression for the Task of Splice Site Prediction.

Authors:  Nic Herndon; Doina Caragea
Journal:  IEEE Trans Nanobioscience       Date:  2016-01-28       Impact factor: 2.935

4.  An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets.

Authors:  Ana Stanescu; Doina Caragea
Journal:  BMC Syst Biol       Date:  2015-09-01

5.  HRGPred: Prediction of herbicide resistant genes with k-mer nucleotide compositional features and support vector machine.

Authors:  Prabina Kumar Meher; Tanmaya Kumar Sahu; K Raghunandan; Shachi Gahoi; Nalini Kanta Choudhury; Atmakuri Ramakrishna Rao
Journal:  Sci Rep       Date:  2019-01-28       Impact factor: 4.379

6.  Identification of donor splice sites using support vector machine: a computational approach based on positional, compositional and dependency features.

Authors:  Prabina Kumar Meher; Tanmaya Kumar Sahu; A R Rao; S D Wahi
Journal:  Algorithms Mol Biol       Date:  2016-06-01       Impact factor: 1.405

7.  Prediction of donor splice sites using random forest with a new sequence encoding approach.

Authors:  Prabina Kumar Meher; Tanmaya Kumar Sahu; Atmakuri Ramakrishna Rao
Journal:  BioData Min       Date:  2016-01-22       Impact factor: 2.522

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.