Literature DB >> 26849871

A Study of Domain Adaptation Classifiers Derived From Logistic Regression for the Task of Splice Site Prediction.

Nic Herndon, Doina Caragea.   

Abstract

Supervised classifiers are highly dependent on abundant labeled training data. Alternatives for addressing the lack of labeled data include: labeling data (but this is costly and time consuming); training classifiers with abundant data from another domain (however, the classification accuracy usually decreases as the distance between domains increases); or complementing the limited labeled data with abundant unlabeled data from the same domain and learning semi-supervised classifiers (but the unlabeled data can mislead the classifier). A better alternative is to use both the abundant labeled data from a source domain, the limited labeled data and optionally the unlabeled data from the target domain to train classifiers in a domain adaptation setting. We propose two such classifiers, based on logistic regression, and evaluate them for the task of splice site prediction-a difficult and essential step in gene prediction. Our classifiers achieved high accuracy, with highest areas under the precision-recall curve between 50.83% and 82.61%.

Entities:  

Mesh:

Year:  2016        PMID: 26849871      PMCID: PMC4894847          DOI: 10.1109/TNB.2016.2522400

Source DB:  PubMed          Journal:  IEEE Trans Nanobioscience        ISSN: 1536-1241            Impact factor:   2.935


  15 in total

1.  Knowledge-based analysis of microarray gene expression data by using support vector machines.

Authors:  M P Brown; W N Grundy; D Lin; N Cristianini; C W Sugnet; T S Furey; M Ares; D Haussler
Journal:  Proc Natl Acad Sci U S A       Date:  2000-01-04       Impact factor: 11.205

2.  Modeling splicing sites with pairwise correlations.

Authors:  Masanori Arita; Koji Tsuda; Kiyoshi Asai
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

3.  An introduction to kernel-based learning algorithms.

Authors:  K R Müller; S Mika; G Rätsch; K Tsuda; B Schölkopf
Journal:  IEEE Trans Neural Netw       Date:  2001

4.  Fold recognition and ab initio structure predictions using hidden Markov models and beta-strand pair potentials.

Authors:  T J Hubbard; J Park
Journal:  Proteins       Date:  1995-11

5.  High-accuracy splice site prediction based on sequence component and position features.

Authors:  J L Li; L F Wang; H Y Wang; L Y Bai; Z M Yuan
Journal:  Genet Mol Res       Date:  2012-09-25

6.  Gene prediction with a hidden Markov model and a new intron submodel.

Authors:  Mario Stanke; Stephan Waack
Journal:  Bioinformatics       Date:  2003-10       Impact factor: 6.937

7.  Splice site identification using probabilistic parameters and SVM classification.

Authors:  A K M A Baten; B C H Chang; S K Halgamuge; Jason Li
Journal:  BMC Bioinformatics       Date:  2006-12-18       Impact factor: 3.169

8.  Accurate splice site prediction using support vector machines.

Authors:  Sören Sonnenburg; Gabriele Schweikert; Petra Philips; Jonas Behr; Gunnar Rätsch
Journal:  BMC Bioinformatics       Date:  2007       Impact factor: 3.169

9.  Global discriminative learning for higher-accuracy computational gene prediction.

Authors:  Axel Bernal; Koby Crammer; Artemis Hatzigeorgiou; Fernando Pereira
Journal:  PLoS Comput Biol       Date:  2007-02-02       Impact factor: 4.475

10.  Assessment of transcript reconstruction methods for RNA-seq.

Authors:  Josep F Abril; Pär G Engström; Felix Kokocinski; Tamara Steijger; Tim J Hubbard; Roderic Guigó; Jennifer Harrow; Paul Bertone
Journal:  Nat Methods       Date:  2013-11-03       Impact factor: 28.547

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.