Literature DB >> 15905279

Semi-supervised protein classification using cluster kernels.

Jason Weston1, Christina Leslie, Eugene Ie, Dengyong Zhou, Andre Elisseeff, William Stafford Noble.   

Abstract

MOTIVATION: Building an accurate protein classification system depends critically upon choosing a good representation of the input sequences of amino acids. Recent work using string kernels for protein data has achieved state-of-the-art classification performance. However, such representations are based only on labeled data--examples with known 3D structures, organized into structural classes--whereas in practice, unlabeled data are far more plentiful.
RESULTS: In this work, we develop simple and scalable cluster kernel techniques for incorporating unlabeled data into the representation of protein sequences. We show that our methods greatly improve the classification performance of string kernels and outperform standard approaches for using unlabeled data, such as adding close homologs of the positive examples to the training data. We achieve equal or superior performance to previously presented cluster kernel methods and at the same time achieving far greater computational efficiency. AVAILABILITY: Source code is available at www.kyb.tuebingen.mpg.de/bs/people/weston/semiprot. The Spider matlab package is available at www.kyb.tuebingen.mpg.de/bs/people/spider. SUPPLEMENTARY INFORMATION: www.kyb.tuebingen.mpg.de/bs/people/weston/semiprot.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15905279     DOI: 10.1093/bioinformatics/bti497

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  25 in total

1.  Maximum margin classifier working in a set of strings.

Authors:  Hitoshi Koyano; Morihiro Hayashida; Tatsuya Akutsu
Journal:  Proc Math Phys Eng Sci       Date:  2016-03       Impact factor: 2.704

2.  Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning.

Authors:  Debasis Chakraborty; Ujjwal Maulik
Journal:  IEEE J Transl Eng Health Med       Date:  2014-12-02       Impact factor: 3.316

3.  Semi-supervised learning improves gene expression-based prediction of cancer recurrence.

Authors:  Mingguang Shi; Bing Zhang
Journal:  Bioinformatics       Date:  2011-09-04       Impact factor: 6.937

4.  Exploiting physico-chemical properties in string kernels.

Authors:  Nora C Toussaint; Christian Widmer; Oliver Kohlbacher; Gunnar Rätsch
Journal:  BMC Bioinformatics       Date:  2010-10-26       Impact factor: 3.169

5.  Accelerating the Original Profile Kernel.

Authors:  Tobias Hamp; Tatyana Goldberg; Burkhard Rost
Journal:  PLoS One       Date:  2013-06-18       Impact factor: 3.240

6.  Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences.

Authors:  Miguel A Santos; Andrei L Turinsky; Serene Ong; Jennifer Tsai; Michael F Berger; Gwenael Badis; Shaheynoor Talukder; Andrew R Gehrke; Martha L Bulyk; Timothy R Hughes; Shoshana J Wodak
Journal:  Nucleic Acids Res       Date:  2010-08-12       Impact factor: 16.971

7.  Semi-supervised prediction of protein subcellular localization using abstraction augmented Markov models.

Authors:  Cornelia Caragea; Doina Caragea; Adrian Silvescu; Vasant Honavar
Journal:  BMC Bioinformatics       Date:  2010-10-26       Impact factor: 3.169

8.  Diversity and dispersal of a ubiquitous protein family: acyl-CoA dehydrogenases.

Authors:  Yao-Qing Shen; B Franz Lang; Gertraud Burger
Journal:  Nucleic Acids Res       Date:  2009-07-22       Impact factor: 16.971

9.  Searching remote homology with spectral clustering with symmetry in neighborhood cluster kernels.

Authors:  Ujjwal Maulik; Anasua Sarkar
Journal:  PLoS One       Date:  2013-02-15       Impact factor: 3.240

10.  Protein localization prediction using random walks on graphs.

Authors:  Xiaohua Xu; Lin Lu; Ping He; Ling Chen
Journal:  BMC Bioinformatics       Date:  2013-05-09       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.