Literature DB >> 24297558

Vector quantization kernels for the classification of protein sequences and structures.

Wyatt T Clark1, Predrag Radivojac.   

Abstract

We propose a new kernel-based method for the classification of protein sequences and structures. We first represent each protein as a set of time series data using several structural, physicochemical, and predicted properties such as a sequence of consecutive dihedral angles, hydrophobicity indices, or predictions of disordered regions. A kernel function is then computed for pairs of proteins, exploiting the principles of vector quantization and subsequently used with support vector machines for protein classification. Although our method requires a significant pre-processing step, it is fast in the training and prediction stages owing to the linear complexity of kernel computation with the length of protein sequences. We evaluate our approach on two protein classification tasks involving the prediction of SCOP structural classes and catalytic activity according to the Gene Ontology. We provide evidence that the method is competitive when compared to string kernels, and useful for a range of protein classification tasks. Furthermore, the applicability of our approach extends beyond computational biology to any classification of time series data.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24297558

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  2 in total

1.  Kernel-based logistic regression model for protein sequence without vectorialization.

Authors:  Youyi Fong; Saheli Datta; Ivelin S Georgiev; Peter D Kwong; Georgia D Tomaras
Journal:  Biostatistics       Date:  2014-12-22       Impact factor: 5.279

2.  When loss-of-function is loss of function: assessing mutational signatures and impact of loss-of-function genetic variants.

Authors:  Kymberleigh A Pagel; Vikas Pejaver; Guan Ning Lin; Hyun-Jun Nam; Matthew Mort; David N Cooper; Jonathan Sebat; Lilia M Iakoucheva; Sean D Mooney; Predrag Radivojac
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.