Literature DB >> 16287089

Prediction of transporter family from protein sequence by support vector machine approach.

H H Lin1, L Y Han, C Z Cai, Z L Ji, Y Z Chen.   

Abstract

Transporters play key roles in cellular transport and metabolic processes, and in facilitating drug delivery and excretion. These proteins are classified into families based on the transporter classification (TC) system. Determination of the TC family of transporters facilitates the study of their cellular and pharmacological functions. Methods for predicting TC family without sequence alignments or clustering are particularly useful for studying novel transporters whose function cannot be determined by sequence similarity. This work explores the use of a machine learning method, support vector machines (SVMs), for predicting the family of transporters from their sequence without the use of sequence similarity. A total of 10,636 transporters in 13 TC subclasses, 1914 transporters in eight TC families, and 168,341 nontransporter proteins are used to train and test the SVM prediction system. Testing results by using a separate set of 4351 transporters and 83,151 nontransporter proteins show that the overall accuracy for predicting members of these TC subclasses and families is 83.4% and 88.0%, respectively, and that of nonmembers is 99.3% and 96.6%, respectively. The accuracies for predicting members and nonmembers of individual TC subclasses are in the range of 70.7-96.1% and 97.6-99.9%, respectively, and those of individual TC families are in the range of 60.6-97.1% and 91.5-99.4%, respectively. A further test by using 26,139 transmembrane proteins outside each of the 13 TC subclasses shows that 90.4-99.6% of these are correctly predicted. Our study suggests that the SVM is potentially useful for facilitating functional study of transporters irrespective of sequence similarity. 2005 Wiley-Liss, Inc.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16287089     DOI: 10.1002/prot.20605

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  22 in total

1.  Effective prediction of bacterial type IV secreted effectors by combined features of both C-termini and N-termini.

Authors:  Yu Wang; Yanzhi Guo; Xuemei Pu; Menglong Li
Journal:  J Comput Aided Mol Des       Date:  2017-11-10       Impact factor: 3.686

2.  TranCEP: Predicting the substrate class of transmembrane transport proteins using compositional, evolutionary, and positional information.

Authors:  Munira Alballa; Faizah Aplop; Gregory Butler
Journal:  PLoS One       Date:  2020-01-14       Impact factor: 3.240

3.  Detection and significance of serum protein markers of small-cell lung cancer.

Authors:  Mingyong Han; Qi Liu; Jiekai Yu; Shu Zheng
Journal:  J Clin Lab Anal       Date:  2008       Impact factor: 2.352

4.  Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach.

Authors:  H H Lin; L Y Han; H L Zhang; C J Zheng; B Xie; Z W Cao; Y Z Chen
Journal:  BMC Bioinformatics       Date:  2006-12-18       Impact factor: 3.169

5.  PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence.

Authors:  Z R Li; H H Lin; L Y Han; L Jiang; X Chen; Y Z Chen
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

6.  Annotation-based inference of transporter function.

Authors:  Thomas J Lee; Ian Paulsen; Peter Karp
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

7.  TransportTP: a two-phase classification approach for membrane transporter prediction and characterization.

Authors:  Haiquan Li; Vagner A Benedito; Michael K Udvardi; Patrick Xuechun Zhao
Journal:  BMC Bioinformatics       Date:  2009-12-14       Impact factor: 3.169

8.  KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns.

Authors:  Yung-Hao Wong; Tzong-Yi Lee; Han-Kuen Liang; Chia-Mao Huang; Ting-Yuan Wang; Yi-Huan Yang; Chia-Huei Chu; Hsien-Da Huang; Ming-Tat Ko; Jenn-Kang Hwang
Journal:  Nucleic Acids Res       Date:  2007-05-21       Impact factor: 16.971

9.  RF-DYMHC: detecting the yeast meiotic recombination hotspots and coldspots by random forest model using gapped dinucleotide composition features.

Authors:  Peng Jiang; Haonan Wu; Jiawei Wei; Fei Sang; Xiao Sun; Zuhong Lu
Journal:  Nucleic Acids Res       Date:  2007-05-03       Impact factor: 16.971

10.  Efficacy of different protein descriptors in predicting protein functional families.

Authors:  Serene A K Ong; Hong Huang Lin; Yu Zong Chen; Ze Rong Li; Zhiwei Cao
Journal:  BMC Bioinformatics       Date:  2007-08-17       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.