Literature DB >> 16153801

A novel approach to extracting features from motif content and protein composition for protein sequence classification.

Xing-Ming Zhao1, Yiu-Ming Cheung, De-Shuang Huang.   

Abstract

This paper presents a novel approach to extracting features from motif content and protein composition for protein sequence classification. First, we formulate a protein sequence as a fixed-dimensional vector using the motif content and protein composition. Then, we further project the vectors into a low-dimensional space by the Principal Component Analysis (PCA) so that they can be represented by a combination of the eigenvectors of the covariance matrix of these vectors. Subsequently, the Genetic Algorithm (GA) is used to extract a subset of biological and functional sequence features from the eigen-space and to optimize the regularization parameter of the Support Vector Machine (SVM) simultaneously. Finally, we utilize the SVM classifiers to classify protein sequences into corresponding families based on the selected feature subsets. In comparison with the existing PSI-BLAST and SVM-pairwise methods, the experiments show the promising results of our approach.

Mesh:

Substances:

Year:  2005        PMID: 16153801     DOI: 10.1016/j.neunet.2005.07.002

Source DB:  PubMed          Journal:  Neural Netw        ISSN: 0893-6080


  11 in total

1.  FunSAV: predicting the functional effect of single amino acid variants using a two-stage random forest model.

Authors:  Mingjun Wang; Xing-Ming Zhao; Kazuhiro Takemoto; Haisong Xu; Yuan Li; Tatsuya Akutsu; Jiangning Song
Journal:  PLoS One       Date:  2012-08-24       Impact factor: 3.240

2.  Prediction of S-glutathionylation sites based on protein sequences.

Authors:  Chenglei Sun; Zheng-Zheng Shi; Xiaobo Zhou; Luonan Chen; Xing-Ming Zhao
Journal:  PLoS One       Date:  2013-02-13       Impact factor: 3.240

3.  Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest.

Authors:  Zhu-Hong You; Keith C C Chan; Pengwei Hu
Journal:  PLoS One       Date:  2015-05-06       Impact factor: 3.240

4.  Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set.

Authors:  Zhu-Hong You; Lin Zhu; Chun-Hou Zheng; Hong-Jie Yu; Su-Ping Deng; Zhen Ji
Journal:  BMC Bioinformatics       Date:  2014-12-03       Impact factor: 3.169

5.  iDPF-PseRAAAC: A Web-Server for Identifying the Defensin Peptide Family and Subfamily Using Pseudo Reduced Amino Acid Alphabet Composition.

Authors:  Yongchun Zuo; Yang Lv; Zhuying Wei; Lei Yang; Guangpeng Li; Guoliang Fan
Journal:  PLoS One       Date:  2015-12-29       Impact factor: 3.240

6.  Identification of Biomarkers for Predicting Lymph Node Metastasis of Stomach Cancer Using Clinical DNA Methylation Data.

Authors:  Jun Wu; Yawen Xiao; Chao Xia; Fan Yang; Hua Li; Zhifeng Shao; Zongli Lin; Xiaodong Zhao
Journal:  Dis Markers       Date:  2017-08-29       Impact factor: 3.434

7.  Accurate classification of membrane protein types based on sequence and evolutionary information using deep learning.

Authors:  Lei Guo; Shunfang Wang; Mingyuan Li; Zicheng Cao
Journal:  BMC Bioinformatics       Date:  2019-12-24       Impact factor: 3.169

8.  A robust hybrid approach based on estimation of distribution algorithm and support vector machine for hunting candidate disease genes.

Authors:  Li Li; Hongmei Chen; Chang Liu; Fang Wang; Fangfang Zhang; Lihua Bai; Yihan Chen; Luying Peng
Journal:  ScientificWorldJournal       Date:  2013-02-07

Review 9.  A survey on evolutionary algorithm based hybrid intelligence in bioinformatics.

Authors:  Shan Li; Liying Kang; Xing-Ming Zhao
Journal:  Biomed Res Int       Date:  2014-03-06       Impact factor: 3.411

10.  Gene selection for cancer identification: a decision tree model empowered by particle swarm optimization algorithm.

Authors:  Kun-Huang Chen; Kung-Jeng Wang; Min-Lung Tsai; Kung-Min Wang; Angelia Melani Adrian; Wei-Chung Cheng; Tzu-Sen Yang; Nai-Chia Teng; Kuo-Pin Tan; Ku-Shang Chang
Journal:  BMC Bioinformatics       Date:  2014-02-20       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.