Literature DB >> 18175047

Genetic programming for creating Chou's pseudo amino acid based features for submitochondria localization.

Loris Nanni1, Alessandra Lumini.   

Abstract

Given a protein that is localized in the mitochondria it is very important to know the submitochondria localization of that protein to understand its function. In this work, we propose a submitochondria localizer whose feature extraction method is based on the Chou's pseudo-amino acid composition. The pseudo-amino acid based features are obtained by combining pseudo-amino acid compositions with hundreds of amino-acid indices and amino-acid substitution matrices, then from this huge set of features a small set of 15 "artificial" features is created. The feature creation is performed by genetic programming combining one or more "original" features by means of some mathematical operators. Finally, the set of combined features are used to train a radial basis function support vector machine. This method is named GP-Loc. Moreover, we also propose a very few parameterized method, named ALL-Loc, where all the "original" features are used to train a linear support vector machine. The overall prediction accuracy obtained by GP-Loc is 89% when the jackknife cross-validation is used, this result outperforms the performance obtained in the literature (85.2%) using the same dataset. While the overall prediction accuracy obtained by ALL-Loc is 83.9%.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18175047     DOI: 10.1007/s00726-007-0018-1

Source DB:  PubMed          Journal:  Amino Acids        ISSN: 0939-4451            Impact factor:   3.520


  33 in total

1.  Subcellular localization of Gram-negative bacterial proteins using sparse learning.

Authors:  Zhonglong Zheng; Jie Yang
Journal:  Protein J       Date:  2010-04       Impact factor: 2.371

2.  Multi label learning for prediction of human protein subcellular localizations.

Authors:  Lin Zhu; Jie Yang; Hong-Bin Shen
Journal:  Protein J       Date:  2009-12       Impact factor: 2.371

3.  repRNA: a web server for generating various feature vectors of RNA sequences.

Authors:  Bin Liu; Fule Liu; Longyun Fang; Xiaolong Wang; Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2015-06-18       Impact factor: 3.291

4.  Protein remote homology detection by combining Chou's distance-pair pseudo amino acid composition and principal component analysis.

Authors:  Bin Liu; Junjie Chen; Xiaolong Wang
Journal:  Mol Genet Genomics       Date:  2015-04-21       Impact factor: 3.291

Review 5.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

6.  A new method for predicting the subcellular localization of eukaryotic proteins with both single and multiple sites: Euk-mPLoc 2.0.

Authors:  Kuo-Chen Chou; Hong-Bin Shen
Journal:  PLoS One       Date:  2010-04-01       Impact factor: 3.240

7.  Prediction of Protein Submitochondrial Locations by Incorporating Dipeptide Composition into Chou's General Pseudo Amino Acid Composition.

Authors:  Khurshid Ahmad; Muhammad Waris; Maqsood Hayat
Journal:  J Membr Biol       Date:  2016-01-08       Impact factor: 1.843

8.  Naïve Bayes classifier with feature selection to identify phage virion proteins.

Authors:  Peng-Mian Feng; Hui Ding; Wei Chen; Hao Lin
Journal:  Comput Math Methods Med       Date:  2013-05-15       Impact factor: 2.238

9.  Predicting secretory proteins of malaria parasite by incorporating sequence evolution information into pseudo amino acid composition via grey system model.

Authors:  Wei-Zhong Lin; Jian-An Fang; Xuan Xiao; Kuo-Chen Chou
Journal:  PLoS One       Date:  2012-11-26       Impact factor: 3.240

10.  iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition.

Authors:  Wei Chen; Peng-Mian Feng; Hao Lin; Kuo-Chen Chou
Journal:  Nucleic Acids Res       Date:  2013-01-08       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.