Literature DB >> 18606172

Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices.

Cristian Robert Munteanu1, Humberto González-Díaz, Alexandre L Magalhães.   

Abstract

The huge amount of new proteins that need a fast enzymatic activity characterization creates demands of protein QSAR theoretical models. The protein parameters that can be used for an enzyme/non-enzyme classification includes the simpler indices such as composition, sequence and connectivity, also called topological indices (TIs) and the computationally expensive 3D descriptors. A comparison of the 3D versus lower dimension indices has not been reported with respect to the power of discrimination of proteins according to enzyme action. A set of 966 proteins (enzymes and non-enzymes) whose structural characteristics are provided by PDB/DSSP files was analyzed with Python/Biopython scripts, STATISTICA and Weka. The list of indices includes, but it is not restricted to pure composition indices (residue fractions), DSSP secondary structure protein composition and 3D indices (surface and access). We also used mixed indices such as composition-sequence indices (Chou's pseudo-amino acid compositions or coupling numbers), 3D-composition (surface fractions) and DSSP secondary structure amino acid composition/propensities (obtained with our Prot-2S Web tool). In addition, we extend and test for the first time several classic TIs for the Randic's protein sequence Star graphs using our Sequence to Star Graph (S2SG) Python application. All the indices were processed with general discriminant analysis models (GDA), neural networks (NN) and machine learning (ML) methods and the results are presented versus complexity, average of Shannon's information entropy (Sh) and data/method type. This study compares for the first time all these classes of indices to assess the ratios between model accuracy and indices/model complexity in enzyme/non-enzyme discrimination. The use of different methods and complexity of data shows that one cannot establish a direct relation between the complexity and the accuracy of the model.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18606172     DOI: 10.1016/j.jtbi.2008.06.003

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  16 in total

1.  Prediction of ketoacyl synthase family using reduced amino acid alphabets.

Authors:  Wei Chen; Pengmian Feng; Hao Lin
Journal:  J Ind Microbiol Biotechnol       Date:  2011-10-26       Impact factor: 3.346

Review 2.  Genetic algorithm optimization in drug design QSAR: Bayesian-regularized genetic neural networks (BRGNN) and genetic algorithm-optimized support vectors machines (GA-SVM).

Authors:  Michael Fernandez; Julio Caballero; Leyden Fernandez; Akinori Sarai
Journal:  Mol Divers       Date:  2010-03-20       Impact factor: 2.943

3.  Protein sequence analysis based on hydropathy profile of amino acids.

Authors:  Xiao-li Xie; Li-fei Zheng; Ying Yu; Li-ping Liang; Man-cai Guo; John Song; Zhi-fa Yuan
Journal:  J Zhejiang Univ Sci B       Date:  2012-02       Impact factor: 3.066

4.  Biomacromolecular quantitative structure-activity relationship (BioQSAR): a proof-of-concept study on the modeling, prediction and interpretation of protein-protein binding affinity.

Authors:  Peng Zhou; Congcong Wang; Feifei Tian; Yanrong Ren; Chao Yang; Jian Huang
Journal:  J Comput Aided Mol Des       Date:  2013-01-10       Impact factor: 3.686

5.  Non-Alignment Features Based Enzyme/Non-Enzyme Classification Using an Ensemble Method.

Authors:  Nicholas J Davidson; Xueyi Wang
Journal:  Proc Int Conf Mach Learn Appl       Date:  2010-12-12

6.  Computational Approaches for Automated Classification of Enzyme Sequences.

Authors:  Akram Mohammed; Chittibabu Guda
Journal:  J Proteomics Bioinform       Date:  2011-08-23

7.  Predicting subcellular location of proteins using integrated-algorithm method.

Authors:  Yu-Dong Cai; Lin Lu; Lei Chen; Jian-Feng He
Journal:  Mol Divers       Date:  2009-08-07       Impact factor: 2.943

8.  Fragment-based optimization of small molecule CXCL12 inhibitors for antagonizing the CXCL12/CXCR4 interaction.

Authors:  Joshua J Ziarek; Yan Liu; Emmanuel Smith; Guolin Zhang; Francis C Peterson; Jun Chen; Yongping Yu; Yu Chen; Brian F Volkman; Rongshi Li
Journal:  Curr Top Med Chem       Date:  2012       Impact factor: 3.295

9.  Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids.

Authors:  Jayanta Kumar Das; Provas Das; Korak Kumar Ray; Pabitra Pal Choudhury; Siddhartha Sankar Jana
Journal:  PLoS One       Date:  2016-12-08       Impact factor: 3.240

10.  Using feature optimization-based support vector machine method to recognize the β-hairpin motifs in enzymes.

Authors:  Dongmei Li; Xiuzhen Hu; Xingxing Liu; Zhenxing Feng; Changjiang Ding
Journal:  Saudi J Biol Sci       Date:  2016-11-28       Impact factor: 4.219

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.