Literature DB >> 14642665

Searching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function.

Zheng Rong Yang1, Rebecca Thomson, T Charles Hodgman, Jonathan Dry, Austin K Doyle, Ajit Narayanan, XiKun Wu.   

Abstract

This paper presents an algorithm which is able to extract discriminant rules from oligopeptides for protease proteolytic cleavage activity prediction. The algorithm is developed using genetic programming. Three important components in the algorithm are a min-max scoring function, the reverse Polish notation (RPN) and the use of minimum description length. The min-max scoring function is developed using amino acid similarity matrices for measuring the similarity between an oligopeptide and a rule, which is a complex algebraic equation of amino acids rather than a simple pattern sequence. The Fisher ratio is then calculated on the scoring values using the class label associated with the oligopeptides. The discriminant ability of each rule can therefore be evaluated. The use of RPN makes the evolutionary operations simpler and therefore reduces the computational cost. To prevent overfitting, the concept of minimum description length is used to penalize over-complicated rules. A fitness function is therefore composed of the Fisher ratio and the use of minimum description length for an efficient evolutionary process. In the application to four protease datasets (Trypsin, Factor Xa, Hepatitis C Virus and HIV protease cleavage site prediction), our algorithm is superior to C5, a conventional method for deriving decision trees.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 14642665     DOI: 10.1016/s0303-2647(03)00141-2

Source DB:  PubMed          Journal:  Biosystems        ISSN: 0303-2647            Impact factor:   1.973


  3 in total

1.  Comprehensive bioinformatic analysis of the specificity of human immunodeficiency virus type 1 protease.

Authors:  Liwen You; Daniel Garwicz; Thorsteinn Rögnvaldsson
Journal:  J Virol       Date:  2005-10       Impact factor: 5.103

2.  How to find simple and accurate rules for viral protease cleavage specificities.

Authors:  Thorsteinn Rögnvaldsson; Terence A Etchells; Liwen You; Daniel Garwicz; Ian Jarman; Paulo J G Lisboa
Journal:  BMC Bioinformatics       Date:  2009-05-16       Impact factor: 3.169

3.  Alignment using genetic programming with causal trees for identification of protein functions.

Authors:  Chun-Min Hung; Yueh-Min Huang; Ming-Shi Chang
Journal:  Nonlinear Anal Theory Methods Appl       Date:  2005-11-28       Impact factor: 2.064

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.