Ajit Narayanan1, Xikun Wu, Z Rong Yang. 1. School of Engineering and Computer Sciences (Bioinformatics Laboratory), Old Library, University of Exeter, Exeter EX4 4PT, UK. a.narayanan@ex.ac.uk
Abstract
MOTIVATION: The motivation is to identify, through machine learning techniques, specific patterns in HIV and HCV viral polyprotein amino acid residues where viral protease cleaves the polyprotein as it leaves the ribosome. An understanding of viral protease specificity may help the development of future anti-viral drugs involving protease inhibitors by identifying specific features of protease activity for further experimental investigation. While viral sequence information is growing at a fast rate, there is still comparatively little understanding of how viral polyproteins are cut into their functional unit lengths. The aim of the work reported here is to investigate whether it is possible to generalise from known cleavage sites to unknown cleavage sites for two specific viruses-HIV and HCV. An understanding of proteolytic activity for specific viruses will contribute to our understanding of viral protease function in general, thereby leading to a greater understanding of protease families and their substrate characteristics. RESULTS: Our results show that artificial neural networks and symbolic learning techniques (See5) capture some fundamental and new substrate attributes, but neural networks outperform their symbolic counterpart.
MOTIVATION: The motivation is to identify, through machine learning techniques, specific patterns in HIV and HCV viral polyprotein amino acid residues where viral protease cleaves the polyprotein as it leaves the ribosome. An understanding of viral protease specificity may help the development of future anti-viral drugs involving protease inhibitors by identifying specific features of protease activity for further experimental investigation. While viral sequence information is growing at a fast rate, there is still comparatively little understanding of how viral polyproteins are cut into their functional unit lengths. The aim of the work reported here is to investigate whether it is possible to generalise from known cleavage sites to unknown cleavage sites for two specific viruses-HIV and HCV. An understanding of proteolytic activity for specific viruses will contribute to our understanding of viral protease function in general, thereby leading to a greater understanding of protease families and their substrate characteristics. RESULTS: Our results show that artificial neural networks and symbolic learning techniques (See5) capture some fundamental and new substrate attributes, but neural networks outperform their symbolic counterpart.
Authors: Thorsteinn Rögnvaldsson; Terence A Etchells; Liwen You; Daniel Garwicz; Ian Jarman; Paulo J G Lisboa Journal: BMC Bioinformatics Date: 2009-05-16 Impact factor: 3.169