Md Mehedi Hasan1,2, Nalini Schaduangrat3, Shaherin Basith4, Gwang Lee4,5, Watshara Shoombuatong3, Balachandran Manavalan4. 1. Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan. 2. Japan Society for the Promotion of Science, Chiyoda-ku, Tokyo 102-0083, Japan. 3. Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand. 4. Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea. 5. Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea.
Abstract
MOTIVATION: Therapeutic peptides failing at clinical trials could be attributed to their toxicity profiles like hemolytic activity, which hamper further progress of peptides as drug candidates. The accurate prediction of hemolytic peptides (HLPs) and its activity from the given peptides is one of the challenging tasks in immunoinformatics, which is essential for drug development and basic research. Although there are a few computational methods that have been proposed for this aspect, none of them are able to identify HLPs and their activities simultaneously. RESULTS: In this study, we proposed a two-layer prediction framework, called HLPpred-Fuse, that can accurately and automatically predict both hemolytic peptides (HLPs or non-HLPs) as well as HLPs activity (high and low). More specifically, feature representation learning scheme was utilized to generate 54 probabilistic features by integrating six different machine learning classifiers and nine different sequence-based encodings. Consequently, the 54 probabilistic features were fused to provide sufficiently converged sequence information which was used as an input to extremely randomized tree for the development of two final prediction models which independently identify HLP and its activity. Performance comparisons over empirical cross-validation analysis, independent test and case study against state-of-the-art methods demonstrate that HLPpred-Fuse consistently outperformed these methods in the identification of hemolytic activity. AVAILABILITY AND IMPLEMENTATION: For the convenience of experimental scientists, a web-based tool has been established at http://thegleelab.org/HLPpred-Fuse. CONTACT: glee@ajou.ac.kr or watshara.sho@mahidol.ac.th or bala@ajou.ac.kr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Therapeutic peptides failing at clinical trials could be attributed to their toxicity profiles like hemolytic activity, which hamper further progress of peptides as drug candidates. The accurate prediction of hemolytic peptides (HLPs) and its activity from the given peptides is one of the challenging tasks in immunoinformatics, which is essential for drug development and basic research. Although there are a few computational methods that have been proposed for this aspect, none of them are able to identify HLPs and their activities simultaneously. RESULTS: In this study, we proposed a two-layer prediction framework, called HLPpred-Fuse, that can accurately and automatically predict both hemolytic peptides (HLPs or non-HLPs) as well as HLPs activity (high and low). More specifically, feature representation learning scheme was utilized to generate 54 probabilistic features by integrating six different machine learning classifiers and nine different sequence-based encodings. Consequently, the 54 probabilistic features were fused to provide sufficiently converged sequence information which was used as an input to extremely randomized tree for the development of two final prediction models which independently identify HLP and its activity. Performance comparisons over empirical cross-validation analysis, independent test and case study against state-of-the-art methods demonstrate that HLPpred-Fuse consistently outperformed these methods in the identification of hemolytic activity. AVAILABILITY AND IMPLEMENTATION: For the convenience of experimental scientists, a web-based tool has been established at http://thegleelab.org/HLPpred-Fuse. CONTACT: glee@ajou.ac.kr or watshara.sho@mahidol.ac.th or bala@ajou.ac.kr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.