Amir Toporik1, Itamar Borukhov2, Avihay Apatoff2, Doron Gerber2, Yossef Kliger2. 1. The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, 52900 Ramat-Gan and Compugen Ltd., 69512 Tel Aviv, IsraelThe Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, 52900 Ramat-Gan and Compugen Ltd., 69512 Tel Aviv, Israel. 2. The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, 52900 Ramat-Gan and Compugen Ltd., 69512 Tel Aviv, Israel.
Abstract
MOTIVATION: Many secretory peptides are synthesized as inactive precursors that must undergo post-translational processing to become biologically active peptides. Attempts to predict natural peptides are limited by the low performance of proteolytic site predictors and by the high combinatorial complexity of pairing such sites. To overcome these limitations, we analyzed the site-wise evolutionary mutation rates of peptide hormone precursors, calculated using the Rate4Site algorithm. RESULTS: Our analysis revealed that within their precursors, peptide residues are significantly more conserved than the pro-peptide residues. This disparity enables the prediction of peptides with a precision of ∼60% at a recall of 40% [receiver-operating characteristic curve (ROC) AUC 0.79]. Subsequently, combining the Rate4Site score with additional features and training a Random Forest classifier enable the prediction of natural peptides hidden within secreted human proteins at a precision of ∼90% at a recall of 50% (ROC AUC 0.96). The high performance of our method allows it to be applied to full secretomes and to predict naturally occurring active peptides. Our prediction on Homo sapiens revealed several putative peptides in the human secretome that are currently unannotated. Furthermore, the unique expression of some of these peptides implies a potential hormone function, including peptides that are highly expressed in endocrine glands. AVAILABILITY AND IMPLEMENTATION: A pseudocode is available in the SUPPLEMENTARY INFORMATION. CONTACT: doron.gerber@biu.ac.il or kliger@cgen.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Many secretory peptides are synthesized as inactive precursors that must undergo post-translational processing to become biologically active peptides. Attempts to predict natural peptides are limited by the low performance of proteolytic site predictors and by the high combinatorial complexity of pairing such sites. To overcome these limitations, we analyzed the site-wise evolutionary mutation rates of peptide hormone precursors, calculated using the Rate4Site algorithm. RESULTS: Our analysis revealed that within their precursors, peptide residues are significantly more conserved than the pro-peptide residues. This disparity enables the prediction of peptides with a precision of ∼60% at a recall of 40% [receiver-operating characteristic curve (ROC) AUC 0.79]. Subsequently, combining the Rate4Site score with additional features and training a Random Forest classifier enable the prediction of natural peptides hidden within secreted human proteins at a precision of ∼90% at a recall of 50% (ROC AUC 0.96). The high performance of our method allows it to be applied to full secretomes and to predict naturally occurring active peptides. Our prediction on Homo sapiens revealed several putative peptides in the human secretome that are currently unannotated. Furthermore, the unique expression of some of these peptides implies a potential hormone function, including peptides that are highly expressed in endocrine glands. AVAILABILITY AND IMPLEMENTATION: A pseudocode is available in the SUPPLEMENTARY INFORMATION. CONTACT: doron.gerber@biu.ac.il or kliger@cgen.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Alicja Płuciennik; Michał Stolarczyk; Maria Bzówka; Agata Raczyńska; Tomasz Magdziarz; Artur Góra Journal: BMC Bioinformatics Date: 2018-08-14 Impact factor: 3.169