MOTIVATION: The production of neuropeptides from their precursor proteins is the result of a complex series of enzymatic processing steps. Often, the annotation of new neuropeptide genes from sequence information outstrips biochemical assays and so bioinformatics tools can provide rapid information on the most likely peptides produced by a gene. Predicting the final bioactive neuropeptides from precursor proteins requires accurate algorithms to determine which locations in the protein are cleaved. RESULTS: Predictive models were trained on Apis mellifera and Drosophila melanogaster precursors using binary logistic regression, multi-layer perceptron and k-nearest neighbor models. The final predictive models included specific amino acids at locations relative to the cleavage sites. Correct classification rates ranged from 78 to 100% indicating that the models adequately predicted cleaved and non-cleaved positions across a wide range of neuropeptide families and insect species. The model trained on D.melanogaster data had better generalization properties than the model trained on A. mellifera for the data sets considered. The reliable and consistent performance of the models in the test data sets suggests that the bioinformatics strategies proposed here can accurately predict neuropeptides in insects with sequence information based on neuropeptides with biochemical and sequence information in well-studied species.
MOTIVATION: The production of neuropeptides from their precursor proteins is the result of a complex series of enzymatic processing steps. Often, the annotation of new neuropeptide genes from sequence information outstrips biochemical assays and so bioinformatics tools can provide rapid information on the most likely peptides produced by a gene. Predicting the final bioactive neuropeptides from precursor proteins requires accurate algorithms to determine which locations in the protein are cleaved. RESULTS: Predictive models were trained on Apis mellifera and Drosophila melanogaster precursors using binary logistic regression, multi-layer perceptron and k-nearest neighbor models. The final predictive models included specific amino acids at locations relative to the cleavage sites. Correct classification rates ranged from 78 to 100% indicating that the models adequately predicted cleaved and non-cleaved positions across a wide range of neuropeptide families and insect species. The model trained on D.melanogaster data had better generalization properties than the model trained on A. mellifera for the data sets considered. The reliable and consistent performance of the models in the test data sets suggests that the bioinformatics strategies proposed here can accurately predict neuropeptides in insects with sequence information based on neuropeptides with biochemical and sequence information in well-studied species.
Authors: Axel Brockmann; Suresh P Annangudi; Timothy A Richmond; Seth A Ament; Fang Xie; Bruce R Southey; Sandra R Rodriguez-Zas; Gene E Robinson; Jonathan V Sweedler Journal: Proc Natl Acad Sci U S A Date: 2009-01-28 Impact factor: 11.205
Authors: Vikram Chandra; Ingrid Fetter-Pruneda; Peter R Oxley; Amelia L Ritger; Sean K McKenzie; Romain Libbrecht; Daniel J C Kronauer Journal: Science Date: 2018-07-27 Impact factor: 47.728
Authors: Elena V Romanova; Nathan G Hatcher; Stanislav S Rubakhin; Jonathan V Sweedler Journal: Neuropharmacology Date: 2008-08-03 Impact factor: 5.250
Authors: Malik N Akhtar; Bruce R Southey; Per E Andrén; Jonathan V Sweedler; Sandra L Rodriguez-Zas Journal: J Proteome Res Date: 2012-11-06 Impact factor: 4.466