Literature DB >> 30802694

AntiVPP 1.0: A portable tool for prediction of antiviral peptides.

Jorge Félix Beltrán Lissabet1, Lisandra Herrera Belén1, Jorge G Farias2.   

Abstract

Viruses are worldwide pathogens with a high impact on the human population. Despite the constant efforts to fight viral infections, there is a need to discover and design new drug candidates. Antiviral peptides are molecules with confirmed activity and constitute excellent alternatives for the treatment of viral infections. In the present study, we developed AntiVPP 1.0, an accurate bioinformatic tool that uses the Random Forest algorithm for antiviral peptide predictions. The model of AntiVPP 1.0 for antiviral peptide predictions uses several features of 1088 peptides for training and validation. During the validation of the model we achieved the TPR = 0.87, SPC = 0.97, ACC = 0.93 and MCC = 0.87 performance measures, which were indicative of a robust model. AntiVPP 1.0 is a fast, accurate and intuitive software focused on the assessment of antiviral peptides candidates. AntiVPP 1.0 is available at https://github.com/bio-coding/AntiVPP.
Copyright © 2019 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Antiviral; Machine learning; Peptide; Prediction; Python; Software

Mesh:

Substances:

Year:  2019        PMID: 30802694      PMCID: PMC7094449          DOI: 10.1016/j.compbiomed.2019.02.011

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


Introduction

Viruses are very old and ubiquitous pathogens, which cause high rates of infection and mortality in the human population [1]. The success of viruses during evolution has been possible due to three general attributes: genetic variation, the variety of forms for their transmission and the efficient way to replicate within their host cells in order to remain in them [2,3]. Due to these attributes, the control of viral diseases throughout history has not been an easy task [4]. In spite of the existence of antiviral drugs, it is necessary to explore novel antiviral compounds in order to control emerging viral pathogens [4,5]. In recent decades, peptides have become increasingly important in the design and delivery of drugs. Research in this regard is focused on the development and refinement of techniques to design and identify synthetic and natural peptides as drug candidates [1,6]. Antiviral peptides (AVPs) are known to fight against various types of viruses and can come from synthetic combinatorial libraries or segments of natural proteins [5,6]. There are different scenarios in which the AVPs have shown activity, e.g. Enfuvirtide (also known as T20), the first peptide inhibitor approved by the FDA against the HIV-1 [7]. Antiviral activity has also been reported for viruses, e.g. Rabies [8], HCV [9], influenza A virus H1N1, H3N2, H5N1, H7N7, H7N9, SARS-CoV and MERS-CoV [10], among others. Nowadays, there are different databases that contain collections of AVPs, among them: AVPpred [11], APD3 [12], CAMPR3 [13] and HIPdb [14], which constitutes excellent opportunities for the development of computational tools focused on the prediction of these molecules. However, unlike the development of bioinformatics tools in the field of antimicrobial peptides predictions (bacteria, fungi, animal cells) [15], the development of in silico tools for the prediction of AVPs is an area that has remained scarcely explored [11]. Currently, there are only three methods for predicting AVPs. The first one is the AVPpred server, which uses a vector support machine (SVM) for its predictions [11]. The second method is based on Random Forest (RF) algorithm and the resulting model of this work showed a better performance in the prediction of AVPs than AVPpred [16]. However, this model has not software to carry out prediction tasks by researchers who are not related to the field of machine learning. The third method, AVP-IC50Pred, was developed by Quresshi and coworkers. AVP-IC50Pred is a regression-based algorithm which uses experimentally proven datasets by employing multiple machine learning algorithms [17]. In this work, we have developed a friendly and portable software based on the RF algorithm for the prediction of AVPs with excellent performance measurements.

Materials and methods

Datasets

To carry out this study, the data set reported by Thakur et al., was selected [11]. For training of the model, the data set T544p+544n* was used (a total of 1088 peptides). 544p corresponds to a collection of 544 antiviral peptides with experimentally validated activity, while the 544n* are 544 non-experimental negative peptides, which has been used in the development of prediction models of antiviral peptides [11,16]. For validation of the model, the independent data set V60p+60n* was selected, composed of 60 peptides with experimentally validated activity (V60p) and 60 negative non-experimental peptides (60n*) (a total of 120 peptides). The building of the training and validation of the model is shown in Fig. 1 .
Fig. 1

Architecture of the training and validation model based on the dataset reported by Thakur and coworkers [11].

Architecture of the training and validation model based on the dataset reported by Thakur and coworkers [11].

Peptide features

For this study, the following features: net charge [18], number of hydrogen bond donors [19], molecular weight [20] and hydropathy index [21], were evaluated. Also, the composition of charged (DEKHR), aliphatic (ILV), aromatic (FHWY), polar (DERKQN), neutral (AGHPSTY), hydrophobic (CVLIMFW), positively charged (HKR), negatively charged (DE), tiny (ACDGST), small (EHILKMNPQV) and large (FRWY) residues as well as the relative frequency of all 20 natural amino acids, were assessed. All features were computed by using the Python 3.6 programming language (available at https://www.python.org/).

Relative frequency (Rfre) of all 20 natural amino acids

where Rfre [a.a] is the relative frequency of a natural amino acid of type i. N is the total number of natural amino acids in the peptide (peptide length).

Residues composition of peptides (PEP [comp])

where PEP [comp] is the sum of all Rfre [a.a] in a peptide.

Training and validation

For the construction of the prediction models, the Random Forest algorithm (RF) was evaluated. The training of the models was carried out in the Python 3.6 programming language. The Anaconda 3 package (available at https://www.anaconda.com) was used to run the libraries: ‘sklearn.ensemble’, ‘RandomForestClassifier’, ‘pandas’, ‘sklearn.externals’, ‘joblib’ and ‘score’. The ‘score’ function (accuracy) was implemented to choose models with scores > 0.95 as the cut-off for posterior validations. The score function measures the accuracy of probabilistic predictions and ranges from 0 to 1. For model validations the following equations were used: where TP represents the true positives; TN the true negatives; FP the false positives and FN the false negatives. For the validation of the method, in addition to the equations mentioned above, the correlation coefficient of Matthews (MCC) was calculated: MCC is used to evaluate the performance of the predictor. Its value ranges from −1 to 1 and a larger MCC means a better prediction [22].

Software development

For the development of our application, we used the programming language Python 3.6 and the WinPython software which is a free open-source portable distribution of the Python programming language. AntiVPP 1.0 has a friendly interface that, in addition to having the ability to discriminate antiviral and non-antiviral peptides, can also be used to calculate different physical-chemical characteristics of the peptides. The software as well as the instructions to run it is available at https://github.com/bio-coding/AntiVPP.1.0.

Results

During the training with the data set T544p + 544n* we obtained several prediction models based on RF with scores >0.95, each of these models were subjected to validation with the use of the independent data set V60p + 60n*. After evaluating each of the models obtained on the validation data, we selected a model with the best balance in the performance measures: TPR = 0.87, SPC = 0.97, ACC = 0.93 and MCC = 0.87. This model presented a score = 0.993 during the training phase. Previously, we had performed an analysis using the Support vector machine (SVM), Artificial neural network (ANN) and k-nearest neighbor (kNN) algorithms in the prediction of antiviral peptides, observing a better balance in the performance measures obtained with the RF algorithm (Table 1 ).
Table 1

Prediction models of antiviral peptides obtained by different algorithms on the validation dataset (V60p+60n*).

AlgorithmPerformance measurements
TPRSPCACCMCC
RF0.870.970.930.87
SVM0.850.930.790.84
ANN0.870.950.900.85
kNN0.830.910.900.81

TPR: sensitivity, SPC: specificity, ACC: accuracy, MCC: correlation coefficient of Matthews, RF: Random Forest, SVM: Support vector machine, ANN: Artificial neural network, kNN: k-nearest neighbor.

Prediction models of antiviral peptides obtained by different algorithms on the validation dataset (V60p+60n*). TPR: sensitivity, SPC: specificity, ACC: accuracy, MCC: correlation coefficient of Matthews, RF: Random Forest, SVM: Support vector machine, ANN: Artificial neural network, kNN: k-nearest neighbor. Our software was developed with the programming language Python 3.6. AntiVPP 1.0 is an application with a simple and intuitive interface, making it ideal for researchers who are involved in the search and design of AVPs and they lack knowledge about the field of machine learning (Fig. 2 ). AntiVPP 1.0 returns two types of predictions: 'True' for positive cases and 'False' for negative cases. In addition, the software performs the computation of several peptide features, which are the characteristics used for this program in AVPs classifications.
Fig. 2

Front of AntiVPP 1.0 (a). Button (PREDICT) for prediction of peptides in antiviral ['True'] or non-antiviral ['False’] (b). Button (CLEAN) to reset all the fields (c).

Front of AntiVPP 1.0 (a). Button (PREDICT) for prediction of peptides in antiviral ['True'] or non-antiviral ['False’] (b). Button (CLEAN) to reset all the fields (c).

Discussion

Viral infections are one of the most important risks to consider for global health [23,24]. Over the last 50 years, extensive efforts have been dedicated to the development of antiviral drugs and great success has been accomplished for some viruses. Nevertheless, there are other viral infections such as epidemic influenza, which continue to spread worldwide and new threats of viruses, as well as drug-resistant viruses, are continuously emerging [23]. Peptide-based drugs have been of great interest to the scientific community from the past decade to the present, given that the modern pharmaceutical industry has come to appreciate the role of these molecules in addressing unmet medical needs. All this is because the peptides can be an excellent complement or even a more suitable alternative to small molecules and biological therapeutics [25]. Regardless of the potential of AVPs, there is a considerable lack of algorithms for AVPs prediction compared to other areas such as the investigation of antimicrobial peptides. To date, the algorithm based on RF for the prediction of AVPs has been the one that has shown a better performance in the prediction of these molecules as reported in the literature [11,16,17]. The comparison of the performance measures obtained in our study, using the different algorithms, supports the previous results on the robustness of RF for AVP predictions [16], as shown in Table 1. In this study, we evaluated the RF algorithm using new combinations of chemical-physical characteristics of the AVPs, obtaining an excellent model with the following performance measures during the validation phase: TPR = 0.87, SPC = 0.97, ACC = 0.93, and MCC = 0.87. In addition, we also confirmed the need to include the relative frequency for the improvement of AVP predictions as previously reported [16]. A comparison among the existing methods for the prediction of AVPs shows that AntiVPP 1.0 has the highest SPC. Specificity is one of the most relevant measures in the construction of predictive models and is characterized by determining the proportion of positive cases (AVPs) correctly identified (Table 2 ) [26].
Table 2

Comparison of the existing programs for prediction of AVPs.

ProgramsPerformance measurements
Ref.
TPRSPCACCMCC
AntiVPP 1.00.870.970.930.87*
AVPpred0.930.920.930.85[11]
Model0.930.930.930.87[16]
IC50PredNot reportedNot reportedNot reportedNot reported[17]

TPR: sensitivity, SPC: specificity, ACC: accuracy, MCC: correlation coefficient of Matthews, RF: Random Forest, SVM: Support vector machine, ANN: Artificial neural network, kNN: k-nearest neighbor, *: current study.

Comparison of the existing programs for prediction of AVPs. TPR: sensitivity, SPC: specificity, ACC: accuracy, MCC: correlation coefficient of Matthews, RF: Random Forest, SVM: Support vector machine, ANN: Artificial neural network, kNN: k-nearest neighbor, *: current study. On the other hand, we report for the first time the number of hydrogen bond donors as another important characteristic to be considered in the development of future AVP prediction algorithms, due to its role improving the quality of performance measures during the testing of our prediction models. It has been studied that H-bond pairing has a great influence on ligand-binding affinity, improving the strength of ligand-receptor interactions [27]. For this reason hydrogen bonds have had an important role in the design and discovery of new peptide-based drugs [28]. This feature is addressed in our work in a novel way, since it had not been used previously for the prediction of antiviral peptides.

Conclusion

AntiVPP 1.0 is a fast, accurate and intuitive tool focused on prediction of antiviral peptides as alternatives to the current tools for this purpose. The hydrogen bond is an important feature to consider in future algorithms addressed to the design and discovery of future antiviral peptides. This software would be helpful for researchers working in the development of antiviral therapies based on peptides due to its high success rates and user-friendliness.

Conflicts of interest

There is no conflict of interest to declare.

Notes

AntiVPP 1.0 is protected by copyright. This software is free for academic users. For commercial purposes, please contact: jorge.farias@ufrontera.cl.
  28 in total

Review 1.  Evolutionary history and phylogeography of human viruses.

Authors:  Edward C Holmes
Journal:  Annu Rev Microbiol       Date:  2008       Impact factor: 15.500

Review 2.  Synthetic therapeutic peptides: science and market.

Authors:  Patrick Vlieghe; Vincent Lisowski; Jean Martinez; Michel Khrestchatisky
Journal:  Drug Discov Today       Date:  2009-10-30       Impact factor: 7.851

3.  AAindex: Amino Acid Index Database.

Authors:  S Kawashima; H Ogata; M Kanehisa
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

Review 4.  The future of peptide-based drugs.

Authors:  David J Craik; David P Fairlie; Spiros Liras; David Price
Journal:  Chem Biol Drug Des       Date:  2013-01       Impact factor: 2.817

5.  Prediction of protein function from sequence properties. Discriminant analysis of a data base.

Authors:  P Klein; M Kanehisa; C DeLisi
Journal:  Biochim Biophys Acta       Date:  1984-06-28

Review 6.  Mining the tree of life: Host defense peptides as antiviral therapeutics.

Authors:  Jessica R Shartouny; Joshy Jacob
Journal:  Semin Cell Dev Biol       Date:  2018-03-13       Impact factor: 7.727

Review 7.  Enfuvirtide: the first therapy to inhibit the entry of HIV-1 into host CD4 lymphocytes.

Authors:  Tom Matthews; Miklos Salgo; Michael Greenberg; Jain Chung; Ralph DeMasi; Dani Bolognesi
Journal:  Nat Rev Drug Discov       Date:  2004-03       Impact factor: 84.694

8.  Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou's general PseAAC.

Authors:  Prabina Kumar Meher; Tanmaya Kumar Sahu; Varsha Saini; Atmakuri Ramakrishna Rao
Journal:  Sci Rep       Date:  2017-02-13       Impact factor: 4.379

Review 9.  Current scenario of peptide-based drugs: the key roles of cationic antitumor and antiviral peptides.

Authors:  Kelly C L Mulder; Loiane A Lima; Vivian J Miranda; Simoni C Dias; Octávio L Franco
Journal:  Front Microbiol       Date:  2013-10-31       Impact factor: 5.640

10.  AVP-IC50 Pred: Multiple machine learning techniques-based prediction of peptide antiviral activity in terms of half maximal inhibitory concentration (IC50).

Authors:  Abid Qureshi; Himani Tandon; Manoj Kumar
Journal:  Biopolymers       Date:  2015-11       Impact factor: 2.505

View more
  8 in total

Review 1.  Antiviral Peptides (AVPs) of Marine Origin as Propitious Therapeutic Drug Candidates for the Treatment of Human Viruses.

Authors:  Linda Sukmarini
Journal:  Molecules       Date:  2022-04-19       Impact factor: 4.927

2.  Meta-iAVP: A Sequence-Based Meta-Predictor for Improving the Prediction of Antiviral Peptides Using Effective Feature Representation.

Authors:  Nalini Schaduangrat; Chanin Nantasenamat; Virapong Prachayasittikul; Watshara Shoombuatong
Journal:  Int J Mol Sci       Date:  2019-11-15       Impact factor: 5.923

Review 3.  Antiviral peptides as promising therapeutic drugs.

Authors:  Liana Costa Pereira Vilas Boas; Marcelo Lattarulo Campos; Rhayfa Lorrayne Araujo Berlanda; Natan de Carvalho Neves; Octávio Luiz Franco
Journal:  Cell Mol Life Sci       Date:  2019-05-17       Impact factor: 9.261

Review 4.  Antimicrobial Peptides: An Update on Classifications and Databases.

Authors:  Ahmer Bin Hafeez; Xukai Jiang; Phillip J Bergen; Yan Zhu
Journal:  Int J Mol Sci       Date:  2021-10-28       Impact factor: 5.923

5.  VirVACPRED: A Web Server for Prediction of Protective Viral Antigens.

Authors:  Jesús Herrera-Bravo; Jorge G Farías; Fernanda Parraguez Contreras; Lisandra Herrera-Belén; Juan-Alejandro Norambuena; Jorge F Beltrán
Journal:  Int J Pept Res Ther       Date:  2021-12-17       Impact factor: 1.931

Review 6.  Computer-aided discovery, design, and investigation of COVID-19 therapeutics.

Authors:  Chun-Chun Chang; Hao-Jen Hsu; Tien-Yuan Wu; Je-Wen Liou
Journal:  Tzu Chi Med J       Date:  2022-03-28

Review 7.  Antimicrobial Peptides as Potential Antiviral Factors in Insect Antiviral Immune Response.

Authors:  Min Feng; Shigang Fei; Junming Xia; Vassiliki Labropoulou; Luc Swevers; Jingchen Sun
Journal:  Front Immunol       Date:  2020-09-02       Impact factor: 7.561

8.  Better understanding and prediction of antiviral peptides through primary and secondary structure feature importance.

Authors:  Abu Sayed Chowdhury; Sarah M Reehl; Kylene Kehn-Hall; Barney Bishop; Bobbie-Jo M Webb-Robertson
Journal:  Sci Rep       Date:  2020-11-06       Impact factor: 4.996

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.