Literature DB >> 21383906

Comparative analysis of epitope predictions: proposed library of putative vaccine candidates for HIV.

Arun Gupta, Dinesh Chaukiker, Tiratha Raj Singh.   

Abstract

Designing a vaccine for a disease is one of the crucial tasks that involve millions and billions of dollars, several decades and yet there is no guarantee of successful results. Several pharmaceutical companies are investing their money and time in such activities. Computational biology could be of great help in these activities by proving a library of plausible candidates that might actually show some positive responses. MHC binding peptide prediction is one such area where the immense power of computers could be used to get a breakthrough. In this direction several databases and servers have been developed by many labs to predict the MHC binding peptides. These short peptides on the antigen surface are recognized by the MHC molecule and are presented to the receptors of T-cells for further immune response. Peptides that bind to a given MHC molecule share sequence similarity. Here we present a comparative study of servers that can predict the MHC binding peptides in a given protein sequence of the antigen. Based on this comparative analysis on HIV data, we are able to propose a library of putative vaccine candidates for the env GP-160 protein of HIV-1.

Entities:  

Keywords:  HIV-1; MHC; MHC binding peptides; epitopic library; putative vaccine candidates

Year:  2011        PMID: 21383906      PMCID: PMC3044427          DOI: 10.6026/97320630005386

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

With the progression and success rates of several genome projects, we are provided with exponentially increasing number of proteins. This protein sequence data also include vital hidden information of pathogenic response of an organism. Sequences from pathogens provide a huge amount of potential vaccine candidates. The first step in the development of peptide vaccines is the identification of the immunodominant peptides along with proteins sequence. Theoretically every sub-sequence along with the protein could be antigenic. The experimental identification of peptide binding affinity to Major Histocompatibility Complex (MHC) [1] molecules requires a binding assay of each peptide, which is a time consuming and costly process. Therefore, a number of alternative research efforts have been carried out in an attempt to discover the laws of binding peptide sequence patterns. One could perform several computational analyses to screen and develop libraries of such peptides. Such libraries could help the research and development process of several pharmaceutical companies saving them money and time and will also insure less hazardous situation to handle the pathogens. One of the major roles of immune system is to recognize and destroy cells expressing non-self or mutated proteins. The cascade of immune response begins when the cytotoxic T lymphocytes (CTLs) recognizes the specific antigen and trigger the specific immune response against them [2]. If the antigens aren't recognized properly on the early onset of the infection, might even lead to lethal diseases. MHC molecules play an important role in the immune system. MHC helps T-Lymphocyte cells to identify the antigens and to trigger the immune response. The complex of bound peptide and MHC complex induces the naïve T cells to proliferate and differentiate into armed effector T cells that help to remove the antigens. There is a great diversity in the selectivity of peptides by MHC molecules which makes it difficult for pathogens to escape the immune response. Each different MHC molecule can bind to a set of different peptides [3]. This brings into the consideration, the several supertypes of MHC molecules with different alleles in many supertypes. The env gene in HIV encodes a single protein, gp160. When gp160 is synthesized in the cell, cellular enzymes add complex carbohydrates and turn it from a protein into a glycoprotein - hence the name gp160 rather than ‘p160’. gp160 is found on the outer surface or envelope of the HIV. It is composed of gp120, which protrudes from the envelope, and gp41, which is embedded in the envelope. gp160 is the unit that helps the virus to adhere and to interact with the surface proteins of host. The majority of HIV subunit vaccines are based on the envelope proteins of HIV namely gp120 and gp41, which form the gp160 glycoprotein complex, or on selected epitopes identified within these proteins [4, 5]. HLA-A2 is a human leukocyte antigen serotype within HLA-A “A” serotype group. The serotype is determined by the antibody recognition ofα2 subset of HLA-A α-chains. For A2, the alpha “A” chain are encoded by the HLA-A*02 allele group and the β-chain are encoded by B2M locus [6]. The serotype identifies the gene products of many HLA-A*02 alleles, including HLA-A*0201, *0202, *0203, *0206, and *0207 gene products. A2 is the most diverse serotype, showing diversity in Eastern Africa and Southwest Asia. While the frequency of A*0201 in Northern Asia is high, its diversity is limited to A*0201 the less common Asian variants A*0203, A*0206. Due to its diversity, it is an interesting entity to study the antigenantibody interactions. This study deals with a comparative analysis of epitope and MHC binding peptide prediction on envelope proteins of HIV-1. The potential value of a preventative and cost-effective vaccine stratagem to protect against HIV is inevitable. Based on this analysis a putative vaccine candidate library has been fabricated which focuses mainly on the performance of servers used and their predicted accuracy with their respective parameters. Generated library will be a useful resource in the process of vaccine design for HIV-1 and it will also help in the generation of similar libraries for other pathogens.

Material and methods

Here in this study, we tested the same sequences over several servers with the similar prediction conditions and compared the results obtained. The protein sequences were fetched from the National Center for Biotechnology Information (NCBI) through their enterz search engine. We considered HIV-1 surface and envelope proteins in our study. In our study, we compared 8 comprehensive servers (see Table 1), based on their results and scores they assigned to different peptides predicted. Additional information provided by every server has also been analyzed along with the results. Analysis has been performed on the basis of final score given by the server. Basic description and working principles of servers is given in the text following. ANNpred [7] is based on Artificial Neural Networks (ANNs) for 30 MHC alleles. ComPred is a comprehensive method for prediction of MHC binding peptides or CTL epitopes of 67 MHC alleles. The prediction for 30 alleles is based on the hybrid approach of Artificial Neural Networks (ANNs) and Quantitative Matrices (QM). The prediction for rest 37 MHC alleles is based on the quantitative matrices. The predicted MHC binders in ComPred and in ANNpred, both are filtered to potential CTL epitopes by using Proteasomal matrices. The Predep [8] algorithm uses the pair-wise potential table of Miyazawa & Jernigan [9], and is able to identify good binders only for MHC molecules with hydrophobic binding pockets. This server returns the peptide “energy score” value. The peptides are ranked according to their energy score (the lower the better). RankPep [10] server predicts MHC-I and MHC-II peptide binders from protein sequence or sequence alignments using Position Specific Scoring Matrices (PSSMs) or profiles from set aligned peptides known to bind to a given MHC molecule as the predictor of MHC-peptide binding. In addition, it predicts those MHC-I ligands with a C-terminal end is that likely to be the result of proteasomal cleavage. Profiles basically consist of a table listing the observed sequence-weighted frequency of all amino acids in every column of a sequence alignment. This server includes a selection of 102 and 80 PSSMs for the prediction of peptide binding MHCI and MHCII molecules, respectively. HLA-BIND [11] allows users to locate and rank peptides that contain peptide-binding motifs for HLA class I molecules. By default, this server predicts 9-mer peptides using 20-by-9 coefficient matrix for the selected HLA molecule for the scoring. The estimated numerical score for the subsequence in case of HLA-A2 is calculated based upon the half-time of dissociation of complexes containing the peptide at 37°C at pH 6.5. For other molecules, the estimate is based on the observed anchor residue preferences. ProPred1 [12] is a matrix based method that allows the prediction of MHC binding sites in an antigenic sequence for 47 MHC class-I alleles. The matrices used in ProPred1 have been obtained from BIMAS [11] server and matrices described by Toes [13]. ProPred1 also allows the prediction of the standard proteasome and immunoproteasome cleavage sites in an antigenic sequence. It allows filtering of MHC binders, who have cleavage site at C terminus. The most of matrices in Propred1 is multiplication type where the score of each predicted peptide is calculated by multiplying scores of each position. We carried out the multiple sequence alignment (MSA) of various HIV envelop protein sequences through ClustalX, to select one representative sequence. We selected CAQ63623.1, being the most consistent based on the blocks appeared among all the sequences in their MSA. Resulted and computed epitopes were ranked according to comparative and statistical analysis system. Additionally all the epitopes have been compared with the LANL immunological database collection of T-cells [14]. As the protein is an envelope protein that helps the virus to attach with the host cell, our hypothesis is that, if the CTL can trigger an immune response against the antigen virus targeting the envelop proteins, then virus will not be able to cause any infection.

Results and discussion

Three-tier screening was performed on the results obtained from various servers. Pre-screening: to collect the top 25 scoring peptides from all the servers. In the second round, out of the top 25 scoring peptides we selected 20 most consistent and common peptide sequences and compared the ranks and their respective positions results from every server against all other servers. This yielded a comprehensive list of 158 peptides (data not shown). Out of the comprehensive list, we selected the peptides with at least 50% occurrence and calculated their average rank (see Table 2). Following this approach, we were able to analyze the selected sequence through various approaches viz, ANN, SVM, PSSM, etc and against various database as well. This three tier scanning increases the chances of accuracy and reduces the false-positive hits if any. Hence the obtained library is the collective favorable results of all the servers together. An interesting aspect of this analysis was extraction of few interesting plausible epitopes while compared with LANL immunological database collection of T-cells [14]. Four epitopic peptide sequences viz. ‘WLWYIKIFI’, ‘NVWATHACV’, ‘LLDTIAIAV’, ‘YIKIFIMIV’, have been found in LALN database. The most important looks ‘WLWYIKIFI’, as it has been ranked 2nd in our analytical system and is for HLA A*0201, A2 and A2.1. Besides that we also propose another plausible epitopic sequence ‘TLFNNSWTL’, as it has been ranked one in our system and can be verified experimentally and might be a part of immunological databases in future. Other sequences can also included in the immunological databases after successful experimental verification.

Conclusion and Future Prospects

Design and development of an effective HIV vaccine is exigent because of complex host-virus pathogenesis. From the study carried out and results obtained it is proposed that the given library contains the most putative vaccine peptide candidates for HIV-1. If we can target these peptide candidates, we might be able to have substantial vaccine for the cure of this disease. The putative vaccine candidates obtained might support our proposed hypothesis and will help in hindering the antigenic effect of the virus. Further, similar study can be carried out to design a consolidate library over the whole proteome level which might provide many more putative candidates, increasing the success rate of winning over the disease analysis. This study also proposes a model for carrying analyses on specific data with diversified techniques to extract something attentiongrabbing and decisive. We would like to extend this work towards the integration of various features of many servers used and would also like to apply more statistical techniques to make results more significant and better. Structural aspects of epitopes will also be incorporated to make results more robust and significant.
  11 in total

1.  Structure-based prediction of binding peptides to MHC class I molecules: application to a broad range of MHC alleles.

Authors:  O Schueler-Furman; Y Altuvia; A Sette; H Margalit
Journal:  Protein Sci       Date:  2000-09       Impact factor: 6.725

2.  Examining the independent binding assumption for binding of peptide epitopes to MHC-I molecules.

Authors:  Björn Peters; Weiwei Tong; John Sidney; Alessandro Sette; Zhiping Weng
Journal:  Bioinformatics       Date:  2003-09-22       Impact factor: 6.937

3.  ProPred1: prediction of promiscuous MHC Class-I binding sites.

Authors:  Harpreet Singh; G P S Raghava
Journal:  Bioinformatics       Date:  2003-05-22       Impact factor: 6.937

4.  Prediction of MHC class I binding peptides using profile motifs.

Authors:  Pedro A Reche; John-Paul Glutting; Ellis L Reinherz
Journal:  Hum Immunol       Date:  2002-09       Impact factor: 2.850

Review 5.  Human tumor antigens recognized by T cells.

Authors:  P F Robbins; Y Kawakami
Journal:  Curr Opin Immunol       Date:  1996-10       Impact factor: 7.486

6.  Application of machine learning techniques in predicting MHC binders.

Authors:  Sneh Lata; Manoj Bhasin; Gajendra P S Raghava
Journal:  Methods Mol Biol       Date:  2007

7.  Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains.

Authors:  K C Parker; M A Bednarek; J E Coligan
Journal:  J Immunol       Date:  1994-01-01       Impact factor: 5.422

8.  The genetic control of HLA-A and B antigens in somatic cell hybrids: requirement for beta2 microglobulin.

Authors:  B Arce-Gomez; E A Jones; C J Barnstable; E Solomon; W F Bodmer
Journal:  Tissue Antigens       Date:  1978-02

9.  An HLA-directed molecular and bioinformatics approach identifies new HLA-A11 HIV-1 subtype E cytotoxic T lymphocyte epitopes in HIV-1-infected Thais.

Authors:  K B Bond; B Sriwanthana; T W Hodge; A S De Groot; T D Mastro; N L Young; N Promadej; J D Altman; K Limpakarnjanarat; J M McNicholl
Journal:  AIDS Res Hum Retroviruses       Date:  2001-05-20       Impact factor: 2.205

10.  Discrete cleavage motifs of constitutive and immunoproteasomes revealed by quantitative analysis of cleavage products.

Authors:  R E Toes; A K Nussbaum; S Degermann; M Schirle; N P Emmerich; M Kraft; C Laplace; A Zwinderman; T P Dick; J Müller; B Schönfisch; C Schmid; H J Fehling; S Stevanovic; H G Rammensee; H Schild
Journal:  J Exp Med       Date:  2001-07-02       Impact factor: 14.307

View more
  3 in total

1.  Immunoinformatic evaluation of multiple epitope ensembles as vaccine candidates: E coli 536.

Authors:  Jade Rai; Ka In Lok; Chun Yin Mok; Harvinder Mann; Mohammed Noor; Pritesh Patel; Darren R Flower
Journal:  Bioinformation       Date:  2012-03-31

2.  Coping with genetic diversity: the contribution of pathogen and human genomics to modern vaccinology.

Authors:  D Lemaire; T Barbosa; P Rihet
Journal:  Braz J Med Biol Res       Date:  2011-10-28       Impact factor: 2.590

Review 3.  In silico prediction of cancer immunogens: current state of the art.

Authors:  Irini A Doytchinova; Darren R Flower
Journal:  BMC Immunol       Date:  2018-03-15       Impact factor: 3.615

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.