Literature DB >> 19238199

Ranking of binding and nonbinding peptides to MHC class I molecules using inverse folding approach: implications for vaccine design.

Satarudra Prakash Singh¹, Bhartendu Nath Mishra.

Abstract

T cell recognition of the peptide-MHC complex initiates a cascade of immunological events necessary for immune responses. Accurate T-cell epitope prediction is an important part of the vaccine designing. Development of predictive algorithms based on sequence profile requires a very large number of experimental binding peptide data to major histocompatibility complex (MHC) molecules. Here we used inverse folding approach to study the peptide specificity of MHC Class-I molecule with the aim of obtaining a better differentiation between binding and nonbinding sequence. Overlapping peptides, spanning the entire protein sequence, are threaded through the backbone coordinates of a known peptide fold in the MHC groove, and their interaction energies are evaluated using statistical pairwise contact potentials. We used the Miyazawa & Jernigan and Betancourt & Thirumalai tables for pairwise contact potentials, and two distance criteria (Nearest atom >> 4.0 A & C-beta >> 7.0 A) for ranking the peptides in an ascending order according to their energy values, and in most cases, known antigenic peptides are highly ranked. The predictions from threading improved when used multiple templates and average scoring scheme. In general, when structural information about a protein-peptide complex is available, the current application of the threading approach can be used to screen a large library of peptides for selection of the best binders to the target protein. The proposed scheme may significantly reduce the number of peptides to be tested in wet laboratory for epitope based vaccine design.

Entities: Chemical Disease Species

Keywords: MHC; contact potential; epitope; template; threading

Year: 2008 PMID： 19238199 PMCID： PMC2639678 DOI： 10.6026/97320630003072

Source DB: PubMed Journal: Bioinformation ISSN： 0973-2063

Abbreviations

CTL - cytotoxic T lymphocytes, MHC - major histocompatibility complex , MJ - Miyazawa and Jernigan , BT - Betancourt and Thirumalai, PDB - Protein databank, MPID - MHC peptide interaction database.

Background

Development of epitope-based vaccines critically requires identification of regions in non-self and mutated proteins which are recognized by cytotoxic T lymphocyte (CTLs). The recognition of such regions by CTLs is a multistep processes where binding of peptides to MHC class I molecule is an important step and further transport of peptide–MHC complex to the antigen presenting cell surface [1]. Much of the information has accumulated regarding the specific binding of peptides to MHC class I molecules. A number of computational methods have been developed for the prediction of MHC binding peptide according to the data and computational approaches they apply i.e. sequence and structure based. Sequence based approaches includes motif, quantitative matrix and machine learning models have been successful applied in the discovery of novel T-cell epitopes involved in the cancer immunity [2, 3] . Although sequence based approaches are well established, but they require large sets of peptides that were tested experimentally and not feasible in situations where insufficient experimental binding data are available [4,5]. Availability of crystallographically solved MHC-peptide complexes provides the opportunities for inverse folding (threading) approach which do not rely on previously binding data but aim to take account of the contributions of individual amino acids along the peptide that prompt them to fit into the groove of MHC allele using structural considerations [6, 7,8]. In this paper, an approach developed to address the inverse protein folding problem is applied to prediction of potential binding peptides to a specific MHC molecule and their interaction energies [9] are evaluated using statistical pairwise contact potentials, MJ [10] and BT [11].The number of conformations the peptide can adopt in the binding groove is limited and defined by the peptide-MHC structure that imposes physical constraints on the peptide [12]. The residues were considered to be in contact or not according to two different distance criteria [6,13]. We also investigated whether using multiple template structures and taking the average improves the predictions or not. After these analysis, we found that using BT potential with any two atoms are closer than 4 Å and taking multiple peptide conformations into consideration improves the threading procedure in discriminating between binding and nonbinding peptides. Hence, the compatibility of the peptide sequence with the space in the binding groove has an important role in molecular recognition which implies that the peptide conformation should be taken into consideration to improve the predictions of threading methods.

Methodology

Template structures

The available data in the PDB are redundant and hence we created a non-redundant set from those entries with the best resolution for the related structural complexes having identical sequence information [14]. The non-redundant dataset consists of fifty four class I MHC-Peptide complexes (Table 1 in supplementary material). All the complexes chosen for the study were characterized using IMGT/3Dstructure-DB Structural Query tool [15] and MHC-Peptide Interaction Database (MPID) [16] including eleven 8-mer peptide-H2-Kb, seventeen 9-mer peptide-HLA-A*0201, twelve 9-mer peptide-H2-Db, four 10-mer peptide-HLA-A*0201 complexes. The MHC non-binding peptide data set for the selected alleles were retrieved from AntiJen database [17], which covered a large range of IC50 value from 5000-440000 (Table 2 in supplementary material). The interface of peptide–MHC complexes is defined using the parameters, Interface Area and Gap Index [18]. Interface area for class I MHC-peptide complexes was defined as the change in their solvent accessible surface area (delta ASA) when going from a monomeric MHC molecule to a dimeric MHC-peptide complex state whereas, Gap index is used as means to evaluate the complementarity of interacting surfaces. The gap index is calculated using the formula, Gap Index = Gap Volume / delta ASA.

Threading with a contact potential matrix

In this method, binding affinity of a peptide is predicted by the total energy of interaction with contact residues. The contacts of the peptide in the available template co-crystal structure are determined according to two different criteria 1), β-carbon atoms are closer than 7 Å [6] and 2) any two atoms are closer than 4 Å [13]. Then, the amino-acid sequence of the query peptide is threaded onto the coordinates of the peptide in the template using MODPROPEP web server [19]. The contacts are assumed to be conserved, and the total interaction energy is obtained by summing the interaction energy values of peptide residues using a contact potential matrix. The contacting residues are determined for the conformation in the known structure, and therefore are only approximate for different sequences threaded. Energy values for amino acid-to-amino acid interactions are taken from the table of statistical pairwise contact potentials derived by MJ and BT [10,11]. The experimental binding energies are correlated with binding affinity (IC50) using the expression, ΔGexp = -RT ln(IC50) where R is the gas constant and T the absolute temperature [20]. The predicted contact energies are given in dimensionless units of RT.

Results and Discussion

The peptide sequences in the test dataset (Table 1 and 2, see supplementary material) were threaded onto the crystal structures of the MHC class I peptide complexes. Different statistical potential matrices (MJ & BT) were used to obtain an estimate of the binding affinity of the threaded sequences, with the goal of ranking the binding and nonbinding sequences in the selected data set (see Methodology). We applied the method of Altuvia and colleagues [21] to score and rank the binding affinities of peptides to MHC class I molecules. Table 3 and 4(under supplementary material)gives the ranking of peptides according to the binding affinities predicted by MJ and BT threading algorithm and using the 1VAC, 1LEG (H2-Kb/8); 1INQ, 1JPG (H2-Db/9); 1HHI, 1AO7 (HLA-A*0201/9); 1I4F, 2CLR (HLA-A*0201/10) complex structure as the template for two different distance criteria (Nearest atom ≫ 4.0 Å & C-beta ≫ 7.0 Å) to define the contacting residues. Although it is reasonable to use the same distance criterion as in the parameterization of the statistical contact potentials, we have applied both distance criteria to enable a direct comparison of the results. Here, we found that the nearest atom ≫ 4.0 Å distances criterion to determine the contacting residues gives a better prediction compared to C-beta ≫ 7.0 Å distances (Table 3 and 4 in supplementary material). Surprisingly, although it still ranks high, the template structure 's own peptide does not have the highest score, indicating that this force field may not have adequate precision. Overall, there is a tendency that the nonbinding peptides are ranked lower than the binding ones, but it is not possible to differentiate the binder and non binder using these rankings. The pair wise potential is used to estimate the binding energies of peptide sequences threaded upon the different structural template. MJ pair-wise contact potential table puts much emphasis on hydrophobic interaction for the MHC alleles that contain various pockets of hydrophobic characters. Although most peptide are relatively buried within the binding groove of the MHC molecule, one can not assume that hydrophobic interaction are the mainly one that will tell binding from nonbinding peptides apart. So we have used the table of BT that has modified table of MJ by changing the reference state from solvent to a defined single solvent like molecule, the amino acid threonine and improved the ranking of template. However, in some cases (HLA-A*0201- 10/1I4F), the template structure's own peptide has a very bad score, and is predicted to have a binding affinity even lower than nonbinding peptides. The results of threading are very much dependent on the template structure used, as a peptide ranks high if its binding scheme is similar to the template peptide. Hence, using multiple templates potentially should provide a better fit for the binding peptides. Therefore, this crude force field is not accurate enough to distinguish the subtle differences between the various peptide sequences. For the other sequences, it is not possible to differentiate binding and nonbinding peptides based on energy using a single template; however, some binders have lower scores using one template and have high scores in the other. Using multiple templates provides more possible conformations accessible in the binding groove than the binding sequences can possibly assume. Therefore, taking the average of results from the two templates improves the results as seen in Table 5(supplementary material). The non binders are ranked lower than the binder, but once again, the binding and nonbinding peptides are not separated significant. In another test to justify the use of the threading method, we evaluated their performances using the rank analysis of binding peptide in the source protein sequences derived from the overlapping peptides. The BT potentials generally rank the template structure's own peptide high among all possible 8, 9 and 10-mers in the source protein (Table 6 in supplementary material).

Conclusions

Threading methodology employing two different statistical contact potentials (MJ and BT) and distance criteria (Nearest atom ≫ 4.0 Å and C-beta ≫ 7.0 Å) were applied to MHC class I molecules with a test set consisting of both its natural binding peptide and nonbinding peptide sequences The aim was to find which force field gives better predictions to rank and differentiate between the two groups in the test dataset, and hence determine which factors are important in the peptide recognition in MHC class I molecules. We found that using a BT force field, nearest atom ≫ 4.0 Å distance criteria and the average of results from multiple template structures gives better predictions. Nevertheless, we could not obtain results that could separate the binders from the nonbinders in the test dataset even when we used multiple templates. This leads to the idea that shapes, rather than certain amino acids, are recognized by the MHC. Although the MHC also adapts to bind different sequences, the binding groove restricts the conformations accessible to the bound peptide. The affinity of the peptide is thus affected by how well it can fit into the volume defined by the binding groove. This finding suggests that the “fitness” of a given peptide to the conformations accessible in the bound form is an important determinant of its binding affinity. This also indicates that the force field precisely defines the energy of the peptide when the exact conformation is available. Thus the inverse folding approach is advantageous for MHC alleles that lack binding data but have solved structure in complex with peptide, or alternatively, a structural model of the complex based on known structures. In this postgenomic era, the approach is potentially useful for screening a library of potential binding sequences to the newly discovered proteins to develop epitope based vaccines.

21 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. Structure-based prediction of binding peptides to MHC class I molecules: application to a broad range of MHC alleles.

Authors: O Schueler-Furman; Y Altuvia; A Sette; H Margalit
Journal: Protein Sci Date: 2000-09 Impact factor: 6.725

3. IMGT/3Dstructure-DB and IMGT/StructuralQuery, a database and a tool for immunoglobulin, T cell receptor and MHC structural data.

Authors: Quentin Kaas; Manuel Ruiz; Marie-Paule Lefranc
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

4. Structure-based prediction of potential binding and nonbinding peptides to HIV-1 protease.

Authors: Nese Kurt; Turkan Haliloglu; Celia A Schiffer
Journal: Biophys J Date: 2003-08 Impact factor: 4.033

5. A structure-based approach for prediction of MHC-binding peptides.

Authors: Yael Altuvia; Hanah Margalit
Journal: Methods Date: 2004-12 Impact factor: 3.608

6. MPID-T: database for sequence-structure-function information on T-cell receptor/peptide/MHC interactions.

Authors: Joo Chuan Tong; Lesheng Kong; Tin Wee Tan; Shoba Ranganathan
Journal: Appl Bioinformatics Date: 2006

7. Knowledge-based structure prediction of MHC class I bound peptides: a study of 23 complexes.

Authors: O Schueler-Furman; R Elber; H Margalit
Journal: Fold Des Date: 1998

8. Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains.

Authors: K C Parker; M A Bednarek; J E Coligan
Journal: J Immunol Date: 1994-01-01 Impact factor: 5.422

9. Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading.

Authors: S Miyazawa; R L Jernigan
Journal: J Mol Biol Date: 1996-03-01 Impact factor: 5.469

Review 10. Epitope identification and vaccine design for cancer immunotherapy.

Authors: Alessandro Sette; Elissa Keogh; Glenn Ishioka; John Sidney; Shabnam Tangri; Brian Livingston; Denise McKinney; Mark Newman; Robert Chesnut; John Fikes
Journal: Curr Opin Investig Drugs Date: 2002-01

5 in total

Review 1. MHC class II epitope predictive algorithms.

Authors: Morten Nielsen; Ole Lund; Søren Buus; Claus Lundegaard
Journal: Immunology Date: 2010-04-12 Impact factor: 7.397

2. Limitations of Ab initio predictions of peptide binding to MHC class II molecules.

Authors: Hao Zhang; Peng Wang; Nikitas Papangelopoulos; Ying Xu; Alessandro Sette; Philip E Bourne; Ole Lund; Julia Ponomarenko; Morten Nielsen; Bjoern Peters
Journal: PLoS One Date: 2010-02-17 Impact factor: 3.240

3. Prediction of MHC binding peptide using Gibbs motif sampler, weight matrix and artificial neural network.

Authors: Satarudra Prakash Singh; Bhartendu Nath Mishra
Journal: Bioinformation Date: 2008-12-06

4. pDOCK: a new technique for rapid and accurate docking of peptide ligands to Major Histocompatibility Complexes.

Authors: Javed Mohammed Khan; Shoba Ranganathan
Journal: Immunome Res Date: 2010-09-27

5. EpicCapo: epitope prediction using combined information of amino acid pairwise contact potentials and HLA-peptide contact site information.

Authors: Thammakorn Saethang; Osamu Hirose; Ingorn Kimkong; Vu Anh Tran; Xuan Tho Dang; Lan Anh T Nguyen; Tu Kien T Le; Mamoru Kubo; Yoichi Yamada; Kenji Satou
Journal: BMC Bioinformatics Date: 2012-11-24 Impact factor: 3.169

5 in total