| Literature DB >> 34222328 |
Rodrigo Ochoa1,2, Roman A Laskowski2, Janet M Thornton2, Pilar Cossio1,3.
Abstract
The prediction of peptide binders to Major Histocompatibility Complex (MHC) class II receptors is of great interest to study autoimmune diseases and for vaccine development. Most approaches predict the affinities using sequence-based models trained on experimental data and multiple alignments from known peptide substrates. However, detecting activity differences caused by single-point mutations is a challenging task. In this work, we used interactions calculated from simulations to build scoring matrices for quickly estimating binding differences by single-point mutations. We modelled a set of 837 peptides bound to an MHC class II allele, and optimized the sampling of the conformations using the Rosetta backrub method by comparing the results to molecular dynamics simulations. From the dynamic trajectories of each complex, we averaged and compared structural observables for each amino acid at each position of the 9°mer peptide core region. With this information, we generated the scoring-matrices to predict the sign of the binding differences. We then compared the performance of the best scoring-matrix to different computational methodologies that range in computational costs. Overall, the prediction of the activity differences caused by single mutated peptides was lower than 60% for all the methods. However, the developed scoring-matrix in combination with existing methods reports an increase in the performance, up to 86% with a scoring method that uses molecular dynamics.Entities:
Keywords: MHC class II; binding; simulations; single-point mutation; structural bioinformatics
Year: 2021 PMID: 34222328 PMCID: PMC8253603 DOI: 10.3389/fmolb.2021.636562
Source DB: PubMed Journal: Front Mol Biosci ISSN: 2296-889X
FIGURE 1(left) Examples of a peptide bound to an MHC class II receptor and conformations from the Backrub Rosetta simulations. (right) Schematic representation of the methodological steps that involve creation of the scoring-matrices. First, an MD vs. Backrub comparison was performed to define the best Backrub setup. Then, the modelling and sampling of a set of known peptide binders was performed to obtain the observables for building the scoring-matrices.
FIGURE 2Comparison of and distributions for amino acid Leu9 from the peptide bound to MHC class II (PKYVKQNTLKLAT PDB id: 1fyt). (A) Last 10 ns of MD, (B) Backrub using kT = 0.35, and (C) Backrub using kT = 1.2. The same analysis was done for amino acid Arg15 of another peptide (AAYSDQATPLLLSPR PDB id 1t5x). (D) Last 10 ns of MD, (E) Backrub using kT = 0.35, and (F) Backrub using kT = 1.2.
Average and fractional error of the number of contacts and hydrogen bonds (HB) made by the main chain atoms of the peptide-core amino acids bound to MHC class II, and sampled with MD or backrub (BR) using . The fractional error was calculated using the standard deviation from the simulations for each peptide core position. The last row shows an average value for all the structures.
| PDB | MD contacts | BR contacts | MD HB | BR HB |
|---|---|---|---|---|
| 1fyt | 147.9 ± 12.5 | 148.1 ± 9.8 | 8.7 ± 1.1 | 7.4 ± 1.1 |
| 1klg | 112.6 ± 9.9 | 104.0 ± 7.8 | 8.5 ± 1.1 | 8.3 ± 1.2 |
| 1sje | 130.3 ± 10.5 | 104.5 ± 7.3 | 10.5 ± 0.9 | 8.6 ± 1.1 |
| 1sjh | 115.1 ± 10.6 | 107.8 ± 9.9 | 8.4 ± 1.1 | 9.9 ± 1.0 |
| 1t5x | 135.4 ± 11.8 | 72.5 ± 10.1 | 7.8 ± 1.0 | 3.7 ± 1.0 |
| 2fse | 126.1 ± 13.0 | 96.8 ± 8.9 | 8.9 ± 1.1 | 6.4 ± 0.8 |
| 3pgd | 129.9 ± 13.5 | 133.3 ± 9.4 | 9.2 ± 1.1 | 8.7 ± 0.7 |
| 4aen | 114.6 ± 11.3 | 99.0 ± 7.6 | 8.2 ± 1.2 | 7.4 ± 1.1 |
| 4i5b | 134.8 ± 10.8 | 126.5 ± 9.6 | 8.9 ± 1.2 | 8.1 ± 1.1 |
| 4ov5 | 161.1 ± 13.8 | 133.9 ± 11.0 | 10.2 ± 1.2 | 8.5 ± 0.7 |
| Average | 130.7 ± 11.8 | 112.6 ± 9.2 | 8.9 ± 1.1 | 7.7 ± 1.0 |
FIGURE 3Information used to model peptides bound to the structure of the MHC class II allele DRB1*0101. (A) Logo representing the frequency of the amino acids within the 837 15-mer peptides that were modelled bound to the MHC class II structure. The larger the height of the letter the more relevant the amino acid is for improving binding. (B) Logo representing the probability of the amino acids at each position of the core region based on the number of hydrogen bonds made by the main chain. The colors represent categories of the amino acids based on physicochemical properties: blue (positive charged), red (negative charged), green (small), fucsia (asparagine) and black (aliphatic). (C) Prediction of the sign of the experimental binding differences, for a set of 56 peptides with single substitutions, using the scoring-matrix (SM-HB-BE) in combination with the state-of-the-art methodologies.
Prediction of the sign of the experimental activity differences by single-point mutations of the peptide core amino acids using the scoring-matrix calculated based on the hydrogen bonds made by main chain atoms (SM-HB) and the number of non-bonded contacts (SM-C). The comparisons include data for the four strategies to extract information from the 2,000 backrub frames.
| Strategy | Matches for SM-HB (%) | Matches for SM-C (%) |
|---|---|---|
| All the frames | 0.553 | 0.501 |
| Last half frames | 0.518 | 0.464 |
| Half frames with best energies | 0.589 | 0.501 |
| Best energy frame | 0.464 | 0.518 |
Match values and bootstrapping standard deviations for the prediction of the sign of the experimental activity differences by single-point mutations of the peptide core amino acids for five state-of-the-art methodologies and the SM-HB-BE (i.e., scoring matrix from hydrogen bonds using half of the conformations with best energies). In addition, we include the computational costs, in days, for running the methods with the 56 pairs of mutated peptides. The strategies are the sequence motif matrix, the machine learning tool NetMHCIIpan, the MD/scoring and backrub/scoring approaches, and the MM-PBSA calculations (see Methods).
| Complementary strategy | Matched predictions | Computational cost (days) |
|---|---|---|
| Sequence matrix | 0.393 ± 0.067 | 0.05 |
| NetMHCIIpan | 0.536 ± 0.079 | 0.1 |
| Backrub/scoring | 0.536 ± 0.067 | 2 |
| MD/scoring | 0.571 ± 0.071 | 15 |
| MM-PBSA | 0.518 ± 0.062 | 15 |
| SM-HB-BE scoring matrix | 0.589 ± 0.065 | 0.05 |
Match values and bootstrapping standard deviations for the prediction of the sign of the experimental activity differences by single point mutations of the peptide core amino acids. The results are for the combination of the additional methodologies with the SM-HB-BE matrix. The strategies are the sequence motif matrix, the machine learning tool NetMHCIIpan, the MD/scoring and backrub/scoring approaches, and the MM-PBSA calculations (see Methods).
| Complementary strategy | Matched predictions in combination with the SM-HB-BE matrix |
|---|---|
| Sequence matrix | 0.607 ± 0.062 |
| NetMHCIIpan | 0.714 ± 0.064 |
| Backrub/scoring | 0.786 ± 0.051 |
| MD/scoring | 0.857 ± 0.047 |
| MM-PBSA | 0.786 ± 0.048 |
Percentage of amino acids per backrub configuration ( and ) for each side chain dihedral that sampled the conformational space similarly to MD simulations among all the 10 MHC class II crystal structures
| Side chain dihedrals | kT = 0.35 (%) | kT = 1.2 (%) |
|---|---|---|
|
| 19.2 | 80.8 |
|
| 12.9 | 87.1 |