| Literature DB >> 30863406 |
Martin Closter Jespersen1, Swapnil Mahajan2, Bjoern Peters2, Morten Nielsen1,3, Paolo Marcatili1.
Abstract
B-cells can neutralize pathogenic molecules by targeting them with extreme specificity using receptors secreted or expressed on their surface (antibodies). This is achieved via molecular interactions between the paratope (i.e., the antibody residues involved in the binding) and the interacting region (epitope) of its target molecule (antigen). Discerning the rules that define this specificity would have profound implications for our understanding of humoral immunogenicity and its applications. The aim of this work is to produce improved, antibody-specific epitope predictions by exploiting features derived from the antigens and their cognate antibodies structures, and combining them using statistical and machine learning algorithms. We have identified several geometric and physicochemical features that are correlated in interacting paratopes and epitopes, used them to develop a Monte Carlo algorithm to generate putative epitopes-paratope pairs, and train a machine-learning model to score them. We show that, by including the structural and physicochemical properties of the paratope, we improve the prediction of the target of a given B-cell receptor. Moreover, we demonstrate a gain in predictive power both in terms of identifying the cognate antigen target for a given antibody and the antibody target for a given antigen, exceeding the results of other available tools.Entities:
Keywords: B cell epitope; antibody; antibody specific epitope prediction; antigen; paratope; prediction
Mesh:
Substances:
Year: 2019 PMID: 30863406 PMCID: PMC6399414 DOI: 10.3389/fimmu.2019.00298
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
Description of the feature used to describe patches.
| Amino acid composition | 20 | Frequency of given amino acid type in patch | Both |
| Exposed Donors/Acceptors | 2 | Amount of exposed donor/acceptor atoms | Both |
| Hydrophobicity score | 1 | Amount of exposed carbon atoms with distance >2.5 Å to an exposed donor/acceptor atom | Both |
| Aromatic/Positive/Negative residues | 3 | Amount of aromatic and positively and negatively charged residues | Both |
| Principal Components | 3 | Principal components calculated on x, y, z coordinates. | Both |
| Size | 1 | Number of residues within the patch | Both |
| Patch density | 1 | Average number of neighbors in patch | Both |
| RSA max, min & mean | 3 | Maximum, minimum and average RSA of patch residues. | Antigen |
| Structural Conjoint Triads | 196 | Structural conjoint triads based on neighboring residues on surface. | Both |
| Zernike Moments | 7 | 4th order Zernike Moments excluding 0th and 1st. | Both |
The first four rows (gray) are physio-chemical features, applied in all models. The following three rows (white) are simple structural features also applied in all models. The last three rows (light gray) are more complex structural features only used in the Antigen and the Full model.
Figure 1(A) Conjoint Triads amino acid classes and representation of method on a sequence level. (B) Structural representation of Conjoint Triads classes mapped to an epitope patch. (C) The three principal components illustrated on an epitope patch. (D) Illustration of 4th order of Zernike Moments' descriptive shape excluding order 0 and 1.
Figure 2Correlation matrix of structural and physicochemical features of the true paired paratope and epitope patches.
Figure 3Box plot showing the distribution of the real epitope ranks within each Antibody-Antigen structure for the three prediction models; Antigen, Minimal, and Full.
Figure 4The ability of the four models' (Antigen: green, Minimal: pink, Full: purple, and DiscoTope-2.0: orange) to identify high overlapping patches. X-axis indicating number of top predicted patches included and Y-axis showing the percentage of structures having at least one high overlapping within the selected pool.
Description of the ranking measurements used to describe performance of the models.
| Epitope Rank | The Frank of the real epitope patch against the ~300 Monte Carlo patches. |
| Antigen Rank | The Frank of the cognate epitope patch toward a given paratope against a pool of epitope patches from other antigens. |
| Antibody Rank | The Frank of the cognate paratope patch toward a given epitope against a pool of paratopes from other antibodies. |
| Structurally Similar Antibody Rank | The Frank of the cognate paratope patch toward a given epitope against a pool of paratopes from other antibodies with structurally similar paratopes. |
| Monte Carlo Antibody Rank | The Frank of the cognate paratope patch toward a given antigen against a pool of paratopes from antibodies with structurally similar paratopes. The paratope score is defined by the average of top 5 scoring Monte Carlo patches. |
| Monte Carlo Antigen Rank | The Frank of the cognate epitope of an antigen toward a given paratope against a pool of epitopes from other antigens with structurally similar paratopes. The antigens score is defined by the average of top 5 scoring Monte Carlo patches. |
| First HO Patch | The Frank of the highest predicted patch highly overlapping the real epitope (target value above 0.25) in the list of predicted antigen patches sorted by prediction value. |
Benchmark of 8 antibody-antigen PDB structures non-redundant to training, comparing DiscoTope-2.0, the Antigen Model and the Full Model developed here.
| 3RKD | 95.3 | 44.7 | 60.1 | 0 | 72.7 | 0.3 |
| 4EDW | 52.2 | 53 | 39.2 | 19 | 10.6 | 25.3 |
| 5B3J | 3.3 | 22 | 31.8 | 56.3 | 0.3 | 17.3 |
| 5DHV | 92 | 51 | 75 | 12.6 | 7.6 | 8.3 |
| 5SY8 | 5.3 | 0.3 | 53.4 | 0.3 | 22.2 | 0 |
| 5TZ2 | 72.8 | 50.3 | 11.6 | 10.6 | 2.6 | 9.6 |
| 5TZT | 0 | 39.3 | 19.9 | 5.0 | 87.3 | 20.0 |
| 5TZU | 54.2 | 43.7 | 20.2 | 0.6 | 2.6 | 0.3 |
| AVERAGE | 38 | 46.9 | 39 | 13.1 | 25.8 | 10.3 |
| MEDIAN | 44.2 | 53.2 | 35.5 | 7.8 | 9.1 | 9.1 |
Epitope Patch Rank indicates how well each model ranks the real epitope patch within the 300 Monte Carlo patches. First HO patch shows how high the first high overlapping patch (target value > 0.25) ranks within the set of 300 MC patches. A rank of 0% means it was ranked highest and a rank of 100% is ranked lowest or that no HO patch exists for the structure.
Benchmark of 8 antibody-antigen PDB structures non-redundant to training, comparing DiscoTope-2.0, ClusPro and the Full Model developed here.
| 3RKD | 86.6 | 4.5 | 87.5 | 25 | 0 | 75 | 75 |
| 4EDW | 79.1 | 100 | 37.5 | 37.5 | 19 | 25 | 12.5 |
| 5B3J | 59.8 | 22.3 | 0 | 0 | 43.3 | 87.5 | 50 |
| 5DHV | 33.3 | 37.5 | 12.5 | 0 | 100 | 50 | 62.5 |
| 5SY8 | 40 | 0 | 12.5 | 12.5 | 23.3 | 37.5 | 87.5 |
| 5TZ2 | 92.3 | 21.4 | 0 | 0 | 0 | 50 | 75 |
| 5TZT | 32.4 | 30 | 62.5 | 37.5 | 100 | 37.5 | 62.5 |
| 5TZU | 86.3 | 4.3 | 12.5 | 0 | 0 | 0 | 0 |
| AVERAGE | 63.8 | 27.5 | 28.1 | 14.1 | 35.7 | 45.3 | 53.1 |
| MEDIAN | 69.5 | 21.9 | 12.5 | 6.2 | 21.2 | 43.8 | 62.5 |
First HO patch shows how high the first high overlapping patch (target value > 0.25) ranks within the set of non-redundant MC patches. Antigen and Antibody ranking show how well the Model can select the correct antigen given a paratope and select the correct antibody given an epitope, respectively. Patch scores for the Discotope-2.0 were calculated as described in the text. A rank of 0% means it was ranked highest and a rank of 100% is ranked lowest or that no HO patch exists for the structure.